1
|
Pezzi PH, Wheeler LC, Freitas LB, Smith SD. Incomplete lineage sorting and hybridization underlie tree discordance in Petunia and related genera (Petunieae, Solanaceae). Mol Phylogenet Evol 2024; 198:108136. [PMID: 38909873 DOI: 10.1016/j.ympev.2024.108136] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2024] [Revised: 06/06/2024] [Accepted: 06/17/2024] [Indexed: 06/25/2024]
Abstract
Despite the overarching history of species divergence, phylogenetic studies often reveal distinct topologies across regions of the genome. The sources of these gene tree discordances are variable, but incomplete lineage sorting (ILS) and hybridization are among those with the most biological importance. Petunia serves as a classic system for studying hybridization in the wild. While field studies suggest that hybridization is frequent, the extent of reticulation within Petunia and its closely related genera has never been examined from a phylogenetic perspective. In this study, we used transcriptomic data from 11 Petunia, 16 Calibrachoa, and 10 Fabiana species to illuminate the relationships between these species and investigate whether hybridization played a significant role in the diversification of the clade. We inferred that gene tree discordance within genera is linked to hybridization events along with high levels of ILS due to their rapid diversification. Moreover, network analyses estimated deeper hybridization events between Petunia and Calibrachoa, genera that have different chromosome numbers. Although these genera cannot hybridize at the present time, ancestral hybridization could have played a role in their parallel radiations, as they share the same habitat and life history.
Collapse
Affiliation(s)
- Pedro H Pezzi
- Department of Genetics, Universidade Federal do Rio Grande do Sul, Porto Alegre, Brazil.
| | - Lucas C Wheeler
- Department of Ecology and Evolutionary Biology, University of Colorado, Boulder, USA
| | - Loreta B Freitas
- Department of Genetics, Universidade Federal do Rio Grande do Sul, Porto Alegre, Brazil
| | - Stacey D Smith
- Department of Ecology and Evolutionary Biology, University of Colorado, Boulder, USA
| |
Collapse
|
2
|
Schreiber M, Jayakodi M, Stein N, Mascher M. Plant pangenomes for crop improvement, biodiversity and evolution. Nat Rev Genet 2024; 25:563-577. [PMID: 38378816 DOI: 10.1038/s41576-024-00691-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/14/2023] [Indexed: 02/22/2024]
Abstract
Plant genome sequences catalogue genes and the genetic elements that regulate their expression. Such inventories further research aims as diverse as mapping the molecular basis of trait diversity in domesticated plants or inquiries into the origin of evolutionary innovations in flowering plants millions of years ago. The transformative technological progress of DNA sequencing in the past two decades has enabled researchers to sequence ever more genomes with greater ease. Pangenomes - complete sequences of multiple individuals of a species or higher taxonomic unit - have now entered the geneticists' toolkit. The genomes of crop plants and their wild relatives are being studied with translational applications in breeding in mind. But pangenomes are applicable also in ecological and evolutionary studies, as they help classify and monitor biodiversity across the tree of life, deepen our understanding of how plant species diverged and show how plants adapt to changing environments or new selection pressures exerted by human beings.
Collapse
Affiliation(s)
- Mona Schreiber
- Department of Biology, University of Marburg, Marburg, Germany
| | - Murukarthick Jayakodi
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Seeland, Germany
| | - Nils Stein
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Seeland, Germany
- Martin Luther University Halle-Wittenberg, Halle (Saale), Germany
| | - Martin Mascher
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Seeland, Germany.
- German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Leipzig, Germany.
| |
Collapse
|
3
|
McKibben MTW, Finch G, Barker MS. Species-tree topology impacts the inference of ancient whole-genome duplications across the angiosperm phylogeny. AMERICAN JOURNAL OF BOTANY 2024; 111:e16378. [PMID: 39039654 DOI: 10.1002/ajb2.16378] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/03/2024] [Revised: 06/11/2024] [Accepted: 06/12/2024] [Indexed: 07/24/2024]
Abstract
PREMISE The history of angiosperms is marked by repeated rounds of ancient whole-genome duplications (WGDs). Here we used state-of-the-art methods to provide an up-to-date view of the distribution of WGDs in the history of angiosperms that considers both uncertainty introduced by different WGD inference methods and different underlying species-tree hypotheses. METHODS We used the distribution synonymous divergences (Ks) of paralogs and orthologs from transcriptomic and genomic data to infer and place WGDs across two hypothesized angiosperm phylogenies. We further tested these WGD hypotheses with syntenic inferences and Bayesian models of duplicate gene gain and loss. RESULTS The predicted number of WGDs in the history of angiosperms (~170) based on the current taxon sampling is largely similar across different inference methods, but varies in the precise placement of WGDs on the phylogeny. Ks-based methods often yield alternative hypothesized WGD placements due to variation in substitution rates among lineages. Phylogenetic models of duplicate gene gain and loss are more robust to topological variation. However, errors in species-tree inference can still produce spurious WGD hypotheses, regardless of method used. CONCLUSIONS Here we showed that different WGD inference methods largely agree on an average of 3.5 WGD in the history of individual angiosperm species. However, the precise placement of WGDs on the phylogeny is subject to the WGD inference method and tree topology. As researchers continue to test hypotheses regarding the impacts ancient WGDs have on angiosperm evolution, it is important to consider the uncertainty of the phylogeny as well as WGD inference methods.
Collapse
Affiliation(s)
- Michael T W McKibben
- Department of Ecology & Evolutionary Biology, University of Arizona, Tucson, AZ, USA
| | - Geoffrey Finch
- Department of Ecology & Evolutionary Biology, University of Arizona, Tucson, AZ, USA
| | - Michael S Barker
- Department of Ecology & Evolutionary Biology, University of Arizona, Tucson, AZ, USA
| |
Collapse
|
4
|
Stiller J, Feng S, Chowdhury AA, Rivas-González I, Duchêne DA, Fang Q, Deng Y, Kozlov A, Stamatakis A, Claramunt S, Nguyen JMT, Ho SYW, Faircloth BC, Haag J, Houde P, Cracraft J, Balaban M, Mai U, Chen G, Gao R, Zhou C, Xie Y, Huang Z, Cao Z, Yan Z, Ogilvie HA, Nakhleh L, Lindow B, Morel B, Fjeldså J, Hosner PA, da Fonseca RR, Petersen B, Tobias JA, Székely T, Kennedy JD, Reeve AH, Liker A, Stervander M, Antunes A, Tietze DT, Bertelsen MF, Lei F, Rahbek C, Graves GR, Schierup MH, Warnow T, Braun EL, Gilbert MTP, Jarvis ED, Mirarab S, Zhang G. Complexity of avian evolution revealed by family-level genomes. Nature 2024; 629:851-860. [PMID: 38560995 PMCID: PMC11111414 DOI: 10.1038/s41586-024-07323-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2023] [Accepted: 03/15/2024] [Indexed: 04/04/2024]
Abstract
Despite tremendous efforts in the past decades, relationships among main avian lineages remain heavily debated without a clear resolution. Discrepancies have been attributed to diversity of species sampled, phylogenetic method and the choice of genomic regions1-3. Here we address these issues by analysing the genomes of 363 bird species4 (218 taxonomic families, 92% of total). Using intergenic regions and coalescent methods, we present a well-supported tree but also a marked degree of discordance. The tree confirms that Neoaves experienced rapid radiation at or near the Cretaceous-Palaeogene boundary. Sufficient loci rather than extensive taxon sampling were more effective in resolving difficult nodes. Remaining recalcitrant nodes involve species that are a challenge to model due to either extreme DNA composition, variable substitution rates, incomplete lineage sorting or complex evolutionary events such as ancient hybridization. Assessment of the effects of different genomic partitions showed high heterogeneity across the genome. We discovered sharp increases in effective population size, substitution rates and relative brain size following the Cretaceous-Palaeogene extinction event, supporting the hypothesis that emerging ecological opportunities catalysed the diversification of modern birds. The resulting phylogenetic estimate offers fresh insights into the rapid radiation of modern birds and provides a taxon-rich backbone tree for future comparative studies.
Collapse
Affiliation(s)
- Josefin Stiller
- Section for Ecology and Evolution, Department of Biology, University of Copenhagen, Copenhagen, Denmark.
| | - Shaohong Feng
- Center for Evolutionary & Organismal Biology, Liangzhu Laboratory & Women's Hospital, Zhejiang University School of Medicine, Hangzhou, China
- Department of General Surgery, Sir Run-Run Shaw Hospital, Zhejiang University School of Medicine, Hangzhou, China
- Innovation Center of Yangtze River Delta, Zhejiang University, Jiashan, China
| | - Al-Aabid Chowdhury
- School of Life and Environmental Sciences, University of Sydney, Sydney, New South Wales, Australia
| | | | - David A Duchêne
- Center for Evolutionary Hologenomics, The Globe Institute, University of Copenhagen, Copenhagen, Denmark
| | - Qi Fang
- BGI Research, Shenzhen, China
| | - Yuan Deng
- BGI Research, Shenzhen, China
- BGI Research, Wuhan, China
| | - Alexey Kozlov
- Computational Molecular Evolution Group, Heidelberg Institute for Theoretical Studies, Heidelberg, Germany
| | - Alexandros Stamatakis
- Computational Molecular Evolution Group, Heidelberg Institute for Theoretical Studies, Heidelberg, Germany
- Institute of Computer Science, Foundation for Research and Technology Hellas, Heraklion, Greece
- Institute for Theoretical Informatics, Karlsruhe Institute of Technology, Karlsruhe, Germany
| | - Santiago Claramunt
- Department of Ecology and Evolutionary Biology, University of Toronto, Toronto, Ontario, Canada
- Department of Natural History, Royal Ontario Museum, Toronto, Ontario, Canada
| | - Jacqueline M T Nguyen
- College of Science and Engineering, Flinders University, Adelaide, South Australia, Australia
- Australian Museum Research Institute, Sydney, New South Wales, Australia
| | - Simon Y W Ho
- School of Life and Environmental Sciences, University of Sydney, Sydney, New South Wales, Australia
| | - Brant C Faircloth
- Department of Biological Sciences and Museum of Natural Science, Louisiana State University, Baton Rouge, LA, USA
| | - Julia Haag
- Computational Molecular Evolution Group, Heidelberg Institute for Theoretical Studies, Heidelberg, Germany
| | - Peter Houde
- Department of Biology, New Mexico State University, Las Cruces, NM, USA
| | - Joel Cracraft
- Department of Ornithology, American Museum of Natural History, New York, NY, USA
| | - Metin Balaban
- Bioinformatics and Systems Biology Graduate Program, University of California San Diego, La Jolla, CA, USA
| | - Uyen Mai
- Computer Science and Engineering, University of California San Diego, La Jolla, CA, USA
| | - Guangji Chen
- BGI Research, Wuhan, China
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing, China
| | - Rongsheng Gao
- BGI Research, Wuhan, China
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing, China
| | | | - Yulong Xie
- Center for Evolutionary & Organismal Biology, Liangzhu Laboratory & Women's Hospital, Zhejiang University School of Medicine, Hangzhou, China
| | - Zijian Huang
- Center for Evolutionary & Organismal Biology, Liangzhu Laboratory & Women's Hospital, Zhejiang University School of Medicine, Hangzhou, China
| | - Zhen Cao
- Department of Computer Science, Rice University, Houston, TX, USA
| | - Zhi Yan
- Department of Computer Science, Rice University, Houston, TX, USA
| | - Huw A Ogilvie
- Department of Computer Science, Rice University, Houston, TX, USA
| | - Luay Nakhleh
- Department of Computer Science, Rice University, Houston, TX, USA
| | - Bent Lindow
- Natural History Museum Denmark, University of Copenhagen, Copenhagen, Denmark
| | - Benoit Morel
- Computational Molecular Evolution Group, Heidelberg Institute for Theoretical Studies, Heidelberg, Germany
- Institute of Computer Science, Foundation for Research and Technology Hellas, Heraklion, Greece
| | - Jon Fjeldså
- Natural History Museum Denmark, University of Copenhagen, Copenhagen, Denmark
| | - Peter A Hosner
- Natural History Museum Denmark, University of Copenhagen, Copenhagen, Denmark
- Center for Global Mountain Biodiversity, Globe Institute, University of Copenhagen, Copenhagen, Denmark
| | - Rute R da Fonseca
- Center for Global Mountain Biodiversity, Globe Institute, University of Copenhagen, Copenhagen, Denmark
| | - Bent Petersen
- Center for Evolutionary Hologenomics, The Globe Institute, University of Copenhagen, Copenhagen, Denmark
- Centre of Excellence for Omics-Driven Computational Biodiscovery, Faculty of Applied Sciences, AIMST University, Bedong, Malaysia
| | - Joseph A Tobias
- Department of Life Sciences, Imperial College London, Silwood Park, Ascot, UK
| | - Tamás Székely
- Milner Centre for Evolution, University of Bath, Bath, UK
- ELKH-DE Reproductive Strategies Research Group, University of Debrecen, Debrecen, Hungary
| | - Jonathan David Kennedy
- Center for Macroecology, Evolution, and Climate, The Globe Institute, University of Copenhagen, Copenhagen, Denmark
| | - Andrew Hart Reeve
- Natural History Museum Denmark, University of Copenhagen, Copenhagen, Denmark
| | - Andras Liker
- HUN-REN-PE Evolutionary Ecology Research Group, University of Pannonia, Veszprém, Hungary
- Behavioural Ecology Research Group, Center for Natural Sciences, University of Pannonia, Veszprém, Hungary
| | | | - Agostinho Antunes
- CIIMAR/CIMAR, Interdisciplinary Centre of Marine and Environmental Research, University of Porto, Porto, Portugal
- Department of Biology, Faculty of Sciences, University of Porto, Porto, Portugal
| | | | - Mads F Bertelsen
- Centre for Zoo and Wild Animal Health, Copenhagen Zoo, Frederiksberg, Denmark
| | - Fumin Lei
- Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
- College of Life Science, University of Chinese Academy of Sciences, Beijing, China
| | - Carsten Rahbek
- Center for Global Mountain Biodiversity, Globe Institute, University of Copenhagen, Copenhagen, Denmark
- Center for Macroecology, Evolution, and Climate, The Globe Institute, University of Copenhagen, Copenhagen, Denmark
- Institute of Ecology, Peking University, Beijing, China
- Danish Institute for Advanced Study, University of Southern Denmark, Odense, Denmark
| | - Gary R Graves
- Center for Macroecology, Evolution, and Climate, The Globe Institute, University of Copenhagen, Copenhagen, Denmark
- Department of Vertebrate Zoology, National Museum of Natural History, Smithsonian Institution, Washington, DC, USA
| | | | - Tandy Warnow
- University of Illinois Urbana-Champaign, Champaign, IL, USA
| | - Edward L Braun
- Department of Biology, University of Florida, Gainesville, FL, USA
| | - M Thomas P Gilbert
- Center for Evolutionary Hologenomics, The Globe Institute, University of Copenhagen, Copenhagen, Denmark
- University Museum, NTNU, Trondheim, Norway
| | - Erich D Jarvis
- Vertebrate Genome Lab, The Rockefeller University, New York, NY, USA
- Howard Hughes Medical Institute, Durham, NC, USA
| | | | - Guojie Zhang
- Center for Evolutionary & Organismal Biology, Liangzhu Laboratory & Women's Hospital, Zhejiang University School of Medicine, Hangzhou, China.
- Innovation Center of Yangtze River Delta, Zhejiang University, Jiashan, China.
- BGI Research, Wuhan, China.
- Villum Center for Biodiversity Genomics, Department of Biology, University of Copenhagen, Copenhagen, Denmark.
| |
Collapse
|
5
|
Lao XL, Meng Y, Wu J, Wen J, Nie ZL. Plastid genomes provide insights into the phylogeny and chloroplast evolution of the paper daisy tribe Gnaphalieae (Asteraceae). Gene 2024; 901:148177. [PMID: 38242378 DOI: 10.1016/j.gene.2024.148177] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2023] [Revised: 01/03/2024] [Accepted: 01/16/2024] [Indexed: 01/21/2024]
Abstract
Chloroplast genomes, as an essential source of phylogenetic information, are increasingly utilized in the evolutionary study of angiosperms. Gnaphalieae is a medium-sized tribe of the sunflower family of Asteraceae, with about 2,100 species in 178 genera distributed in temperate habitats worldwide. There has been considerable progress in our understanding of their phylogenetic evolution using both nuclear and chloroplast sequences, but no focus on chloroplast genomic data. In this study, we performed sequencing, assembly, and annotation of 16 representative chloroplast genomes from all the major lineages of Gnaphalieae. Our results showed that the plastomes exhibited a typical circular tetrad structure with similar genomic structure gene content. But there were differences in genome size, SSRs, and codon usage within the tribe. Phylogenetic analysis revealed Relhania clade is the earliest diverged lineages with the Lasiopogon clade and the Gnaphalium s.s. clade diverged subsequently. The core group includes FLAG clade sister to the HAP and Australasian group. Compared with the outgroup species, chloroplast genome size of the FLAG clade is much reduced whereas those of Australasian, HAP, Gnaphalium s.s., Lasiopogon and Relhania clades are relatively expanded. Insertions and deletions in the intergenic regions associated with repetitive sequence variations are supposed to be the main factor leading to length variations in the chloroplast genomes of Gnaphalieae. The comparative analyses of chloroplast genomes would provide useful implications into understanding the taxonomic and evolutionary history of Gnaphalieae.
Collapse
Affiliation(s)
- Xiao-Lin Lao
- College of Biology and Environmental Sciences, Jishou University, Jishou, Hunan 416000, China
| | - Ying Meng
- College of Biology and Environmental Sciences, Jishou University, Jishou, Hunan 416000, China
| | - Jue Wu
- College of Biology and Environmental Sciences, Jishou University, Jishou, Hunan 416000, China
| | - Jun Wen
- Department of Botany, National Museum of Natural History, Smithsonian Institution, Washington, DC 20013-7012, USA
| | - Ze-Long Nie
- College of Biology and Environmental Sciences, Jishou University, Jishou, Hunan 416000, China.
| |
Collapse
|
6
|
Dietz L, Mayer C, Stolle E, Eberle J, Misof B, Podsiadlowski L, Niehuis O, Ahrens D. Metazoa-level USCOs as markers in species delimitation and classification. Mol Ecol Resour 2024; 24:e13921. [PMID: 38146909 DOI: 10.1111/1755-0998.13921] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2023] [Revised: 12/06/2023] [Accepted: 12/13/2023] [Indexed: 12/27/2023]
Abstract
Metazoa-level universal single-copy orthologs (mzl-USCOs) are universally applicable markers for DNA taxonomy in animals that can replace or supplement single-gene barcodes. Previously, mzl-USCOs from target enrichment data were shown to reliably distinguish species. Here, we tested whether USCOs are an evenly distributed, representative sample of a given metazoan genome and therefore able to cope with past hybridization events and incomplete lineage sorting. This is relevant for coalescent-based species delimitation approaches, which critically depend on the assumption that the investigated loci do not exhibit autocorrelation due to physical linkage. Based on 239 chromosome-level assembled genomes, we confirmed that mzl-USCOs are genetically unlinked for practical purposes and a representative sample of a genome in terms of reciprocal distances between USCOs on a chromosome and of distribution across chromosomes. We tested the suitability of mzl-USCOs extracted from genomes for species delimitation and phylogeny in four case studies: Anopheles mosquitos, Drosophila fruit flies, Heliconius butterflies and Darwin's finches. In almost all instances, USCOs allowed delineating species and yielded phylogenies that corresponded to those generated from whole genome data. Our phylogenetic analyses demonstrate that USCOs may complement single-gene DNA barcodes and provide more accurate taxonomic inferences. Combining USCOs from sources that used different versions of ortholog reference libraries to infer marker orthology may be challenging and, at times, impact taxonomic conclusions. However, we expect this problem to become less severe as the rapidly growing number of reference genomes provides a better representation of the number and diversity of organismal lineages.
Collapse
Affiliation(s)
- Lars Dietz
- Museum A. Koenig, Leibniz Institute for the Analysis of Biodiversity Change, Bonn, Germany
| | - Christoph Mayer
- Museum A. Koenig, Leibniz Institute for the Analysis of Biodiversity Change, Bonn, Germany
| | - Eckart Stolle
- Museum A. Koenig, Leibniz Institute for the Analysis of Biodiversity Change, Bonn, Germany
| | - Jonas Eberle
- Museum A. Koenig, Leibniz Institute for the Analysis of Biodiversity Change, Bonn, Germany
- Paris-Lodron-University, Salzburg, Austria
| | - Bernhard Misof
- Museum A. Koenig, Leibniz Institute for the Analysis of Biodiversity Change, Bonn, Germany
- Rheinische Friedrich-Wilhelms-Universität Bonn, Bonn, Germany
| | - Lars Podsiadlowski
- Museum A. Koenig, Leibniz Institute for the Analysis of Biodiversity Change, Bonn, Germany
| | - Oliver Niehuis
- Abt. Evolutionsbiologie und Ökologie, Institut für Biologie I, Albert-Ludwigs-Universität Freiburg, Freiburg, Germany
| | - Dirk Ahrens
- Museum A. Koenig, Leibniz Institute for the Analysis of Biodiversity Change, Bonn, Germany
| |
Collapse
|
7
|
Jiang Z, Zang W, Ericson PGP, Song G, Wu S, Feng S, Drovetski SV, Liu G, Zhang D, Saitoh T, Alström P, Edwards SV, Lei F, Qu Y. Gene flow and an anomaly zone complicate phylogenomic inference in a rapidly radiated avian family (Prunellidae). BMC Biol 2024; 22:49. [PMID: 38413944 PMCID: PMC10900574 DOI: 10.1186/s12915-024-01848-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2023] [Accepted: 02/15/2024] [Indexed: 02/29/2024] Open
Abstract
BACKGROUND Resolving the phylogeny of rapidly radiating lineages presents a challenge when building the Tree of Life. An Old World avian family Prunellidae (Accentors) comprises twelve species that rapidly diversified at the Pliocene-Pleistocene boundary. RESULTS Here we investigate the phylogenetic relationships of all species of Prunellidae using a chromosome-level de novo assembly of Prunella strophiata and 36 high-coverage resequenced genomes. We use homologous alignments of thousands of exonic and intronic loci to build the coalescent and concatenated phylogenies and recover four different species trees. Topology tests show a large degree of gene tree-species tree discordance but only 40-54% of intronic gene trees and 36-75% of exonic genic trees can be explained by incomplete lineage sorting and gene tree estimation errors. Estimated branch lengths for three successive internal branches in the inferred species trees suggest the existence of an empirical anomaly zone. The most common topology recovered for species in this anomaly zone was not similar to any coalescent or concatenated inference phylogenies, suggesting presence of anomalous gene trees. However, this interpretation is complicated by the presence of gene flow because extensive introgression was detected among these species. When exploring tree topology distributions, introgression, and regional variation in recombination rate, we find that many autosomal regions contain signatures of introgression and thus may mislead phylogenetic inference. Conversely, the phylogenetic signal is concentrated to regions with low-recombination rate, such as the Z chromosome, which are also more resistant to interspecific introgression. CONCLUSIONS Collectively, our results suggest that phylogenomic inference should consider the underlying genomic architecture to maximize the consistency of phylogenomic signal.
Collapse
Affiliation(s)
- Zhiyong Jiang
- Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing, China
| | - Wenqing Zang
- Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing, China
| | - Per G P Ericson
- Department of Bioinformatics and Genetics, Swedish Museum of Natural History, PO Box 50007, Stockholm, SE-104 05, Sweden
| | - Gang Song
- Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
| | - Shaoyuan Wu
- Jiangsu International Joint Center of Genomics, Jiangsu Key Laboratory of Phylogenomics & Comparative Genomics, School of Life Sciences, Jiangsu Normal University, Xuzhou, 221116, Jiangsu, China
| | - Shaohong Feng
- Center for Evolutionary & Organismal Biology, Zhejiang University School of Medicine, Hangzhou, 310058, China
- Liangzhu Laboratory, Zhejiang University, 1369 West Wenyi Road, Hangzhou, 311121, China
- Innovation Center of Yangtze River Delta, Zhejiang University, Jiashan, 314102, China
| | - Sergei V Drovetski
- National Museum of Natural History, Smithsonian Institution, Washington, DC, 20004, USA
- Present address: U.S. Geological Survey, Eastern Ecological Science Center at Patuxent Research Refuge, Laurel, MD, 20708, USA
| | - Gang Liu
- Chinese Academy of Forestry, Institute of Ecological Conservation and Restoration, Beijing, 100091, China
| | - Dezhi Zhang
- Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
| | - Takema Saitoh
- Yamashina Institute for Ornithology, Abiko, Chiba, Japan
| | - Per Alström
- Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
- Animal Ecology, Department of Ecology and Genetics, Evolutionary Biology Centre, Uppsala University, Norbyvägen 18 D, 752 36, Uppsala, Sweden
| | - Scott V Edwards
- Museum of Comparative Zoology and Department of Organismic & Evolutionary Biology, Harvard University, 26 Oxford Street, Cambridge, MA, 02138, USA
| | - Fumin Lei
- Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing, China
| | - Yanhua Qu
- Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, China.
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing, China.
- Department of Bioinformatics and Genetics, Swedish Museum of Natural History, PO Box 50007, Stockholm, SE-104 05, Sweden.
| |
Collapse
|
8
|
Ané C, Fogg J, Allman ES, Baños H, Rhodes JA. Anomalous networks under the multispecies coalescent: theory and prevalence. J Math Biol 2024; 88:29. [PMID: 38372830 DOI: 10.1007/s00285-024-02050-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2023] [Revised: 01/18/2024] [Accepted: 01/21/2024] [Indexed: 02/20/2024]
Abstract
Reticulations in a phylogenetic network represent processes such as gene flow, admixture, recombination and hybrid speciation. Extending definitions from the tree setting, an anomalous network is one in which some unrooted tree topology displayed in the network appears in gene trees with a lower frequency than a tree not displayed in the network. We investigate anomalous networks under the Network Multispecies Coalescent Model with possible correlated inheritance at reticulations. Focusing on subsets of 4 taxa, we describe a new algorithm to calculate quartet concordance factors on networks of any level, faster than previous algorithms because of its focus on 4 taxa. We then study topological properties required for a 4-taxon network to be anomalous, uncovering the key role of [Formula: see text]-cycles: cycles of 3 edges parent to a sister group of 2 taxa. Under the model of common inheritance, that is, when each gene tree coalesces within a species tree displayed in the network, we prove that 4-taxon networks are never anomalous. Under independent and various levels of correlated inheritance, we use simulations under realistic parameters to quantify the prevalence of anomalous 4-taxon networks, finding that truly anomalous networks are rare. At the same time, however, we find a significant fraction of networks close enough to the anomaly zone to appear anomalous, when considering the quartet concordance factors observed from a few hundred genes. These apparent anomalies may challenge network inference methods.
Collapse
Affiliation(s)
- Cécile Ané
- Department of Statistics, University of Wisconsin - Madison, Madison, WI, 53706, USA.
- Department of Botany, University of Wisconsin - Madison, Madison, WI, 53706, USA.
| | - John Fogg
- Department of Statistics, University of Wisconsin - Madison, Madison, WI, 53706, USA
| | - Elizabeth S Allman
- Department of Mathematics and Statistics, University of Alaska Fairbanks, Fairbanks, AK, 99775-6660, USA
| | - Hector Baños
- Department of Biochemistry & Molecular Biology, Dalhousie University, Halifax, NS, Canada
- Department of Mathematics and Statistics, Dalhousie University, Halifax, NS, Canada
| | - John A Rhodes
- Department of Mathematics and Statistics, University of Alaska Fairbanks, Fairbanks, AK, 99775-6660, USA
| |
Collapse
|
9
|
Wu Z, Solís-Lemus C. Ultrafast learning of four-node hybridization cycles in phylogenetic networks using algebraic invariants. BIOINFORMATICS ADVANCES 2024; 4:vbae014. [PMID: 38384862 PMCID: PMC10879748 DOI: 10.1093/bioadv/vbae014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/25/2023] [Revised: 12/23/2023] [Accepted: 02/06/2024] [Indexed: 02/23/2024]
Abstract
Motivation The abundance of gene flow in the Tree of Life challenges the notion that evolution can be represented with a fully bifurcating process which cannot capture important biological realities like hybridization, introgression, or horizontal gene transfer. Coalescent-based network methods are increasingly popular, yet not scalable for big data, because they need to perform a heuristic search in the space of networks as well as numerical optimization that can be NP-hard. Here, we introduce a novel method to reconstruct phylogenetic networks based on algebraic invariants. While there is a long tradition of using algebraic invariants in phylogenetics, our work is the first to define phylogenetic invariants on concordance factors (frequencies of four-taxon splits in the input gene trees) to identify level-1 phylogenetic networks under the multispecies coalescent model. Results Our novel hybrid detection methodology is optimization-free as it only requires the evaluation of polynomial equations, and as such, it bypasses the traversal of network space, yielding a computational speed at least 10 times faster than the fastest-to-date network methods. We illustrate our method's performance on simulated and real data from the genus Canis. Availability and implementation We present an open-source publicly available Julia package PhyloDiamond.jl available at https://github.com/solislemuslab/PhyloDiamond.jl with broad applicability within the evolutionary community.
Collapse
Affiliation(s)
- Zhaoxing Wu
- Department of Statistics, Wisconsin Institute for Discovery, University of Wisconsin-Madison, Madison, WI 53706, United States
| | - Claudia Solís-Lemus
- Department of Plant Pathology, Wisconsin Institute for Discovery, University of Wisconsin-Madison, Madison, WI 53706, United States
| |
Collapse
|
10
|
Zhang Q, Folk RA, Mo ZQ, Ye H, Zhang ZY, Peng H, Zhao JL, Yang SX, Yu XQ. Phylotranscriptomic analyses reveal deep gene tree discordance in Camellia (Theaceae). Mol Phylogenet Evol 2023; 188:107912. [PMID: 37648181 DOI: 10.1016/j.ympev.2023.107912] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2023] [Revised: 08/09/2023] [Accepted: 08/27/2023] [Indexed: 09/01/2023]
Abstract
Gene tree discordance is a significant legacy of biological evolution. Multiple factors can result in incongruence among genes, such as introgression, incomplete lineage sorting (ILS), gene duplication or loss. Resolving the background of gene tree discordance is a critical way to uncover the process of species diversification. Camellia, the largest genus in Theaceae, has controversial taxonomy and systematics due in part to a complex evolutionary history. We used 60 transcriptomes of 55 species, which represented 15 sections of Camellia to investigate its phylogeny and the possible causes of gene tree discordance. We conducted gene tree discordance analysis based on 1,617 orthologous low-copy nuclear genes, primarily using coalescent species trees and polytomy tests to distinguish hard and soft conflict. A selective pressure analysis was also performed to assess the impact of selection on phylogenetic topology reconstruction. Our results detected different levels of gene tree discordance in the backbone of Camellia, and recovered rapid diversification as one of the possible causes of gene tree discordance. Furthermore, we confirmed that none of the currently proposed sections of Camellia was monophyletic. Comparisons among datasets partitioned under different selective pressure regimes showed that integrating all orthologous genes provided the best phylogenetic resolution of the species tree of Camellia. The findings of this study reveal rapid diversification as a major source of gene tree discordance in Camellia and will facilitate future investigation of reticulate relationships at the species level in this important plant genus.
Collapse
Affiliation(s)
- Qiong Zhang
- CAS Key Laboratory for Plant Diversity and Biogeography of East Asia, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming 650201, China; College of Life Sciences, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Ryan A Folk
- Department of Biological Sciences, Mississippi State University, MS 39762, United States
| | - Zhi-Qiong Mo
- CAS Key Laboratory for Plant Diversity and Biogeography of East Asia, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming 650201, China
| | - Hang Ye
- Guangxi Key Laboratory of Special Non-wood Forest Cultivation and Utilization, Guangxi Forestry Research Institute, Nanning 530002, Guangxi, China
| | - Zhao-Yuan Zhang
- Guangxi Key Laboratory of Special Non-wood Forest Cultivation and Utilization, Guangxi Forestry Research Institute, Nanning 530002, Guangxi, China
| | - Hua Peng
- CAS Key Laboratory for Plant Diversity and Biogeography of East Asia, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming 650201, China
| | - Jian-Li Zhao
- Yunnan Key Laboratory of Plant Reproductive Adaptation and Evolutionary Ecology and Institute of Biodiversity, School of Ecology and Environmental Science, Yunnan University, Kunming 650091, China.
| | - Shi-Xiong Yang
- CAS Key Laboratory for Plant Diversity and Biogeography of East Asia, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming 650201, China.
| | - Xiang-Qin Yu
- CAS Key Laboratory for Plant Diversity and Biogeography of East Asia, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming 650201, China.
| |
Collapse
|
11
|
San Jose M, Doorenweerd C, Geib S, Barr N, Dupuis JR, Leblanc L, Kauwe A, Morris KY, Rubinoff D. Interspecific gene flow obscures phylogenetic relationships in an important insect pest species complex. Mol Phylogenet Evol 2023; 188:107892. [PMID: 37524217 DOI: 10.1016/j.ympev.2023.107892] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2023] [Revised: 07/07/2023] [Accepted: 07/28/2023] [Indexed: 08/02/2023]
Abstract
As genomic data proliferates, the prevalence of post-speciation gene flow is making species boundaries and relationships increasingly ambiguous. Although current approaches inferring fully bifurcating phylogenies based on concatenated datasets provide simple and robust answers to many species relationships, they may be inaccurate because the models ignore inter-specific gene flow and incomplete lineage sorting. To examine the potential error resulting from ignoring gene flow, we generated both a RAD-seq and a 500 protein-coding loci highly multiplexed amplicon (HiMAP) dataset for a monophyletic group of 12 species defined as the Bactrocera dorsalis sensu lato clade. With some of the world's worst agricultural pests, the taxonomy of the B. dorsalis s.l. clade is important for trade and quarantines. However, taxonomic confusion confounds resolution due to intra- and interspecific phenotypic variation and convergence, mitochondrial introgression across half of the species, and viable hybrids. We compared the topological convergence of our datasets using concatenated phylogenetic and various multispecies coalescent approaches, some of which account for gene flow. All analyses agreed on species delimitation, but there was incongruence between species relationships. Under concatenation, both datasets suggest identical species relationships with mostly high statistical support. However, multispecies coalescent and multispecies network approaches suggest markedly different hypotheses and detected significant gene flow. We suggest that the network approaches are likely more accurate because gene flow violates the assumptions of the concatenated phylogenetic analyses, but the data-reductive requirements of network approaches resulted in reduced statistical support and could not unambiguously resolve gene flow directions. Our study highlights the importance of testing for gene flow, particularly with phylogenomic datasets, even when concatenated approaches receive high statistical support.
Collapse
Affiliation(s)
- Michael San Jose
- University of Hawaii, College of Tropical Agriculture and Human Resources, Department of Plant and Environmental Protection Sciences, Entomology Section, 3050 Maile Way, Honolulu, HI, 96822-2231, USA.
| | - Camiel Doorenweerd
- University of Hawaii, College of Tropical Agriculture and Human Resources, Department of Plant and Environmental Protection Sciences, Entomology Section, 3050 Maile Way, Honolulu, HI, 96822-2231, USA
| | - Scott Geib
- Tropical Crop and Commodity Protection Research Unit, Daniel K Inouye U.S. Pacific Basin Agricultural Center, USDA Agricultural Research Services, Hilo, HI, USA
| | - Norman Barr
- United States Department of Agriculture, Animal and Plant Health Inspection Service, Plant Protection and Quarantine, Science & Technology, Insect Management and Molecular Diagnostics Laboratory, 22675 N. Moorefield Road, Edinburg, TX 78541, USA
| | - Julian R Dupuis
- University of Kentucky, Department of Entomology, S225 Ag Science Center North, 1100 South Limestone, Lexington, KY, 40546-0091, USA
| | - Luc Leblanc
- University of Idaho, Department of Entomology, Plant Pathology and Nematology, 875 Perimeter Drive, MS2329, Moscow, ID, 83844-2329, USA
| | - Angela Kauwe
- Tropical Crop and Commodity Protection Research Unit, Daniel K Inouye U.S. Pacific Basin Agricultural Center, USDA Agricultural Research Services, Hilo, HI, USA
| | - Kimberley Y Morris
- University of Hawaii, College of Tropical Agriculture and Human Resources, Department of Plant and Environmental Protection Sciences, Entomology Section, 3050 Maile Way, Honolulu, HI, 96822-2231, USA; Tropical Crop and Commodity Protection Research Unit, Daniel K Inouye U.S. Pacific Basin Agricultural Center, USDA Agricultural Research Services, Hilo, HI, USA
| | - Daniel Rubinoff
- University of Hawaii, College of Tropical Agriculture and Human Resources, Department of Plant and Environmental Protection Sciences, Entomology Section, 3050 Maile Way, Honolulu, HI, 96822-2231, USA
| |
Collapse
|
12
|
Congrains C, Dupuis JR, Rodriguez EJ, Norrbom AL, Steck G, Sutton B, Nolazco N, de Brito RA, Geib SM. Phylogenomic analysis provides diagnostic tools for the identification of Anastrepha fraterculus (Diptera: Tephritidae) species complex. Evol Appl 2023; 16:1598-1618. [PMID: 37752958 PMCID: PMC10519418 DOI: 10.1111/eva.13589] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2022] [Revised: 07/24/2023] [Accepted: 08/10/2023] [Indexed: 09/28/2023] Open
Abstract
Insect pests cause tremendous impact to agriculture worldwide. Species identification is crucial for implementing appropriate measures of pest control but can be challenging in closely related species. True fruit flies of the genus Anastrepha Schiner (Diptera: Tephritidae) include some of the most serious agricultural pests in the Americas, with the Anastrepha fraterculus (Wiedemann) complex being one of the most important due to its extreme polyphagy and wide distribution across most of the New World tropics and subtropics. The eight morphotypes described for this complex as well as other closely related species are classified in the fraterculus species group, whose evolutionary relationships are unresolved due to incomplete lineage sorting and introgression. We performed multifaceted phylogenomic approaches using thousands of genes to unravel the evolutionary relationships within the A. fraterculus complex to provide a baseline for molecular diagnosis of these pests. We used a methodology that accommodates variable sources of data (transcriptome, genome, and whole-genome shotgun sequencing) and developed a tool to align and filter orthologs, generating reliable datasets for phylogenetic studies. We inferred 3031 gene trees that displayed high levels of discordance. Nevertheless, the topologies of the inferred coalescent species trees were consistent across methods and datasets, except for one lineage in the A. fraterculus complex. Furthermore, network analysis indicated introgression across lineages in the fraterculus group. We present a robust phylogeny of the group that provides insights into the intricate patterns of evolution of the A. fraterculus complex supporting the hypothesis that this complex is an assemblage of closely related cryptic lineages that have evolved under interspecific gene flow. Despite this complex evolutionary scenario, our subsampling analysis revealed that a set of as few as 80 loci has a similar phylogenetic resolution as the genome-scale dataset, offering a foundation to develop more efficient diagnostic tools in this species group.
Collapse
Affiliation(s)
- Carlos Congrains
- U.S. Department of Agriculture‐Agricultural Research Service, Daniel K. Inouye U.S. Pacific Basin Agricultural Research Center, Tropical Pest Genetics and Molecular Biology Research UnitHiloHawaiiUSA
- Department of Plant and Environmental Protection ServicesUniversity of Hawaii at ManoaHonoluluHawaiiUSA
| | - Julian R. Dupuis
- Department of EntomologyUniversity of KentuckyLexingtonKentuckyUSA
| | - Erick J. Rodriguez
- Division of Plant Industry, Florida Department of Agriculture and Consumer ServicesGainesvilleFloridaUSA
| | - Allen L. Norrbom
- Systematic Entomology LabUSDA, ARS c/o Smithsonian InstitutionWashington DCUSA
| | - Gary Steck
- Division of Plant Industry, Florida Department of Agriculture and Consumer ServicesGainesvilleFloridaUSA
| | - Bruce Sutton
- Department of Entomology (Research Associate), National Museum of Natural HistorySmithsonian InstitutionGainesvilleFloridaUSA
| | - Norma Nolazco
- Centro de Diagnóstico de Sanidad Vegetal, Servicio Nacional de Sanidad AgrariaPeru
| | - Reinaldo A. de Brito
- Departamento de Genética e EvoluçãoUniversidade Federal de São CarlosSão CarlosSão PauloBrazil
| | - Scott M. Geib
- U.S. Department of Agriculture‐Agricultural Research Service, Daniel K. Inouye U.S. Pacific Basin Agricultural Research Center, Tropical Pest Genetics and Molecular Biology Research UnitHiloHawaiiUSA
| |
Collapse
|
13
|
Ané C, Fogg J, Allman ES, Baños H, Rhodes JA. ANOMALOUS NETWORKS UNDER THE MULTISPECIES COALESCENT: THEORY AND PREVALENCE. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.08.18.553582. [PMID: 37662314 PMCID: PMC10473666 DOI: 10.1101/2023.08.18.553582] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/05/2023]
Abstract
Reticulations in a phylogenetic network represent processes such as gene flow, admixture, recombination and hybrid speciation. Extending definitions from the tree setting, an anomalous network is one in which some unrooted tree topology displayed in the network appears in gene trees with a lower frequency than a tree not displayed in the network. We investigate anomalous networks under the Network Multispecies Coalescent Model with possible correlated inheritance at reticulations. Focusing on subsets of 4 taxa, we describe a new algorithm to calculate quartet concordance factors on networks of any level, faster than previous algorithms because of its focus on 4 taxa. We then study topological properties required for a 4-taxon network to be anomalous, uncovering the key role of 32-cycles: cycles of 3 edges parent to a sister group of 2 taxa. Under the model of common inheritance, that is, when each gene tree coalesces within a species tree displayed in the network, we prove that 4-taxon networks are never anomalous. Under independent and various levels of correlated inheritance, we use simulations under realistic parameters to quantify the prevalence of anomalous 4-taxon networks, finding that truly anomalous networks are rare. At the same time, however, we find a significant fraction of networks close enough to the anomaly zone to appear anomalous, when considering the quartet concordance factors observed from a few hundred genes. These apparent anomalies may challenge network inference methods.
Collapse
Affiliation(s)
- Cécile Ané
- Department of Statistics, University of Wisconsin - Madison, WI, 53706, USA
- Department of Botany, University of Wisconsin - Madison, WI, 53706, USA
| | - John Fogg
- Department of Statistics, University of Wisconsin - Madison, WI, 53706, USA
| | - Elizabeth S Allman
- Department of Mathematics and Statistics, University of Alaska - Fairbanks, AK, 99775-6660, USA
| | - Hector Baños
- Department of Biochemistry & Molecular Biology, Dalhousie University, Halifax, Nova Scotia, Canada
- Department of Mathematics and Statistics, Dalhousie University, Halifax, Nova Scotia, Canada
| | - John A Rhodes
- Department of Mathematics and Statistics, University of Alaska - Fairbanks, AK, 99775-6660, USA
| |
Collapse
|
14
|
Crossman CA, Fontaine MC, Frasier TR. A comparison of genomic diversity and demographic history of the North Atlantic and Southwest Atlantic southern right whales. Mol Ecol 2023. [PMID: 37577945 DOI: 10.1111/mec.17099] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2023] [Revised: 07/25/2023] [Accepted: 07/31/2023] [Indexed: 08/15/2023]
Abstract
Right whales (genus Eubalaena) were among the first, and most extensively pursued, targets of commercial whaling. However, understanding the impacts of this persecution requires knowledge of the demographic histories of these species prior to exploitation. We used deep whole genome sequencing (~40×) of 12 North Atlantic (E. glacialis) and 10 Southwest Atlantic southern (E. australis) right whales to quantify contemporary levels of genetic diversity and infer their demographic histories over time. Using coalescent- and identity-by-descent-based modelling to estimate ancestral effective population sizes from genomic data, we demonstrate that North Atlantic right whales have lived with smaller effective population sizes (Ne ) than southern right whales in the Southwest Atlantic since their divergence and describe the decline in both populations around the time of whaling. North Atlantic right whales exhibit reduced genetic diversity and longer runs of homozygosity leading to higher inbreeding coefficients compared to the sampled population of southern right whales. This study represents the first comprehensive assessment of genome-wide diversity of right whales in the western Atlantic and underscores the benefits of high coverage, genome-wide datasets to help resolve long-standing questions about how historical changes in effective population size over different time scales shape contemporary diversity estimates. This knowledge is crucial to improve our understanding of the right whales' history and inform our approaches to address contemporary conservation issues. Understanding and quantifying the cumulative impact of long-term small Ne , low levels of diversity and recent inbreeding on North Atlantic right whale recovery will be important next steps.
Collapse
Affiliation(s)
- Carla A Crossman
- Biology Department, Saint Mary's University, Halifax, Nova Scotia, Canada
| | - Michael C Fontaine
- Laboratoire MIVEGEC (Université de Montpellier, CNRS 5290, IRD 224), Montpellier, France
- Groningen Institute for Evolutionary Life Sciences (GELIFES), University of Groningen, Groningen, The Netherlands
| | - Timothy R Frasier
- Biology Department, Saint Mary's University, Halifax, Nova Scotia, Canada
| |
Collapse
|
15
|
Lesica P, Lavin M. Will molecular phylogenetics help decrease nomenclatural instability? AMERICAN JOURNAL OF BOTANY 2023; 110:e16219. [PMID: 37561649 DOI: 10.1002/ajb2.16219] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/29/2023] [Revised: 06/26/2023] [Accepted: 06/27/2023] [Indexed: 08/12/2023]
Affiliation(s)
- Peter Lesica
- Division of Biological Sciences, University of Montana, Missoula, 59812, Montana, USA
| | - Matt Lavin
- Plant Sciences and Plant Pathology Department, Montana State University, Bozeman, 59717, Montana, USA
| |
Collapse
|
16
|
Xu J, Ané C. Identifiability of local and global features of phylogenetic networks from average distances. J Math Biol 2022; 86:12. [PMID: 36481927 DOI: 10.1007/s00285-022-01847-8] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2021] [Revised: 11/17/2022] [Accepted: 11/22/2022] [Indexed: 12/12/2022]
Abstract
Phylogenetic networks extend phylogenetic trees to model non-vertical inheritance, by which a lineage inherits material from multiple parents. The computational complexity of estimating phylogenetic networks from genome-wide data with likelihood-based methods limits the size of networks that can be handled. Methods based on pairwise distances could offer faster alternatives. We study here the information that average pairwise distances contain on the underlying phylogenetic network, by characterizing local and global features that can or cannot be identified. For general networks, we clarify that the root and edge lengths adjacent to reticulations are not identifiable, and then focus on the class of zipped-up semidirected networks. We provide a criterion to swap subgraphs locally, such as 3-cycles, resulting in indistinguishable networks. We propose the "distance split tree", which can be constructed from pairwise distances, and prove that it is a refinement of the network's tree of blobs, capturing the tree-like features of the network. For level-1 networks, this distance split tree is equal to the tree of blobs refined to separate polytomies from blobs, and we prove that the mixed representation of the network is identifiable. The information loss is localized around 4-cycles, for which the placement of the reticulation is unidentifiable. The mixed representation combines split edges for 4-cycles, regular tree and hybrid edges from the semidirected network, and edge parameters that encode all information identifiable from average pairwise distances.
Collapse
Affiliation(s)
- Jingcheng Xu
- Department of Statistics, University of Wisconsin - Madison, Madison, WI, 53706, USA.
| | - Cécile Ané
- Department of Statistics, University of Wisconsin - Madison, Madison, WI, 53706, USA
- Department of Botany, University of Wisconsin - Madison, Madison, WI, 53706, USA
| |
Collapse
|
17
|
Zaharias P, Warnow T. Recent progress on methods for estimating and updating large phylogenies. Philos Trans R Soc Lond B Biol Sci 2022; 377:20210244. [PMID: 35989607 PMCID: PMC9393559 DOI: 10.1098/rstb.2021.0244] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2021] [Accepted: 01/07/2022] [Indexed: 12/20/2022] Open
Abstract
With the increased availability of sequence data and even of fully sequenced and assembled genomes, phylogeny estimation of very large trees (even of hundreds of thousands of sequences) is now a goal for some biologists. Yet, the construction of these phylogenies is a complex pipeline presenting analytical and computational challenges, especially when the number of sequences is very large. In the past few years, new methods have been developed that aim to enable highly accurate phylogeny estimations on these large datasets, including divide-and-conquer techniques for multiple sequence alignment and/or tree estimation, methods that can estimate species trees from multi-locus datasets while addressing heterogeneity due to biological processes (e.g. incomplete lineage sorting and gene duplication and loss), and methods to add sequences into large gene trees or species trees. Here we present some of these recent advances and discuss opportunities for future improvements. This article is part of a discussion meeting issue 'Genomic population structures of microbial pathogens'.
Collapse
Affiliation(s)
- Paul Zaharias
- Department of Computer Science, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA
| | - Tandy Warnow
- Department of Computer Science, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA
| |
Collapse
|
18
|
Zhang C, Mirarab S. Weighting by Gene Tree Uncertainty Improves Accuracy of Quartet-based Species Trees. Mol Biol Evol 2022; 39:6750035. [PMID: 36201617 PMCID: PMC9750496 DOI: 10.1093/molbev/msac215] [Citation(s) in RCA: 24] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2022] [Revised: 09/20/2022] [Accepted: 10/03/2022] [Indexed: 01/07/2023] Open
Abstract
Phylogenomic analyses routinely estimate species trees using methods that account for gene tree discordance. However, the most scalable species tree inference methods, which summarize independently inferred gene trees to obtain a species tree, are sensitive to hard-to-avoid errors introduced in the gene tree estimation step. This dilemma has created much debate on the merits of concatenation versus summary methods and practical obstacles to using summary methods more widely and to the exclusion of concatenation. The most successful attempt at making summary methods resilient to noisy gene trees has been contracting low support branches from the gene trees. Unfortunately, this approach requires arbitrary thresholds and poses new challenges. Here, we introduce threshold-free weighting schemes for the quartet-based species tree inference, the metric used in the popular method ASTRAL. By reducing the impact of quartets with low support or long terminal branches (or both), weighting provides stronger theoretical guarantees and better empirical performance than the unweighted ASTRAL. Our simulations show that weighting improves accuracy across many conditions and reduces the gap with concatenation in conditions with low gene tree discordance and high noise. On empirical data, weighting improves congruence with concatenation and increases support. Together, our results show that weighting, enabled by a new optimization algorithm we introduce, improves the utility of summary methods and can reduce the incongruence often observed across analytical pipelines.
Collapse
Affiliation(s)
- Chao Zhang
- Bioinformatics and Systems Biology, UC San Diego, La Jolla, CA, USA
| | | |
Collapse
|
19
|
Astudillo-Clavijo V, Stiassny MLJ, Ilves KL, Musilova Z, Salzburger W, López-Fernández H. Exon-based phylogenomics and the relationships of African cichlid fishes: tackling the challenges of reconstructing phylogenies with repeated rapid radiations. Syst Biol 2022; 72:134-149. [PMID: 35880863 DOI: 10.1093/sysbio/syac051] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2021] [Revised: 07/06/2022] [Accepted: 07/19/2022] [Indexed: 11/13/2022] Open
Abstract
African cichlids (subfamily: Pseudocrenilabrinae) are among the most diverse vertebrates, and their propensity for repeated rapid radiation has made them a celebrated model system in evolutionary research. Nonetheless, despite numerous studies, phylogenetic uncertainty persists, and riverine lineages remain comparatively underrepresented in higher-level phylogenetic studies. Heterogeneous gene histories resulting from incomplete lineage sorting (ILS) and hybridization are likely sources of uncertainty, especially during episodes of rapid speciation. We investigate relationships of Pseudocrenilabrinae and its close relatives while accounting for multiple sources of genetic discordance using species tree and hybrid network analyses with hundreds of single-copy exons. We improve sequence recovery for distant relatives, thereby extending the taxonomic reach of our probes, with a hybrid reference guided/de novo assembly approach. Our analyses provide robust hypotheses for most higher-level relationships and reveal widespread gene heterogeneity, including in riverine taxa. ILS and past hybridization are identified as sources of genetic discordance in different lineages. Sampling of various Blenniiformes (formerly Ovalentaria) adds strong phylogenomic support for convict blennies (Pholidichthyidae) as sister to Cichlidae, and points to other potentially useful protein-coding markers across the order. A reliable phylogeny with representatives from diverse environments will support ongoing taxonomic and comparative evolutionary research in the cichlid model system.
Collapse
Affiliation(s)
- Viviana Astudillo-Clavijo
- Department of Ecology and Evolutionary Biology, University of Toronto, Toronto, M5S 3B2, Canada.,Department of Natural History, Royal Ontario Museum, Toronto, M5S 2C6, Canada.,Department of Ecology and Evolutionary Biology and Museum of Zoology, University of Michigan, Ann Arbor, 48109, USA
| | - Melanie L J Stiassny
- Department of Ichthyology, American Museum of Natural History, New York, 10024-5102, USA
| | - Katriina L Ilves
- Research & Collections, Zoology, Canadian Museum of Nature, Ottawa, K1P 6P4, Canada
| | - Zuzana Musilova
- Department of Zoology, Charles University in Prague, Vinicna 7, Prague, CZ-128 44, Czech Republic
| | - Walter Salzburger
- Zoological Institute, University of Basel, Vesalgasse 1, CH-4051, Basel, Switzerland
| | - Hernán López-Fernández
- Department of Ecology and Evolutionary Biology, University of Toronto, Toronto, M5S 3B2, Canada.,Department of Natural History, Royal Ontario Museum, Toronto, M5S 2C6, Canada.,Department of Ecology and Evolutionary Biology and Museum of Zoology, University of Michigan, Ann Arbor, 48109, USA
| |
Collapse
|
20
|
Pang XX, Zhang DY. Impact of Ghost Introgression on Coalescent-based Species Tree Inference and Estimation of Divergence Time. Syst Biol 2022; 72:35-49. [PMID: 35799362 DOI: 10.1093/sysbio/syac047] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2022] [Revised: 06/25/2022] [Accepted: 07/05/2022] [Indexed: 11/15/2022] Open
Abstract
The species studied in any evolutionary investigation generally constitute a small proportion of all the species currently existing or that have gone extinct. It is therefore likely that introgression, which is widespread across the tree of life, involves "ghosts," i.e., unsampled, unknown, or extinct lineages. However, the impact of ghost introgression on estimations of species trees has rarely been studied and is poorly understood. Here, we use mathematical analysis and simulations to examine the robustness of species tree methods based on the multispecies coalescent model to introgression from a ghost or extant lineage. We found that many results originally obtained for introgression between extant species can easily be extended to ghost introgression, such as the strongly interactive effects of incomplete lineage sorting (ILS) and introgression on the occurrence of anomalous gene trees (AGTs). The relative performance of the summary species tree method (ASTRAL) and the full-likelihood method (*BEAST) varies under different introgression scenarios, with the former being more robust to gene flow between non-sister species whereas the latter performing better under certain conditions of ghost introgression. When an outgroup ghost (defined as a lineage that diverged before the most basal species under investigation) acts as the donor of the introgressed genes, the time of root divergence among the investigated species generally was overestimated, whereas ingroup introgression, as commonly perceived, can only lead to underestimation. In many cases of ingroup introgression that may or may not involve ghost lineages, the stronger the ILS, the higher the accuracy achieved in estimating the time of root divergence, although the topology of the species tree is more prone to be biased by the effect of introgression.
Collapse
Affiliation(s)
- Xiao-Xu Pang
- State Key Laboratory of Earth Surface Processes and Resource Ecology and Ministry of Education Key Laboratory for Biodiversity Science and Ecological Engineering, College of Life Sciences, Beijing Normal University, Beijing 100875, China
| | - Da-Yong Zhang
- State Key Laboratory of Earth Surface Processes and Resource Ecology and Ministry of Education Key Laboratory for Biodiversity Science and Ecological Engineering, College of Life Sciences, Beijing Normal University, Beijing 100875, China
| |
Collapse
|
21
|
Young MK, Smith R, Pilgrim KL, Isaak DJ, McKelvey KS, Parkes S, Egge J, Schwartz MK. A Molecular Taxonomy of Cottus in western North America. WEST N AM NATURALIST 2022. [DOI: 10.3398/064.082.0208] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
Affiliation(s)
- Michael K. Young
- USDA Forest Service, National Genomics Center for Wildlife and Fish Conservation, Rocky Mountain Research Station, 800 E. Beckwith Avenue, Missoula, MT 59802
| | - Rebecca Smith
- USDA Forest Service, National Genomics Center for Wildlife and Fish Conservation, Rocky Mountain Research Station, 800 E. Beckwith Avenue, Missoula, MT 59802
| | - Kristine L. Pilgrim
- USDA Forest Service, National Genomics Center for Wildlife and Fish Conservation, Rocky Mountain Research Station, 800 E. Beckwith Avenue, Missoula, MT 59802
| | - Daniel J. Isaak
- USDA Forest Service, Rocky Mountain Research Station, 322 East Front Street Suite 401, Boise, ID 83702
| | - Kevin S. McKelvey
- USDA Forest Service, National Genomics Center for Wildlife and Fish Conservation, Rocky Mountain Research Station, 800 E. Beckwith Avenue, Missoula, MT 59802
| | - Sharon Parkes
- USDA Forest Service, Rocky Mountain Research Station, 322 East Front Street Suite 401, Boise, ID 83702
| | - Jacob Egge
- Department of Biology, Pacific Lutheran University, Tacoma, WA 98447
| | - Michael K. Schwartz
- USDA Forest Service, National Genomics Center for Wildlife and Fish Conservation, Rocky Mountain Research Station, 800 E. Beckwith Avenue, Missoula, MT 59802
| |
Collapse
|
22
|
Interpreting phylogenetic conflict: Hybridization in the most speciose genus of lichen-forming fungi. Mol Phylogenet Evol 2022; 174:107543. [PMID: 35690378 DOI: 10.1016/j.ympev.2022.107543] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2021] [Revised: 02/06/2022] [Accepted: 05/13/2022] [Indexed: 11/24/2022]
Abstract
While advances in sequencing technologies have been invaluable for understanding evolutionary relationships, increasingly large genomic data sets may result in conflicting evolutionary signals that are often caused by biological processes, including hybridization. Hybridization has been detected in a variety of organisms, influencing evolutionary processes such as generating reproductive barriers and mixing standing genetic variation. Here, we investigate the potential role of hybridization in the diversification of the most speciose genus of lichen-forming fungi, Xanthoparmelia. As Xanthoparmelia is projected to have gone through recent, rapid diversification, this genus is particularly suitable for investigating and interpreting the origins of phylogenomic conflict. Focusing on a clade of Xanthoparmelia largely restricted to the Holarctic region, we used a genome skimming approach to generate 962 single-copy gene regions representing over 2 Mbp of the mycobiont genome. From this genome-scale dataset, we inferred evolutionary relationships using both concatenation and coalescent-based species tree approaches. We also used three independent tests for hybridization. Although different species tree reconstruction methods recovered largely consistent and well-supported trees, there was widespread incongruence among individual gene trees. Despite challenges in differentiating hybridization from ILS in situations of recent rapid radiations, our genome-wide analyses detected multiple potential hybridization events in the Holarctic clade, suggesting one possible source of trait variability in this hyperdiverse genus. This study highlights the value in using a pluralistic approach for characterizing genome-scale conflict, even in groups with well-resolved phylogenies, while highlighting current challenges in detecting the specific impacts of hybridization.
Collapse
|
23
|
Identifiability of species network topologies from genomic sequences using the logDet distance. J Math Biol 2022; 84:35. [PMID: 35385988 DOI: 10.1007/s00285-022-01734-2] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2021] [Revised: 01/12/2022] [Accepted: 03/02/2022] [Indexed: 10/18/2022]
Abstract
Inference of network-like evolutionary relationships between species from genomic data must address the interwoven signals from both gene flow and incomplete lineage sorting. The heavy computational demands of standard approaches to this problem severely limit the size of datasets that may be analyzed, in both the number of species and the number of genetic loci. Here we provide a theoretical pointer to more efficient methods, by showing that logDet distances computed from genomic-scale sequences retain sufficient information to recover network relationships in the level-1 ultrametric case. This result is obtained under the Network Multispecies Coalescent model combined with a mixture of General Time-Reversible sequence evolution models across individual gene trees. It applies to both unlinked site data, such as for SNPs, and to sequence data in which many contiguous sites may have evolved on a common tree, such as concatenated gene sequences. Thus under standard stochastic models statistically justifiable inference of network relationships from sequences can be accomplished without consideration of individual genes or gene trees.
Collapse
|
24
|
Yan Z, Smith ML, Du P, Hahn MW, Nakhleh L. Species Tree Inference Methods Intended to Deal with Incomplete Lineage Sorting Are Robust to the Presence of Paralogs. Syst Biol 2022; 71:367-381. [PMID: 34245291 PMCID: PMC8978208 DOI: 10.1093/sysbio/syab056] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2020] [Revised: 06/23/2021] [Accepted: 06/30/2021] [Indexed: 11/24/2022] Open
Abstract
Many recent phylogenetic methods have focused on accurately inferring species trees when there is gene tree discordance due to incomplete lineage sorting (ILS). For almost all of these methods, and for phylogenetic methods in general, the data for each locus are assumed to consist of orthologous, single-copy sequences. Loci that are present in more than a single copy in any of the studied genomes are excluded from the data. These steps greatly reduce the number of loci available for analysis. The question we seek to answer in this study is: what happens if one runs such species tree inference methods on data where paralogy is present, in addition to or without ILS being present? Through simulation studies and analyses of two large biological data sets, we show that running such methods on data with paralogs can still provide accurate results. We use multiple different methods, some of which are based directly on the multispecies coalescent model, and some of which have been proven to be statistically consistent under it. We also treat the paralogous loci in multiple ways: from explicitly denoting them as paralogs, to randomly selecting one copy per species. In all cases, the inferred species trees are as accurate as equivalent analyses using single-copy orthologs. Our results have significant implications for the use of ILS-aware phylogenomic analyses, demonstrating that they do not have to be restricted to single-copy loci. This will greatly increase the amount of data that can be used for phylogenetic inference.[Gene duplication and loss; incomplete lineage sorting; multispecies coalescent; orthology; paralogy.].
Collapse
Affiliation(s)
- Zhi Yan
- Department of Computer Science, Rice University,
6100 Main Street, Houston, TX 77005, USA
| | - Megan L Smith
- Department of Biology and Department of Computer Science,
Indiana University, 1001 East Third Street, Bloomington,
IN 47405, USA
| | - Peng Du
- Department of Computer Science, Rice University,
6100 Main Street, Houston, TX 77005, USA
| | - Matthew W Hahn
- Department of Biology and Department of Computer Science,
Indiana University, 1001 East Third Street, Bloomington,
IN 47405, USA
| | - Luay Nakhleh
- Department of Computer Science, Rice University,
6100 Main Street, Houston, TX 77005, USA
- Department of BioSciences, Rice University, 6100
Main Street, Houston, TX 77005, USA
| |
Collapse
|
25
|
Hibbins MS, Hahn MW. Phylogenomic approaches to detecting and characterizing introgression. Genetics 2022; 220:iyab173. [PMID: 34788444 PMCID: PMC9208645 DOI: 10.1093/genetics/iyab173] [Citation(s) in RCA: 51] [Impact Index Per Article: 25.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2021] [Accepted: 10/02/2021] [Indexed: 12/26/2022] Open
Abstract
Phylogenomics has revealed the remarkable frequency with which introgression occurs across the tree of life. These discoveries have been enabled by the rapid growth of methods designed to detect and characterize introgression from whole-genome sequencing data. A large class of phylogenomic methods makes use of data across species to infer and characterize introgression based on expectations from the multispecies coalescent. These methods range from simple tests, such as the D-statistic, to model-based approaches for inferring phylogenetic networks. Here, we provide a detailed overview of the various signals that different modes of introgression are expected leave in the genome, and how current methods are designed to detect them. We discuss the strengths and pitfalls of these approaches and identify areas for future development, highlighting the different signals of introgression, and the power of each method to detect them. We conclude with a discussion of current challenges in inferring introgression and how they could potentially be addressed.
Collapse
Affiliation(s)
- Mark S Hibbins
- Department of Biology, Indiana University, Bloomington, IN 47405, USA
| | - Matthew W Hahn
- Department of Biology, Indiana University, Bloomington, IN 47405, USA
- Department of Computer Science, Indiana University, Bloomington, IN 47405, USA
| |
Collapse
|
26
|
Pyron RA, O’Connell KA, Lemmon EM, Lemmon AR, Beamer DA. Candidate-species delimitation in Desmognathus salamanders reveals gene flow across lineage boundaries, confounding phylogenetic estimation and clarifying hybrid zones. Ecol Evol 2022; 12:e8574. [PMID: 35222955 PMCID: PMC8848459 DOI: 10.1002/ece3.8574] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2021] [Revised: 01/05/2022] [Accepted: 01/10/2022] [Indexed: 12/19/2022] Open
Abstract
Dusky Salamanders (genus Desmognathus) currently comprise only 22 described, extant species. However, recent mitochondrial and nuclear estimates indicate the presence of up to 49 candidate species based on ecogeographic sampling. Previous studies also suggest a complex history of hybridization between these lineages. Studies in other groups suggest that disregarding admixture may affect both phylogenetic inference and clustering-based species delimitation. With a dataset comprising 233 Anchored Hybrid Enrichment (AHE) loci sequenced for 896 Desmognathus specimens from all 49 candidate species, we test three hypotheses regarding (i) species-level diversity, (ii) hybridization and admixture, and (iii) misleading phylogenetic inference. Using phylogenetic and population-clustering analyses considering gene flow, we find support for at least 47 candidate species in the phylogenomic dataset, some of which are newly characterized here while others represent combinations of previously named lineages that are collapsed in the current dataset. Within these, we observe significant phylogeographic structure, with up to 64 total geographic genetic lineages, many of which hybridize either narrowly at contact zones or extensively across ecological gradients. We find strong support for both recent admixture between terminal lineages and ancient hybridization across internal branches. This signal appears to distort concatenated phylogenetic inference, wherein more heavily admixed terminal specimens occupy apparently artifactual early-diverging topological positions, occasionally to the extent of forming false clades of intermediate hybrids. Additional geographic and genetic sampling and more robust computational approaches will be needed to clarify taxonomy, and to reconstruct a network topology to display evolutionary relationships in a manner that is consistent with their complex history of reticulation.
Collapse
Affiliation(s)
- Robert Alexander Pyron
- Department of Biological SciencesThe George Washington UniversityWashingtonDistrict of ColumbiaUSA
- Division of Amphibians and ReptilesDepartment of Vertebrate ZoologyNational Museum of Natural History Smithsonian InstitutionWashingtonDistrict of ColumbiaUSA
| | - Kyle A. O’Connell
- Department of Biological SciencesThe George Washington UniversityWashingtonDistrict of ColumbiaUSA
- Division of Amphibians and ReptilesDepartment of Vertebrate ZoologyNational Museum of Natural History Smithsonian InstitutionWashingtonDistrict of ColumbiaUSA
- Global Genome InitiativeNational Museum of Natural History Smithsonian InstitutionWashingtonDistrict of ColumbiaUSA
- Biomedical Data Science LabDeloitte Consulting LLPArlingtonVirginiaUSA
| | | | - Alan R. Lemmon
- Department of Scientific ComputingFlorida State UniversityTallahasseeFloridaUSA
| | - David A. Beamer
- Department of Natural SciencesNash Community CollegeRocky MountNorth CarolinaUSA
| |
Collapse
|
27
|
Suvorov A, Kim BY, Wang J, Armstrong EE, Peede D, D'Agostino ERR, Price DK, Waddell P, Lang M, Courtier-Orgogozo V, David JR, Petrov D, Matute DR, Schrider DR, Comeault AA. Widespread introgression across a phylogeny of 155 Drosophila genomes. Curr Biol 2022; 32:111-123.e5. [PMID: 34788634 PMCID: PMC8752469 DOI: 10.1016/j.cub.2021.10.052] [Citation(s) in RCA: 95] [Impact Index Per Article: 47.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2021] [Revised: 09/29/2021] [Accepted: 10/22/2021] [Indexed: 01/12/2023]
Abstract
Genome-scale sequence data have invigorated the study of hybridization and introgression, particularly in animals. However, outside of a few notable cases, we lack systematic tests for introgression at a larger phylogenetic scale across entire clades. Here, we leverage 155 genome assemblies from 149 species to generate a fossil-calibrated phylogeny and conduct multilocus tests for introgression across 9 monophyletic radiations within the genus Drosophila. Using complementary phylogenomic approaches, we identify widespread introgression across the evolutionary history of Drosophila. Mapping gene-tree discordance onto the phylogeny revealed that both ancient and recent introgression has occurred across most of the 9 clades that we examined. Our results provide the first evidence of introgression occurring across the evolutionary history of Drosophila and highlight the need to continue to study the evolutionary consequences of hybridization and introgression in this genus and across the tree of life.
Collapse
Affiliation(s)
- Anton Suvorov
- Department of Genetics, University of North Carolina, Chapel Hill, NC 27599, USA.
| | - Bernard Y Kim
- Department of Biology, Stanford University, Stanford, CA, USA
| | - Jeremy Wang
- Department of Genetics, University of North Carolina, Chapel Hill, NC 27599, USA
| | | | - David Peede
- Department of Biology, University of North Carolina, Chapel Hill, NC 27599, USA
| | | | - Donald K Price
- School of Life Sciences, University of Nevada, Las Vegas, NV 89119, USA
| | - Peter Waddell
- School of Fundamental Sciences, Massey University, Palmerston North 4442, New Zealand
| | - Michael Lang
- CNRS, Institut Jacques Monod, Université de Paris, Paris 75013, France
| | | | - Jean R David
- Laboratoire Evolution, Génomes, Comportement, Ecologie (EGCE) CNRS, IRD, Univ. Paris-sud, Université Paris-Saclay, Gif sur Yvette 91190, France; Institut de Systématique, Evolution, Biodiversité, CNRS, MNHN, UPMC, EPHE, Muséum National d'Histoire Naturelle, Sorbonne Universités, Paris 75005, France
| | - Dmitri Petrov
- Department of Biology, Stanford University, Stanford, CA, USA
| | - Daniel R Matute
- Department of Biology, University of North Carolina, Chapel Hill, NC 27599, USA
| | - Daniel R Schrider
- Department of Genetics, University of North Carolina, Chapel Hill, NC 27599, USA
| | - Aaron A Comeault
- Molecular Ecology & Evolution Group, School of Natural Sciences, Bangor University, Bangor, Gwynedd LL57 2DGA, UK.
| |
Collapse
|
28
|
Jiao X, Flouri T, Yang Z. Multispecies coalescent and its applications to infer species phylogenies and cross-species gene flow. Natl Sci Rev 2022; 8:nwab127. [PMID: 34987842 PMCID: PMC8692950 DOI: 10.1093/nsr/nwab127] [Citation(s) in RCA: 21] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2021] [Revised: 07/10/2021] [Accepted: 07/11/2021] [Indexed: 02/06/2023] Open
Abstract
Multispecies coalescent (MSC) is the extension of the single-population coalescent model to multiple species. It integrates the phylogenetic process of species divergences and the population genetic process of coalescent, and provides a powerful framework for a number of inference problems using genomic sequence data from multiple species, including estimation of species divergence times and population sizes, estimation of species trees accommodating discordant gene trees, inference of cross-species gene flow and species delimitation. In this review, we introduce the major features of the MSC model, discuss full-likelihood and heuristic methods of species tree estimation and summarize recent methodological advances in inference of cross-species gene flow. We discuss the statistical and computational challenges in the field and research directions where breakthroughs may be likely in the next few years.
Collapse
Affiliation(s)
- Xiyun Jiao
- Department of Genetics, Evolution and Environment, University College London, London WC1E 6BT, UK
| | - Tomáš Flouri
- Department of Genetics, Evolution and Environment, University College London, London WC1E 6BT, UK
| | - Ziheng Yang
- Department of Genetics, Evolution and Environment, University College London, London WC1E 6BT, UK
| |
Collapse
|
29
|
Merging Arcs to Produce Acyclic Phylogenetic Networks and Normal Networks. Bull Math Biol 2022; 84:26. [PMID: 34982266 PMCID: PMC8727431 DOI: 10.1007/s11538-021-00986-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2021] [Accepted: 12/11/2021] [Indexed: 11/30/2022]
Abstract
As phylogenetic networks grow increasingly complicated, systematic methods for simplifying them to reveal properties will become more useful. This paper considers how to modify acyclic phylogenetic networks into other acyclic networks by contracting specific arcs that include a set D. The networks need not be binary, so vertices in the networks may have more than two parents and/or more than two children. In general, in order to make the resulting network acyclic, additional arcs not in D must also be contracted. This paper shows how to choose D so that the resulting acyclic network is “pre-normal”. As a result, removal of all redundant arcs yields a normal network. The set D can be selected based only on the geometry of the network, giving a well-defined normal phylogenetic network depending only on the given network. There are CSD maps relating most of the networks. The resulting network can be visualized as a “wired lift” in the original network, which appears as the original network with each arc drawn in one of three ways.
Collapse
|
30
|
Zhu Q, Mirarab S. Assembling a Reference Phylogenomic Tree of Bacteria and Archaea by Summarizing Many Gene Phylogenies. Methods Mol Biol 2022; 2569:137-165. [PMID: 36083447 DOI: 10.1007/978-1-0716-2691-7_7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
Phylogenomics is the inference of phylogenetic trees based on multiple marker genes sampled in the genomes of interest. An important challenge in phylogenomics is the potential incongruence among the evolutionary histories of individual genes, which can be widespread in microorganisms due to the prevalence of horizontal gene transfer. This protocol introduces the procedures for building a phylogenetic tree of a large number of microbial genomes using a broad sampling of marker genes that are representative of whole-genome evolution. The protocol highlights the use of a gene tree summary method, which can effectively reconstruct the species tree while accounting for the topological conflicts among individual gene trees. The pipeline described in this protocol is scalable to tens of thousands of genomes while retaining high accuracy. We discussed multiple software tools, libraries, and scripts to enable convenient adoption of the protocol. The protocol is suitable for microbiology and microbiome studies based on public genomes and metagenomic data.
Collapse
Affiliation(s)
- Qiyun Zhu
- Biodesign Center for Fundamental and Applied Microbiomics, Arizona State University, Tempe, AZ, USA.
- School of Life Sciences, Arizona State University, Tempe, AZ, USA.
| | - Siavash Mirarab
- Department of Electrical and Computer Engineering, University of California San Diego, San Diego, CA, USA
| |
Collapse
|
31
|
Wallin R, van Iersel L, Kelk S, Stougie L. Applicability of several rooted phylogenetic network algorithms for representing the evolutionary history of SARS-CoV-2. BMC Ecol Evol 2021; 21:220. [PMID: 34876022 PMCID: PMC8649988 DOI: 10.1186/s12862-021-01946-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2021] [Accepted: 11/03/2021] [Indexed: 11/13/2022] Open
Abstract
Background Rooted phylogenetic networks are used to display complex evolutionary history involving so-called reticulation events, such as genetic recombination. Various methods have been developed to construct such networks, using for example a multiple sequence alignment or multiple phylogenetic trees as input data. Coronaviruses are known to recombine frequently, but rooted phylogenetic networks have not yet been used extensively to describe their evolutionary history. Here, we created a workflow to compare the evolutionary history of SARS-CoV-2 with other SARS-like viruses using several rooted phylogenetic network inference algorithms. This workflow includes filtering noise from sets of phylogenetic trees by contracting edges based on branch length and bootstrap support, followed by resolution of multifurcations. We explored the running times of the network inference algorithms, the impact of filtering on the properties of the produced networks, and attempted to derive biological insights regarding the evolution of SARS-CoV-2 from them. Results The network inference algorithms are capable of constructing rooted phylogenetic networks for coronavirus data, although running-time limitations require restricting such datasets to a relatively small number of taxa. Filtering generally reduces the number of reticulations in the produced networks and increases their temporal consistency. Taxon bat-SL-CoVZC45 emerges as a major and structural source of discordance in the dataset. The tested algorithms often indicate that SARS-CoV-2/RaTG13 is a tree-like clade, with possibly some reticulate activity further back in their history. A smaller number of constructed networks posit SARS-CoV-2 as a possible recombinant, although this might be a methodological artefact arising from the interaction of bat-SL-CoVZC45 discordance and the optimization criteria used. Conclusion Our results demonstrate that as part of a wider workflow and with careful attention paid to running time, rooted phylogenetic network algorithms are capable of producing plausible networks from coronavirus data. These networks partly corroborate existing theories about SARS-CoV-2, and partly produce new avenues for exploration regarding the location and significance of reticulate activity within the wider group of SARS-like viruses. Our workflow may serve as a model for pipelines in which phylogenetic network algorithms can be used to analyse different datasets and test different hypotheses.
Collapse
Affiliation(s)
- Rosanne Wallin
- Centrum Wiskunde & Informatica (CWI), Science Park 123, 1098 XG, Amsterdam, The Netherlands
| | - Leo van Iersel
- Delft Institute of Applied Mathematics, Delft University of Technology, Van Mourik Broekmanweg 6, 2628 XE, Delft, The Netherlands
| | - Steven Kelk
- Department of Data Science and Knowledge Engineering (DKE), Maastricht University, Maastricht, The Netherlands
| | - Leen Stougie
- Centrum Wiskunde & Informatica (CWI), Science Park 123, 1098 XG, Amsterdam, The Netherlands. .,School of Business and Economics, Vrije Universiteit, Amsterdam, The Netherlands.
| |
Collapse
|
32
|
Santos SHD, Peery RM, Miller JM, Dao A, Lyu FH, Li X, Li MH, Coltman DW. Ancient hybridization patterns between bighorn and thinhorn sheep. Mol Ecol 2021; 30:6273-6288. [PMID: 34845798 DOI: 10.1111/mec.16136] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2020] [Revised: 07/27/2021] [Accepted: 08/18/2021] [Indexed: 12/12/2022]
Abstract
Whole-genome sequencing has advanced the study of species evolution, including the detection of genealogical discordant events such as ancient hybridization and incomplete lineage sorting (ILS). The evolutionary history of bighorn (Ovis canadensis) and thinhorn (Ovis dalli) sheep present an ideal system to investigate evolutionary discordance due to their recent and rapid radiation and putative secondary contact between bighorn and thinhorn sheep subspecies, specifically the dark pelage Stone sheep (O. dalli stonei) and predominately white Dall sheep (O. dalli dalli), during the last ice age. Here, we used multiple genomes of bighorn and thinhorn sheep, together with snow (O. nivicola) and the domestic sheep (O. aries) as outgroups, to assess their phylogenomic history, potential introgression patterns and their adaptive consequences. Among the Pachyceriforms (snow, bighorn and thinhorn sheep) a consistent monophyletic species tree was retrieved; however, many genealogical discordance patterns were observed. Alternative phylogenies frequently placed Stone and bighorn as sister clades. This relationship occurred more often and was less divergent than that between Dall and bighorn. We also observed many blocks containing introgression signal between Stone and bighorn genomes in which coat colour genes were present. Introgression signals observed between Dall and bighorn were more random and less frequent, and therefore probably due to ILS or intermediary secondary contact. These results strongly suggest that Stone sheep originated from a complex series of events, characterized by multiple, ancient periods of secondary contact with bighorn sheep.
Collapse
Affiliation(s)
- Sarah H D Santos
- Department of Biological Sciences, University of Alberta, Edmonton, AB, Canada
| | - Rhiannon M Peery
- Department of Biological Sciences, University of Alberta, Edmonton, AB, Canada
| | - Joshua M Miller
- Department of Biological Sciences, University of Alberta, Edmonton, AB, Canada
| | - Anh Dao
- Department of Biological Sciences, University of Alberta, Edmonton, AB, Canada
| | - Feng-Hua Lyu
- College of Animal Science and Technology, China Agricultural University, Beijing, China
| | - Xin Li
- CAS Key Laboratory of Animal Ecology and Conservation Biology, Chinese Academy of Sciences (CAS), Beijing, China.,University of Chinese Academy of Sciences (UCAS), Beijing, China
| | - Meng-Hua Li
- CAS Key Laboratory of Animal Ecology and Conservation Biology, Chinese Academy of Sciences (CAS), Beijing, China
| | - David W Coltman
- Department of Biological Sciences, University of Alberta, Edmonton, AB, Canada
| |
Collapse
|
33
|
Finger N, Farleigh K, Bracken JT, Leaché AD, François O, Yang Z, Flouri T, Charran T, Jezkova T, Williams DA, Blair C. Genome-scale data reveal deep lineage divergence and a complex demographic history in the Texas horned lizard (Phrynosoma cornutum) throughout the southwestern and central US. Genome Biol Evol 2021; 14:6443127. [PMID: 34849831 PMCID: PMC8735750 DOI: 10.1093/gbe/evab260] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/12/2021] [Indexed: 12/03/2022] Open
Abstract
The southwestern and central United States serve as an ideal region to test alternative hypotheses regarding biotic diversification. Genomic data can now be combined with sophisticated computational models to quantify the impacts of paleoclimate change, geographic features, and habitat heterogeneity on spatial patterns of genetic diversity. In this study, we combine thousands of genotyping-by-sequencing (GBS) loci with mtDNA sequences (ND1) from the Texas horned lizard (Phrynosoma cornutum) to quantify relative support for different catalysts of diversification. Phylogenetic and clustering analyses of the GBS data indicate support for at least three primary populations. The spatial distribution of populations appears concordant with habitat type, with desert populations in AZ and NM showing the largest genetic divergence from the remaining populations. The mtDNA data also support a divergent desert population, but other relationships differ and suggest mtDNA introgression. Genotype–environment association with bioclimatic variables supports divergence along precipitation gradients more than along temperature gradients. Demographic analyses support a complex history, with introgression and gene flow playing an important role during diversification. Bayesian multispecies coalescent analyses with introgression (MSci) analyses also suggest that gene flow occurred between populations. Paleo-species distribution models support two southern refugia that geographically correspond to contemporary lineages. We find that divergence times are underestimated and population sizes are overestimated when introgression occurred and is ignored in coalescent analyses, and furthermore, inference of ancient introgression events and demographic history is sensitive to inclusion of a single recently admixed sample. Our analyses cannot refute the riverine barrier or glacial refugia hypotheses. Results also suggest that populations are continuing to diverge along habitat gradients. Finally, the strong evidence of admixture, gene flow, and mtDNA introgression among populations suggests that P. cornutum should be considered a single widespread species under the General Lineage Species Concept.
Collapse
Affiliation(s)
- Nicholas Finger
- Department of Biological Sciences, New York City College of Technology, The City University of New York, 285 Jay Street, Brooklyn, NY, 11201, USA
| | - Keaka Farleigh
- Department of Biology, Miami University, 501 E High St, Oxford, OH, 45056, USA
| | - Jason T Bracken
- Department of Biology, Miami University, 501 E High St, Oxford, OH, 45056, USA
| | - Adam D Leaché
- Department of Biology & Burke Museum of Natural History and Culture, University of Washington, Seattle, WA, 98195, USA
| | - Olivier François
- Faculty of Medicine, University Grenoble-Alpes, TIMC-IMAG UMR 5525, Grenoble, La Tronche, F38706, France 38000
| | - Ziheng Yang
- Department of Genetics, Evolution and Environment, University College London, Darwin Building, Gower Street, London, WC1E 6BT, UK
| | - Tomas Flouri
- Department of Genetics, Evolution and Environment, University College London, Darwin Building, Gower Street, London, WC1E 6BT, UK
| | - Tristan Charran
- Department of Biological Sciences, New York City College of Technology, The City University of New York, 285 Jay Street, Brooklyn, NY, 11201, USA
| | - Tereza Jezkova
- Department of Biology, Miami University, 501 E High St, Oxford, OH, 45056, USA
| | - Dean A Williams
- Department of Biology, Texas Christian University, 2800 S University Dr, Fort Worth, TX, 76129, USA
| | - Christopher Blair
- Department of Biological Sciences, New York City College of Technology, The City University of New York, 285 Jay Street, Brooklyn, NY, 11201, USA.,Biology PhD Program, CUNY Graduate Center, 365 5th Ave, New York, NY, 10016, USA
| |
Collapse
|
34
|
How challenging RADseq data turned out to favor coalescent-based species tree inference. A case study in Aichryson (Crassulaceae). Mol Phylogenet Evol 2021; 167:107342. [PMID: 34785384 DOI: 10.1016/j.ympev.2021.107342] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2020] [Revised: 07/05/2021] [Accepted: 10/29/2021] [Indexed: 12/24/2022]
Abstract
Analysing multiple genomic regions while incorporating detection and qualification of discordance among regions has become standard for understanding phylogenetic relationships. In plants, which usually have comparatively large genomes, this is feasible by the combination of reduced-representation library (RRL) methods and high-throughput sequencing enabling the cost effective acquisition of genomic data for thousands of loci from hundreds of samples. One popular RRL method is RADseq. A major disadvantage of established RADseq approaches is the rather short fragment and sequencing range, leading to loci of little individual phylogenetic information. This issue hampers the application of coalescent-based species tree inference. The modified RADseq protocol presented here targets ca. 5,000 loci of 300-600nt length, sequenced with the latest short-read-sequencing (SRS) technology, has the potential to overcome this drawback. To illustrate the advantages of this approach we use the study group Aichryson Webb & Berthelott (Crassulaceae), a plant genus that diversified on the Canary Islands. The data analysis approach used here aims at a careful quality control of the long loci dataset. It involves an informed selection of thresholds for accurate clustering, a thorough exploration of locus properties, such as locus length, coverage and variability, to identify potential biased data and a comparative phylogenetic inference of filtered datasets, accompanied by an evaluation of resulting BS support, gene and site concordance factor values, to improve overall resolution of the resulting phylogenetic trees. The final dataset contains variable loci with an average length of 373nt and facilitates species tree estimation using a coalescent-based summary approach. Additional improvements brought by the approach are critically discussed.
Collapse
|
35
|
Calderón-Acevedo CA, Bagley JC, Muchhala N. Genome-wide ultraconserved elements resolve phylogenetic relationships and biogeographic history among Neotropical leaf-nosed bats in the genus Anoura (Phyllostomidae). Mol Phylogenet Evol 2021; 167:107356. [PMID: 34774763 DOI: 10.1016/j.ympev.2021.107356] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2021] [Revised: 10/26/2021] [Accepted: 11/08/2021] [Indexed: 10/19/2022]
Abstract
AnouraGray, 1838 are Neotropical nectarivorous bats and the most speciose genus within the phyllostomid subfamily Glossophaginae. However, Anoura species limits remain debated, and phylogenetic relationships remain poorly known, because previous studies used limited Anoura taxon sampling or focused primarily on higher-level relationships. Here, we conduct the first phylogenomic study of Anoura by analyzing 2039 genome-wide ultraconserved elements (UCEs) sequenced for 42 individuals from 8 Anoura species/lineages plus two outgroups. Overall, our results based on UCEs resolved relationships in the genus and supported (1) the monophyly of small-bodied Anoura species (previously genus Lonchoglossa); (2) monotypic status of A. caudifer; and (3) nested positions of "A. carishina", A. caudifer aequatoris, and A. geoffroyi peruana specimens within A. latidens, A. caudifer and A. geoffroyi, respectively (suggesting that these taxa are not distinct species). Additionally, (4) phylogenetic networks allowing reticulate edges did not explain gene tree discordance better than the species tree (without introgression), indicating that a coalescent model accounting for discordance solely through incomplete lineage sorting fit our data well. Sensitivity analyses indicated that our species tree results were not adversely affected by varying taxon sampling across loci. Tree calibration and Bayesian coalescent analyses dated the onset of diversification within Anoura to around ∼ 6-9 million years ago in the Miocene, with extant species diverging mainly within the past ∼ 4 million years. We inferred a historical biogeographical scenario for Anoura of parapatric speciation fragmenting the range of a wide-ranging ancestral lineage centered in the Central to Northern Andes, along with Pliocene-Pleistocene dispersal or founder event speciation in Amazonia and the Brazilian Atlantic forest during the last ∼ 2.5 million years.
Collapse
Affiliation(s)
- Camilo A Calderón-Acevedo
- Department of Biology, University of Missouri-St. Louis, One University Blvd., 223 Research Bldg., St. Louis, MO 63121, USA; Department of Earth and Environmental Science, Rutgers University, 195 University Ave., Boyden Hall 433, Newark, NJ, 07102 USA.
| | - Justin C Bagley
- Department of Biology, University of Missouri-St. Louis, One University Blvd., 223 Research Bldg., St. Louis, MO 63121, USA; Department of Biology, Jacksonville State University, 242 Martin Hall, 700 Pelham Rd North, Jacksonville, AL 36265, USA; Department of Biology, Virginia Commonwealth University, 1000 W Cary St., Suite 126, Richmond, VA 23284, USA.
| | - Nathan Muchhala
- Department of Biology, University of Missouri-St. Louis, One University Blvd., 223 Research Bldg., St. Louis, MO 63121, USA.
| |
Collapse
|
36
|
Mirarab S, Nakhleh L, Warnow T. Multispecies Coalescent: Theory and Applications in Phylogenetics. ANNUAL REVIEW OF ECOLOGY, EVOLUTION, AND SYSTEMATICS 2021. [DOI: 10.1146/annurev-ecolsys-012121-095340] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Species tree estimation is a basic part of many biological research projects, ranging from answering basic evolutionary questions (e.g., how did a group of species adapt to their environments?) to addressing questions in functional biology. Yet, species tree estimation is very challenging, due to processes such as incomplete lineage sorting, gene duplication and loss, horizontal gene transfer, and hybridization, which can make gene trees differ from each other and from the overall evolutionary history of the species. Over the last 10–20 years, there has been tremendous growth in methods and mathematical theory for estimating species trees and phylogenetic networks, and some of these methods are now in wide use. In this survey, we provide an overview of the current state of the art, identify the limitations of existing methods and theory, and propose additional research problems and directions.
Collapse
Affiliation(s)
- Siavash Mirarab
- Electrical and Computer Engineering Department, University of California, San Diego, La Jolla, California 92093, USA
| | - Luay Nakhleh
- Department of Computer Science, Rice University, Houston, Texas 77005, USA
| | - Tandy Warnow
- Department of Computer Science, University of Illinois Urbana-Champaign, Urbana, Illinois 61801, USA
| |
Collapse
|
37
|
Wang Y, Cao Z, Ogilvie HA, Nakhleh L. Phylogenomic assessment of the role of hybridization and introgression in trait evolution. PLoS Genet 2021; 17:e1009701. [PMID: 34407067 PMCID: PMC8405015 DOI: 10.1371/journal.pgen.1009701] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2021] [Revised: 08/30/2021] [Accepted: 07/07/2021] [Indexed: 11/30/2022] Open
Abstract
Trait evolution among a set of species-a central theme in evolutionary biology-has long been understood and analyzed with respect to a species tree. However, the field of phylogenomics, which has been propelled by advances in sequencing technologies, has ushered in the era of species/gene tree incongruence and, consequently, a more nuanced understanding of trait evolution. For a trait whose states are incongruent with the branching patterns in the species tree, the same state could have arisen independently in different species (homoplasy) or followed the branching patterns of gene trees, incongruent with the species tree (hemiplasy). Another evolutionary process whose extent and significance are better revealed by phylogenomic studies is gene flow between different species. In this work, we present a phylogenomic method for assessing the role of hybridization and introgression in the evolution of polymorphic or monomorphic binary traits. We apply the method to simulated evolutionary scenarios to demonstrate the interplay between the parameters of the evolutionary history and the role of introgression in a binary trait's evolution (which we call xenoplasy). Very importantly, we demonstrate, including on a biological data set, that inferring a species tree and using it for trait evolution analysis in the presence of gene flow could lead to misleading hypotheses about trait evolution.
Collapse
Affiliation(s)
- Yaxuan Wang
- Department of Computer Science, Rice University, Houston, Texas, United States of America
| | - Zhen Cao
- Department of Computer Science, Rice University, Houston, Texas, United States of America
| | - Huw A. Ogilvie
- Department of Computer Science, Rice University, Houston, Texas, United States of America
| | - Luay Nakhleh
- Department of Computer Science, Rice University, Houston, Texas, United States of America
- Department of BioSciences, Rice University, Houston, Texas, United States of America
| |
Collapse
|
38
|
Suvorov A, Scornavacca C, Fujimoto MS, Bodily P, Clement M, Crandall KA, Whiting MF, Schrider DR, Bybee SM. Deep ancestral introgression shapes evolutionary history of dragonflies and damselflies. Syst Biol 2021; 71:526-546. [PMID: 34324671 PMCID: PMC9017697 DOI: 10.1093/sysbio/syab063] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2021] [Revised: 07/20/2021] [Accepted: 07/26/2021] [Indexed: 11/13/2022] Open
Abstract
Introgression is an important biological process affecting at least 10% of the extant species in the animal kingdom. Introgression significantly impacts inference of phylogenetic species relationships where a strictly binary tree model cannot adequately explain reticulate net-like species relationships. Here we use phylogenomic approaches to understand patterns of introgression along the evolutionary history of a unique, non-model insect system: dragonflies and damselflies (Odonata). We demonstrate that introgression is a pervasive evolutionary force across various taxonomic levels within Odonata. In particular, we show that the morphologically "intermediate" species of Anisozygoptera (one of the three primary suborders within Odonata besides Zygoptera and Anisoptera), which retain phenotypic characteristics of the other two suborders, experienced high levels of introgression likely coming from zygopteran genomes. Additionally, we find evidence for multiple cases of deep inter-superfamilial ancestral introgression.
Collapse
Affiliation(s)
- Anton Suvorov
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, United States
| | - Celine Scornavacca
- Institut des Sciences de l'Evolution Université de Montpellier, CNRS, IRD, EPHE CC 064, Place Eugène Bataillon, 34095 Montpellier Cedex 05, France
| | - M Stanley Fujimoto
- Department of Computer Science, Brigham Young University, Provo, UT, United States
| | - Paul Bodily
- Department of Computer Science, Idaho State University, Pocatello, ID, United States
| | - Mark Clement
- Department of Computer Science, Brigham Young University, Provo, UT, United States
| | - Keith A Crandall
- Computational Biology Institute, Department of Biostatistics and Bioinformatics, Milken Institute School of Public Health, George Washington University, Washington, DC, United States
| | - Michael F Whiting
- Department of Biology, Brigham Young University, Provo, UT, United States.,M.L. Bean Museum, Brigham Young University, Provo, UT, United States
| | - Daniel R Schrider
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, United States
| | - Seth M Bybee
- Department of Biology, Brigham Young University, Provo, UT, United States.,M.L. Bean Museum, Brigham Young University, Provo, UT, United States
| |
Collapse
|
39
|
Gene flow in phylogenomics: Sequence capture resolves species limits and biogeography of Afromontane forest endemic frogs from the Cameroon Highlands. Mol Phylogenet Evol 2021; 163:107258. [PMID: 34252546 DOI: 10.1016/j.ympev.2021.107258] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2020] [Revised: 06/28/2021] [Accepted: 07/07/2021] [Indexed: 11/21/2022]
Abstract
Puddle frogs of the Phrynobatrachus steindachneri species complex are a useful group for investigating speciation and phylogeography in Afromontane forests of the Cameroon Volcanic Line, western Central Africa. The species complex is represented by six morphologically relatively cryptic mitochondrial DNA lineages, only two of which are distinguished at the species level - southern P. jimzimkusi and Lake Oku endemic P. njiomock, leaving the remaining four lineages identified as 'P. steindachneri'. In this study, the six mtDNA lineages are subjected to genomic sequence capture analyses and morphological examination to delimit species and to study biogeography. The nuclear DNA data (387 loci; 571,936 aligned base pairs) distinguished all six mtDNA lineages, but the topological pattern and divergence depths supported only four main clades: P. jimzimkusi, P. njiomock, and only two divergent evolutionary lineages within the four 'P. steindachneri' mtDNA lineages. One of the two lineages is herein described as a new species, P. amieti sp. nov. Reticulate evolution (hybridization) was detected within the species complex with morphologically intermediate hybrid individuals placed between the parental species in phylogenomic analyses, forming a ladder-like phylogenetic pattern. The presence of hybrids is undesirable in standard phylogenetic analyses but is essential and beneficial in the network multispecies coalescent. This latter approach provided insight into the reticulate evolutionary history of these endemic frogs. Introgressions likely occurred during the Middle and Late Pleistocene climatic oscillations, due to the cyclic connections (likely dominating during cold glacials) and separations (during warm interglacials) of montane forests. The genomic phylogeographic pattern supports the separation of the southern (Mt. Manengouba to Mt. Oku) and northern mountains at the onset of the Pleistocene. Further subdivisions occurred in the Early Pleistocene, separating populations from the northernmost (Tchabal Mbabo, Gotel Mts.) and middle mountains (Mt. Mbam, Mt. Oku, Mambilla Plateau), as well as the microendemic lineage restricted to Lake Oku (Mt. Oku). This unique model system is highly threatened as all the species within the complex have exhibited severe population declines in the past decade, placing them on the brink of extinction. In addition, Mount Oku is identified to be of particular conservation importance because it harbors three species of this complex. We, therefore, urge for conservation actions in the Cameroon Highlands to preserve their diversity before it is too late.
Collapse
|
40
|
Walker JF, Smith SA, Hodel RGJ, Moyroud E. Concordance-based approaches for the inference of relationships and molecular rates with phylogenomic datasets. Syst Biol 2021; 71:943-958. [PMID: 34240209 DOI: 10.1093/sysbio/syab052] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2020] [Revised: 06/23/2021] [Accepted: 07/01/2021] [Indexed: 11/12/2022] Open
Abstract
Gene tree conflict is common and finding methods to analyze and alleviate the negative effects that conflict has on species tree analysis is a crucial part of phylogenomics. This study aims to expand the discussion of inferring species trees and molecular branch lengths when conflict is present. Conflict is typically examined in two ways: inferring its prevalence, and inferring the influence of the individual genes (how strongly one gene supports any given topology compared to an alternative topology). Here, we examine a procedure for incorporating both conflict and the influence of genes in order to infer evolutionary relationships. All supported relationships in the gene trees are analyzed and the likelihood of the genes constrained to these relationships is summed to provide a likelihood for the relationship. Consensus tree assembly is conducted based on the sum of likelihoods for a given relationship and choosing relationships based on the most likely relationship assuming it does not conflict with a relationship that has a higher likelihood score. If it is not possible for all most likely relationships to be combined into a single bifurcating tree then multiple trees are produced and a consensus tree with a polytomy is created. This procedure allows for more influential genes to have greater influence on an inferred relationship, does not assume conflict has arisen from any one source, and does not force the dataset to produce a single bifurcating tree. Using this approach on three empirical datasets, we examine and discuss the relationship between influence and prevalence of gene tree conflict. We find that in one of the datasets, assembling a bifurcating consensus tree solely composed of the most likely relationships is impossible. To account for conflict in molecular rate analysis we also introduce a concordance-based approach to the summary and estimation of branch lengths suitable for downstream comparative analyses. We demonstrate through simulation that even under high levels of stochastic conflict, the mean and median of the concordant rates recapitulate the true molecular rate better than using a supermatrix approach. Using a large phylogenomic dataset, we examine rate heterogeneity across concordant genes with a focus on the branch subtending crown angiosperms. Notably, we find highly variable rates of evolution along the branch subtending crown angiosperms. The approaches outlined here have several limitations, but they also represent some alternative methods for harnessing the complexity of phylogenomic datasets and enrich our inferences of both species' relationships and evolutionary processes.
Collapse
Affiliation(s)
- Joseph F Walker
- The Sainsbury Laboratory, University of Cambridge, 47 Bateman Street, Cambridge CB2 1LR, UK.,Department of Biological Sciences, University of Illinois at Chicago, Chicago, IL, 60607 U.S.A
| | - Stephen A Smith
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI, USA
| | - Richard G J Hodel
- Department of Botany, National Museum of Natural History, MRC 166, Smithsonian Institution, Washington, DC, 20013-7012, USA
| | - Edwige Moyroud
- The Sainsbury Laboratory, University of Cambridge, 47 Bateman Street, Cambridge CB2 1LR, UK.,Department of Genetics, University of Cambridge, Downing Street, Cambridge CB2 3EH, UK
| |
Collapse
|
41
|
Vázquez-Miranda H, Barker FK. Autosomal, sex-linked and mitochondrial loci resolve evolutionary relationships among wrens in the genus Campylorhynchus. Mol Phylogenet Evol 2021; 163:107242. [PMID: 34224849 DOI: 10.1016/j.ympev.2021.107242] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2020] [Revised: 06/14/2021] [Accepted: 06/29/2021] [Indexed: 01/18/2023]
Abstract
Although there is general consensus that sampling of multiple genetic loci is critical in accurate reconstruction of species trees, the exact numbers and the best types of molecular markers remain an open question. In particular, the phylogenetic utility of sex-linked loci is underexplored. Here, we sample all species and 70% of the named diversity of the New World wren genus Campylorhynchus using sequences from 23 loci, to evaluate the effects of linkage on efficiency in recovering a well-supported tree for the group. At a tree-wide level, we found that most loci supported fewer than half the possible clades and that sex-linked loci produced similar resolution to slower-coalescing autosomal markers, controlling for locus length. By contrast, we did find evidence that linkage affected the efficiency of recovery of individual relationships; as few as two sex-linked loci were necessary to resolve a selection of clades with long to medium subtending branches, whereas 4-6 autosomal loci were necessary to achieve comparable results. These results support an expanded role for sampling of the avian Z chromosome in phylogenetic studies, including target enrichment approaches. Our concatenated and species tree analyses represent significant improvements in our understanding of diversification in Campylorhynchus, and suggest a relatively complex scenario for its radiation across the Miocene/Pliocene boundary, with multiple invasions of South America.
Collapse
Affiliation(s)
- Hernán Vázquez-Miranda
- Departamento de Zoología, Instituto de Biología, Universidad Nacional Autónoma de México, Ciudad de México C.P. 04510, Mexico
| | - F Keith Barker
- Department of Ecology, Evolution and Behavior, Bell Museum of Natural History, University of Minnesota, 40 Gortner Laboratory, 1479 Gortner Avenue, Saint Paul, MN 55108, USA
| |
Collapse
|
42
|
De Luca D, Piredda R, Sarno D, Kooistra WHCF. Resolving cryptic species complexes in marine protists: phylogenetic haplotype networks meet global DNA metabarcoding datasets. THE ISME JOURNAL 2021; 15:1931-1942. [PMID: 33589768 PMCID: PMC8245484 DOI: 10.1038/s41396-021-00895-0] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/26/2020] [Revised: 12/23/2020] [Accepted: 01/14/2021] [Indexed: 12/21/2022]
Abstract
Marine protists have traditionally been assumed to be lowly diverse and cosmopolitan. Yet, several recent studies have shown that many protist species actually consist of cryptic complexes of species whose members are often restricted to particular biogeographic regions. Nonetheless, detection of cryptic species is usually hampered by sampling coverage and application of methods (e.g. phylogenetic trees) that are not well suited to identify relatively recent divergence and ongoing gene flow. In this paper, we show how these issues can be overcome by inferring phylogenetic haplotype networks from global metabarcoding datasets. We use the Chaetoceros curvisetus (Bacillariophyta) species complex as study case. Using two complementary metabarcoding datasets (Ocean Sampling Day and Tara Oceans), we equally resolve the cryptic complex in terms of number of inferred species. We detect new hypothetical species in both datasets. Gene flow between most of species is absent, but no barcoding gap exists. Some species have restricted distribution patterns whereas others are widely distributed. Closely related taxa occupy contrasting biogeographic regions, suggesting that geographic and ecological differentiation drive speciation. In conclusion, we show the potential of the analysis of metabarcoding data with evolutionary approaches for systematic and phylogeographic studies of marine protists.
Collapse
Affiliation(s)
- Daniele De Luca
- Department of Integrative Marine Ecology, Stazione Zoologica Anton Dohrn, Naples, Italy
- Department of Biology, Botanical Garden of Naples, University of Naples Federico II, Naples, Italy
| | - Roberta Piredda
- Department of Integrative Marine Ecology, Stazione Zoologica Anton Dohrn, Naples, Italy
| | - Diana Sarno
- Department of Research Infrastructure for Marine Biological Resources, Stazione Zoologica Anton Dohrn, Naples, Italy
| | - Wiebe H C F Kooistra
- Department of Integrative Marine Ecology, Stazione Zoologica Anton Dohrn, Naples, Italy.
| |
Collapse
|
43
|
Reilly SB, Stubbs AL, Arida E, Karin BR, Arifin U, Kaiser H, Bi K, Iskandar DT, McGuire JA. Phylogenomic Analysis Reveals Dispersal-Driven Speciation and Divergence with Gene Flow in Lesser Sunda Flying Lizards (Genus Draco). Syst Biol 2021; 71:221-241. [PMID: 34117769 DOI: 10.1093/sysbio/syab043] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2020] [Revised: 05/30/2021] [Accepted: 06/02/2021] [Indexed: 12/13/2022] Open
Abstract
The Lesser Sunda Archipelago offers exceptional potential as a model system for studying the dynamics of dispersal-driven diversification. The geographic proximity of the islands suggests the possibility for successful dispersal, but this is countered by the permanence of the marine barriers and extreme intervening currents that are expected to hinder gene flow. Phylogenetic and species delimitation analyses of flying lizards (genus Draco) using single mitochondrial genes, complete mitochondrial genomes, and exome-capture data sets identified 9-11 deeply divergent lineages including single-island endemics, lineages that span multiple islands, and parapatrically-distributed non-sister lineages on the larger islands. Population clustering and PCA confirmed these genetic boundaries with isolation-by-distance playing a role in some islands or island sets. While gdi estimates place most candidate species comparisons in the ambiguous zone, migration estimates suggest 9 or 10 species exist with nuclear introgression detected across some intra-island contact zones. Initial entry of Draco into the archipelago occurred at 5.5-7.5 Ma, with most inter-island colonization events having occurred between 1-3 Ma. Biogeographical model testing favors scenarios integrating geographic distance and historical island connectivity, including an initial stepping-stone dispersal process from the Greater Sunda Shelf through the Sunda Arc as far eastward as Lembata Island. However, rather than reaching the adjacent island of Pantar by dispersing over the 15-km wide Alor Strait, Draco ultimately reached Pantar (and much of the rest of the archipelago) by way of a circuitous route involving at least five over-water dispersal events. These findings suggest that historical geological and oceanographic conditions heavily influenced dispersal pathways and gene flow, which in turn drove species formation and shaped species boundaries.
Collapse
Affiliation(s)
- Sean B Reilly
- Museum of Vertebrate Zoology and Department of Integrative Biology, University of California, Berkeley, CA 94720, USA
| | - Alexander L Stubbs
- Museum of Vertebrate Zoology and Department of Integrative Biology, University of California, Berkeley, CA 94720, USA
| | - Evy Arida
- Museum Zoologicum Bogoriense, Indonesian Institute of Sciences, Cibinong, Indonesia
| | - Benjamin R Karin
- Museum of Vertebrate Zoology and Department of Integrative Biology, University of California, Berkeley, CA 94720, USA
| | - Umilaela Arifin
- School of Life Sciences and Technology, Institut Teknologi Bandung, Bandung, Indonesia
| | - Hinrich Kaiser
- Department of Vertebrate Zoology, Zoologisches Forschungsmuseum Alexander Koenig, Adenauerallee 160, 53113 Bonn, Germany; and Department of Biology, Victor Valley College, Victorville, California 92395, USA
| | - Ke Bi
- Museum of Vertebrate Zoology and Department of Integrative Biology, University of California, Berkeley, CA 94720, USA.,Computational Genomics Resource Laboratory, California Institute for Quantitative Biosciences, University of California, Berkeley, CA 94720, USA
| | | | - Jimmy A McGuire
- Museum of Vertebrate Zoology and Department of Integrative Biology, University of California, Berkeley, CA 94720, USA
| |
Collapse
|
44
|
Congrains C, Zucchi RA, de Brito RA. Phylogenomic approach reveals strong signatures of introgression in the rapid diversification of neotropical true fruit flies (Anastrepha: Tephritidae). Mol Phylogenet Evol 2021; 162:107200. [PMID: 33984467 DOI: 10.1016/j.ympev.2021.107200] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2020] [Revised: 01/30/2021] [Accepted: 05/03/2021] [Indexed: 01/08/2023]
Abstract
New sequencing techniques have allowed us to explore the variation on thousands of genes and elucidate evolutionary relationships of lineages even in complex scenarios, such as when there is rapid diversification. That seems to be the case of species in the genus Anastrepha, which shows great species diversity that has been divided into 21 species groups, several of which show wide geographical distribution. The fraterculus group has several economically important species and it is also an outstanding model for speciation studies, since it includes several lineages that have diverged recently possibly in the presence of interspecific gene flow. Our main goal is to test whether we can infer phylogenetic relationships of recently diverged taxa with gene flow, such as what is expected for the fraterculus group and determine whether certain genes remain informative even in this complex scenario. An analysis of thousands of orthologous genes derived from transcriptome datasets of 10 different lineages across the genus, including some of the economically most important pests, revealed signals of incomplete lineage sorting, vestiges of ancestral introgression between more distant lineages and ongoing gene flow between closely related lineages. Though these patterns affect the phylogenetic signal, the phylogenomic inferences consistently show that the morphologically identified species here investigated are in different evolutionary lineages, with the sole exception involving Brazilian lineages of A. fraterculus, which has been suggested to be a complex assembly of cryptic species. A tree space analysis suggested that genes with greater phylogenetic resolution have evolved under similar selection pressures and are more resilient to intraspecific gene flow, which would make it more likely that these genomic regions may be useful for identifying fraterculus group lineages. Our findings help establish relationships among the most important Anastrepha species groups, as well as bring further data to indicate that the diversification of fraterculus group lineages, and even other lineages in the genus Anastrepha, has been strongly influenced by interspecific gene flow.
Collapse
Affiliation(s)
- Carlos Congrains
- Departamento de Genética e Evolução, Universidade Federal de São Carlos, São Carlos, SP, Brazil.
| | - Roberto A Zucchi
- Escola Superior de Agricultura "Luiz de Queiroz" - ESALQ, Universidade de São Paulo - USP, Piracicaba, SP, Brazil
| | - Reinaldo A de Brito
- Departamento de Genética e Evolução, Universidade Federal de São Carlos, São Carlos, SP, Brazil
| |
Collapse
|
45
|
Ferreira MS, Jones MR, Callahan CM, Farelo L, Tolesa Z, Suchentrunk F, Boursot P, Mills LS, Alves PC, Good JM, Melo-Ferreira J. The Legacy of Recurrent Introgression during the Radiation of Hares. Syst Biol 2021; 70:593-607. [PMID: 33263746 PMCID: PMC8048390 DOI: 10.1093/sysbio/syaa088] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2020] [Revised: 11/06/2020] [Accepted: 11/13/2020] [Indexed: 12/30/2022] Open
Abstract
Hybridization may often be an important source of adaptive variation, but the extent and long-term impacts of introgression have seldom been evaluated in the phylogenetic context of a radiation. Hares (Lepus) represent a widespread mammalian radiation of 32 extant species characterized by striking ecological adaptations and recurrent admixture. To understand the relevance of introgressive hybridization during the diversification of Lepus, we analyzed whole exome sequences (61.7 Mb) from 15 species of hares (1-4 individuals per species), spanning the global distribution of the genus, and two outgroups. We used a coalescent framework to infer species relationships and divergence times, despite extensive genealogical discordance. We found high levels of allele sharing among species and show that this reflects extensive incomplete lineage sorting and temporally layered hybridization. Our results revealed recurrent introgression at all stages along the Lepus radiation, including recent gene flow between extant species since the last glacial maximum but also pervasive ancient introgression occurring since near the origin of the hare lineages. We show that ancient hybridization between northern hemisphere species has resulted in shared variation of potential adaptive relevance to highly seasonal environments, including genes involved in circadian rhythm regulation, pigmentation, and thermoregulation. Our results illustrate how the genetic legacy of ancestral hybridization may persist across a radiation, leaving a long-lasting signature of shared genetic variation that may contribute to adaptation. [Adaptation; ancient introgression; hybridization; Lepus; phylogenomics.].
Collapse
Affiliation(s)
- Mafalda S Ferreira
- CIBIO, Centro de Investigação em Biodiversidade e Recursos Genéticos, InBIO Laboratório Associado, Universidade do Porto, Vairão, Portugal
- Departamento de Biologia, Faculdade de Ciências da Universidade do Porto, Porto, Portugal
- Division of Biological Sciences, University of Montana, Missoula, Montana, United States of America
| | - Matthew R Jones
- Division of Biological Sciences, University of Montana, Missoula, Montana, United States of America
| | - Colin M Callahan
- Division of Biological Sciences, University of Montana, Missoula, Montana, United States of America
| | - Liliana Farelo
- CIBIO, Centro de Investigação em Biodiversidade e Recursos Genéticos, InBIO Laboratório Associado, Universidade do Porto, Vairão, Portugal
| | - Zelalem Tolesa
- Department of Biology, Hawassa University, Hawassa, Ethiopia
| | - Franz Suchentrunk
- Department for Interdisciplinary Life Sciences, Research Institute of Wildlife Ecology, University of Veterinary Medicine Vienna, Vienna, Austria
| | - Pierre Boursot
- Institut des Sciences de l’Évolution Montpellier (ISEM), Université de Montpellier, CNRS, IRD, EPHE, France
| | - L Scott Mills
- Wildlife Biology Program, College of Forestry and Conservation, University of Montana, Missoula, Montana, United States of America
- Office of Research and Creative Scholarship, University of Montana, Missoula, Montana, United States of America; Jeffrey M. Good and José Melo-Ferreira shared the senior authorship
| | - Paulo C Alves
- CIBIO, Centro de Investigação em Biodiversidade e Recursos Genéticos, InBIO Laboratório Associado, Universidade do Porto, Vairão, Portugal
- Departamento de Biologia, Faculdade de Ciências da Universidade do Porto, Porto, Portugal
- Wildlife Biology Program, College of Forestry and Conservation, University of Montana, Missoula, Montana, United States of America
| | - Jeffrey M Good
- Division of Biological Sciences, University of Montana, Missoula, Montana, United States of America
- Wildlife Biology Program, College of Forestry and Conservation, University of Montana, Missoula, Montana, United States of America
| | - José Melo-Ferreira
- CIBIO, Centro de Investigação em Biodiversidade e Recursos Genéticos, InBIO Laboratório Associado, Universidade do Porto, Vairão, Portugal
- Departamento de Biologia, Faculdade de Ciências da Universidade do Porto, Porto, Portugal
| |
Collapse
|
46
|
Schrago CG, Barzilai LP. Challenges in estimating virus divergence times in short epidemic timescales with special reference to the evolution of SARS-CoV-2 pandemic. Genet Mol Biol 2021; 44:e20200254. [PMID: 33570080 PMCID: PMC7869796 DOI: 10.1590/1678-4685-gmb-2020-0254] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2020] [Accepted: 01/18/2021] [Indexed: 11/21/2022] Open
Abstract
The estimation of evolutionary parameters provides essential information for designing public health policies. In short time intervals, however, nucleotide substitutions are ineffective to record all complexities of virus population dynamics. In this sense, the current SARS-CoV-2 pandemic poses a challenge for evolutionary analysis. We used computer simulation to evolve populations in scenarios of varying temporal intervals to evaluate the impact of the age of an epidemic on estimates of time and geography. Before estimating virus timescales, the shape of tree topologies can be used as a proxy to assess the effectiveness of the virus phylogeny in providing accurate estimates of evolutionary parameters. In short timescales, estimates have larger uncertainty. We compared the predictions from simulations with empirical data. The tree shape of SARS-CoV-2 was closer to shorter timescales scenarios, which yielded parametric estimates with larger uncertainty, suggesting that estimates from these datasets should be evaluated cautiously. To increase the accuracy of the estimates of virus transmission times between populations, the uncertainties associated with the age estimates of both the crown and stem nodes should be communicated. We place the age of the common ancestor of the current SARS-CoV-2 pandemic in late September 2019, corroborating an earlier emergence of the virus.
Collapse
Affiliation(s)
- Carlos G. Schrago
- Universidade Federal do Rio de Janeiro, Departamento de
Genética, Rio de Janeiro, RJ, Brazil
| | - Lucia P. Barzilai
- Universidade Federal do Rio de Janeiro, Departamento de
Genética, Rio de Janeiro, RJ, Brazil
| |
Collapse
|
47
|
Koch H, DeGiorgio M. Maximum Likelihood Estimation of Species Trees from Gene Trees in the Presence of Ancestral Population Structure. Genome Biol Evol 2020; 12:3977-3995. [PMID: 32022857 PMCID: PMC7061232 DOI: 10.1093/gbe/evaa022] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/23/2020] [Indexed: 11/12/2022] Open
Abstract
Though large multilocus genomic data sets have led to overall improvements in phylogenetic inference, they have posed the new challenge of addressing conflicting signals across the genome. In particular, ancestral population structure, which has been uncovered in a number of diverse species, can skew gene tree frequencies, thereby hindering the performance of species tree estimators. Here we develop a novel maximum likelihood method, termed TASTI (Taxa with Ancestral structure Species Tree Inference), that can infer phylogenies under such scenarios, and find that it has increasing accuracy with increasing numbers of input gene trees, contrasting with the relatively poor performances of methods not tailored for ancestral structure. Moreover, we propose a supertree approach that allows TASTI to scale computationally with increasing numbers of input taxa. We use genetic simulations to assess TASTI's performance in the three- and four-taxon settings and demonstrate the application of TASTI on a six-species Afrotropical mosquito data set. Finally, we have implemented TASTI in an open-source software package for ease of use by the scientific community.
Collapse
Affiliation(s)
- Hillary Koch
- Department of Statistics, Pennsylvania State University
| | - Michael DeGiorgio
- Department of Computer and Electrical Engineering and Computer Science, Florida Atlantic University
| |
Collapse
|
48
|
Grewe F, Ametrano C, Widhelm TJ, Leavitt S, Distefano I, Polyiam W, Pizarro D, Wedin M, Crespo A, Divakar PK, Lumbsch HT. Using target enrichment sequencing to study the higher-level phylogeny of the largest lichen-forming fungi family: Parmeliaceae (Ascomycota). IMA Fungus 2020; 11:27. [PMID: 33317627 PMCID: PMC7734834 DOI: 10.1186/s43008-020-00051-x] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2020] [Accepted: 11/29/2020] [Indexed: 11/10/2022] Open
Abstract
Parmeliaceae is the largest family of lichen-forming fungi with a worldwide distribution. We used a target enrichment data set and a qualitative selection method for 250 out of 350 genes to infer the phylogeny of the major clades in this family including 81 taxa, with both subfamilies and all seven major clades previously recognized in the subfamily Parmelioideae. The reduced genome-scale data set was analyzed using concatenated-based Bayesian inference and two different Maximum Likelihood analyses, and a coalescent-based species tree method. The resulting topology was strongly supported with the majority of nodes being fully supported in all three concatenated-based analyses. The two subfamilies and each of the seven major clades in Parmelioideae were strongly supported as monophyletic. In addition, most backbone relationships in the topology were recovered with high nodal support. The genus Parmotrema was found to be polyphyletic and consequently, it is suggested to accept the genus Crespoa to accommodate the species previously placed in Parmotrema subgen. Crespoa. This study demonstrates the power of reduced genome-scale data sets to resolve phylogenetic relationships with high support. Due to lower costs, target enrichment methods provide a promising avenue for phylogenetic studies including larger taxonomic/specimen sampling than whole genome data would allow.
Collapse
Affiliation(s)
- Felix Grewe
- Science & Education, The Grainger Bioinformatics Center, Negaunee Integrative Research Center, Gantz Family Collections Center, and Pritzker Laboratory for Molecular Systematics, The Field Museum, 1400 S. Lake Shore Drive, Chicago, IL, USA.
| | - Claudio Ametrano
- Science & Education, The Grainger Bioinformatics Center, Negaunee Integrative Research Center, Gantz Family Collections Center, and Pritzker Laboratory for Molecular Systematics, The Field Museum, 1400 S. Lake Shore Drive, Chicago, IL, USA
| | - Todd J Widhelm
- Science & Education, The Grainger Bioinformatics Center, Negaunee Integrative Research Center, Gantz Family Collections Center, and Pritzker Laboratory for Molecular Systematics, The Field Museum, 1400 S. Lake Shore Drive, Chicago, IL, USA
| | - Steven Leavitt
- Department of Biology and M. L. Bean Life Science Museum, Brigham Young University, Provo, UT, USA
| | - Isabel Distefano
- Science & Education, The Grainger Bioinformatics Center, Negaunee Integrative Research Center, Gantz Family Collections Center, and Pritzker Laboratory for Molecular Systematics, The Field Museum, 1400 S. Lake Shore Drive, Chicago, IL, USA
| | - Wetchasart Polyiam
- Lichen Research Unit, Biology Department, Faculty of Science, Ramkhamhaeng University, Ramkhamhaeng 24 Road, Bangkok, 10240, Thailand
| | - David Pizarro
- Departamento de Farmacología, Farmacognosia y Botánica, Facultad de Farmacia, Universidad Complutense de Madrid, 28040, Madrid, Spain
| | - Mats Wedin
- Department of Botany, Swedish Museum of Natural History, PO Box 50007, SE-104 05, Stockholm, Sweden
| | - Ana Crespo
- Departamento de Farmacología, Farmacognosia y Botánica, Facultad de Farmacia, Universidad Complutense de Madrid, 28040, Madrid, Spain
| | - Pradeep K Divakar
- Departamento de Farmacología, Farmacognosia y Botánica, Facultad de Farmacia, Universidad Complutense de Madrid, 28040, Madrid, Spain
| | - H Thorsten Lumbsch
- Science & Education, The Grainger Bioinformatics Center, Negaunee Integrative Research Center, Gantz Family Collections Center, and Pritzker Laboratory for Molecular Systematics, The Field Museum, 1400 S. Lake Shore Drive, Chicago, IL, USA
| |
Collapse
|
49
|
Blair C, Ané C. Phylogenetic Trees and Networks Can Serve as Powerful and Complementary Approaches for Analysis of Genomic Data. Syst Biol 2020; 69:593-601. [PMID: 31432090 DOI: 10.1093/sysbio/syz056] [Citation(s) in RCA: 54] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2019] [Accepted: 08/15/2019] [Indexed: 11/14/2022] Open
Abstract
Genomic data have had a profound impact on nearly every biological discipline. In systematics and phylogenetics, the thousands of loci that are now being sequenced can be analyzed under the multispecies coalescent model (MSC) to explicitly account for gene tree discordance due to incomplete lineage sorting (ILS). However, the MSC assumes no gene flow post divergence, calling for additional methods that can accommodate this limitation. Explicit phylogenetic network methods have emerged, which can simultaneously account for ILS and gene flow by representing evolutionary history as a directed acyclic graph. In this point of view, we highlight some of the strengths and limitations of phylogenetic networks and argue that tree-based inference should not be blindly abandoned in favor of networks simply because they represent more parameter rich models. Attention should be given to model selection of reticulation complexity, and the most robust conclusions regarding evolutionary history are likely obtained when combining tree- and network-based inference.
Collapse
Affiliation(s)
- Christopher Blair
- Department of Biological Sciences, New York City College of Technology, The City University of New York, 285 Jay Street, Brooklyn, NY 11201, USA
- Biology PhD Program, CUNY Graduate Center, 365 5th Ave., New York, NY 10016, USA
| | - Cécile Ané
- Department of Botany, University of Wisconsin - Madison, 1300 University Ave, Madison, WI 53706, USA
- Department of Statistics, University of Wisconsin - Madison, 1300 University Ave, Madison, WI 53706, USA
| |
Collapse
|
50
|
Chan KO, Hutter CR, Wood PL, Grismer LL, Das I, Brown RM. Gene flow creates a mirage of cryptic species in a Southeast Asian spotted stream frog complex. Mol Ecol 2020; 29:3970-3987. [PMID: 32808335 DOI: 10.1111/mec.15603] [Citation(s) in RCA: 35] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2019] [Revised: 07/29/2020] [Accepted: 08/13/2020] [Indexed: 02/06/2023]
Abstract
Most new cryptic species are described using conventional tree- and distance-based species delimitation methods (SDMs), which rely on phylogenetic arrangements and measures of genetic divergence. However, although numerous factors such as population structure and gene flow are known to confound phylogenetic inference and species delimitation, the influence of these processes is not frequently evaluated. Using large numbers of exons, introns, and ultraconserved elements obtained using the FrogCap sequence-capture protocol, we compared conventional SDMs with more robust genomic analyses that assess population structure and gene flow to characterize species boundaries in a Southeast Asian frog complex (Pulchrana picturata). Our results showed that gene flow and introgression can produce phylogenetic patterns and levels of divergence that resemble distinct species (up to 10% divergence in mitochondrial DNA). Hybrid populations were inferred as independent (singleton) clades that were highly divergent from adjacent populations (7%-10%) and unusually similar (<3%) to allopatric populations. Such anomalous patterns are not uncommon in Southeast Asian amphibians, which brings into question whether the high levels of cryptic diversity observed in other amphibian groups reflect distinct cryptic species-or, instead, highly admixed and structured metapopulation lineages. Our results also provide an alternative explanation to the conundrum of divergent (sometimes nonsister) sympatric lineages-a pattern that has been celebrated as indicative of true cryptic speciation. Based on these findings, we recommend that species delimitation of continuously distributed "cryptic" groups should not rely solely on conventional SDMs, but should necessarily examine population structure and gene flow to avoid taxonomic inflation.
Collapse
Affiliation(s)
- Kin O Chan
- Lee Kong Chian National History Museum, Faculty of Science, National University of Singapore, Singapore
| | - Carl R Hutter
- Biodiversity Institute and Department of Ecology and Evolutionary Biology, University of Kansas, Lawrence, KS, USA.,Museum of Natural Sciences and Department of Biological Sciences, Louisiana State University, Baton Rouge, LA, USA
| | - Perry L Wood
- Biodiversity Institute and Department of Ecology and Evolutionary Biology, University of Kansas, Lawrence, KS, USA.,Department of Biological Sciences & Museum of Natural History, Auburn University, Auburn, AL, USA
| | - L L Grismer
- Herpetology Laboratory, Department of Biology, La Sierra University, Riverside, CA, USA
| | - Indraneil Das
- Institute of Biodiversity and Environmental Conservation, Universiti Malaysia Sarawak, Kota Samarahan, Sarawak, Malaysia
| | - Rafe M Brown
- Biodiversity Institute and Department of Ecology and Evolutionary Biology, University of Kansas, Lawrence, KS, USA
| |
Collapse
|