1
|
Mo YK, Hahn MW, Smith ML. Applications of machine learning in phylogenetics. Mol Phylogenet Evol 2024; 196:108066. [PMID: 38565358 DOI: 10.1016/j.ympev.2024.108066] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2023] [Revised: 02/16/2024] [Accepted: 03/21/2024] [Indexed: 04/04/2024]
Abstract
Machine learning has increasingly been applied to a wide range of questions in phylogenetic inference. Supervised machine learning approaches that rely on simulated training data have been used to infer tree topologies and branch lengths, to select substitution models, and to perform downstream inferences of introgression and diversification. Here, we review how researchers have used several promising machine learning approaches to make phylogenetic inferences. Despite the promise of these methods, several barriers prevent supervised machine learning from reaching its full potential in phylogenetics. We discuss these barriers and potential paths forward. In the future, we expect that the application of careful network designs and data encodings will allow supervised machine learning to accommodate the complex processes that continue to confound traditional phylogenetic methods.
Collapse
Affiliation(s)
- Yu K Mo
- Department of Computer Science, Indiana University, Bloomington, IN 47405, USA
| | - Matthew W Hahn
- Department of Computer Science, Indiana University, Bloomington, IN 47405, USA; Department of Biology, Indiana University, Bloomington, IN 47405, USA
| | - Megan L Smith
- Department of Biological Sciences, Mississippi State University, Starkville, MS 39762, USA.
| |
Collapse
|
2
|
Pezzi PH, Wheeler LC, Freitas LB, Smith SD. Incomplete lineage sorting and hybridization underlie tree discordance in Petunia and related genera (Petunieae, Solanaceae). Mol Phylogenet Evol 2024; 198:108136. [PMID: 38909873 DOI: 10.1016/j.ympev.2024.108136] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2024] [Revised: 06/06/2024] [Accepted: 06/17/2024] [Indexed: 06/25/2024]
Abstract
Despite the overarching history of species divergence, phylogenetic studies often reveal distinct topologies across regions of the genome. The sources of these gene tree discordances are variable, but incomplete lineage sorting (ILS) and hybridization are among those with the most biological importance. Petunia serves as a classic system for studying hybridization in the wild. While field studies suggest that hybridization is frequent, the extent of reticulation within Petunia and its closely related genera has never been examined from a phylogenetic perspective. In this study, we used transcriptomic data from 11 Petunia, 16 Calibrachoa, and 10 Fabiana species to illuminate the relationships between these species and investigate whether hybridization played a significant role in the diversification of the clade. We inferred that gene tree discordance within genera is linked to hybridization events along with high levels of ILS due to their rapid diversification. Moreover, network analyses estimated deeper hybridization events between Petunia and Calibrachoa, genera that have different chromosome numbers. Although these genera cannot hybridize at the present time, ancestral hybridization could have played a role in their parallel radiations, as they share the same habitat and life history.
Collapse
Affiliation(s)
- Pedro H Pezzi
- Department of Genetics, Universidade Federal do Rio Grande do Sul, Porto Alegre, Brazil.
| | - Lucas C Wheeler
- Department of Ecology and Evolutionary Biology, University of Colorado, Boulder, USA
| | - Loreta B Freitas
- Department of Genetics, Universidade Federal do Rio Grande do Sul, Porto Alegre, Brazil
| | - Stacey D Smith
- Department of Ecology and Evolutionary Biology, University of Colorado, Boulder, USA
| |
Collapse
|
3
|
Zhang R, Drummond AJ, Mendes FK. Fast Bayesian Inference of Phylogenies from Multiple Continuous Characters. Syst Biol 2024; 73:102-124. [PMID: 38085256 PMCID: PMC11129596 DOI: 10.1093/sysbio/syad067] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2023] [Revised: 03/23/2023] [Accepted: 11/07/2023] [Indexed: 05/28/2024] Open
Abstract
Time-scaled phylogenetic trees are an ultimate goal of evolutionary biology and a necessary ingredient in comparative studies. The accumulation of genomic data has resolved the tree of life to a great extent, yet timing evolutionary events remain challenging if not impossible without external information such as fossil ages and morphological characters. Methods for incorporating morphology in tree estimation have lagged behind their molecular counterparts, especially in the case of continuous characters. Despite recent advances, such tools are still direly needed as we approach the limits of what molecules can teach us. Here, we implement a suite of state-of-the-art methods for leveraging continuous morphology in phylogenetics, and by conducting extensive simulation studies we thoroughly validate and explore our methods' properties. While retaining model generality and scalability, we make it possible to estimate absolute and relative divergence times from multiple continuous characters while accounting for uncertainty. We compile and analyze one of the most data-type diverse data sets to date, comprised of contemporaneous and ancient molecular sequences, and discrete and continuous morphological characters from living and extinct Carnivora taxa. We conclude by synthesizing lessons about our method's behavior, and suggest future research venues.
Collapse
Affiliation(s)
- Rong Zhang
- Programme in Emerging Infectious Diseases, Duke-NUS Medical School 169857, Singapore
| | - Alexei J Drummond
- Centre for Computational Evolution, The University of Auckland, Auckland 1010, New Zealand
- School of Biological Sciences, The University of Auckland, Auckland 1010, New Zealand
| | - Fábio K Mendes
- Department of Biology, Washington University in St. Louis, St. Louis, MO 63130, USA
| |
Collapse
|
4
|
Pang XX, Zhang DY. Detection of Ghost Introgression Requires Exploiting Topological and Branch Length Information. Syst Biol 2024; 73:207-222. [PMID: 38224495 PMCID: PMC11129598 DOI: 10.1093/sysbio/syad077] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2023] [Revised: 12/17/2023] [Accepted: 12/27/2023] [Indexed: 01/17/2024] Open
Abstract
In recent years, the study of hybridization and introgression has made significant progress, with ghost introgression-the transfer of genetic material from extinct or unsampled lineages to extant species-emerging as a key area for research. Accurately identifying ghost introgression, however, presents a challenge. To address this issue, we focused on simple cases involving 3 species with a known phylogenetic tree. Using mathematical analyses and simulations, we evaluated the performance of popular phylogenetic methods, including HyDe and PhyloNet/MPL, and the full-likelihood method, Bayesian Phylogenetics and Phylogeography (BPP), in detecting ghost introgression. Our findings suggest that heuristic approaches relying on site-pattern counts or gene-tree topologies struggle to differentiate ghost introgression from introgression between sampled non-sister species, frequently leading to incorrect identification of donor and recipient species. The full-likelihood method BPP uses multilocus sequence alignments directly-hence taking into account both gene-tree topologies and branch lengths, by contrast, is capable of detecting ghost introgression in phylogenomic datasets. We analyzed a real-world phylogenomic dataset of 14 species of Jaltomata (Solanaceae) to showcase the potential of full-likelihood methods for accurate inference of introgression.
Collapse
Affiliation(s)
- Xiao-Xu Pang
- Ministry of Education Key Laboratory for Biodiversity Science and Ecological Engineering, College of Life Sciences, Beijing Normal University, Beijing 100875, China
| | - Da-Yong Zhang
- Ministry of Education Key Laboratory for Biodiversity Science and Ecological Engineering, College of Life Sciences, Beijing Normal University, Beijing 100875, China
| |
Collapse
|
5
|
Mulder KP, Savage AE, Gratwicke B, Longcore JE, Bronikowski E, Evans M, Longo AV, Kurata NP, Walsh T, Pasmans F, McInerney N, Murray S, Martel A, Fleischer RC. Sequence capture identifies fastidious chytrid fungi directly from host tissue. Fungal Genet Biol 2024; 170:103858. [PMID: 38101696 DOI: 10.1016/j.fgb.2023.103858] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2023] [Revised: 12/04/2023] [Accepted: 12/12/2023] [Indexed: 12/17/2023]
Abstract
The chytrid fungus Batrachochytrium dendrobatidis (Bd) was discovered in 1998 as the cause of chytridiomycosis, an emerging infectious disease causing mass declines in amphibian populations worldwide. The rapid population declines of the 1970s-1990s were likely caused by the spread of a highly virulent lineage belonging to the Bd-GPL clade that was introduced to naïve susceptible populations. Multiple genetically distinct and regional lineages of Bd have since been isolated and sequenced, greatly expanding the known biological diversity within this fungal pathogen. To date, most Bd research has been restricted to the limited number of samples that could be isolated using culturing techniques, potentially causing a selection bias for strains that can grow on media and missing other unculturable or fastidious strains that are also present on amphibians. We thus attempted to characterize potentially non-culturable genetic lineages of Bd from distinct amphibian taxa using sequence capture technology on DNA extracted from host tissue and swabs. We focused our efforts on host taxa from two different regions that likely harbored distinct Bd clades: (1) wild-caught leopard frogs (Rana) from North America, and (2) a Japanese Giant Salamander (Andrias japonicus) at the Smithsonian Institution's National Zoological Park that exhibited signs of disease and tested positive for Bd using qPCR, but multiple attempts failed to isolate and culture the strain for physiological and genetic characterization. We successfully enriched for and sequenced thousands of fungal genes from both host clades, and Bd load was positively associated with number of recovered Bd sequences. Phylogenetic reconstruction placed all the Rana-derived strains in the Bd-GPL clade. In contrast, the A. japonicus strain fell within the Bd-Asia3 clade, expanding the range of this clade and generating additional genomic data to confirm its placement. The retrieved ITS locus matched public barcoding data from wild A. japonicus and Bd infections found on other amphibians in India and China, suggesting that this uncultured clade is widespread across Asia. Our study underscores the importance of recognizing and characterizing the hidden diversity of fastidious strains in order to reconstruct the spatiotemporal and evolutionary history of Bd. The success of the sequence capture approach highlights the utility of directly sequencing pathogen DNA from host tissue to characterize cryptic diversity that is missed by culture-reliant approaches.
Collapse
Affiliation(s)
- Kevin P Mulder
- Wildlife Health Ghent, Faculty of Veterinary Medicine, Ghent University, Merelbeke, Belgium; Center for Conservation Genomics, Smithsonian National Zoo and Conservation Biology Institute, Washington, DC, USA.
| | - Anna E Savage
- Department of Biology, University of Central Florida, Orlando, FL, USA
| | - Brian Gratwicke
- Smithsonian's National Zoo and Conservation Biology Institute, Washington, DC, USA
| | - Joyce E Longcore
- School of Biology and Ecology, University of Maine, Orono, ME, USA
| | - Ed Bronikowski
- Smithsonian's National Zoo and Conservation Biology Institute, Washington, DC, USA
| | - Matthew Evans
- Smithsonian's National Zoo and Conservation Biology Institute, Washington, DC, USA
| | - Ana V Longo
- Department of Biology, University of Florida, Gainesville, FL, USA
| | - Naoko P Kurata
- Center for Conservation Genomics, Smithsonian National Zoo and Conservation Biology Institute, Washington, DC, USA; Department of Natural Resources and the Environment, Cornell University, Ithaca, NY, USA; Department of Ichthyology, American Museum of Natural History, New York, NY, USA
| | - Tim Walsh
- Smithsonian's National Zoo and Conservation Biology Institute, Washington, DC, USA
| | - Frank Pasmans
- Wildlife Health Ghent, Faculty of Veterinary Medicine, Ghent University, Merelbeke, Belgium
| | - Nancy McInerney
- Center for Conservation Genomics, Smithsonian National Zoo and Conservation Biology Institute, Washington, DC, USA
| | - Suzan Murray
- Smithsonian's National Zoo and Conservation Biology Institute, Washington, DC, USA
| | - An Martel
- Wildlife Health Ghent, Faculty of Veterinary Medicine, Ghent University, Merelbeke, Belgium
| | - Robert C Fleischer
- Center for Conservation Genomics, Smithsonian National Zoo and Conservation Biology Institute, Washington, DC, USA
| |
Collapse
|
6
|
Rivas-González I, Schierup MH, Wakeley J, Hobolth A. TRAILS: Tree reconstruction of ancestry using incomplete lineage sorting. PLoS Genet 2024; 20:e1010836. [PMID: 38330138 PMCID: PMC10880969 DOI: 10.1371/journal.pgen.1010836] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2023] [Revised: 02/21/2024] [Accepted: 01/22/2024] [Indexed: 02/10/2024] Open
Abstract
Genome-wide genealogies of multiple species carry detailed information about demographic and selection processes on individual branches of the phylogeny. Here, we introduce TRAILS, a hidden Markov model that accurately infers time-resolved population genetics parameters, such as ancestral effective population sizes and speciation times, for ancestral branches using a multi-species alignment of three species and an outgroup. TRAILS leverages the information contained in incomplete lineage sorting fragments by modelling genealogies along the genome as rooted three-leaved trees, each with a topology and two coalescent events happening in discretized time intervals within the phylogeny. Posterior decoding of the hidden Markov model can be used to infer the ancestral recombination graph for the alignment and details on demographic changes within a branch. Since TRAILS performs posterior decoding at the base-pair level, genome-wide scans based on the posterior probabilities can be devised to detect deviations from neutrality. Using TRAILS on a human-chimp-gorilla-orangutan alignment, we recover speciation parameters and extract information about the topology and coalescent times at high resolution.
Collapse
Affiliation(s)
| | - Mikkel H. Schierup
- Bioinformatics Research Center (BiRC), Aarhus University, Aarhus, Denmark
| | - John Wakeley
- Department of Organismic and Evolutionary Biology, Harvard University, Massachusetts, United States of America
| | - Asger Hobolth
- Department of Mathematics, Aarhus University, Aarhus, Denmark
| |
Collapse
|
7
|
Piwczyński M, Granjon L, Trzeciak P, Carlos Brito J, Oana Popa M, Daba Dinka M, Johnston NP, Boratyński Z. Unraveling phylogenetic relationships and species boundaries in the arid adapted Gerbillus rodents (Muridae: Gerbillinae) by RAD-seq data. Mol Phylogenet Evol 2023; 189:107913. [PMID: 37659480 DOI: 10.1016/j.ympev.2023.107913] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2023] [Revised: 08/25/2023] [Accepted: 08/28/2023] [Indexed: 09/04/2023]
Abstract
Gerbillus is one of the most speciose genera among rodents, with ca. 51 recognized species. Previous attempts to reconstruct the evolutionary history of Gerbillus mainly relied on the mitochondrial cyt-b marker as a source of phylogenetic information. In this study, we utilize RAD-seq genomic data from 37 specimens representing 11 species to reconstruct the phylogenetic tree for Gerbillus, applying concatenation and coalescence methods. We identified four highly supported clades corresponding to the traditionally recognized subgenera: Dipodillus, Gerbillus, Hendecapleura and Monodia. Only two uncertain branches were detected in the resulting trees, with one leading to diversification of the main lineages in the genus, recognized by quartet sampling analysis as uncertain due to possible introgression. We also examined species boundaries for four pairs of sister taxa, including potentially new species from Morocco, using SNAPP. The results strongly supported a speciation model in which all taxa are treated as separate species. The dating analyses confirmed the Plio-Pleistocene diversification of the genus, with the uncertain branch coinciding with the beginning of aridification of the Sahara at the the Plio-Pleistocene boundary. This study aligns well with the earlier analyses based on the cyt-b marker, reaffirming its suitability as an adequate marker for estimating genetic diversity in Gerbillus.
Collapse
Affiliation(s)
- Marcin Piwczyński
- Department of Ecology and Biogeography, Nicolaus Copernicus University in Toruń, Lwowska 1, PL-87-100 Toruń, Poland.
| | - Laurent Granjon
- CBGP, IRD, CIRAD, INRAE, Institut Agro, Université de Montpellier, Montpellier, France
| | - Paulina Trzeciak
- Department of Ecology and Biogeography, Nicolaus Copernicus University in Toruń, Lwowska 1, PL-87-100 Toruń, Poland
| | - José Carlos Brito
- CIBIO-InBio, Research Center in Biodiversity and Genetic Resources, University of Porto, Campus de Vairão, Rua Padre Armando Quintas 7, 4485-661 Vairão, Portugal; BIOPOLIS Program in Genomics, Biodiversity and Land Planning, CIBIO, Campus de Vairão, Vairão, Portugal; Departamento de Biologia, Faculdade de Ciências, Universidade do Porto, Porto, Portugal
| | - Madalina Oana Popa
- Department of Ecology and Biogeography, Nicolaus Copernicus University in Toruń, Lwowska 1, PL-87-100 Toruń, Poland; "Stejarul" Research Centre for Biological Sciences, National Institute of Research and Development for Biological Sciences, Alexandru cel Bun 6, RO-610004, Piatra Neamţ, Romania
| | - Mergi Daba Dinka
- Department of Ecology and Biogeography, Nicolaus Copernicus University in Toruń, Lwowska 1, PL-87-100 Toruń, Poland
| | - Nikolas P Johnston
- School of Life Sciences, University of Technology Sydney, 15 Broadway, Ultimo, NSW 2007, Australia; Centre for Sustainable Ecosystem Solutions, School of Earth, Atmospheric and Life Sciences, University of Wollongong, Northfields Ave, Wollongong, NSW 2500, Australia
| | - Zbyszek Boratyński
- CIBIO-InBio, Research Center in Biodiversity and Genetic Resources, University of Porto, Campus de Vairão, Rua Padre Armando Quintas 7, 4485-661 Vairão, Portugal; BIOPOLIS Program in Genomics, Biodiversity and Land Planning, CIBIO, Campus de Vairão, Vairão, Portugal
| |
Collapse
|
8
|
Erlenbach T, Haynes L, Fish O, Beveridge J, Giambrone S, Reed LK, Dyer KA, Scott Chialvo CH. Investigating the phylogenetic history of toxin tolerance in mushroom-feeding Drosophila. Ecol Evol 2023; 13:e10736. [PMID: 38099137 PMCID: PMC10719611 DOI: 10.1002/ece3.10736] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2023] [Revised: 11/01/2023] [Accepted: 11/02/2023] [Indexed: 12/17/2023] Open
Abstract
Understanding how and when key novel adaptations evolved is a central goal of evolutionary biology. Within the immigrans-tripunctata radiation of Drosophila, many mushroom-feeding species are tolerant of host toxins, such as cyclopeptides, that are lethal to nearly all other eukaryotes. In this study, we used phylogenetic and functional approaches to investigate the evolution of cyclopeptide tolerance in the immigrans-tripunctata radiation of Drosophila. First, we inferred the evolutionary relationships among 48 species in this radiation using 978 single copy orthologs. Our results resolved previous incongruities within species groups across the phylogeny. Second, we expanded on previous studies of toxin tolerance by assaying 16 of these species for tolerance to α-amanitin and found that six of them could develop on diet with toxin. Finally, we asked how α-amanitin tolerance might have evolved across the immigrans-tripunctata radiation, and inferred that toxin tolerance was ancestral in mushroom-feeding Drosophila and subsequently lost multiple times. Our findings expand our understanding of toxin tolerance across the immigrans-tripunctata radiation and emphasize the uniqueness of toxin tolerance in this adaptive radiation and the complexity of biochemical adaptations.
Collapse
Affiliation(s)
| | - Lauren Haynes
- Department of Biological SciencesUniversity of AlabamaTuscaloosaAlabamaUSA
| | - Olivia Fish
- Department of Biological SciencesUniversity of AlabamaTuscaloosaAlabamaUSA
| | - Jordan Beveridge
- Department of Biological SciencesUniversity of AlabamaTuscaloosaAlabamaUSA
| | | | - Laura K. Reed
- Department of Biological SciencesUniversity of AlabamaTuscaloosaAlabamaUSA
| | - Kelly A. Dyer
- Department of GeneticsUniversity of GeorgiaAthensGeorgiaUSA
| | - Clare H. Scott Chialvo
- Department of Biological SciencesUniversity of AlabamaTuscaloosaAlabamaUSA
- Department of BiologyAppalachian State UniversityBooneNorth CarolinaUSA
| |
Collapse
|
9
|
Han Y, Molloy EK. Quartets enable statistically consistent estimation of cell lineage trees under an unbiased error and missingness model. Algorithms Mol Biol 2023; 18:19. [PMID: 38041123 PMCID: PMC10691101 DOI: 10.1186/s13015-023-00248-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Accepted: 11/19/2023] [Indexed: 12/03/2023] Open
Abstract
Cancer progression and treatment can be informed by reconstructing its evolutionary history from tumor cells. Although many methods exist to estimate evolutionary trees (called phylogenies) from molecular sequences, traditional approaches assume the input data are error-free and the output tree is fully resolved. These assumptions are challenged in tumor phylogenetics because single-cell sequencing produces sparse, error-ridden data and because tumors evolve clonally. Here, we study the theoretical utility of methods based on quartets (four-leaf, unrooted phylogenetic trees) in light of these barriers. We consider a popular tumor phylogenetics model, in which mutations arise on a (highly unresolved) tree and then (unbiased) errors and missing values are introduced. Quartets are then implied by mutations present in two cells and absent from two cells. Our main result is that the most probable quartet identifies the unrooted model tree on four cells. This motivates seeking a tree such that the number of quartets shared between it and the input mutations is maximized. We prove an optimal solution to this problem is a consistent estimator of the unrooted cell lineage tree; this guarantee includes the case where the model tree is highly unresolved, with error defined as the number of false negative branches. Lastly, we outline how quartet-based methods might be employed when there are copy number aberrations and other challenges specific to tumor phylogenetics.
Collapse
Affiliation(s)
- Yunheng Han
- Department of Computer Science, University of Maryland, College Park, MD, USA
| | - Erin K Molloy
- Department of Computer Science, University of Maryland, College Park, MD, USA.
- University of Maryland Institute for Advanced Computer Studies, College Park, MD, USA.
| |
Collapse
|
10
|
Erlenbach T, Haynes L, Fish O, Beveridge J, Bingolo E, Giambrone SA, Kropelin G, Rudisill S, Chialvo P, Reed LK, Dyer KA, Chialvo CS. Investigating the phylogenetic history of toxin tolerance in mushroom-feeding Drosophila. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.08.03.551872. [PMID: 37577671 PMCID: PMC10418198 DOI: 10.1101/2023.08.03.551872] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/15/2023]
Abstract
Understanding how and when key novel adaptations evolved is a central goal of evolutionary biology. Within the immigrans-tripunctata radiation of Drosophila , many mushroom-feeding species are tolerant of host toxins, such as cyclopeptides, that are lethal to nearly all other eukaryotes. In this study, we used phylogenetic and functional approaches to investigate the evolution of cyclopeptide tolerance in the immigrans-tripunctata radiation of Drosophila . We first inferred the evolutionary relationships among 48 species in this radiation using 978 single copy orthologs. Our results resolved previous incongruities within species groups across the phylogeny. Second, we expanded on previous studies of toxin tolerance by assaying 16 of these species for tolerance to α-amanitin and found that six of these species could develop on diet with toxin. Third, we examined fly development on a diet containing a natural mix of toxins extracted from the Death Cap Amanita phalloides mushroom. Both tolerant and susceptible species developed on diet with this mix, though tolerant species survived at significantly higher concentrations. Finally, we asked how cyclopeptide tolerance might have evolved across the immigrans-tripunctata radiation and inferred that toxin tolerance was ancestral and subsequently lost multiple times. Our results suggest the evolutionary history of cyclopeptide tolerance is complex, and simply describing this trait as present or absent does not fully capture the occurrence or impact on this adaptive radiation. More broadly, the evolution of novelty can be more complex than previously thought, and that accurate descriptions of such novelties are critical in studies examining their evolution.
Collapse
|
11
|
Pardo-De la Hoz CJ, Magain N, Piatkowski B, Cornet L, Dal Forno M, Carbone I, Miadlikowska J, Lutzoni F. Ancient Rapid Radiation Explains Most Conflicts Among Gene Trees and Well-Supported Phylogenomic Trees of Nostocalean Cyanobacteria. Syst Biol 2023; 72:694-712. [PMID: 36827095 DOI: 10.1093/sysbio/syad008] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2022] [Revised: 02/12/2023] [Accepted: 02/22/2023] [Indexed: 02/25/2023] Open
Abstract
Prokaryotic genomes are often considered to be mosaics of genes that do not necessarily share the same evolutionary history due to widespread horizontal gene transfers (HGTs). Consequently, representing evolutionary relationships of prokaryotes as bifurcating trees has long been controversial. However, studies reporting conflicts among gene trees derived from phylogenomic data sets have shown that these conflicts can be the result of artifacts or evolutionary processes other than HGT, such as incomplete lineage sorting, low phylogenetic signal, and systematic errors due to substitution model misspecification. Here, we present the results of an extensive exploration of phylogenetic conflicts in the cyanobacterial order Nostocales, for which previous studies have inferred strongly supported conflicting relationships when using different concatenated phylogenomic data sets. We found that most of these conflicts are concentrated in deep clusters of short internodes of the Nostocales phylogeny, where the great majority of individual genes have low resolving power. We then inferred phylogenetic networks to detect HGT events while also accounting for incomplete lineage sorting. Our results indicate that most conflicts among gene trees are likely due to incomplete lineage sorting linked to an ancient rapid radiation, rather than to HGTs. Moreover, the short internodes of this radiation fit the expectations of the anomaly zone, i.e., a region of the tree parameter space where a species tree is discordant with its most likely gene tree. We demonstrated that concatenation of different sets of loci can recover up to 17 distinct and well-supported relationships within the putative anomaly zone of Nostocales, corresponding to the observed conflicts among well-supported trees based on concatenated data sets from previous studies. Our findings highlight the important role of rapid radiations as a potential cause of strongly conflicting phylogenetic relationships when using phylogenomic data sets of bacteria. We propose that polytomies may be the most appropriate phylogenetic representation of these rapid radiations that are part of anomaly zones, especially when all possible genomic markers have been considered to infer these phylogenies. [Anomaly zone; bacteria; horizontal gene transfer; incomplete lineage sorting; Nostocales; phylogenomic conflict; rapid radiation; Rhizonema.].
Collapse
Affiliation(s)
| | - Nicolas Magain
- Evolution and Conservation Biology, InBioS Research Center, Université de Liège, Liège 4000, Belgium
| | - Bryan Piatkowski
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37830, USA
| | - Luc Cornet
- Evolution and Conservation Biology, InBioS Research Center, Université de Liège, Liège 4000, Belgium
- BCCM/IHEM, Mycology and Aerobiology, Sciensano, Brussels, Belgium
| | | | - Ignazio Carbone
- Department of Entomology and Plant Pathology, North Carolina State University, Raleigh, NC 27606, USA
| | | | | |
Collapse
|
12
|
Zhao M, Kurtis SM, White ND, Moncrieff AE, Leite RN, Brumfield RT, Braun EL, Kimball RT. Exploring Conflicts in Whole Genome Phylogenetics: A Case Study Within Manakins (Aves: Pipridae). Syst Biol 2023; 72:161-178. [PMID: 36130303 PMCID: PMC10452962 DOI: 10.1093/sysbio/syac062] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2021] [Revised: 09/03/2022] [Accepted: 09/06/2022] [Indexed: 11/13/2022] Open
Abstract
Some phylogenetic problems remain unresolved even when large amounts of sequence data are analyzed and methods that accommodate processes such as incomplete lineage sorting are employed. In addition to investigating biological sources of phylogenetic incongruence, it is also important to reduce noise in the phylogenomic dataset by using appropriate filtering approach that addresses gene tree estimation errors. We present the results of a case study in manakins, focusing on the very difficult clade comprising the genera Antilophia and Chiroxiphia. Previous studies suggest that Antilophia is nested within Chiroxiphia, though relationships among Antilophia+Chiroxiphia species have been highly unstable. We extracted more than 11,000 loci (ultra-conserved elements and introns) from whole genomes and conducted analyses using concatenation and multispecies coalescent methods. Topologies resulting from analyses using all loci differed depending on the data type and analytical method, with 2 clades (Antilophia+Chiroxiphia and Manacus+Pipra+Machaeopterus) in the manakin tree showing incongruent results. We hypothesized that gene trees that conflicted with a long coalescent branch (e.g., the branch uniting Antilophia+Chiroxiphia) might be enriched for cases of gene tree estimation error, so we conducted analyses that either constrained those gene trees to include monophyly of Antilophia+Chiroxiphia or excluded these loci. While constraining trees reduced some incongruence, excluding the trees led to completely congruent species trees, regardless of the data type or model of sequence evolution used. We found that a suite of gene metrics (most importantly the number of informative sites and likelihood of intralocus recombination) collectively explained the loci that resulted in non-monophyly of Antilophia+Chiroxiphia. We also found evidence for introgression that may have contributed to the discordant topologies we observe in Antilophia+Chiroxiphia and led to deviations from expectations given the multispecies coalescent model. Our study highlights the importance of identifying factors that can obscure phylogenetic signal when dealing with recalcitrant phylogenetic problems, such as gene tree estimation error, incomplete lineage sorting, and reticulation events. [Birds; c-gene; data type; gene estimation error; model fit; multispecies coalescent; phylogenomics; reticulation].
Collapse
Affiliation(s)
- Min Zhao
- Department of Biology, University of Florida, Gainesville, FL 32611, USA
| | - Sarah M Kurtis
- Department of Biology, University of Florida, Gainesville, FL 32611, USA
| | - Noor D White
- Neurobiology-Neurodegeneration and Repair Laboratory, National Eye Institute, Bethesda, MD 20892, USA
- Department of Vertebrate Zoology, National Museum of Natural History, Smithsonian Institution, Washington, DC 20560, USA
| | - Andre E Moncrieff
- Department of Biological Sciences and Museum of Natural Science, Louisiana State University, Baton Rouge, LA 70803, USAand
| | - Rafael N Leite
- Graduate Program in Ecology, National Institute of Amazonian Research, Manaus, AM, Brazil
| | - Robb T Brumfield
- Department of Biological Sciences and Museum of Natural Science, Louisiana State University, Baton Rouge, LA 70803, USAand
| | - Edward L Braun
- Department of Biology, University of Florida, Gainesville, FL 32611, USA
| | - Rebecca T Kimball
- Department of Biology, University of Florida, Gainesville, FL 32611, USA
| |
Collapse
|
13
|
Yusuf LH, Tyukmaeva V, Hoikkala A, Ritchie MG. Divergence and introgression among the virilis group of Drosophila. Evol Lett 2022; 6:537-551. [PMID: 36579165 PMCID: PMC9783487 DOI: 10.1002/evl3.301] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2022] [Revised: 09/23/2022] [Accepted: 10/12/2022] [Indexed: 12/03/2022] Open
Abstract
Speciation with gene flow is now widely regarded as common. However, the frequency of introgression between recently diverged species and the evolutionary consequences of gene flow are still poorly understood. The virilis group of Drosophila contains 12 species that are geographically widespread and show varying levels of prezygotic and postzygotic isolation. Here, we use de novo genome assemblies and whole-genome sequencing data to resolve phylogenetic relationships and describe patterns of introgression and divergence across the group. We suggest that the virilis group consists of three, rather than the traditional two, subgroups. Some genes undergoing rapid sequence divergence across the group were involved in chemical communication and desiccation tolerance, and may be related to the evolution of sexual isolation and adaptation. We found evidence of pervasive phylogenetic discordance caused by ancient introgression events between distant lineages within the group, and more recent gene flow between closely related species. When assessing patterns of genome-wide divergence in species pairs across the group, we found no consistent genomic evidence of a disproportionate role for the X chromosome as has been found in other systems. Our results show how ancient and recent introgressions confuse phylogenetic reconstruction, but may play an important role during early radiation of a group.
Collapse
Affiliation(s)
- Leeban H. Yusuf
- Centre for Biological Diversity, School of BiologyUniversity of St AndrewsSt AndrewsKY16 9THUnited Kingdom
| | - Venera Tyukmaeva
- Centre for Biological Diversity, School of BiologyUniversity of St AndrewsSt AndrewsKY16 9THUnited Kingdom,Department of Evolution, Ecology and BehaviourUniversity of LiverpoolLiverpoolL69 7ZBUnited Kingdom
| | - Anneli Hoikkala
- Department of Biological and Environmental ScienceUniversity of JyväskyläJyväskylä40014Finland
| | - Michael G. Ritchie
- Centre for Biological Diversity, School of BiologyUniversity of St AndrewsSt AndrewsKY16 9THUnited Kingdom
| |
Collapse
|
14
|
Thureborn O, Razafimandimbison SG, Wikström N, Rydin C. Target capture data resolve recalcitrant relationships in the coffee family (Rubioideae, Rubiaceae). FRONTIERS IN PLANT SCIENCE 2022; 13:967456. [PMID: 36160958 PMCID: PMC9493367 DOI: 10.3389/fpls.2022.967456] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/12/2022] [Accepted: 08/03/2022] [Indexed: 06/16/2023]
Abstract
Subfamily Rubioideae is the largest of the main lineages in the coffee family (Rubiaceae), with over 8,000 species and 29 tribes. Phylogenetic relationships among tribes and other major clades within this group of plants are still only partly resolved despite considerable efforts. While previous studies have mainly utilized data from the organellar genomes and nuclear ribosomal DNA, we here use a large number of low-copy nuclear genes obtained via a target capture approach to infer phylogenetic relationships within Rubioideae. We included 101 Rubioideae species representing all but two (the monogeneric tribes Foonchewieae and Aitchinsonieae) of the currently recognized tribes, and all but one non-monogeneric tribe were represented by more than one genus. Using data from the 353 genes targeted with the universal Angiosperms353 probe set we investigated the impact of data type, analytical approach, and potential paralogs on phylogenetic reconstruction. We inferred a robust phylogenetic hypothesis of Rubioideae with the vast majority (or all) nodes being highly supported across all analyses and datasets and few incongruences between the inferred topologies. The results were similar to those of previous studies but novel relationships were also identified. We found that supercontigs [coding sequence (CDS) + non-coding sequence] clearly outperformed CDS data in levels of support and gene tree congruence. The full datasets (353 genes) outperformed the datasets with potentially paralogous genes removed (186 genes) in levels of support but increased gene tree incongruence slightly. The pattern of gene tree conflict at short internal branches were often consistent with high levels of incomplete lineage sorting (ILS) due to rapid speciation in the group. While concatenation- and coalescence-based trees mainly agreed, the observed phylogenetic discordance between the two approaches may be best explained by their differences in accounting for ILS. The use of target capture data greatly improved our confidence and understanding of the Rubioideae phylogeny, highlighted by the increased support for previously uncertain relationships and the increased possibility to explore sources of underlying phylogenetic discordance.
Collapse
Affiliation(s)
- Olle Thureborn
- Department of Ecology, Environment and Plant Sciences, Stockholm University, Stockholm, Sweden
| | | | - Niklas Wikström
- Department of Ecology, Environment and Plant Sciences, Stockholm University, Stockholm, Sweden
- Bergius Foundation, Royal Swedish Academy of Sciences, Stockholm, Sweden
| | - Catarina Rydin
- Department of Ecology, Environment and Plant Sciences, Stockholm University, Stockholm, Sweden
- Bergius Foundation, Royal Swedish Academy of Sciences, Stockholm, Sweden
| |
Collapse
|
15
|
Kück P, Romahn J, Meusemann K. Pitfalls of the site-concordance factor (sCF) as measure of phylogenetic branch support. NAR Genom Bioinform 2022; 4:lqac064. [PMID: 36128424 PMCID: PMC9477076 DOI: 10.1093/nargab/lqac064] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2022] [Revised: 08/10/2022] [Accepted: 08/17/2022] [Indexed: 12/01/2022] Open
Abstract
Confidence measures of branch reliability play an important role in phylogenetics as these measures allow to identify trees or parts of a tree that are well supported by the data and thus adequate to serve as basis for evolutionary inference of biological systems. Unreliable branch relationships in phylogenetic analyses are of concern because of their potential to represent incorrect relationships of interest among more reliable branch relationships. The site-concordance factor implemented in the IQ-TREE package is a recently introduced heuristic solution to the problem of identifying unreliable branch relationships on the basis of quartets. We test the performance of the site-concordance measure with simple examples based on simulated data and designed to study its behaviour in branch support estimates related to different degrees of branch length heterogeneities among a ten sequence tree. Our results show that in particular in cases of relationships with heterogeneous branch lengths site-concordance measures may be misleading. We therefore argue that the maximum parsimony optimality criterion currently used by the site-concordance measure may sometimes be poorly suited to evaluate branch support and that the scores reported by the site-concordance factor should not be considered as reliable.
Collapse
Affiliation(s)
- Patrick Kück
- Centre for Molecular Biodiversity Research, Leibniz Institute for the Analysis of Biodiversity Change , Adenauerallee 160, 53113 Bonn, Germany
| | - Juliane Romahn
- Centre for Molecular Biodiversity Research, Leibniz Institute for the Analysis of Biodiversity Change , Adenauerallee 160, 53113 Bonn, Germany
- LOEWE Centre for Translational Biodiversity Genomics (LOEWE-TBG) , Senckenberganlage 25, 60325 Frankfurt am Main, Germany
- Senckenberg Society for Nature Research , Senckenberganlage 25, 60325 Frankfurt am Main, Germany
| | - Karen Meusemann
- Directorate, Leibniz Institute for the Analysis of Biodiversity Change , Adenauerallee 160, 53113 Bonn, Germany
| |
Collapse
|
16
|
Smith ML, Vanderpool D, Hahn MW. Using all gene families vastly expands data available for phylogenomic inference. Mol Biol Evol 2022; 39:6596367. [PMID: 35642314 PMCID: PMC9178227 DOI: 10.1093/molbev/msac112] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open
Abstract
Traditionally, single-copy orthologs have been the gold standard in phylogenomics. Most phylogenomic studies identify putative single-copy orthologs using clustering approaches and retain families with a single sequence per species. This limits the amount of data available by excluding larger families. Recent advances have suggested several ways to include data from larger families. For instance, tree-based decomposition methods facilitate the extraction of orthologs from large families. Additionally, several methods for species tree inference are robust to the inclusion of paralogs and could use all of the data from larger families. Here, we explore the effects of using all families for phylogenetic inference by examining relationships among 26 primate species in detail and by analyzing five additional data sets. We compare single-copy families, orthologs extracted using tree-based decomposition approaches, and all families with all data. We explore several species tree inference methods, finding that identical trees are returned across nearly all subsets of the data and methods for primates. The relationships among Platyrrhini remain contentious; however, the species tree inference method matters more than the subset of data used. Using data from larger gene families drastically increases the number of genes available and leads to consistent estimates of branch lengths, nodal certainty and concordance, and inferences of introgression in primates. For the other data sets, topological inferences are consistent whether single-copy families or orthologs extracted using decomposition approaches are analyzed. Using larger gene families is a promising approach to include more data in phylogenomics without sacrificing accuracy, at least when high-quality genomes are available.
Collapse
Affiliation(s)
- Megan L Smith
- Department of Biology and Department of Computer Science, Indiana University, Bloomington, Indiana, USA
| | - Dan Vanderpool
- Department of Biology and Department of Computer Science, Indiana University, Bloomington, Indiana, USA
| | - Matthew W Hahn
- Department of Biology and Department of Computer Science, Indiana University, Bloomington, Indiana, USA
| |
Collapse
|
17
|
Giaretta A, Murphy B, Maurin O, Mazine FF, Sano P, Lucas E. Phylogenetic Relationships Within the Hyper-Diverse Genus Eugenia (Myrtaceae: Myrteae) Based on Target Enrichment Sequencing. FRONTIERS IN PLANT SCIENCE 2022; 12:759460. [PMID: 35185945 PMCID: PMC8855041 DOI: 10.3389/fpls.2021.759460] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/16/2021] [Accepted: 11/29/2021] [Indexed: 06/14/2023]
Abstract
Eugenia is one of the most taxonomically challenging lineages of flowering plants, in which morphological delimitation has changed over the last few years resulting from recent phylogenetic study based on molecular data. Efforts, until now, have been limited to Sanger sequencing of mostly plastid markers. These phylogenetic studies indicate 11 clades formalized as infrageneric groups. However, relationships among these clades are poorly supported at key nodes and inconsistent between studies, particularly along the backbone and within Eugenia sect. Umbellatae encompasses ca. 700 species. To resolve and better understand systematic discordance, 54 Eugenia taxa were subjected to phylogenomic Hyb-Seq using 353 low-copy nuclear genes. Twenty species trees based on coding and non-coding loci of nuclear and plastid datasets were recovered using coalescent and concatenated approaches. Concordant and conflicting topologies were assessed by comparing tree landscapes, topology tests, and gene and site concordance factors. The topologies are similar except between nuclear and plastid datasets. The coalescent trees better accommodate disparity in the intron dataset, which contains more parsimony informative sites, while concatenated trees recover more conservative topologies, as they have narrower distribution in the tree landscape. This suggests that highly supported phylogenetic relationships determined in previous studies do not necessarily indicate overwhelming concordant signal. Congruence must be interpreted carefully especially in concatenated datasets. Despite this, the congruence between the multi-species coalescent (MSC) approach and concatenated tree topologies found here is notable. Our analysis does not support Eugenia subg. Pseudeugenia or sect. Pilothecium, as currently circumscribed, suggesting necessary taxonomic reassessment. Five clades are further discussed within Eugenia sect. Umbellatae progress toward its division into workable clades. While targeted sequencing provides a massive quantity of data that improves phylogenetic resolution in Eugenia, uncertainty still remains in Eugenia sect. Umbellatae. The general pattern of higher site coefficient factor (CF) than gene CF in the backbone of Eugenia suggests stochastic error from limited signal. Tree landscapes in combination with concordance factor scores, as implemented here, provide a comprehensive approach that incorporates several phylogenetic hypotheses. We believe the protocols employed here will be of use for future investigations on the evolutionary history of Myrtaceae.
Collapse
Affiliation(s)
- Augusto Giaretta
- Faculdade de Ciências Biológicas e Ambientais, Universidade Federal da Grande Dourados, Unidade II, Dourados, Brazil
- Laboratório de Sistemática Vegetal, Departamento de Botânica, Instituto de Biociências, Universidade de São Paulo, São Paulo, Brazil
| | - Bruce Murphy
- Jodrell Laboratory, Royal Botanic Gardens, Kew, Surrey, United Kingdom
- Department of Life Sciences, Imperial College, London, United Kingdom
| | - Olivier Maurin
- Jodrell Laboratory, Royal Botanic Gardens, Kew, Surrey, United Kingdom
| | - Fiorella F. Mazine
- Centro de Ciências e Tecnologias para a Sustentabilidade, Universidade Federal de São Carlos, Campus Sorocaba, Sorocaba, Brazil
| | - Paulo Sano
- Laboratório de Sistemática Vegetal, Departamento de Botânica, Instituto de Biociências, Universidade de São Paulo, São Paulo, Brazil
| | - Eve Lucas
- Herbarium, Royal Botanic Gardens, Kew, Surrey, United Kingdom
| |
Collapse
|
18
|
Utilizing museomics to trace the complex history and species boundaries in an avian-study system of conservation concern. Heredity (Edinb) 2022; 128:159-168. [PMID: 35082388 PMCID: PMC8897408 DOI: 10.1038/s41437-022-00499-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2021] [Revised: 12/23/2021] [Accepted: 01/05/2022] [Indexed: 11/08/2022] Open
Abstract
A taxonomic classification that accurately captures evolutionary history is essential for conservation. Genomics provides powerful tools for delimiting species and understanding their evolutionary relationships. This allows for a more accurate and detailed view on conservation status compared with other, traditionally used, methods. However, from a practical and ethical perspective, gathering sufficient samples for endangered taxa may be difficult. Here, we use museum specimens to trace the evolutionary history and species boundaries in an Asian oriole clade. The endangered silver oriole has long been recognized as a distinct species based on its unique coloration, but a recent study suggested that it might be nested within the maroon oriole-species complex. To evaluate species designation, population connectivity, and the corresponding conservation implications, we assembled a de novo genome and used whole-genome resequencing of historical specimens. Our results show that the silver orioles form a monophyletic lineage within the maroon oriole complex and that maroon and silver forms continued to interbreed after initial divergence, but do not show signs of recent gene flow. Using a genome scan, we identified genes that may form the basis for color divergence and act as reproductive barriers. Taken together, our results confirm the species status of the silver oriole and highlight that taxonomic revision of the maroon forms is urgently needed. Our study demonstrates how genomics and Natural History Collections (NHC) can be utilized to shed light on the taxonomy and evolutionary history of natural populations and how such insights can directly benefit conservation practitioners when assessing wild populations.
Collapse
|
19
|
Morel B, Schade P, Lutteropp S, Williams TA, Szöllősi GJ, Stamatakis A. SpeciesRax: A tool for maximum likelihood species tree inference from gene family trees under duplication, transfer, and loss. Mol Biol Evol 2022; 39:6503503. [PMID: 35021210 PMCID: PMC8826479 DOI: 10.1093/molbev/msab365] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
Species tree inference from gene family trees is becoming increasingly popular because it can account for discordance between the species tree and the corresponding gene family trees. In particular, methods that can account for multiple-copy gene families exhibit potential to leverage paralogy as informative signal. At present, there does not exist any widely adopted inference method for this purpose. Here, we present SpeciesRax, the first maximum likelihood method that can infer a rooted species tree from a set of gene family trees and can account for gene duplication, loss, and transfer events. By explicitly modeling events by which gene trees can depart from the species tree, SpeciesRax leverages the phylogenetic rooting signal in gene trees. SpeciesRax infers species tree branch lengths in units of expected substitutions per site and branch support values via paralogy-aware quartets extracted from the gene family trees. Using both empirical and simulated data sets we show that SpeciesRax is at least as accurate as the best competing methods while being one order of magnitude faster on large data sets at the same time. We used SpeciesRax to infer a biologically plausible rooted phylogeny of the vertebrates comprising 188 species from 31,612 gene families in 1 h using 40 cores. SpeciesRax is available under GNU GPL at https://github.com/BenoitMorel/GeneRax and on BioConda.
Collapse
Affiliation(s)
- Benoit Morel
- Computational Molecular Evolution group, Heidelberg Institute for Theoretical Studies, Heidelberg, Germany.,Institute for Theoretical Informatics, Karlsruhe Institute of Technology, Karlsruhe, Germany
| | - Paul Schade
- Institute for Theoretical Informatics, Karlsruhe Institute of Technology, Karlsruhe, Germany
| | - Sarah Lutteropp
- Computational Molecular Evolution group, Heidelberg Institute for Theoretical Studies, Heidelberg, Germany
| | - Tom A Williams
- School of Biological Sciences, University of Bristol, Bristol, UK
| | - Gergely J Szöllősi
- ELTE-MTA "Lendület" Evolutionary Genomics Research Group, Pázmány P. stny. 1A., H-1117 Budapest, Hungary.,Dept. Biological Physics, Eötvös University, Pázmány P. stny. 1A., H-1117 Budapest, Hungary.,Institute of Evolution, Centre for Ecological Research, Konkoly-Thege M. út 29-33. H-1121 Budapest, Hungary
| | - Alexandros Stamatakis
- Computational Molecular Evolution group, Heidelberg Institute for Theoretical Studies, Heidelberg, Germany.,Institute for Theoretical Informatics, Karlsruhe Institute of Technology, Karlsruhe, Germany
| |
Collapse
|
20
|
Doronina L, Feigin CY, Schmitz J. OUP accepted manuscript. Syst Biol 2022; 71:1045-1053. [PMID: 35289914 PMCID: PMC9366447 DOI: 10.1093/sysbio/syac025] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2021] [Revised: 03/09/2022] [Accepted: 03/11/2022] [Indexed: 11/29/2022] Open
Abstract
Although first posited to be of a single origin, the two superfamilies of phalangeriform marsupial possums (Phalangeroidea: brushtail possums and cuscuses and Petauroidea: possums and gliders) have long been considered, based on multiple sequencing studies, to have evolved from two separate origins. However, previous data from these sequence analyses suggested a variety of conflicting trees. Therefore, we reinvestigated these relationships by screening \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{upgreek}
\usepackage{mathrsfs}
\setlength{\oddsidemargin}{-69pt}
\begin{document}
}{}$\sim$\end{document}200,000 orthologous short interspersed element (SINE) loci across the newly available whole-genome sequences of phalangeriform species and their relatives. Compared to sequence data, SINE presence/absence patterns are evolutionarily almost neutral molecular markers of the phylogenetic history of species. Their random and highly complex genomic insertion ensures their virtually homoplasy-free nature and enables one to compare hundreds of shared unique orthologous events to determine the true species tree. Here, we identify 106 highly reliable phylogenetic SINE markers whose presence/absence patterns within multiple Australasian possum genomes unexpectedly provide the first significant evidence for the reunification of Australasian possums into one monophyletic group. Together, our findings indicate that nucleotide homoplasy and ancestral incomplete lineage sorting have most likely driven the conflicting signal distributions seen in previous sequence-based studies. [Ancestral incomplete lineage sorting; possum genomes; possum monophyly; retrophylogenomics; SINE presence/absence.]
Collapse
Affiliation(s)
- Liliya Doronina
- Institute of Experimental Pathology (ZMBE), University of Münster, Von-Esmarch-Str. 56, D-48149 Münster, Germany
| | - Charles Y Feigin
- Department of Molecular Biology, Princeton University, 119 Lewis Thomas Laboratory, Washington Road, Princeton, NJ 08544-1014, USA
- School of BioSciences, The University of Melbourne, BioSciences 4, Royal Pde, Parkville, VIC 3010, Australia
| | - Jürgen Schmitz
- Correspondence to be sent to: Institute of Experimental Pathology (ZMBE), University of Münster, Von-Esmarch-Str. 56, D-48149 Münster, Germany; E-mail:
| |
Collapse
|
21
|
Schull JK, Turakhia Y, Hemker JA, Dally WJ, Bejerano G. OUP accepted manuscript. Genome Biol Evol 2022; 14:6529394. [PMID: 35171243 PMCID: PMC8920512 DOI: 10.1093/gbe/evac013] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/10/2022] [Indexed: 11/14/2022] Open
Abstract
We present Champagne, a whole-genome method for generating character matrices for phylogenomic analysis using large genomic indel events. By rigorously picking orthologous genes and locating large insertion and deletion events, Champagne delivers a character matrix that considerably reduces homoplasy compared with morphological and nucleotide-based matrices, on both established phylogenies and difficult-to-resolve nodes in the mammalian tree. Champagne provides ample evidence in the form of genomic structural variation to support incomplete lineage sorting and possible introgression in Paenungulata and human–chimp–gorilla which were previously inferred primarily through matrices composed of aligned single-nucleotide characters. Champagne also offers further evidence for Myomorpha as sister to Sciuridae and Hystricomorpha in the rodent tree. Champagne harbors distinct theoretical advantages as an automated method that produces nearly homoplasy-free character matrices on the whole-genome scale.
Collapse
Affiliation(s)
- James K Schull
- Department of Computer Science, Stanford University, USA
| | - Yatish Turakhia
- Department of Electrical and Computer Engineering, University of California San Diego, USA
| | - James A Hemker
- Department of Computer Science, Stanford University, USA
| | - William J Dally
- Department of Computer Science, Stanford University, USA
- NVIDIA, Santa Clara, California, USA
- Department of Electrical Engineering, Stanford University, USA
| | - Gill Bejerano
- Department of Computer Science, Stanford University, USA
- Department of Developmental Biology, Stanford University, USA
- Department of Biomedical Data Science, Stanford University, USA
- Department of Pediatrics, Stanford University, USA
- Corresponding author: E-mail:
| |
Collapse
|
22
|
Jofre GI, Singh A, Mavengere H, Sundar G, D'Agostino E, Chowdhary A, Matute DR. An Indian lineage of Histoplasma with strong signatures of differentiation and selection. Fungal Genet Biol 2022; 158:103654. [PMID: 34942368 DOI: 10.1016/j.fgb.2021.103654] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2021] [Revised: 12/06/2021] [Accepted: 12/11/2021] [Indexed: 01/04/2023]
Abstract
Histoplasma, a genus of dimorphic fungi, is the etiological agent of histoplasmosis, a pulmonary disease widespread across the globe. Whole genome sequencing has revealed that the genus harbors a previously unrecognized diversity of cryptic species. To date, studies have focused on Histoplasma isolates collected in the Americas with little knowledge of the genomic variation from other localities. In this report, we report the existence of a well-differentiated lineage of Histoplasma occurring in the Indian subcontinent. The group is differentiated enough to satisfy the requirements of a phylogenetic species, as it shows extensive genetic differentiation along the whole genome and has little evidence of gene exchange with other Histoplasma species. Next, we leverage this genetic differentiation to identify genetic changes that are unique to this group and that have putatively evolved through rapid positive selection. We found that none of the previously known virulence factors have evolved rapidly in the Indian lineage but find evidence of strong signatures of selection on other alleles potentially involved in clinically-important phenotypes. Our work serves as an example of the importance of correctly identifying species boundaries to understand the extent of selection in the evolution of pathogenic lineages. IMPORTANCE: Whole genome sequencing has revolutionized our understanding of microbial diversity, including human pathogens. In the case of fungal pathogens, a limiting factor in understanding the extent of their genetic diversity has been the lack of systematic sampling. In this piece, we show the results of a collection in the Indian subcontinent of the pathogenic fungus Histoplasma, the causal agent of a systemic mycosis. We find that Indian samples of Histoplasma form a distinct clade which is highly differentiated from other Histoplasma species. We also show that the genome of this lineage shows unique signals of natural selection. This work exemplifies how the combination of a robust sampling along with population genetics, and phylogenetics can reveal the precise genetic changes that differentiate lineages of fungal pathogens.
Collapse
Affiliation(s)
- Gaston I Jofre
- Department of Biology, University of North Carolina, Chapel Hill, NC, United States
| | - Ashutosh Singh
- National Reference Laboratory for Antimicrobial Resistance in Fungal Pathogens, Medical Mycology Unit, Department of Microbiology, Vallabhbhai Patel Chest Institute, University of Delhi, Delhi, India
| | - Heidi Mavengere
- Department of Biology, University of North Carolina, Chapel Hill, NC, United States
| | - Gandhi Sundar
- National Reference Laboratory for Antimicrobial Resistance in Fungal Pathogens, Medical Mycology Unit, Department of Microbiology, Vallabhbhai Patel Chest Institute, University of Delhi, Delhi, India
| | - Emmanuel D'Agostino
- Department of Biology, University of North Carolina, Chapel Hill, NC, United States
| | - Anuradha Chowdhary
- National Reference Laboratory for Antimicrobial Resistance in Fungal Pathogens, Medical Mycology Unit, Department of Microbiology, Vallabhbhai Patel Chest Institute, University of Delhi, Delhi, India
| | - Daniel R Matute
- Department of Biology, University of North Carolina, Chapel Hill, NC, United States.
| |
Collapse
|
23
|
Smith ML, Hahn MW. The Frequency and Topology of Pseudoorthologs. Syst Biol 2021; 71:649-659. [PMID: 34951639 DOI: 10.1093/sysbio/syab097] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2021] [Revised: 12/15/2021] [Accepted: 12/17/2021] [Indexed: 11/12/2022] Open
Abstract
Phylogenetics has long relied on the use of orthologs, or genes related through speciation events, to infer species relationships. However, identifying orthologs is difficult because gene duplication can obscure relationships among genes. Researchers have been particularly concerned with the insidious effects of pseudoorthologs-duplicated genes that are mistaken for orthologs because they are present in a single copy in each sampled species. Because gene tree topologies of pseudoorthologs may differ from the species tree topology, they have often been invoked as the cause of counterintuitive results in phylogenetics. Despite these perceived problems, no previous work has calculated the probabilities of pseudoortholog topologies, or has been able to circumscribe the regions of parameter space in which pseudoorthologs are most likely to occur. Here, we introduce a model for calculating the probabilities and branch lengths of orthologs and pseudoorthologs, including concordant and discordant pseudoortholog topologies, on a rooted three-taxon species tree. We show that the probability of orthologs is high relative to the probability of pseudoorthologs across reasonable regions of parameter space. Furthermore, the probabilities of the two discordant topologies are equal and never exceed that of the concordant topology, generally being much lower. We describe the species tree topologies most prone to generating pseudoorthologs, finding that they are likely to present problems to phylogenetic inference irrespective of the presence of pseudoorthologs. Overall, our results suggest that pseudoorthologs are unlikely to mislead inferences of species relationships under the biological scenarios considered here.
Collapse
Affiliation(s)
- Megan L Smith
- Department of Biology and Department of Computer Science, Indiana University, Bloomington, IN 47405, USA
| | - Matthew W Hahn
- Department of Biology and Department of Computer Science, Indiana University, Bloomington, IN 47405, USA
| |
Collapse
|
24
|
How challenging RADseq data turned out to favor coalescent-based species tree inference. A case study in Aichryson (Crassulaceae). Mol Phylogenet Evol 2021; 167:107342. [PMID: 34785384 DOI: 10.1016/j.ympev.2021.107342] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2020] [Revised: 07/05/2021] [Accepted: 10/29/2021] [Indexed: 12/24/2022]
Abstract
Analysing multiple genomic regions while incorporating detection and qualification of discordance among regions has become standard for understanding phylogenetic relationships. In plants, which usually have comparatively large genomes, this is feasible by the combination of reduced-representation library (RRL) methods and high-throughput sequencing enabling the cost effective acquisition of genomic data for thousands of loci from hundreds of samples. One popular RRL method is RADseq. A major disadvantage of established RADseq approaches is the rather short fragment and sequencing range, leading to loci of little individual phylogenetic information. This issue hampers the application of coalescent-based species tree inference. The modified RADseq protocol presented here targets ca. 5,000 loci of 300-600nt length, sequenced with the latest short-read-sequencing (SRS) technology, has the potential to overcome this drawback. To illustrate the advantages of this approach we use the study group Aichryson Webb & Berthelott (Crassulaceae), a plant genus that diversified on the Canary Islands. The data analysis approach used here aims at a careful quality control of the long loci dataset. It involves an informed selection of thresholds for accurate clustering, a thorough exploration of locus properties, such as locus length, coverage and variability, to identify potential biased data and a comparative phylogenetic inference of filtered datasets, accompanied by an evaluation of resulting BS support, gene and site concordance factor values, to improve overall resolution of the resulting phylogenetic trees. The final dataset contains variable loci with an average length of 373nt and facilitates species tree estimation using a coalescent-based summary approach. Additional improvements brought by the approach are critically discussed.
Collapse
|
25
|
Hamilton CA, Winiger N, Rubin JJ, Breinholt J, Rougerie R, Kitching IJ, Barber JR, Kawahara AY. Hidden phylogenomic signal helps elucidate arsenurine silkmoth phylogeny and the evolution of body size and wing shape trade-offs. Syst Biol 2021; 71:859-874. [PMID: 34791485 DOI: 10.1093/sysbio/syab090] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2020] [Revised: 10/29/2021] [Accepted: 11/01/2021] [Indexed: 11/13/2022] Open
Abstract
One of the key objectives in biological research is understanding how evolutionary processes have produced Earth's diversity. A critical step towards revealing these processes is an investigation of evolutionary tradeoffs - that is, the opposing pressures of multiple selective forces. For millennia, nocturnal moths have had to balance successful flight, as they search for mates or host plants, with evading bat predators. However, the potential for evolutionary trade-offs between wing shape and body size are poorly understood. In this study, we used phylogenomics and geometric morphometrics to examine the evolution of wing shape in the wild silkmoth subfamily Arsenurinae (Saturniidae) and evaluate potential evolutionary relationships between body size and wing shape. The phylogeny was inferred based on 782 loci from target capture data of 42 arsenurine species representing all 10 recognized genera. After detecting in our data one of the most vexing problems in phylogenetic inference - a region of a tree that possesses short branches and no "support" for relationships (i.e., a polytomy), we looked for hidden phylogenomic signal (i.e., inspecting differing phylogenetic inferences, alternative support values, quartets, and phylogenetic networks) to better illuminate the most probable generic relationships within the subfamily. We found there are putative evolutionary trade-offs between wing shape, body size, and the interaction of fore- and hindwing shape. Namely, body size tends to decrease with increasing hindwing length but increases as forewing shape becomes more complex. Additionally, the type of hindwing (i.e., tail or no tail) a lineage possesses has a significant effect on the complexity of forewing shape. We outline possible selective forces driving the complex hindwing shapes that make Arsenurinae, and silkmoths as a whole, so charismatic.
Collapse
Affiliation(s)
- Chris A Hamilton
- Florida Museum of Natural History, McGuire Center for Lepidoptera and Biodiversity, University of Florida, Gainesville, FL 32611 USA.,Department of Entomology, Plant Pathology & Nematology, University of Idaho, Moscow, ID, 83844 USA
| | - Nathalie Winiger
- Florida Museum of Natural History, McGuire Center for Lepidoptera and Biodiversity, University of Florida, Gainesville, FL 32611 USA.,Wildlife Ecology and Management, Albert-Ludwigs-Universität Freiburg, 79106 Freiburg, Germany
| | - Juliette J Rubin
- Florida Museum of Natural History, McGuire Center for Lepidoptera and Biodiversity, University of Florida, Gainesville, FL 32611 USA
| | - Jesse Breinholt
- Florida Museum of Natural History, McGuire Center for Lepidoptera and Biodiversity, University of Florida, Gainesville, FL 32611 USA.,Division of Bioinformatics, Intermountain Healthcare, Precision Genomics, St. George, UT 84790 USA
| | - Rodolphe Rougerie
- Institut de Systématique, Evolution, Biodiversité (ISYEB), Muséum national d'Histoire naturelle, CNRS, Sorbonne Université, EPHE, Université des Antilles, Paris, France
| | - Ian J Kitching
- Department of Life Sciences, Natural History Museum, Cromwell Road, London SW7 5BD, UK
| | - Jesse R Barber
- Department of Biological Sciences, Boise State University, Boise, ID, 83725 USA
| | - Akito Y Kawahara
- Florida Museum of Natural History, McGuire Center for Lepidoptera and Biodiversity, University of Florida, Gainesville, FL 32611 USA
| |
Collapse
|
26
|
Hibbins MS, Hahn MW. Phylogenomic approaches to detecting and characterizing introgression. Genetics 2021; 220:6425633. [PMID: 34788444 PMCID: PMC9208645 DOI: 10.1093/genetics/iyab173] [Citation(s) in RCA: 49] [Impact Index Per Article: 16.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2021] [Accepted: 10/02/2021] [Indexed: 12/26/2022] Open
Abstract
Phylogenomics has revealed the remarkable frequency with which introgression occurs across the tree of life. These discoveries have been enabled by the rapid growth of methods designed to detect and characterize introgression from whole-genome sequencing data. A large class of phylogenomic methods makes use of data across species to infer and characterize introgression based on expectations from the multispecies coalescent. These methods range from simple tests, such as the D-statistic, to model-based approaches for inferring phylogenetic networks. Here, we provide a detailed overview of the various signals that different modes of introgression are expected leave in the genome, and how current methods are designed to detect them. We discuss the strengths and pitfalls of these approaches and identify areas for future development, highlighting the different signals of introgression, and the power of each method to detect them. We conclude with a discussion of current challenges in inferring introgression and how they could potentially be addressed.
Collapse
Affiliation(s)
- Mark S Hibbins
- Department of Biology, Indiana University, Bloomington, IN 47405, USA
| | - Matthew W Hahn
- Department of Biology, Indiana University, Bloomington, IN 47405, USA.,Department of Computer Science, Indiana University, Bloomington, IN 47405, USA
| |
Collapse
|
27
|
Molloy EK, Gatesy J, Springer MS. Theoretical and practical considerations when using retroelement insertions to estimate species trees in the anomaly zone. Syst Biol 2021; 71:721-740. [PMID: 34677617 DOI: 10.1093/sysbio/syab086] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2020] [Accepted: 10/11/2021] [Indexed: 11/13/2022] Open
Abstract
A potential shortcoming of concatenation methods for species tree estimation is their failure to account for incomplete lineage sorting. Coalescent methods address this problem but make various assumptions that, if violated, can result in worse performance than concatenation. Given the challenges of analyzing DNA sequences with both concatenation and coalescent methods, retroelement insertions (RIs) have emerged as powerful phylogenomic markers for species tree estimation. Here, we show that two recently proposed quartet-based methods, SDPquartets and ASTRAL_BP, are statistically consistent estimators of the unrooted species tree topology under the coalescent when RIs follow a neutral infinite-sites model of mutation and the expected number of new RIs per generation is constant across the species tree. The accuracy of these (and other) methods for inferring species trees from RIs has yet to be assessed on simulated data sets, where the true species tree topology is known. Therefore, we evaluated eight methods given RIs simulated from four model species trees, all of which have short branches and at least three of which are in the anomaly zone. In our simulation study, ASTRAL_BP and SDPquartets always recovered the correct species tree topology when given a sufficiently large number of RIs, as predicted. A distance-based method (ASTRID_BP) and Dollo parsimony also performed well in recovering the species tree topology. In contrast, unordered, polymorphism, and Camin-Sokal parsimony typically fail to recover the correct species tree topology in anomaly zone situations with more than four ingroup taxa. Of the methods studied, only ASTRAL_BP automatically estimates internal branch lengths (in coalescent units) and support values (i.e. local posterior probabilities). We examined the accuracy of branch length estimation, finding that estimated lengths were accurate for short branches but upwardly biased otherwise. This led us to derive the maximum likelihood (branch length) estimate for when RIs are given as input instead of binary gene trees; this corrected formula produced accurate estimates of branch lengths in our simulation study, provided that a sufficiently large number of RIs were given as input. Lastly, we evaluated the impact of data quantity on species tree estimation by repeating the above experiments with input sizes varying from 100 to 100 000 parsimony-informative RIs. We found that, when given just 1 000 parsimony-informative RIs as input, ASTRAL_BP successfully reconstructed major clades (i.e clades separated by branches > 0.3 CUs) with high support and identified rapid radiations (i.e. shorter connected branches), although not their precise branching order. The local posterior probability was effective for controlling false positive branches in these scenarios.
Collapse
Affiliation(s)
- Erin K Molloy
- Department of Computer Science, University of Maryland, College Park, College Park, 20742, USA
| | - John Gatesy
- Sackler Institute for Comparative Genomics, American Museum of Natural History, New York, 10024, USA
| | - Mark S Springer
- Department of Evolution, Ecology, and Organismal Biology, University of California, Riverside, Riverside, 92521, USA
| |
Collapse
|
28
|
Nesi N, Tsagkogeorga G, Tsang SM, Nicolas V, Lalis A, Scanlon AT, Riesle-Sbarbaro SA, Wiantoro S, Hitch AT, Juste J, Pinzari CA, Bonaccorso FJ, Todd CM, Lim BK, Simmons NB, McGowen MR, Rossiter SJ. Interrogating Phylogenetic Discordance Resolves Deep Splits in the Rapid Radiation of Old World Fruit Bats (Chiroptera: Pteropodidae). Syst Biol 2021; 70:1077-1089. [PMID: 33693838 PMCID: PMC8513763 DOI: 10.1093/sysbio/syab013] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2019] [Revised: 04/27/2021] [Accepted: 03/03/2021] [Indexed: 11/14/2022] Open
Abstract
The family Pteropodidae (Old World fruit bats) comprises $>$200 species distributed across the Old World tropics and subtropics. Most pteropodids feed on fruit, suggesting an early origin of frugivory, although several lineages have shifted to nectar-based diets. Pteropodids are of exceptional conservation concern with $>$50% of species considered threatened, yet the systematics of this group has long been debated, with uncertainty surrounding early splits attributed to an ancient rapid diversification. Resolving the relationships among the main pteropodid lineages is essential if we are to fully understand their evolutionary distinctiveness, and the extent to which these bats have transitioned to nectar-feeding. Here we generated orthologous sequences for $>$1400 nuclear protein-coding genes (2.8 million base pairs) across 114 species from 43 genera of Old World fruit bats (57% and 96% of extant species- and genus-level diversity, respectively), and combined phylogenomic inference with filtering by information content to resolve systematic relationships among the major lineages. Concatenation and coalescent-based methods recovered three distinct backbone topologies that were not able to be reconciled by filtering via phylogenetic information content. Concordance analysis and gene genealogy interrogation show that one topology is consistently the best supported, and that observed phylogenetic conflicts arise from both gene tree error and deep incomplete lineage sorting. In addition to resolving long-standing inconsistencies in the reported relationships among major lineages, we show that Old World fruit bats have likely undergone at least seven independent dietary transitions from frugivory to nectarivory. Finally, we use this phylogeny to identify and describe one new genus. [Chiroptera; coalescence; concordance; incomplete lineage sorting; nectar feeder; species tree; target enrichment.].
Collapse
Affiliation(s)
- Nicolas Nesi
- School of Biological and Chemical Sciences, Queen Mary University of London, Mile End Road, London E1 4NS, UK
| | - Georgia Tsagkogeorga
- School of Biological and Chemical Sciences, Queen Mary University of London, Mile End Road, London E1 4NS, UK
| | - Susan M Tsang
- Department of Mammalogy, Division of Vertebrate Zoology, American Museum of Natural History, New York, USA
- Zoology Section, National Museum of Natural History, Manila, Philippines
| | - Violaine Nicolas
- Institut de Systématique, Evolution, Biodiversité (ISYEB), Muséum national d’Histoire naturelle, CNRS, Sorbonne Université, EPHE, Université des Antilles, Paris, France
| | - Aude Lalis
- Institut de Systématique, Evolution, Biodiversité (ISYEB), Muséum national d’Histoire naturelle, CNRS, Sorbonne Université, EPHE, Université des Antilles, Paris, France
| | - Annette T Scanlon
- School of Natural and Built Environments, University of South Australia, Mawson Lakes, SA, Australia
| | - Silke A Riesle-Sbarbaro
- Department of Veterinary Medicine, University of Cambridge, Cambridge, UK
- Institute of Zoology, Zoological Society of London, London, UK
- Centre for Biological Threats and Special Pathogens, Robert Koch Institute, Berlin, Germany
| | - Sigit Wiantoro
- Museum Zoologicum Bogoriense, Research Center for Biology, Indonesian Institute of Sciences, Cibinong, Indonesia
| | - Alan T Hitch
- Department of Wildlife, Fish, and Conservation Biology, University of California Davis, CA, USA
| | - Javier Juste
- Estación Biológica de Doñana (CSIC), Avda. Américo Vespucio, Sevilla, Spain
| | | | | | - Christopher M Todd
- The Hawkesbury institute for the Environment, Western Sydney University, Australia
| | - Burton K Lim
- Royal Ontario Museum, Toronto, ON M5S 2C6, Canada
| | - Nancy B Simmons
- Department of Mammalogy, Division of Vertebrate Zoology, American Museum of Natural History, New York, USA
| | - Michael R McGowen
- Department of Vertebrate Zoology, Smithsonian National Museum of Natural History, Washington, DC, USA
| | - Stephen J Rossiter
- School of Biological and Chemical Sciences, Queen Mary University of London, Mile End Road, London E1 4NS, UK
| |
Collapse
|
29
|
Blotto BL, Lyra ML, Cardoso MCS, Trefaut Rodrigues M, R Dias I, Marciano-Jr E, Dal Vechio F, Orrico VGD, Brandão RA, Lopes de Assis C, Lantyer-Silva ASF, Rutherford MG, Gagliardi-Urrutia G, Solé M, Baldo D, Nunes I, Cajade R, Torres A, Grant T, Jungfer KH, da Silva HR, Haddad CFB, Faivovich J. The phylogeny of the Casque-headed Treefrogs (Hylidae: Hylinae: Lophyohylini). Cladistics 2021; 37:36-72. [PMID: 34478174 DOI: 10.1111/cla.12409] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/21/2019] [Indexed: 12/24/2022] Open
Abstract
The South American and West Indian Casque-headed Treefrogs (Hylidae: Hylinae: Lophyohylini) include 85 species. These are notably diverse in morphology (e.g. disparate levels of cranial hyperossification) and life history (e.g. different reproductive modes, chemical defences), have a wide distribution, and occupy habitats from the tropical rainforests to semiarid scrubland. In this paper, we present a phylogenetic analysis of this hylid tribe based on sequence fragments of up to five mitochondrial (12S, 16S, ND1, COI, Cytb) and six nuclear genes (POMC, RAG-1, RHOD, SIAH, TNS3, TYR). We included most of its species (> 96%), in addition to a number of new species. Our results indicate: (i) the paraphyly of Trachycephalus with respect to Aparasphenodon venezolanus; (ii) the nonmonophyly of Aparasphenodon, with Argenteohyla siemersi, Corythomantis galeata and Nyctimantis rugiceps nested within it, and Ap. venezolanus nested within Trachycephalus; (iii) the polyphyly of Corythomantis; (iv) the nonmonophyly of the recognized species groups of Phyllodytes; and (v) a pervasive low support for the deep relationships among the major clades of Lophyohylini, including C. greeningi and the monotypic genera Itapotihyla and Phytotriades. To remedy the nonmonophyly of Aparasphenodon, Corythomantis, and Trachycephalus, we redefined Nyctimantis to include Aparasphenodon (with the exception of Ap. venezolanus, which we transferred to Trachycephalus), Argenteohyla, and C. galeata. Additionally, our results indicate the need for taxonomic work in the following clades: (i) Trachycephalus dibernardoi and Tr. imitatrix; (ii) Tr. atlas, Tr. mambaiensis and Tr. nigromaculatus; and (iii) Phyllodytes. On the basis of our phylogenetic results, we analyzed the evolution of skull hyperossification and reproductive biology, with emphasis on the multiple independent origins of phytotelm breeding, in the context of Anura. We also analyzed the inter-related aspects of chemical defences, venom delivery, phragmotic behaviour, co-ossification, and prevention of evaporative water loss.
Collapse
Affiliation(s)
- Boris L Blotto
- Departamento de Biodiversidade and Centro de Aquicultura, Instituto de Biociências, Universidade Estadual Paulista, Av. 24A 1515, 13506-900, Rio Claro, São Paulo, Brazil.,Departamento de Zoologia, Instituto de Biociências, Universidade de São Paulo, 05508-090, São Paulo, São Paulo, Brazil
| | - Mariana L Lyra
- Departamento de Biodiversidade and Centro de Aquicultura, Instituto de Biociências, Universidade Estadual Paulista, Av. 24A 1515, 13506-900, Rio Claro, São Paulo, Brazil
| | - Monica C S Cardoso
- Setor de Herpetologia, Departamento de Vertebrados, Museu Nacional, Universidade Federal do Rio de Janeiro, Quinta da Boa Vista, CEP 20940-040, Rio de Janeiro, Rio de Janeiro, Brazil
| | - Miguel Trefaut Rodrigues
- Departamento de Zoologia, Instituto de Biociências, Universidade de São Paulo, 05508-090, São Paulo, São Paulo, Brazil
| | - Iuri R Dias
- Tropical Herpetology Laboratory, Departamento de Ciências Biológicas, Universidade Estadual de Santa Cruz, Rodovia Jorge Amado, km 16, CEP 45662-900, Ilhéus, Bahia, Brazil
| | - Euvaldo Marciano-Jr
- Tropical Herpetology Laboratory, Departamento de Ciências Biológicas, Universidade Estadual de Santa Cruz, Rodovia Jorge Amado, km 16, CEP 45662-900, Ilhéus, Bahia, Brazil
| | - Francisco Dal Vechio
- Departamento de Zoologia, Instituto de Biociências, Universidade de São Paulo, 05508-090, São Paulo, São Paulo, Brazil
| | - Victor G D Orrico
- Tropical Herpetology Laboratory, Departamento de Ciências Biológicas, Universidade Estadual de Santa Cruz, Rodovia Jorge Amado, km 16, CEP 45662-900, Ilhéus, Bahia, Brazil
| | - Reuber A Brandão
- Laboratório de Fauna e Unidades de Conservação, Departamento de Engenharia Florestal, Universidade de Brasília, 70910-900, Brasília, Distrito Federal, Brazil
| | - Clodoaldo Lopes de Assis
- Museu de Zoologia João Moojen, Departamento de Biologia Animal, Universidade Federal de Viçosa, 36570-900, Viçosa, Minas Gerais, Brazil
| | - Amanda S F Lantyer-Silva
- Departamento de Biodiversidade and Centro de Aquicultura, Instituto de Biociências, Universidade Estadual Paulista, Av. 24A 1515, 13506-900, Rio Claro, São Paulo, Brazil
| | - Mike G Rutherford
- Department of Life Sciences, The University of The West Indies Zoology Museum, The University of The West Indies, St. Augustine, Trinidad & Tobago
| | - Giussepe Gagliardi-Urrutia
- Laboratorio de Sistemática de Vertebrados, Pontifícia Universidade Católica do Rio Grande do Sul (PUCRS), Av. Ipiranga, 6681, Prédio 40, sala 110, 90619-900, Porto Alegre, Rio Grande do Sul, Brazil
| | - Mirco Solé
- Tropical Herpetology Laboratory, Departamento de Ciências Biológicas, Universidade Estadual de Santa Cruz, Rodovia Jorge Amado, km 16, CEP 45662-900, Ilhéus, Bahia, Brazil
| | - Diego Baldo
- Laboratorio de Genetica Evolutiva "Claudio Juan Bidau", Instituto de Biologıa Subtropical (CONICET-UNaM), Félix de Azara, 1552, CPA N3300LQF Posadas, Misiones, Argentina
| | - Ivan Nunes
- Laboratório de Herpetologia, Instituto de Biociências, Universidade Estadual Paulista, Campus do Litoral Paulista, CEP 11330-900, São Vicente, São Paulo, Brazil
| | - Rodrigo Cajade
- Laboratorio de Herpetología, Departamento de Biología, Facultad de Ciencias Exactas y Naturales y Agrimensura, CONICET, Universidad Nacional del Nordeste, Av. Libertad 5470, 3400, Corrientes, Argentina
| | - Ambrosio Torres
- Unidad Ejecutora Lillo, CONICET - Fundación Miguel Lillo, Miguel Lillo 251, 4000, San Miguel de Tucumán, Argentina
| | - Taran Grant
- Departamento de Zoologia, Instituto de Biociências, Universidade de São Paulo, 05508-090, São Paulo, São Paulo, Brazil
| | - Karl-Heinz Jungfer
- Department of Biology, Institute of Integrated Sciences, University of Koblenz-Landau, Universitätsstr. 1, 56070, Koblenz, Germany
| | - Helio R da Silva
- Departamento de Biologia Animal, Instituto de Biologia, Universidade Federal Rural do Rio de Janeiro, Caixa Postal 74524, 23851-970, Seropédica, Rio de Janeiro, Brazil
| | - Célio F B Haddad
- Departamento de Biodiversidade and Centro de Aquicultura, Instituto de Biociências, Universidade Estadual Paulista, Av. 24A 1515, 13506-900, Rio Claro, São Paulo, Brazil
| | - Julián Faivovich
- División Herpetología, Museo Argentino de Ciencias Naturales "Bernardino Rivadavia"-CONICET, Angel Gallardo 470, C1405DJR, Buenos Aires, Argentina.,Departamento de Biodiversidad y Biología Experimental, Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires, Buenos Aires, Argentina
| |
Collapse
|
30
|
Ferrer Obiol J, James HF, Chesser RT, Bretagnolle V, González-Solís J, Rozas J, Riutort M, Welch AJ. Integrating Sequence Capture and Restriction Site-Associated DNA Sequencing to Resolve Recent Radiations of Pelagic Seabirds. Syst Biol 2021; 70:976-996. [PMID: 33512506 PMCID: PMC8357341 DOI: 10.1093/sysbio/syaa101] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2020] [Revised: 11/13/2020] [Accepted: 12/15/2020] [Indexed: 01/01/2023] Open
Abstract
The diversification of modern birds has been shaped by a number of radiations. Rapid diversification events make reconstructing the evolutionary relationships among taxa challenging due to the convoluted effects of incomplete lineage sorting (ILS) and introgression. Phylogenomic data sets have the potential to detect patterns of phylogenetic incongruence, and to address their causes. However, the footprints of ILS and introgression on sequence data can vary between different phylogenomic markers at different phylogenetic scales depending on factors such as their evolutionary rates or their selection pressures. We show that combining phylogenomic markers that evolve at different rates, such as paired-end double-digest restriction site-associated DNA (PE-ddRAD) and ultraconserved elements (UCEs), allows a comprehensive exploration of the causes of phylogenetic discordance associated with short internodes at different timescales. We used thousands of UCE and PE-ddRAD markers to produce the first well-resolved phylogeny of shearwaters, a group of medium-sized pelagic seabirds that are among the most phylogenetically controversial and endangered bird groups. We found that phylogenomic conflict was mainly derived from high levels of ILS due to rapid speciation events. We also documented a case of introgression, despite the high philopatry of shearwaters to their breeding sites, which typically limits gene flow. We integrated state-of-the-art concatenated and coalescent-based approaches to expand on previous comparisons of UCE and RAD-Seq data sets for phylogenetics, divergence time estimation, and inference of introgression, and we propose a strategy to optimize RAD-Seq data for phylogenetic analyses. Our results highlight the usefulness of combining phylogenomic markers evolving at different rates to understand the causes of phylogenetic discordance at different timescales. [Aves; incomplete lineage sorting; introgression; PE-ddRAD-Seq; phylogenomics; radiations; shearwaters; UCEs.].
Collapse
Affiliation(s)
- Joan Ferrer Obiol
- Departament de Genètica, Microbiologia i Estadística, Facultat de Biologia, Universitat de Barcelona, Barcelona, Catalonia, Spain
- Institut de Recerca de la Biodiversitat (IRBio), Barcelona, Catalonia, Spain
| | - Helen F James
- Department of Vertebrate Zoology, National Museum of Natural History, Smithsonian Institution, Washington, DC, USA
| | - R Terry Chesser
- Department of Vertebrate Zoology, National Museum of Natural History, Smithsonian Institution, Washington, DC, USA
- U.S. Geological Survey, Patuxent Wildlife Research Center, Laurel, MD, USA
| | - Vincent Bretagnolle
- Centre d’Études Biologiques de Chizé, CNRS & La Rochelle Université, 79360, Villiers en Bois, France
| | - Jacob González-Solís
- Institut de Recerca de la Biodiversitat (IRBio), Barcelona, Catalonia, Spain
- Departament de Biologia Evolutiva, Ecologia i Ciències Ambientals, Facultat de Biologia, Universitat de Barcelona, Barcelona, Catalonia, Spain
| | - Julio Rozas
- Departament de Genètica, Microbiologia i Estadística, Facultat de Biologia, Universitat de Barcelona, Barcelona, Catalonia, Spain
- Institut de Recerca de la Biodiversitat (IRBio), Barcelona, Catalonia, Spain
| | - Marta Riutort
- Departament de Genètica, Microbiologia i Estadística, Facultat de Biologia, Universitat de Barcelona, Barcelona, Catalonia, Spain
- Institut de Recerca de la Biodiversitat (IRBio), Barcelona, Catalonia, Spain
| | | |
Collapse
|
31
|
Torres A, Goloboff PA, Catalano SA. Parsimony analysis of phylogenomic datasets (I): scripts and guidelines for using TNT (Tree Analysis using New Technology). Cladistics 2021; 38:103-125. [DOI: 10.1111/cla.12477] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/01/2021] [Indexed: 12/15/2022] Open
Affiliation(s)
- Ambrosio Torres
- Unidad Ejecutora Lillo Consejo Nacional de Investigaciones Científicas y Técnicas ‐ Fundación Miguel Lillo Miguel Lillo 251 S. M. de Tucumán Tucumán 4000 Argentina
| | - Pablo A. Goloboff
- Unidad Ejecutora Lillo Consejo Nacional de Investigaciones Científicas y Técnicas ‐ Fundación Miguel Lillo Miguel Lillo 251 S. M. de Tucumán Tucumán 4000 Argentina
- American Museum of Natural History 200 Central Park West New York NY 10024 USA
| | - Santiago A. Catalano
- Unidad Ejecutora Lillo Consejo Nacional de Investigaciones Científicas y Técnicas ‐ Fundación Miguel Lillo Miguel Lillo 251 S. M. de Tucumán Tucumán 4000 Argentina
- Facultad de Ciencias Naturales e Instituto Miguel Lillo Universidad Nacional de Tucumán Miguel Lillo 205 S. M. de Tucumán Tucumán 4000 Argentina
| |
Collapse
|
32
|
Ogilvie HA, Mendes FK, Vaughan TG, Matzke NJ, Stadler T, Welch D, Drummond AJ. Novel Integrative Modeling of Molecules and Morphology across Evolutionary Timescales. Syst Biol 2021; 71:208-220. [PMID: 34228807 PMCID: PMC8677526 DOI: 10.1093/sysbio/syab054] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2021] [Revised: 06/23/2021] [Accepted: 06/29/2021] [Indexed: 11/13/2022] Open
Abstract
Evolutionary models account for either population- or species-level processes but usually not both. We introduce a new model, the FBD-MSC, which makes it possible for the first time to integrate both the genealogical and fossilization phenomena, by means of the multispecies coalescent (MSC) and the fossilized birth–death (FBD) processes. Using this model, we reconstruct the phylogeny representing all extant and many fossil Caninae, recovering both the relative and absolute time of speciation events. We quantify known inaccuracy issues with divergence time estimates using the popular strategy of concatenating molecular alignments and show that the FBD-MSC solves them. Our new integrative method and empirical results advance the paradigm and practice of probabilistic total evidence analyses in evolutionary biology.[Caninae; fossilized birth–death; molecular clock; multispecies coalescent; phylogenetics; species trees.]
Collapse
Affiliation(s)
- Huw A Ogilvie
- Department of Computer Science, Rice University, Houston TX, 77005, USA
| | - Fábio K Mendes
- Centre for Computational Evolution, The University of Auckland, Auckland, 1010, New Zealand.,School of Biological Sciences, The University of Auckland, Auckland, 1010, New Zealand
| | - Timothy G Vaughan
- Department of Biosystems Science and Engineering, ETH Zürich, Basel, 4058, Switzerland.,SIB Swiss Institute of Bioinformatics, Lausanne, 1015, Switzerland
| | - Nicholas J Matzke
- Centre for Computational Evolution, The University of Auckland, Auckland, 1010, New Zealand.,School of Biological Sciences, The University of Auckland, Auckland, 1010, New Zealand
| | - Tanja Stadler
- Department of Biosystems Science and Engineering, ETH Zürich, Basel, 4058, Switzerland.,SIB Swiss Institute of Bioinformatics, Lausanne, 1015, Switzerland
| | - David Welch
- Centre for Computational Evolution, The University of Auckland, Auckland, 1010, New Zealand.,School of Computer Science, The University of Auckland, Auckland, 1010, New Zealand
| | - Alexei J Drummond
- Centre for Computational Evolution, The University of Auckland, Auckland, 1010, New Zealand.,School of Computer Science, The University of Auckland, Auckland, 1010, New Zealand.,School of Biological Sciences, The University of Auckland, Auckland, 1010, New Zealand
| |
Collapse
|
33
|
Sun C, Huang J, Wang Y, Zhao X, Su L, Thomas GWC, Zhao M, Zhang X, Jungreis I, Kellis M, Vicario S, Sharakhov IV, Bondarenko SM, Hasselmann M, Kim CN, Paten B, Penso-Dolfin L, Wang L, Chang Y, Gao Q, Ma L, Ma L, Zhang Z, Zhang H, Zhang H, Ruzzante L, Robertson HM, Zhu Y, Liu Y, Yang H, Ding L, Wang Q, Ma D, Xu W, Liang C, Itgen MW, Mee L, Cao G, Zhang Z, Sadd BM, Hahn MW, Schaack S, Barribeau SM, Williams PH, Waterhouse RM, Mueller RL. Genus-Wide Characterization of Bumblebee Genomes Provides Insights into Their Evolution and Variation in Ecological and Behavioral Traits. Mol Biol Evol 2021; 38:486-501. [PMID: 32946576 PMCID: PMC7826183 DOI: 10.1093/molbev/msaa240] [Citation(s) in RCA: 40] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
Bumblebees are a diverse group of globally important pollinators in natural ecosystems and for agricultural food production. With both eusocial and solitary life-cycle phases, and some social parasite species, they are especially interesting models to understand social evolution, behavior, and ecology. Reports of many species in decline point to pathogen transmission, habitat loss, pesticide usage, and global climate change, as interconnected causes. These threats to bumblebee diversity make our reliance on a handful of well-studied species for agricultural pollination particularly precarious. To broadly sample bumblebee genomic and phenotypic diversity, we de novo sequenced and assembled the genomes of 17 species, representing all 15 subgenera, producing the first genus-wide quantification of genetic and genomic variation potentially underlying key ecological and behavioral traits. The species phylogeny resolves subgenera relationships, whereas incomplete lineage sorting likely drives high levels of gene tree discordance. Five chromosome-level assemblies show a stable 18-chromosome karyotype, with major rearrangements creating 25 chromosomes in social parasites. Differential transposable element activity drives changes in genome sizes, with putative domestications of repetitive sequences influencing gene coding and regulatory potential. Dynamically evolving gene families and signatures of positive selection point to genus-wide variation in processes linked to foraging, diet and metabolism, immunity and detoxification, as well as adaptations for life at high altitudes. Our study reveals how bumblebee genes and genomes have evolved across the Bombus phylogeny and identifies variations potentially linked to key ecological and behavioral traits of these important pollinators.
Collapse
Affiliation(s)
- Cheng Sun
- Institute of Apicultural Research, Chinese Academy of Agricultural Sciences, Beijing, China
| | - Jiaxing Huang
- Institute of Apicultural Research, Chinese Academy of Agricultural Sciences, Beijing, China
| | - Yun Wang
- School of Life Sciences, Chongqing University, Chongqing, China
| | - Xiaomeng Zhao
- Institute of Apicultural Research, Chinese Academy of Agricultural Sciences, Beijing, China
| | - Long Su
- Institute of Apicultural Research, Chinese Academy of Agricultural Sciences, Beijing, China
| | - Gregg W C Thomas
- Division of Biological Sciences, University of Montana, Missoula, MT
| | - Mengya Zhao
- State Key Laboratory of Agricultural Microbiology, Huazhong Agricultural University, Wuhan, China
| | - Xingtan Zhang
- Fujian Provincial Key Laboratory of Haixia Applied Plant Systems Biology, Fujian Agriculture and Forestry University, Fuzhou, China
| | - Irwin Jungreis
- MIT Computer Science and Artificial Intelligence Laboratory, Cambridge, MA.,Broad Institute of MIT and Harvard, Cambridge, MA
| | - Manolis Kellis
- MIT Computer Science and Artificial Intelligence Laboratory, Cambridge, MA.,Broad Institute of MIT and Harvard, Cambridge, MA
| | - Saverio Vicario
- Institute of Atmospheric Pollution Research-Italian National Research Council C/O Department of Physics, University of Bari, Bari, Italy
| | - Igor V Sharakhov
- Department of Entomology, Virginia Polytechnic and State University, Blacksburg, VA.,Department of Cytology and Genetics, Tomsk State University, Tomsk, Russian Federation
| | - Semen M Bondarenko
- Department of Entomology, Virginia Polytechnic and State University, Blacksburg, VA
| | - Martin Hasselmann
- Department of Livestock Population Genomics, Institute of Animal Science, University of Hohenheim, Stuttgart, Germany
| | - Chang N Kim
- UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA
| | - Benedict Paten
- UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA
| | | | - Li Wang
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China
| | - Yuxiao Chang
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China
| | - Qiang Gao
- BGI Genomics, BGI-Shenzhen, Shenzhen, China
| | - Ling Ma
- BGI Genomics, BGI-Shenzhen, Shenzhen, China
| | - Lina Ma
- China National Center for Bioinformation & Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China
| | - Zhang Zhang
- China National Center for Bioinformation & Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China
| | - Hongbo Zhang
- School of Life Sciences, Chongqing University, Chongqing, China
| | - Huahao Zhang
- College of Pharmacy and Life Science, Jiujiang University, Jiujiang, China
| | - Livio Ruzzante
- Department of Ecology and Evolution, University of Lausanne, and Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Hugh M Robertson
- Department of Entomology, University of Illinois at Urbana-Champaign, Champaign, IL
| | - Yihui Zhu
- Department of Medical Microbiology and Immunology, Genome Center, and MIND Institute, University of California Davis, Davis, CA
| | - Yanjie Liu
- Institute of Apicultural Research, Chinese Academy of Agricultural Sciences, Beijing, China
| | - Huipeng Yang
- Institute of Apicultural Research, Chinese Academy of Agricultural Sciences, Beijing, China
| | - Lele Ding
- Institute of Apicultural Research, Chinese Academy of Agricultural Sciences, Beijing, China
| | - Quangui Wang
- Institute of Apicultural Research, Chinese Academy of Agricultural Sciences, Beijing, China
| | - Dongna Ma
- Fujian Provincial Key Laboratory of Haixia Applied Plant Systems Biology, Fujian Agriculture and Forestry University, Fuzhou, China
| | - Weilin Xu
- Institute of Apicultural Research, Chinese Academy of Agricultural Sciences, Beijing, China
| | - Cheng Liang
- Institute of Sericultural and Apiculture, Yunnan Academy of Agricultural Sciences, Mengzi, China
| | - Michael W Itgen
- Department of Biology, Colorado State University, Fort Collins, CO
| | - Lauren Mee
- Department of Ecology, Evolution and Behaviour, Institute of Integrative Biology, University of Liverpool, Liverpool, United Kingdom
| | - Gang Cao
- State Key Laboratory of Agricultural Microbiology, Huazhong Agricultural University, Wuhan, China
| | - Ze Zhang
- School of Life Sciences, Chongqing University, Chongqing, China
| | - Ben M Sadd
- School of Biological Sciences, Illinois State University, Normal, IL
| | - Matthew W Hahn
- Department of Biology, Indiana University, Bloomington, IN.,Department of Computer Science, Indiana University, Bloomington, IN
| | | | - Seth M Barribeau
- Department of Ecology, Evolution and Behaviour, Institute of Integrative Biology, University of Liverpool, Liverpool, United Kingdom
| | - Paul H Williams
- Department of Life Sciences, Natural History Museum, London, United Kingdom
| | - Robert M Waterhouse
- Department of Ecology and Evolution, University of Lausanne, and Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | | |
Collapse
|
34
|
Talavera G, Lukhtanov V, Pierce NE, Vila R. DNA barcodes combined with multi-locus data of representative taxa can generate reliable higher-level phylogenies. Syst Biol 2021; 71:382-395. [PMID: 34022059 PMCID: PMC8830075 DOI: 10.1093/sysbio/syab038] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2019] [Revised: 05/13/2021] [Accepted: 05/25/2021] [Indexed: 12/04/2022] Open
Abstract
Taxa are frequently labeled incertae sedis when their placement is debated at ranks above the species level, such as their subgeneric, generic, or subtribal placement. This is a pervasive problem in groups with complex systematics due to difficulties in identifying suitable synapomorphies. In this study, we propose combining DNA barcodes with a multilocus backbone phylogeny in order to assign taxa to genus or other higher-level categories. This sampling strategy generates molecular matrices containing large amounts of missing data that are not distributed randomly: barcodes are sampled for all representatives, and additional markers are sampled only for a small percentage. We investigate the effects of the degree and randomness of missing data on phylogenetic accuracy using simulations for up to 100 markers in 1000-tips trees, as well as a real case: the subtribe Polyommatina (Lepidoptera: Lycaenidae), a large group including numerous species with unresolved taxonomy. Our simulation tests show that when a strategic and representative selection of species for higher-level categories has been made for multigene sequencing (approximately one per simulated genus), the addition of this multigene backbone DNA data for as few as 5–10% of the specimens in the total data set can produce high-quality phylogenies, comparable to those resulting from 100% multigene sampling. In contrast, trees based exclusively on barcodes performed poorly. This approach was applied to a 1365-specimen data set of Polyommatina (including ca. 80% of described species), with nearly 8% of representative species included in the multigene backbone and the remaining 92% included only by mitochondrial COI barcodes, a phylogeny was generated that highlighted potential misplacements, unrecognized major clades, and placement for incertae sedis taxa. We use this information to make systematic rearrangements within Polyommatina, and to describe two new genera. Finally, we propose a systematic workflow to assess higher-level taxonomy in hyperdiverse groups. This research identifies an additional, enhanced value of DNA barcodes for improvements in higher-level systematics using large data sets. [Birabiro; DNA barcoding; incertae sedis; Kipepeo; Lycaenidae; missing data; phylogenomic; phylogeny; Polyommatina; supermatrix; systematics; taxonomy]
Collapse
Affiliation(s)
- Gerard Talavera
- Institut Botànic de Barcelona (IBB, CSIC-Ajuntament de Barcelona), Passeig del Migdia s/n, 08038 Barcelona, Catalonia, Spain.,Department of Organismic and Evolutionary Biology and Museum of Comparative Zoology, Harvard University, 26 Oxford Street, Cambridge, MA 02138, United States
| | - Vladimir Lukhtanov
- Department of Karyosystematics, Zoological Institute of Russian Academy of Sciences, Universitetskaya nab. 1, 199034 St. Petersburg, Russia
| | - Naomi E Pierce
- Department of Organismic and Evolutionary Biology and Museum of Comparative Zoology, Harvard University, 26 Oxford Street, Cambridge, MA 02138, United States
| | - Roger Vila
- Institut de Biologia Evolutiva (CSIC-UPF), Passeig Marítim de la Barceloneta, 08003 Barcelona, Catalonia, Spain
| |
Collapse
|
35
|
Harrington RC, Friedman M, Miya M, Near TJ, Campbell MA. Phylogenomic resolution of the monotypic and enigmatic
Amarsipus
, the Bagless Glassfish (Teleostei, Amarsipidae). ZOOL SCR 2021. [DOI: 10.1111/zsc.12477] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Affiliation(s)
| | - Matt Friedman
- Museum of Paleontology and Department of Earth and Environmental Sciences University of Michigan Ann Arbor MIUSA
| | - Masaki Miya
- Natural History Museum and Institute, Chiba Chiba Japan
| | - Thomas J. Near
- Department of Ecology and Evolutionary Biology Yale University New Haven CTUSA
- Peabody Museum Yale University New Haven CTUSA
| | | |
Collapse
|
36
|
Minh BQ, Hahn MW, Lanfear R. New Methods to Calculate Concordance Factors for Phylogenomic Datasets. Mol Biol Evol 2021; 37:2727-2733. [PMID: 32365179 PMCID: PMC7475031 DOI: 10.1093/molbev/msaa106] [Citation(s) in RCA: 237] [Impact Index Per Article: 79.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
We implement two measures for quantifying genealogical concordance in phylogenomic data sets: the gene concordance factor (gCF) and the novel site concordance factor (sCF). For every branch of a reference tree, gCF is defined as the percentage of "decisive" gene trees containing that branch. This measure is already in wide usage, but here we introduce a package that calculates it while accounting for variable taxon coverage among gene trees. sCF is a new measure defined as the percentage of decisive sites supporting a branch in the reference tree. gCF and sCF complement classical measures of branch support in phylogenetics by providing a full description of underlying disagreement among loci and sites. An easy to use implementation and tutorial is freely available in the IQ-TREE software package (http://www.iqtree.org/doc/Concordance-Factor, last accessed May 13, 2020).
Collapse
Affiliation(s)
- Bui Quang Minh
- Research School of Computer Science, Australian National University, Canberra, ACT, Australia.,Department of Ecology and Evolution, Research School of Biology, Australian National University, Canberra, ACT, Australia
| | - Matthew W Hahn
- Department of Biology, Indiana University, Bloomington, IN.,Department of Computer Science, Indiana University, Bloomington, IN
| | - Robert Lanfear
- Department of Ecology and Evolution, Research School of Biology, Australian National University, Canberra, ACT, Australia
| |
Collapse
|
37
|
Assessing topological congruence among concatenation-based phylogenomic approaches in empirical datasets. Mol Phylogenet Evol 2021; 161:107086. [PMID: 33609710 DOI: 10.1016/j.ympev.2021.107086] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2020] [Revised: 09/25/2020] [Accepted: 01/22/2021] [Indexed: 10/22/2022]
Abstract
Assessing the effect of methodological decisions on the resulting hypotheses is critical in phylogenetics. Recent studies have focused on evaluating how model selection, orthology definition and confounding factors affect phylogenomic results. Here, we compare the results of three concatenated phylogenetic methods (Maximum Likelihood, ML; Bayesian Inference, BI; Maximum Parsimony, MP) in 157 empirical phylogenomic datasets. The resulting trees were very similar, with 96.7% of all nodes shared between BI and ML (90.6% for ML-MP and 89.1% for BI-MP). Differing nodes were predominantly those of lower support. The main conclusions of most of the studies agreed for the three phylogenetic methods and the discordance involved nodes considered as recalcitrant problems in systematics. The differences between methods were proportionally larger in datasets that analyze the relationships at higher taxonomic levels (particularly phyla and kingdoms), and independent of the number of characters included in the datasets. Note: a spanish version of this article is available in the Supplementary material (Supplementary material online).
Collapse
|
38
|
Pavón-Vázquez CJ, Brennan IG, Keogh JS. A Comprehensive Approach to Detect Hybridization Sheds Light on the Evolution of Earth's Largest Lizards. Syst Biol 2021; 70:877-890. [PMID: 33512509 DOI: 10.1093/sysbio/syaa102] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2020] [Revised: 12/20/2020] [Accepted: 12/21/2020] [Indexed: 11/14/2022] Open
Abstract
Hybridization between species occurs more frequently in vertebrates than traditionally thought but distinguishing ancient hybridization from other phenomena that generate similar evolutionary patterns remains challenging. Here, we used a comprehensive workflow to discover evidence of ancient hybridization between the Komodo dragon (Varanus komodoensis) from Indonesia and a common ancestor of an Australian group of monitor lizards known colloquially as sand monitors. Our data comprises >300 nuclear loci, mitochondrial genomes, phenotypic data, fossil and contemporary records, and past/present climatic data. We show that the four sand monitor species share more nuclear alleles with V. komodoensis than expected given a bifurcating phylogeny, likely as a result of hybridization between the latter species and a common ancestor of sand monitors. Sand monitors display phenotypes that are intermediate between their closest relatives and V. komodoensis. Biogeographic analyses suggest that V. komodoensis and ancestral sand monitors co-occurred in northern Australia. In agreement with the fossil record, this provides further evidence that the Komodo dragon once inhabited the Australian continent. Our study shows how different sources of evidence can be used to thoroughly characterize evolutionary histories that deviate from a treelike pattern, that hybridization can have long-lasting effects on phenotypes and that detecting hybridization can improve our understanding of evolutionary and biogeographic patterns.
Collapse
Affiliation(s)
- Carlos J Pavón-Vázquez
- Division of Ecology and Evolution, Research School of Biology, The Australian National University, Canberra, Australian Capital Territory 2601, Australia
| | - Ian G Brennan
- Division of Ecology and Evolution, Research School of Biology, The Australian National University, Canberra, Australian Capital Territory 2601, Australia
| | - J Scott Keogh
- Division of Ecology and Evolution, Research School of Biology, The Australian National University, Canberra, Australian Capital Territory 2601, Australia
| |
Collapse
|
39
|
Mavengere H, Mattox K, Teixeira MM, Sepúlveda VE, Gomez OM, Hernandez O, McEwen J, Matute DR. Paracoccidioides Genomes Reflect High Levels of Species Divergence and Little Interspecific Gene Flow. mBio 2020; 11:e01999-20. [PMID: 33443110 PMCID: PMC8534288 DOI: 10.1128/mbio.01999-20] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2020] [Accepted: 10/27/2020] [Indexed: 12/30/2022] Open
Abstract
The fungus Paracoccidioides is a prevalent human pathogen endemic to South America. The genus is composed of five species. In this report, we use 37 whole-genome sequences to study the allocation of genetic variation in Paracoccidioides We tested three genome-wide predictions of advanced speciation, namely, that all species should be reciprocally monophyletic, that species pairs should be highly differentiated along the whole genome, and that there should be low rates of interspecific gene exchange. We find support for these three hypotheses. Species pairs with older divergences show no evidence of gene exchange, while more recently diverged species pairs show evidence of modest rates of introgression. Our results indicate that as divergence progresses, species boundaries become less porous among Paracoccidioides species. Our results suggest that species in Paracoccidioides are at different stages along the divergence continuum.IMPORTANCEParacoccidioides is the causal agent of a systemic mycosis in Latin America. Most of the inference of the evolutionary history of Paracoccidioides has used only a few molecular markers. In this report, we evaluate the extent of genome divergence among Paracoccidioides species and study the possibility of interspecific gene exchange. We find that all species are highly differentiated. We also find that the amount of gene flow between species is low and in some cases is even completely absent in spite of geographic overlap. Our study constitutes a systematic effort to identify species boundaries in fungal pathogens and to determine the extent of gene exchange among fungal species.
Collapse
Affiliation(s)
- Heidi Mavengere
- Biology Department, University of North Carolina, Chapel Hill, North Carolina, USA
| | - Kathleen Mattox
- Biology Department, University of North Carolina, Chapel Hill, North Carolina, USA
| | - Marcus M Teixeira
- Núcleo de Medicina Tropical, Faculdade de Medicina, University of Brasília, Brasília, Brazil
| | - Victoria E Sepúlveda
- Department of Microbiology and Immunology, University of North Carolina, Chapel Hill, North Carolina, USA
| | - Oscar M Gomez
- Cellular and Molecular Biology Unit, Corporación para Investigaciones Biológicas, Medellín, Colombia
| | - Orville Hernandez
- Cellular and Molecular Biology Unit, Corporación para Investigaciones Biológicas, Medellín, Colombia
- MICROBA Research Group, School of Microbiology, Universidad de Antioquia, Medellín, Colombia
| | - Juan McEwen
- Cellular and Molecular Biology Unit, Corporación para Investigaciones Biológicas, Medellín, Colombia
- School of Medicine, Universidad de Antioquia, Medellín, Colombia
| | - Daniel R Matute
- Biology Department, University of North Carolina, Chapel Hill, North Carolina, USA
| |
Collapse
|
40
|
Chan KO, Hutter CR, Wood PL, Grismer LL, Brown RM. Target-capture phylogenomics provide insights on gene and species tree discordances in Old World treefrogs (Anura: Rhacophoridae). Proc Biol Sci 2020; 287:20202102. [PMID: 33290680 PMCID: PMC7739936 DOI: 10.1098/rspb.2020.2102] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2020] [Accepted: 11/13/2020] [Indexed: 11/12/2022] Open
Abstract
Genome-scale data have greatly facilitated the resolution of recalcitrant nodes that Sanger-based datasets have been unable to resolve. However, phylogenomic studies continue to use traditional methods such as bootstrapping to estimate branch support; and high bootstrap values are still interpreted as providing strong support for the correct topology. Furthermore, relatively little attention has been given to assessing discordances between gene and species trees, and the underlying processes that produce phylogenetic conflict. We generated novel genomic datasets to characterize and determine the causes of discordance in Old World treefrogs (Family: Rhacophoridae)-a group that is fraught with conflicting and poorly supported topologies among major clades. Additionally, a suite of data filtering strategies and analytical methods were applied to assess their impact on phylogenetic inference. We showed that incomplete lineage sorting was detected at all nodes that exhibited high levels of discordance. Those nodes were also associated with extremely short internal branches. We also clearly demonstrate that bootstrap values do not reflect uncertainty or confidence for the correct topology and, hence, should not be used as a measure of branch support in phylogenomic datasets. Overall, we showed that phylogenetic discordances in Old World treefrogs resulted from incomplete lineage sorting and that species tree inference can be improved using a multi-faceted, total-evidence approach, which uses the most amount of data and considers results from different analytical methods and datasets.
Collapse
Affiliation(s)
- Kin Onn Chan
- Lee Kong Chian Natural History Museum, National University of Singapore, 2 Conservatory Drive, Singapore 117377, Republic of Singapore
| | - Carl R. Hutter
- Museum of Natural Sciences and Department of Biological Sciences, Louisiana State University, Baton Rouge, LA 70803, USA
| | - Perry L. Wood
- Department of Biological Sciences and Museum of Natural History, Auburn University, Auburn, AL 36849, USA
| | - L. Lee Grismer
- Herpetology Laboratory, Department of Biology, La Sierra University, Riverside, CA 92505, USA
| | - Rafe M. Brown
- Biodiversity Institute and Department of Ecology and Evolutionary Biology, University of Kansas, Lawrence, KS 66045, USA
| |
Collapse
|
41
|
Primate phylogenomics uncovers multiple rapid radiations and ancient interspecific introgression. PLoS Biol 2020; 18:e3000954. [PMID: 33270638 PMCID: PMC7738166 DOI: 10.1371/journal.pbio.3000954] [Citation(s) in RCA: 48] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2020] [Revised: 12/15/2020] [Accepted: 11/02/2020] [Indexed: 12/17/2022] Open
Abstract
Our understanding of the evolutionary history of primates is undergoing continual revision due to ongoing genome sequencing efforts. Bolstered by growing fossil evidence, these data have led to increased acceptance of once controversial hypotheses regarding phylogenetic relationships, hybridization and introgression, and the biogeographical history of primate groups. Among these findings is a pattern of recent introgression between species within all major primate groups examined to date, though little is known about introgression deeper in time. To address this and other phylogenetic questions, here, we present new reference genome assemblies for 3 Old World monkey (OWM) species: Colobus angolensis ssp. palliatus (the black and white colobus), Macaca nemestrina (southern pig-tailed macaque), and Mandrillus leucophaeus (the drill). We combine these data with 23 additional primate genomes to estimate both the species tree and individual gene trees using thousands of loci. While our species tree is largely consistent with previous phylogenetic hypotheses, the gene trees reveal high levels of genealogical discordance associated with multiple primate radiations. We use strongly asymmetric patterns of gene tree discordance around specific branches to identify multiple instances of introgression between ancestral primate lineages. In addition, we exploit recent fossil evidence to perform fossil-calibrated molecular dating analyses across the tree. Taken together, our genome-wide data help to resolve multiple contentious sets of relationships among primates, while also providing insight into the biological processes and technical artifacts that led to the disagreements in the first place. Combining three newly sequenced primate genomes with other published genomes, this study adapts a little-known method for detecting ancient introgression to genome-scale data, revealing multiple previously unknown examples of hybridization between primate species.
Collapse
|
42
|
Phylogenomics of manakins (Aves: Pipridae) using alternative locus filtering strategies based on informativeness. Mol Phylogenet Evol 2020; 155:107013. [PMID: 33217578 DOI: 10.1016/j.ympev.2020.107013] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2019] [Revised: 11/07/2020] [Accepted: 11/11/2020] [Indexed: 01/11/2023]
Abstract
Target capture sequencing effectively generates molecular marker arrays useful for molecular systematics. These extensive data sets are advantageous where previous studies using a few loci have failed to resolve relationships confidently. Moreover, target capture is well-suited to fragmented source DNA, allowing data collection from species that lack fresh tissues. Herein we use target capture to generate data for a phylogeny of the avian family Pipridae (manakins), a group that has been the subject of many behavioral and ecological studies. Most manakin species feature lek mating systems, where males exhibit complex behavioral displays including mechanical and vocal sounds, coordinated movements of multiple males, and high speed movements. We analyzed thousands of ultraconserved element (UCE) loci along with a smaller number of coding exons and their flanking regions from all but one species of Pipridae. We examined three different methods of phylogenetic estimation (concatenation and two multispecies coalescent methods). Phylogenetic inferences using UCE data yielded strongly supported estimates of phylogeny regardless of analytical method. Exon probes had limited capability to capture sequence data and resulted in phylogeny estimates with reduced support and modest topological differences relative to the UCE trees, although these conflicts had limited support. Two genera were paraphyletic among all analyses and data sets, with Antilophia nested within Chiroxiphia and Tyranneutes nested within Neopelma. The Chiroxiphia-Antilophia clade was an exception to the generally high support we observed; the topology of this clade differed among analyses, even those based on UCE data. To further explore relationships within this group, we employed two filtering strategies to remove low-information loci. Those analyses resulted in distinct topologies, suggesting that the relationships we identified within Chiroxiphia-Antilophia should be interpreted with caution. Despite the existence of a few continuing uncertainties, our analyses resulted in a robust phylogenetic hypothesis of the family Pipridae that provides a comparative framework for future ecomorphological and behavioral studies.
Collapse
|
43
|
Chan KO, Hutter CR, Wood PL, Grismer LL, Das I, Brown RM. Gene flow creates a mirage of cryptic species in a Southeast Asian spotted stream frog complex. Mol Ecol 2020; 29:3970-3987. [PMID: 32808335 DOI: 10.1111/mec.15603] [Citation(s) in RCA: 29] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2019] [Revised: 07/29/2020] [Accepted: 08/13/2020] [Indexed: 02/06/2023]
Abstract
Most new cryptic species are described using conventional tree- and distance-based species delimitation methods (SDMs), which rely on phylogenetic arrangements and measures of genetic divergence. However, although numerous factors such as population structure and gene flow are known to confound phylogenetic inference and species delimitation, the influence of these processes is not frequently evaluated. Using large numbers of exons, introns, and ultraconserved elements obtained using the FrogCap sequence-capture protocol, we compared conventional SDMs with more robust genomic analyses that assess population structure and gene flow to characterize species boundaries in a Southeast Asian frog complex (Pulchrana picturata). Our results showed that gene flow and introgression can produce phylogenetic patterns and levels of divergence that resemble distinct species (up to 10% divergence in mitochondrial DNA). Hybrid populations were inferred as independent (singleton) clades that were highly divergent from adjacent populations (7%-10%) and unusually similar (<3%) to allopatric populations. Such anomalous patterns are not uncommon in Southeast Asian amphibians, which brings into question whether the high levels of cryptic diversity observed in other amphibian groups reflect distinct cryptic species-or, instead, highly admixed and structured metapopulation lineages. Our results also provide an alternative explanation to the conundrum of divergent (sometimes nonsister) sympatric lineages-a pattern that has been celebrated as indicative of true cryptic speciation. Based on these findings, we recommend that species delimitation of continuously distributed "cryptic" groups should not rely solely on conventional SDMs, but should necessarily examine population structure and gene flow to avoid taxonomic inflation.
Collapse
Affiliation(s)
- Kin O Chan
- Lee Kong Chian National History Museum, Faculty of Science, National University of Singapore, Singapore
| | - Carl R Hutter
- Biodiversity Institute and Department of Ecology and Evolutionary Biology, University of Kansas, Lawrence, KS, USA.,Museum of Natural Sciences and Department of Biological Sciences, Louisiana State University, Baton Rouge, LA, USA
| | - Perry L Wood
- Biodiversity Institute and Department of Ecology and Evolutionary Biology, University of Kansas, Lawrence, KS, USA.,Department of Biological Sciences & Museum of Natural History, Auburn University, Auburn, AL, USA
| | - L L Grismer
- Herpetology Laboratory, Department of Biology, La Sierra University, Riverside, CA, USA
| | - Indraneil Das
- Institute of Biodiversity and Environmental Conservation, Universiti Malaysia Sarawak, Kota Samarahan, Sarawak, Malaysia
| | - Rafe M Brown
- Biodiversity Institute and Department of Ecology and Evolutionary Biology, University of Kansas, Lawrence, KS, USA
| |
Collapse
|
44
|
Gaboriau T, Mendes FK, Joly S, Silvestro D, Salamin N. A multi‐platform package for the analysis of intra‐ and interspecific trait evolution. Methods Ecol Evol 2020. [DOI: 10.1111/2041-210x.13458] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- Théo Gaboriau
- Department of Computational Biology University of Lausanne Lausanne Switzerland
| | - Fábio K. Mendes
- School of Computer Science The University of Auckland Auckland New Zealand
- School of Biological Sciences The University of Auckland Auckland New Zealand
| | - Simon Joly
- Institut Recherche en Biologie Végétale Montréal QC Canada
- Montreal Botanical Garden Montreal QC Canada
| | - Daniele Silvestro
- Department of Biology University of Fribourg Fribourg Switzerland
- Department of Biological and Environmental Sciences University of Gothenburg and Global Gothenburg Biodiversity Centre Gothenburg Sweden
| | - Nicolas Salamin
- Department of Computational Biology University of Lausanne Lausanne Switzerland
| |
Collapse
|
45
|
Chan KO, Hutter CR, Wood PL, Grismer LL, Brown RM. Larger, unfiltered datasets are more effective at resolving phylogenetic conflict: Introns, exons, and UCEs resolve ambiguities in Golden-backed frogs (Anura: Ranidae; genus Hylarana). Mol Phylogenet Evol 2020; 151:106899. [PMID: 32590046 DOI: 10.1016/j.ympev.2020.106899] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2020] [Revised: 05/18/2020] [Accepted: 06/17/2020] [Indexed: 01/01/2023]
Abstract
Using FrogCap, a recently-developed sequence-capture protocol, we obtained >12,000 highly informative exons, introns, and ultraconserved elements (UCEs), which we used to illustrate variation in evolutionary histories of these classes of markers, and to resolve long-standing systematic problems in Southeast Asian Golden-backed frogs of the genus-complex Hylarana. We also performed a comprehensive suite of analyses to assess the relative performance of different genetic markers, data filtering strategies, tree inference methods, and different measures of branch support. To reduce gene tree estimation error, we filtered the data using different thresholds of taxon completeness (missing data) and parsimony informative sites (PIS). We then estimated species trees using concatenated datasets and Maximum Likelihood (IQ-TREE) in addition to summary (ASTRAL-III), distance-based (ASTRID), and site-based (SVDQuartets) multispecies coalescent methods. Topological congruence and branch support were examined using traditional bootstrap, local posterior probabilities, gene concordance factors, quartet frequencies, and quartet scores. Our results did not yield a single concordant topology. Instead, introns, exons, and UCEs clearly possessed different phylogenetic signals, resulting in conflicting, yet strongly-supported phylogenetic estimates. However, a combined analysis comprising the most informative introns, exons, and UCEs converged on a similar topology across all analyses, with the exception of SVDQuartets. Bootstrap values were consistently high despite high levels of incongruence and high proportions of gene trees supporting conflicting topologies. Although low bootstrap values did indicate low heuristic support, high bootstrap support did not necessarily reflect congruence or support for the correct topology. This study reiterates findings of some previous studies, which demonstrated that traditional bootstrap values can produce positively misleading measures of support in large phylogenomic datasets. We also showed a remarkably strong positive relationship between branch length and topological congruence across all datasets, implying that very short internodes remain a challenge to resolve, even with orders of magnitude more data than ever before. Overall, our results demonstrate that more data from unfiltered or combined datasets produced superior results. Although data filtering reduced gene tree incongruence, decreased amounts of data also biased phylogenetic estimation. A point of diminishing returns was evident, at which higher congruence (from more stringent filtering) at the expense of amount of data led to topological error as assessed by comparison to more complete datasets across different genomic markers. Additionally, we showed that applying a parameter-rich model to a partitioned analysis of concatenated data produces better results compared to unpartitioned, or even partitioned analysis using model selection. Despite some lingering uncertainties, a combined analysis of our genomic data and sequences supplemented from GenBank (on the basis of a few gene regions) revealed highly supported novel systematic arrangements. Based on these new findings, we transfer Amnirana nicobariensis into the genus Indosylvirana; and I. milleti and Hylarana celebensis to the genus Papurana. We also provisionally place H. attigua in the genus Papurana pending verification from positively identified (voucher substantiated) samples.
Collapse
Affiliation(s)
- Kin Onn Chan
- Lee Kong Chian National History Museum, Faculty of Science, National University of Singapore, 2 Conservatory Drive, 117377, Singapore.
| | - Carl R Hutter
- Biodiversity Institute and Department of Ecology and Evolutionary Biology, University of Kansas, Lawrence, KS 66045, USA; Museum of Natural Sciences and Department of Biological Sciences, Louisiana State University, Baton Rouge, LA 70803, USA
| | - Perry L Wood
- Museum of Natural Sciences and Department of Biological Sciences, Louisiana State University, Baton Rouge, LA 70803, USA; Department of Biological Sciences & Museum of Natural History, Auburn University, Auburn, AL 36849, USA
| | - L Lee Grismer
- Herpetology Laboratory, Department of Biology, La Sierra University, 4500 Riverwalk Parkway, Riverside, CA 92505, USA
| | - Rafe M Brown
- Biodiversity Institute and Department of Ecology and Evolutionary Biology, University of Kansas, Lawrence, KS 66045, USA
| |
Collapse
|
46
|
Reyes-Velasco J, Adams RH, Boissinot S, Parkinson CL, Campbell JA, Castoe TA, Smith EN. Genome-wide SNPs clarify lineage diversity confused by coloration in coralsnakes of the Micrurus diastema species complex (Serpentes: Elapidae). Mol Phylogenet Evol 2020; 147:106770. [PMID: 32084510 DOI: 10.1016/j.ympev.2020.106770] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2019] [Revised: 02/06/2020] [Accepted: 02/14/2020] [Indexed: 01/04/2023]
Abstract
New world coralsnakes of the genus Micrurus are a diverse radiation of highly venomous and brightly colored snakes that range from North Carolina to Argentina. Species in this group have played central roles in developing and testing hypotheses about the evolution of mimicry and aposematism. Despite their diversity and prominence as model systems, surprisingly little is known about species boundaries and phylogenetic relationships within Micrurus, which has substantially hindered meaningful analyses of their evolutionary history. Here we use mitochondrial genes together with thousands of nuclear genomic loci obtained via ddRADseq to study the phylogenetic relationships and population genomics of a subclade of the genus Micrurus: The M. diastema species complex. Our results indicate that prior species and species-group inferences based on morphology and color pattern have grossly misguided taxonomy, and that the M. diastema complex is not monophyletic. Based on our analyses of molecular data, we infer the phylogenetic relationships among species and populations, and provide a revised taxonomy for the group. Two non-sister species-complexes with similar color patterns are recognized, the M. distans and the M. diastema complexes, the first being basal to the monadal Micrurus and the second encompassing most North American monadal taxa. We examined all 13 species, and their respective subspecies, for a total of 24 recognized taxa in the M. diastema species complex. Our analyses suggest a reduction to 10 species, with no subspecific designations warranted, to be a more likely estimate of species diversity, namely, M. apiatus, M. browni, M. diastema, M. distans, M. ephippifer, M. fulvius, M. michoacanensis, M. oliveri, M. tener, and one undescribed species.
Collapse
Affiliation(s)
- Jacobo Reyes-Velasco
- Department of Biology, University of Texas at Arlington, 501 S. Nedderman Drive, 337 Life Science, Arlington, TX 76010, USA; New York University Abu Dhabi, Saadiyat Island, Abu Dhabi, United Arab Emirates
| | - Richard H Adams
- Department of Biology, University of Texas at Arlington, 501 S. Nedderman Drive, 337 Life Science, Arlington, TX 76010, USA
| | - Stephane Boissinot
- New York University Abu Dhabi, Saadiyat Island, Abu Dhabi, United Arab Emirates
| | - Christopher L Parkinson
- Department of Biological Sciences and Department of Forestry and Environmental Conservation, Clemson University, 190 Collins St., Clemson, SC 29634, USA
| | - Jonathan A Campbell
- Department of Biology, University of Texas at Arlington, 501 S. Nedderman Drive, 337 Life Science, Arlington, TX 76010, USA
| | - Todd A Castoe
- Department of Biology, University of Texas at Arlington, 501 S. Nedderman Drive, 337 Life Science, Arlington, TX 76010, USA
| | - Eric N Smith
- Department of Biology, University of Texas at Arlington, 501 S. Nedderman Drive, 337 Life Science, Arlington, TX 76010, USA.
| |
Collapse
|
47
|
Prasanna AN, Gerber D, Kijpornyongpan T, Aime MC, Doyle VP, Nagy LG. Model Choice, Missing Data, and Taxon Sampling Impact Phylogenomic Inference of Deep Basidiomycota Relationships. Syst Biol 2020; 69:17-37. [PMID: 31062852 DOI: 10.1093/sysbio/syz029] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2017] [Revised: 04/21/2019] [Accepted: 04/26/2019] [Indexed: 11/12/2022] Open
Abstract
Resolving deep divergences in the tree of life is challenging even for analyses of genome-scale phylogenetic data sets. Relationships between Basidiomycota subphyla, the rusts and allies (Pucciniomycotina), smuts and allies (Ustilaginomycotina), and mushroom-forming fungi and allies (Agaricomycotina) were found particularly recalcitrant both to traditional multigene and genome-scale phylogenetics. Here, we address basal Basidiomycota relationships using concatenated and gene tree-based analyses of various phylogenomic data sets to examine the contribution of several potential sources of bias. We evaluate the contribution of biological causes (hard polytomy, incomplete lineage sorting) versus unmodeled evolutionary processes and factors that exacerbate their effects (e.g., fast-evolving sites and long-branch taxa) to inferences of basal Basidiomycota relationships. Bayesian Markov Chain Monte Carlo and likelihood mapping analyses reject the hard polytomy with confidence. In concatenated analyses, fast-evolving sites and oversimplified models of amino acid substitution favored the grouping of smuts with mushroom-forming fungi, often leading to maximal bootstrap support in both concatenation and coalescent analyses. On the contrary, the most conserved data subsets grouped rusts and allies with mushroom-forming fungi, although this relationship proved labile, sensitive to model choice, to different data subsets and to missing data. Excluding putative long-branch taxa, genes with high proportions of missing data and/or with strong signal failed to reveal a consistent trend toward one or the other topology, suggesting that additional sources of conflict are at play. While concatenated analyses yielded strong but conflicting support, individual gene trees mostly provided poor support for any resolution of rusts, smuts, and mushroom-forming fungi, suggesting that the true Basidiomycota tree might be in a part of tree space that is difficult to access using both concatenation and gene tree-based approaches. Inference-based assessments of absolute model fit strongly reject best-fit models for the vast majority of genes, indicating a poor fit of even the most commonly used models. While this is consistent with previous assessments of site-homogenous models of amino acid evolution, this does not appear to be the sole source of confounding signal. Our analyses suggest that topologies uniting smuts with mushroom-forming fungi can arise as a result of inappropriate modeling of amino acid sites that might be prone to systematic bias. We speculate that improved models of sequence evolution could shed more light on basal splits in the Basidiomycota, which, for now, remain unresolved despite the use of whole genome data.
Collapse
Affiliation(s)
- Arun N Prasanna
- Synthetic and Systems Biology Unit, Institute of Biochemistry, BRC-HAS, Szeged 6726, Hungary
| | - Daniel Gerber
- Synthetic and Systems Biology Unit, Institute of Biochemistry, BRC-HAS, Szeged 6726, Hungary.,Institute of Archaeology, Research Centre for the Humanities, Hungarian Academy of Sciences, Budapest 1097, Hungary
| | | | - M Catherine Aime
- Department of Botany and Plant Pathology, Purdue University, West Lafayette, IN 47907, USA
| | - Vinson P Doyle
- Department of Plant Pathology and Crop Physiology, Louisiana State University AgCenter, Baton Rouge, LA 70803, USA
| | - Laszlo G Nagy
- Synthetic and Systems Biology Unit, Institute of Biochemistry, BRC-HAS, Szeged 6726, Hungary
| |
Collapse
|
48
|
Gatesy J, Sloan DB, Warren JM, Baker RH, Simmons MP, Springer MS. Partitioned coalescence support reveals biases in species-tree methods and detects gene trees that determine phylogenomic conflicts. Mol Phylogenet Evol 2019; 139:106539. [DOI: 10.1016/j.ympev.2019.106539] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2018] [Revised: 06/10/2019] [Accepted: 06/17/2019] [Indexed: 12/26/2022]
|
49
|
Hamilton CA, St Laurent RA, Dexter K, Kitching IJ, Breinholt JW, Zwick A, Timmermans MJTN, Barber JR, Kawahara AY. Phylogenomics resolves major relationships and reveals significant diversification rate shifts in the evolution of silk moths and relatives. BMC Evol Biol 2019; 19:182. [PMID: 31533606 PMCID: PMC6751749 DOI: 10.1186/s12862-019-1505-1] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2019] [Accepted: 08/29/2019] [Indexed: 03/13/2023] Open
Abstract
BACKGROUND Silkmoths and their relatives constitute the ecologically and taxonomically diverse superfamily Bombycoidea, which includes some of the most charismatic species of Lepidoptera. Despite displaying spectacular forms and diverse ecological traits, relatively little attention has been given to understanding their evolution and drivers of their diversity. To begin to address this problem, we created a new Bombycoidea-specific Anchored Hybrid Enrichment (AHE) probe set and sampled up to 571 loci for 117 taxa across all major lineages of the Bombycoidea, with a newly developed DNA extraction protocol that allows Lepidoptera specimens to be readily sequenced from pinned natural history collections. RESULTS The well-supported tree was overall consistent with prior morphological and molecular studies, although some taxa were misplaced. The bombycid Arotros Schaus was formally transferred to Apatelodidae. We identified important evolutionary patterns (e.g., morphology, biogeography, and differences in speciation and extinction), and our analysis of diversification rates highlights the stark increases that exist within the Sphingidae (hawkmoths) and Saturniidae (wild silkmoths). CONCLUSIONS Our study establishes a backbone for future evolutionary, comparative, and taxonomic studies of Bombycoidea. We postulate that the rate shifts identified are due to the well-documented bat-moth "arms race". Our research highlights the flexibility of AHE to generate genomic data from a wide range of museum specimens, both age and preservation method, and will allow researchers to tap into the wealth of biological data residing in natural history collections around the globe.
Collapse
Affiliation(s)
- C A Hamilton
- Florida Museum of Natural History, University of Florida, Gainesville, FL, 32611, USA.
- Department of Entomology, Plant Pathology & Nematology, University of Idaho, Moscow, ID, 83844, USA.
| | - R A St Laurent
- Florida Museum of Natural History, University of Florida, Gainesville, FL, 32611, USA
| | - K Dexter
- Florida Museum of Natural History, University of Florida, Gainesville, FL, 32611, USA
| | - I J Kitching
- Department of Life Sciences, Natural History Museum, Cromwell Road, London, SW7 5BD, UK
| | - J W Breinholt
- Florida Museum of Natural History, University of Florida, Gainesville, FL, 32611, USA
- RAPiD Genomics, 747 SW 2nd Avenue #314, Gainesville, FL, 32601, USA
| | - A Zwick
- Australian National Insect Collection, CSIRO, Clunies Ross St, Acton, ACT, Canberra, 2601, Australia
| | - M J T N Timmermans
- Department of Natural Sciences, Middlesex University, The Burroughs, London, NW4 4BT, UK
| | - J R Barber
- Department of Biological Sciences, Boise State University, Boise, ID, 83725, USA
| | - A Y Kawahara
- Florida Museum of Natural History, University of Florida, Gainesville, FL, 32611, USA.
| |
Collapse
|
50
|
Roycroft EJ, Moussalli A, Rowe KC. Phylogenomics Uncovers Confidence and Conflict in the Rapid Radiation of Australo-Papuan Rodents. Syst Biol 2019; 69:431-444. [DOI: 10.1093/sysbio/syz044] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2018] [Accepted: 06/12/2019] [Indexed: 11/13/2022] Open
Abstract
Abstract
The estimation of robust and accurate measures of branch support has proven challenging in the era of phylogenomics. In data sets of potentially millions of sites, bootstrap support for bifurcating relationships around very short internal branches can be inappropriately inflated. Such overestimation of branch support may be particularly problematic in rapid radiations, where phylogenetic signal is low and incomplete lineage sorting severe. Here, we explore this issue by comparing various branch support estimates under both concatenated and coalescent frameworks, in the recent radiation Australo-Papuan murine rodents (Muridae: Hydromyini). Using nucleotide sequence data from 1245 independent loci and several phylogenomic inference methods, we unequivocally resolve the majority of genus-level relationships within Hydromyini. However, at four nodes we recover inconsistency in branch support estimates both within and among concatenated and coalescent approaches. In most cases, concatenated likelihood approaches using standard fast bootstrap algorithms did not detect any uncertainty at these four nodes, regardless of partitioning strategy. However, we found this could be overcome with two-stage resampling, that is, across genes and sites within genes (using -bsam GENESITE in IQ-TREE). In addition, low confidence at recalcitrant nodes was recovered using UFBoot2, a recent revision to the bootstrap protocol in IQ-TREE, but this depended on partitioning strategy. Summary coalescent approaches also failed to detect uncertainty under some circumstances. For each of four recalcitrant nodes, an equivalent (or close to equivalent) number of genes were in strong support ($>$ 75% bootstrap) of both the primary and at least one alternative topological hypothesis, suggesting notable phylogenetic conflict among loci not detected using some standard branch support metrics. Recent debate has focused on the appropriateness of concatenated versus multigenealogical approaches to resolving species relationships, but less so on accurately estimating uncertainty in large data sets. Our results demonstrate the importance of employing multiple approaches when assessing confidence and highlight the need for greater attention to the development of robust measures of uncertainty in the era of phylogenomics.
Collapse
Affiliation(s)
- Emily J Roycroft
- School of BioSciences, The University of Melbourne, Parkville, VIC 3010, Australia
- Department of Science, Museums Victoria, GPO Box 666, Melbourne, VIC 3001, Australia
| | - Adnan Moussalli
- School of BioSciences, The University of Melbourne, Parkville, VIC 3010, Australia
- Department of Science, Museums Victoria, GPO Box 666, Melbourne, VIC 3001, Australia
| | - Kevin C Rowe
- School of BioSciences, The University of Melbourne, Parkville, VIC 3010, Australia
- Department of Science, Museums Victoria, GPO Box 666, Melbourne, VIC 3001, Australia
| |
Collapse
|