1
|
Naranjo AA, Edwards CE, Gitzendanner MA, Soltis DE, Soltis PS. Abundant incongruence in a clade endemic to a biodiversity hotspot: Phylogenetics of the scrub mint clade (Lamiaceae). Mol Phylogenet Evol 2024; 192:108014. [PMID: 38199595 DOI: 10.1016/j.ympev.2024.108014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2023] [Revised: 12/26/2023] [Accepted: 01/06/2024] [Indexed: 01/12/2024]
Abstract
The Scrub Mint clade(Lamiaceae) provides a unique system for investigating the evolutionary processes driving diversification in the North American Coastal Plain from both a systematic and biogeographic context. The clade comprisesDicerandra, Conradina, Piloblephis, Stachydeoma, and four species of the broadly defined genus Clinopodium(Mentheae; Lamiaceae), almost all of which are endemic to the North American Eastern Coastal Plain. Most species of this clade are threatened or endangered and restricted to sandhill or a mosaic of scrub habitats. We analyzed relationships in this clade to understand the evolution of the group and identify evolutionary mechanisms acting on the clade, with important implications for conservation. We used a target-capture method to sequence and analyze 238 nuclear loci across all species of scrub mints, reconstructed the phylogeny, and calculated gene tree concordance, gene tree estimation error, and reticulation indices for every node in the tree using ML methods. Phylogenetic networks were used to determine reticulation events. Our nuclear phylogenetic estimates were consistent with previous results, while greatly increasing the robustness of taxon sampling. The phylogeny resolved the full relationship between Dicerandra and Conradina and the less-studied members of the clade (Piloblephis, Stachydeoma, Clinopodium spp.). We found hotspots of gene tree discordance and reticulation throughout the tree, especially in perennial Dicerandra. Several instances of reticulation events were uncovered between annual and perennial Dicerandra, and within the Conradina + allies clade. Incomplete lineage sorting also likely contributed to phylogenetic discordance. These results clarify phylogenetic relationships in the clade and provide insight on important evolutionary drivers in the clade, such as hybridization. General relationships in the group were confirmed, while the large amount of gene tree discordance is likely due to reticulation across the phylogeny.
Collapse
Affiliation(s)
- Andre A Naranjo
- Institute of Environment, Department of Biological Sciences, Florida International University, 11200 SW 8th ST, Miami, FL 33199, USA; Florida Museum of Natural History, University of Florida, 1659 Museum Road, PO Box 117800, Gainesville, FL 32611-7800, USA.
| | | | - Matthew A Gitzendanner
- Department of Biology, University of Florida, PO Box 118526, Gainesville, FL 32611-8526, USA
| | - Douglas E Soltis
- Florida Museum of Natural History, University of Florida, 1659 Museum Road, PO Box 117800, Gainesville, FL 32611-7800, USA; Department of Biology, University of Florida, PO Box 118526, Gainesville, FL 32611-8526, USA
| | - Pamela S Soltis
- Florida Museum of Natural History, University of Florida, 1659 Museum Road, PO Box 117800, Gainesville, FL 32611-7800, USA
| |
Collapse
|
2
|
Lopez Fang L, Peede D, Ortega-Del Vecchyo D, McTavish EJ, Huerta-Sánchez E. Leveraging shared ancestral variation to detect local introgression. PLoS Genet 2024; 20:e1010155. [PMID: 38190420 PMCID: PMC10798638 DOI: 10.1371/journal.pgen.1010155] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2022] [Revised: 01/19/2024] [Accepted: 12/04/2023] [Indexed: 01/10/2024] Open
Abstract
Introgression is a common evolutionary phenomenon that results in shared genetic material across non-sister taxa. Existing statistical methods such as Patterson's D statistic can detect introgression by measuring an excess of shared derived alleles between populations. The D statistic is effective to detect genome-wide patterns of introgression but can give spurious inferences of introgression when applied to local regions. We propose a new statistic, D+, that leverages both shared ancestral and derived alleles to infer local introgressed regions. Incorporating both shared derived and ancestral alleles increases the number of informative sites per region, improving our ability to identify local introgression. We use a coalescent framework to derive the expected value of this statistic as a function of different demographic parameters under an instantaneous admixture model and use coalescent simulations to compute the power and precision of D+. While the power of D and D+ is comparable, D+ has better precision than D. We apply D+ to empirical data from the 1000 Genome Project and Heliconius butterflies to infer local targets of introgression in humans and in butterflies.
Collapse
Affiliation(s)
- Lesly Lopez Fang
- Department of Life & Environmental Sciences, University of California, Merced, Merced, California, United States of America
- Quantitative & Systems Biology Graduate Group, University of California, Merced, Merced, California, United States of America
| | - David Peede
- Department of Ecology, Evolution and Organismal Biology, Brown University, Providence, Rhode Island, United States of America
- Center for Computational Biology, Brown University, Providence, Rhode Island, United States of America
- Institute at Brown for Environment and Society, Brown University, Providence, Rhode Island, United States of America
| | - Diego Ortega-Del Vecchyo
- Laboratorio Internacional de Investigación sobre el Genoma Humano, Universidad Nacional Autónoma de México, Santiago de Querétaro, Querétaro, México
| | - Emily Jane McTavish
- Department of Life & Environmental Sciences, University of California, Merced, Merced, California, United States of America
- Quantitative & Systems Biology Graduate Group, University of California, Merced, Merced, California, United States of America
| | - Emilia Huerta-Sánchez
- Department of Ecology, Evolution and Organismal Biology, Brown University, Providence, Rhode Island, United States of America
- Center for Computational Biology, Brown University, Providence, Rhode Island, United States of America
| |
Collapse
|
3
|
Zhang Y, Zhu Q, Shao Y, Jiang Y, Ouyang Y, Zhang L, Zhang W. Inferring Historical Introgression with Deep Learning. Syst Biol 2023; 72:1013-1038. [PMID: 37257491 DOI: 10.1093/sysbio/syad033] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2022] [Revised: 05/28/2023] [Accepted: 05/30/2023] [Indexed: 06/02/2023] Open
Abstract
Resolving phylogenetic relationships among taxa remains a challenge in the era of big data due to the presence of genetic admixture in a wide range of organisms. Rapidly developing sequencing technologies and statistical tests enable evolutionary relationships to be disentangled at a genome-wide level, yet many of these tests are computationally intensive and rely on phased genotypes, large sample sizes, restricted phylogenetic topologies, or hypothesis testing. To overcome these difficulties, we developed a deep learning-based approach, named ERICA, for inferring genome-wide evolutionary relationships and local introgressed regions from sequence data. ERICA accepts sequence alignments of both population genomic data and multiple genome assemblies, and efficiently identifies discordant genealogy patterns and exchanged regions across genomes when compared with other methods. We further tested ERICA using real population genomic data from Heliconius butterflies that have undergone adaptive radiation and frequent hybridization. Finally, we applied ERICA to characterize hybridization and introgression in wild and cultivated rice, revealing the important role of introgression in rice domestication and adaptation. Taken together, our findings demonstrate that ERICA provides an effective method for teasing apart evolutionary relationships using whole genome data, which can ultimately facilitate evolutionary studies on hybridization and introgression.
Collapse
Affiliation(s)
- Yubo Zhang
- State Key Laboratory of Protein and Plant Gene Research, Peking-Tsinghua Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, China
| | - Qingjie Zhu
- Chinese Institute for Brain Research, Beijing 102206, China
| | - Yi Shao
- Chinese Institute for Brain Research, Beijing 102206, China
| | - Yanchen Jiang
- State Key Laboratory of Protein and Plant Gene Research, Peking-Tsinghua Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, China
- State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Peking University, Beijing 100871, China
| | - Yidan Ouyang
- National Key Laboratory of Crop Genetic Improvement and National Centre of Plant Gene Research (Wuhan), Hubei Hongshan Laboratory, Huazhong Agricultural University, Wuhan 430070, China
| | - Li Zhang
- Chinese Institute for Brain Research, Beijing 102206, China
| | - Wei Zhang
- State Key Laboratory of Protein and Plant Gene Research, Peking-Tsinghua Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, China
- State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Peking University, Beijing 100871, China
| |
Collapse
|
4
|
Vernygora OV, Campbell EO, Grishin NV, Sperling FA, Dupuis JR. Gauging ages of tiger swallowtail butterflies using alternate SNP analyses. Mol Phylogenet Evol 2022; 171:107465. [DOI: 10.1016/j.ympev.2022.107465] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2022] [Revised: 02/26/2022] [Accepted: 03/15/2022] [Indexed: 10/18/2022]
|
5
|
Tanoeiro L, Oleastro M, Nunes A, Marques AT, Duarte SV, Gomes JP, Matos APA, Vítor JMB, Vale FF. Cryptic Prophages Contribution for Campylobacter jejuni and Campylobacter coli Introgression. Microorganisms 2022; 10:microorganisms10030516. [PMID: 35336092 PMCID: PMC8955182 DOI: 10.3390/microorganisms10030516] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2022] [Revised: 02/22/2022] [Accepted: 02/23/2022] [Indexed: 11/23/2022] Open
Abstract
Campylobacter coli and C. jejuni, the causing agents of campylobacteriosis, are described to be undergoing introgression events, i.e., the transference of genetic material between different species, with some isolates sharing almost a quarter of its genome. The participation of phages in introgression events and consequent impact on host ecology and evolution remain elusive. Three distinct prophages, named C. jejuni integrated elements 1, 2, and 4 (CJIE1, CJIE2, and CJIE4), are described in C. jejuni. Here, we identified two unreported prophages, Campylobacter coli integrated elements 1 and 2 (CCIE1 and CCIE2 prophages), which are C. coli homologues of CJIE1 and CJIE2, respectively. No induction was achieved for both prophages. Conversely, induction assays on CJIE1 and CJIE2 point towards the inducibility of these prophages. CCIE2-, CJIE1-, and CJIE4-like prophages were identified in a Campylobacter spp. population of 840 genomes, and phylogenetic analysis revealed clustering in three major groups: CJIE1-CCIE1, CJIE2-CCIE2, and CJIE4, clearly segregating prophages from C. jejuni and C. coli, but not from human- and nonhuman-derived isolates, corroborating the flowing between animals and humans in the agricultural context. Punctual bacteriophage host-jumps were observed in the context of C. jejuni and C. coli, and although random chance cannot be fully discarded, these observations seem to implicate prophages in evolutionary introgression events that are modulating the hybridization of C. jejuni and C. coli species.
Collapse
Affiliation(s)
- Luís Tanoeiro
- Pathogen Genome Bioinformatics and Computational Biology, Research Institute for Medicines (iMed-ULisboa), Faculty of Pharmacy, Universidade de Lisboa, 1649-003 Lisboa, Portugal; (L.T.); (A.T.M.); (J.M.B.V.)
| | - Mónica Oleastro
- National Reference Laboratory for Gastrointestinal Infections, Department of Infectious Diseases, National Institute of Health Dr. Ricardo Jorge, 1600-609 Lisboa, Portugal;
| | - Alexandra Nunes
- Bioinformatics Unit, Department of Infectious Diseases, National Institute of Health Dr. Ricardo Jorge, 1600-609 Lisboa, Portugal; (A.N.); (J.P.G.)
| | - Andreia T. Marques
- Pathogen Genome Bioinformatics and Computational Biology, Research Institute for Medicines (iMed-ULisboa), Faculty of Pharmacy, Universidade de Lisboa, 1649-003 Lisboa, Portugal; (L.T.); (A.T.M.); (J.M.B.V.)
| | - Sílvia Vaz Duarte
- Innovation and Technology Unit, Department of Human Genetics, National Institute of Health Dr. Ricardo Jorge, 1600-609 Lisboa, Portugal;
| | - João Paulo Gomes
- Bioinformatics Unit, Department of Infectious Diseases, National Institute of Health Dr. Ricardo Jorge, 1600-609 Lisboa, Portugal; (A.N.); (J.P.G.)
| | - António Pedro Alves Matos
- Centro de Investigação Interdisciplinar Egas Moniz (CiiEM), Cooperativa de Ensino Superior Egas Moniz, Quinta da Granja, 2829-511 Caparica, Portugal;
| | - Jorge M. B. Vítor
- Pathogen Genome Bioinformatics and Computational Biology, Research Institute for Medicines (iMed-ULisboa), Faculty of Pharmacy, Universidade de Lisboa, 1649-003 Lisboa, Portugal; (L.T.); (A.T.M.); (J.M.B.V.)
| | - Filipa F. Vale
- Pathogen Genome Bioinformatics and Computational Biology, Research Institute for Medicines (iMed-ULisboa), Faculty of Pharmacy, Universidade de Lisboa, 1649-003 Lisboa, Portugal; (L.T.); (A.T.M.); (J.M.B.V.)
- Correspondence: or
| |
Collapse
|
6
|
Dittberner H, Tellier A, de Meaux J. Approximate Bayesian computation untangles signatures of contemporary and historical hybridization between two endangered species. Mol Biol Evol 2022; 39:6516021. [PMID: 35084503 PMCID: PMC8826969 DOI: 10.1093/molbev/msac015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Contemporary gene flow, when resumed after a period of isolation, can have crucial consequences for endangered species, as it can both increase the supply of adaptive alleles and erode local adaptation. Determining the history of gene flow and thus the importance of contemporary hybridization, however, is notoriously difficult. Here, we focus on two endangered plant species, Arabis nemorensis and A. sagittata, which hybridize naturally in a sympatric population located on the banks of the Rhine. Using reduced genome sequencing, we determined the phylogeography of the two taxa but report only a unique sympatric population. Molecular variation in chloroplast DNA indicated that A. sagittata is the principal receiver of gene flow. Applying classical D-statistics and its derivatives to whole-genome data of 35 accessions, we detect gene flow not only in the sympatric population but also among allopatric populations. Using an Approximate Bayesian computation approach, we identify the model that best describes the history of gene flow between these taxa. This model shows that low levels of gene flow have persisted long after speciation. Around 10 000 years ago, gene flow stopped and a period of complete isolation began. Eventually, a hotspot of contemporary hybridization was formed in the unique sympatric population. Occasional sympatry may have helped protect these lineages from extinction in spite of their extremely low diversity.
Collapse
Affiliation(s)
- Hannes Dittberner
- Institute of Plant Sciences,University of Cologne, Zülpicher str. 47b, Germany
| | - Aurelien Tellier
- Department of Life Science Systems, Technical University of Munich, Freising, Germany
| | - Juliette de Meaux
- Institute of Plant Sciences,University of Cologne, Zülpicher str. 47b, Germany
| |
Collapse
|
7
|
Schull JK, Turakhia Y, Hemker JA, Dally WJ, Bejerano G. OUP accepted manuscript. Genome Biol Evol 2022; 14:6529394. [PMID: 35171243 PMCID: PMC8920512 DOI: 10.1093/gbe/evac013] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/10/2022] [Indexed: 11/14/2022] Open
Abstract
We present Champagne, a whole-genome method for generating character matrices for phylogenomic analysis using large genomic indel events. By rigorously picking orthologous genes and locating large insertion and deletion events, Champagne delivers a character matrix that considerably reduces homoplasy compared with morphological and nucleotide-based matrices, on both established phylogenies and difficult-to-resolve nodes in the mammalian tree. Champagne provides ample evidence in the form of genomic structural variation to support incomplete lineage sorting and possible introgression in Paenungulata and human–chimp–gorilla which were previously inferred primarily through matrices composed of aligned single-nucleotide characters. Champagne also offers further evidence for Myomorpha as sister to Sciuridae and Hystricomorpha in the rodent tree. Champagne harbors distinct theoretical advantages as an automated method that produces nearly homoplasy-free character matrices on the whole-genome scale.
Collapse
Affiliation(s)
- James K Schull
- Department of Computer Science, Stanford University, USA
| | - Yatish Turakhia
- Department of Electrical and Computer Engineering, University of California San Diego, USA
| | - James A Hemker
- Department of Computer Science, Stanford University, USA
| | - William J Dally
- Department of Computer Science, Stanford University, USA
- NVIDIA, Santa Clara, California, USA
- Department of Electrical Engineering, Stanford University, USA
| | - Gill Bejerano
- Department of Computer Science, Stanford University, USA
- Department of Developmental Biology, Stanford University, USA
- Department of Biomedical Data Science, Stanford University, USA
- Department of Pediatrics, Stanford University, USA
- Corresponding author: E-mail:
| |
Collapse
|
8
|
Hibbins MS, Hahn MW. Phylogenomic approaches to detecting and characterizing introgression. Genetics 2021; 220:6425633. [PMID: 34788444 PMCID: PMC9208645 DOI: 10.1093/genetics/iyab173] [Citation(s) in RCA: 49] [Impact Index Per Article: 16.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2021] [Accepted: 10/02/2021] [Indexed: 12/26/2022] Open
Abstract
Phylogenomics has revealed the remarkable frequency with which introgression occurs across the tree of life. These discoveries have been enabled by the rapid growth of methods designed to detect and characterize introgression from whole-genome sequencing data. A large class of phylogenomic methods makes use of data across species to infer and characterize introgression based on expectations from the multispecies coalescent. These methods range from simple tests, such as the D-statistic, to model-based approaches for inferring phylogenetic networks. Here, we provide a detailed overview of the various signals that different modes of introgression are expected leave in the genome, and how current methods are designed to detect them. We discuss the strengths and pitfalls of these approaches and identify areas for future development, highlighting the different signals of introgression, and the power of each method to detect them. We conclude with a discussion of current challenges in inferring introgression and how they could potentially be addressed.
Collapse
Affiliation(s)
- Mark S Hibbins
- Department of Biology, Indiana University, Bloomington, IN 47405, USA
| | - Matthew W Hahn
- Department of Biology, Indiana University, Bloomington, IN 47405, USA.,Department of Computer Science, Indiana University, Bloomington, IN 47405, USA
| |
Collapse
|
9
|
Hibbins MS, Hahn MW. The effects of introgression across thousands of quantitative traits revealed by gene expression in wild tomatoes. PLoS Genet 2021; 17:e1009892. [PMID: 34748547 PMCID: PMC8601620 DOI: 10.1371/journal.pgen.1009892] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2021] [Revised: 11/18/2021] [Accepted: 10/18/2021] [Indexed: 01/13/2023] Open
Abstract
It is now understood that introgression can serve as powerful evolutionary force, providing genetic variation that can shape the course of trait evolution. Introgression also induces a shared evolutionary history that is not captured by the species phylogeny, potentially complicating evolutionary analyses that use a species tree. Such analyses are often carried out on gene expression data across species, where the measurement of thousands of trait values allows for powerful inferences while controlling for shared phylogeny. Here, we present a Brownian motion model for quantitative trait evolution under the multispecies network coalescent framework, demonstrating that introgression can generate apparently convergent patterns of evolution when averaged across thousands of quantitative traits. We test our theoretical predictions using whole-transcriptome expression data from ovules in the wild tomato genus Solanum. Examining two sub-clades that both have evidence for post-speciation introgression, but that differ substantially in its magnitude, we find patterns of evolution that are consistent with histories of introgression in both the sign and magnitude of ovule gene expression. Additionally, in the sub-clade with a higher rate of introgression, we observe a correlation between local gene tree topology and expression similarity, implicating a role for introgressed cis-regulatory variation in generating these broad-scale patterns. Our results reveal a general role for introgression in shaping patterns of variation across many thousands of quantitative traits, and provide a framework for testing for these effects using simple model-informed predictions. It is now known from studying large genetic datasets that species often hybridize and cross with each other over many generations – a phenomenon known as introgression. Introgression introduces new genetic variation into a population, and this variation can cause traits to be shared among the introgressing species. When researchers study the evolution of trait variation among species, this source of trait sharing is rarely accounted for. Here, we present a statistical model of the effects of introgression on trait variation. This model predicts that, when averaged across many thousands of traits, introgressing species are consistently more similar than expected from standard approaches. Researchers studying gene expression often consider the expression of many thousands of genes, making this a case where the expected effects of introgression are likely to manifest. We tested our model prediction using ovule gene expression data from the wild tomato genus Solanum, in two groups of species with evidence of historical introgression. We found that patterns of expression similarity in both groups are consistent with their histories of introgression and the predictions from our model. Our results highlight the importance of accounting for introgression as a source of trait variation among species.
Collapse
Affiliation(s)
- Mark S. Hibbins
- Department of Biology, Indiana University, Bloomington, Indiana, United States of America
- * E-mail:
| | - Matthew W. Hahn
- Department of Biology, Indiana University, Bloomington, Indiana, United States of America
- Department of Computer Science, Indiana University, Bloomington, Indiana, United States of America
| |
Collapse
|
10
|
Blischak PD, Barker MS, Gutenkunst RN. Chromosome-scale inference of hybrid speciation and admixture with convolutional neural networks. Mol Ecol Resour 2021; 21:2676-2688. [PMID: 33682305 PMCID: PMC8675098 DOI: 10.1111/1755-0998.13355] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2020] [Revised: 01/26/2021] [Accepted: 02/05/2021] [Indexed: 11/30/2022]
Abstract
Inferring the frequency and mode of hybridization among closely related organisms is an important step for understanding the process of speciation and can help to uncover reticulated patterns of phylogeny more generally. Phylogenomic methods to test for the presence of hybridization come in many varieties and typically operate by leveraging expected patterns of genealogical discordance in the absence of hybridization. An important assumption made by these tests is that the data (genes or SNPs) are independent given the species tree. However, when the data are closely linked, it is especially important to consider their nonindependence. Recently, deep learning techniques such as convolutional neural networks (CNNs) have been used to perform population genetic inferences with linked SNPs coded as binary images. Here, we use CNNs for selecting among candidate hybridization scenarios using the tree topology (((P1 , P2 ), P3 ), Out) and a matrix of pairwise nucleotide divergence (dXY ) calculated in windows across the genome. Using coalescent simulations to train and independently test a neural network showed that our method, HyDe-CNN, was able to accurately perform model selection for hybridization scenarios across a wide breath of parameter space. We then used HyDe-CNN to test models of admixture in Heliconius butterflies, as well as comparing it to phylogeny-based introgression statistics. Given the flexibility of our approach, the dropping cost of long-read sequencing and the continued improvement of CNN architectures, we anticipate that inferences of hybridization using deep learning methods like ours will help researchers to better understand patterns of admixture in their study organisms.
Collapse
Affiliation(s)
- Paul D. Blischak
- Department of Ecology & Evolutionary Biology, University of Arizona, Tucson, AZ, 85721, USA
- Department of Molecular & Cellular Biology, University of Arizona, Tucson, AZ, 85721, USA
| | - Michael S. Barker
- Department of Ecology & Evolutionary Biology, University of Arizona, Tucson, AZ, 85721, USA
| | - Ryan N. Gutenkunst
- Department of Molecular & Cellular Biology, University of Arizona, Tucson, AZ, 85721, USA
| |
Collapse
|
11
|
Esquerré D, Keogh JS, Demangel D, Morando M, Avila LJ, Sites JW, Ferri-Yáñez F, Leaché AD. Rapid radiation and rampant reticulation: Phylogenomics of South American Liolaemus lizards. Syst Biol 2021; 71:286-300. [PMID: 34259868 DOI: 10.1093/sysbio/syab058] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2019] [Revised: 06/25/2021] [Accepted: 06/30/2021] [Indexed: 01/09/2023] Open
Abstract
Understanding the factors that cause heterogeneity among gene trees can increase the accuracy of species trees. Discordant signals across the genome are commonly produced by incomplete lineage sorting (ILS) and introgression, which in turn can result in reticulate evolution. Species tree inference using the multispecies coalescent is designed to deal with ILS and is robust to low levels of introgression, but extensive introgression violates the fundamental assumption that relationships are strictly bifurcating. In this study, we explore the phylogenomics of the iconic Liolaemus subgenus of South American lizards, a group of over 100 species mostly distributed in and around the Andes mountains. Using mitochondrial DNA (mtDNA) and genome-wide restriction-site associated DNA sequencing (RADseq; nDNA hereafter), we inferred a time-calibrated mtDNA gene tree, nDNA species trees, and phylogenetic networks. We found high levels of discordance between mtDNA and nDNA, which we attribute in part to extensive ILS resulting from rapid diversification. These data also reveal extensive and deep introgression, which combined with rapid diversification, explain the high level of phylogenetic discordance. We discuss these findings in the context of Andean orogeny and glacial cycles that fragmented, expanded, and contracted species distributions. Finally, we use the new phylogeny to resolve long-standing taxonomic issues in one of the most studied lizard groups in the New World.
Collapse
Affiliation(s)
- Damien Esquerré
- Division of Ecology and Evolution, Research School of Biology, The Australian National University, Canberra, ACT, Australia
| | - J Scott Keogh
- Division of Ecology and Evolution, Research School of Biology, The Australian National University, Canberra, ACT, Australia
| | | | - Mariana Morando
- Instituto Patagónico para el Estudio de los Ecosistemas Continentales (IPEEC- CONICET), Puerto Madryn, Chubut, Argentina
| | - Luciano J Avila
- Instituto Patagónico para el Estudio de los Ecosistemas Continentales (IPEEC- CONICET), Puerto Madryn, Chubut, Argentina
| | - Jack W Sites
- Department of Biology and M.L. Bean Life Science Museum, Brigham Young University, Provo, Utah, USA
| | - Francisco Ferri-Yáñez
- Departamento de Biogeografía y Cambio Global, Museo Nacional de Ciencias Naturales, CSIC & Laboratorio Internacional en Cambio Global CSIC-PUC (LINCGlobal), Calle José Gutiérrez Abascal, 2, 28006, Madrid, Spain
| | - Adam D Leaché
- Department of Biology & Burke Museum of Natural History and Culture, University of Washington, Seattle, Washington, USA
| |
Collapse
|
12
|
Introgression is widespread in the radiation of carnivorous Nepenthes pitcher plants. Mol Phylogenet Evol 2021; 163:107214. [PMID: 34052438 DOI: 10.1016/j.ympev.2021.107214] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2021] [Revised: 05/14/2021] [Accepted: 05/25/2021] [Indexed: 11/23/2022]
Abstract
Introgression and hybridization are important processes in plant evolution, but they are difficult to study from a phylogenetic perspective, because they conflict with the bifurcating evolutionary history typically depicted in phylogenetic models. The role of hybridization in plant evolution is best documented in the form of allo-polyploidizations. In contrast, homoploid hybridization and introgression are less explored, although they may be crucial in adaptive radiations. Here we employ genome-wide data (ddRAD-seq, transcriptomes) to investigate the evolutionary history of Nepenthes, a radiation of c. 160 species of iconic carnivorous plants mainly from tropical Asia. Our data indicates that the main radiation is only c. 5 million years old, and confirms previous bifurcating phylogenies. However, due to a greatly expanded number of loci, we were able test for the first time the long-standing hypotheses of introgression and historical hybridization. The genus presents one very clear case of organellar capture between two distantly related but sympatric groups. Furthermore, all Nepenthes species show introgression signals in their nuclear genomes, as uncovered by a general survey of ABBA-BABA-like statistics. The ancestor of the rapid main radiation shows ancestry from two deeply diverged lineages, as indicated by phylogenetic network analyses. All major clades of the main radiation show further introgression both within and between each other, as suggested by admixture graphs. Our study supports the hypothesis that rapid adaptive radiations are hotspots of introgression in the tree of life, and highlights the need to consider non-treelike processes in evolutionary studies of Nepenthes in particular.
Collapse
|
13
|
Kong S, Kubatko LS. Comparative Performance of Popular Methods for Hybrid Detection using Genomic Data. Syst Biol 2021; 70:891-907. [PMID: 33404632 DOI: 10.1093/sysbio/syaa092] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2020] [Accepted: 11/13/2020] [Indexed: 11/13/2022] Open
Abstract
Interspecific hybridization is an important evolutionary phenomenon that generates genetic variability in a population and fosters species diversity in nature. The availability of large genome scale datasets has revolutionized hybridization studies to shift from the observation of the presence or absence of hybrids to the investigation of the genomic constitution of hybrids and their genome-specific evolutionary dynamics. Although a handful of methods have been proposed in an attempt to identify hybrids, accurate detection of hybridization from genomic data remains a challenging task. In addition to methods that infer phylogenetic networks or that utilize pairwise divergence, site pattern frequency based and population genetic clustering approaches are popularly used in practice, though the performance of these methods under different hybridization scenarios has not been extensively examined. Here, we use simulated data to comparatively evaluate the performance of four tools that are commonly used to infer hybridization events: the site pattern frequency based methods HyDe and the D-statistic (i.e., the ABBA-BABA test) and the population clustering approaches structure and ADMIXTURE. We consider single hybridization scenarios that vary in the time of hybridization and the amount of incomplete lineage sorting (ILS) for different proportions of parental contributions (γ); introgressive hybridization; multiple hybridization scenarios; and a mixture of ancestral and recent hybridization scenarios. We focus on the statistical power to detect hybridization and the false discovery rate (FDR) for comparisons of the D-statistic and HyDe, and the accuracy of the estimates of γ as measured by the mean squared error for HyDe, structure, and ADMIXTURE. Both HyDe and the D-statistic are powerful for detecting hybridization in all scenarios except those with high ILS, although the D-statistic often has an unacceptably high FDR. The estimates of γ in HyDe are impressively robust and accurate whereas structure and ADMIXTURE sometimes fail to identify hybrids, particularly when the proportional parental contributions are asymmetric (i.e., when γ is close to 0). Moreover, the posterior distribution estimated using structure exhibits multimodality in many scenarios, making interpretation difficult. Our results provide guidance in selecting appropriate methods for identifying hybrid populations from genomic data.
Collapse
Affiliation(s)
- Sungsik Kong
- Department of Evolution, Ecology, and Organismal Biology, The Ohio State University, Columbus, OH, USA
| | - Laura S Kubatko
- Department of Evolution, Ecology, and Organismal Biology, The Ohio State University, Columbus, OH, USA.,Department of Statistics, The Ohio State University, Columbus, OH, USA
| |
Collapse
|
14
|
Hibbins MS, Gibson MJS, Hahn MW. Determining the probability of hemiplasy in the presence of incomplete lineage sorting and introgression. eLife 2020; 9:e63753. [PMID: 33345772 PMCID: PMC7800383 DOI: 10.7554/elife.63753] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2020] [Accepted: 12/18/2020] [Indexed: 12/11/2022] Open
Abstract
The incongruence of character states with phylogenetic relationships is often interpreted as evidence of convergent evolution. However, trait evolution along discordant gene trees can also generate these incongruences - a phenomenon known as hemiplasy. Classic comparative methods do not account for discordance, resulting in incorrect inferences about the number, timing, and direction of trait transitions. Biological sources of discordance include incomplete lineage sorting (ILS) and introgression, but only ILS has received theoretical consideration in the context of hemiplasy. Here, we present a model that shows introgression makes hemiplasy more likely, such that methods that account for ILS alone will be conservative. We also present a method and software (HeIST) for making statistical inferences about the probability of hemiplasy and homoplasy in large datasets that contain both ILS and introgression. We apply our methods to two empirical datasets, finding that hemiplasy is likely to contribute to the observed trait incongruences in both.
Collapse
Affiliation(s)
- Mark S Hibbins
- Department of Biology, Indiana UniversityBloomingtonUnited States
| | | | - Matthew W Hahn
- Department of Biology, Indiana UniversityBloomingtonUnited States
- Department of Computer Science, Indiana UniversityBloomingtonUnited States
| |
Collapse
|
15
|
Criado Ruiz D, Villa Machío I, Herrero Nieto A, Nieto Feliner G. Hybridization and cryptic speciation in the Iberian endemic plant genus Phalacrocarpum (Asteraceae-Anthemideae). Mol Phylogenet Evol 2020; 156:107024. [PMID: 33271372 DOI: 10.1016/j.ympev.2020.107024] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2020] [Revised: 11/18/2020] [Accepted: 11/24/2020] [Indexed: 01/28/2023]
Abstract
Understanding the role and impact of reticulation in phylogenetic inquiry has improved with extended use of high throughput sequencing data. Yet, due to the dynamism of genomes over evolutionary time, disentangling old hybridization events remains a serious challenge. Phalacrocarpum (DC.) Willk. is one of the 27 Iberian endemic plant genera, currently considered monotypic but including three subspecies. Its uncertain phylogenetic relationships within tribe Anthemideae (Asteraceae) point to an Early Miocene divergence from its sister group, and its persistent taxonomic instability has been proposed to be due to hybridization. We aim at understanding the evolutionary history of this genus using SNPs called from a genotyping-by-sequencing (GBS) analysis, Sanger sequences-from three plastid DNA regions (psbJ-petA, petB-petD, trnH-psbA) and the nuclear ribosomal ITS regions (cloned)-as well as leaf morphometric multivariate analysis. SNP data and Sanger sequences strongly support the unforeseen existence of a cryptic species in the eastern populations of P. oppositifolium subsp. anomalum. Broad molecular and morphometric patterns of variation found in conflictive populations from the Sanabria Valley region convincingly identify a recent previously undocumented hybrid zone. By contrast, evidence is less conclusive on relationships between subspecies hoffmannseggii, oppositifolium and a second conflictive group distributed along the Galician-Portuguese border (Orense massifs). Although genetic clustering analysis of SNP data suggests that the former subspecies was the maternal progenitor in hybridization events that gave rise to the other two groups, we found considerable uniqueness of ITS ribotypes and plastid haplotypes in them. This result, in the context of Pleistocene climatically-driven range shifts in NW Iberian Peninsula, can be due to periods of isolation, genetic bottlenecks and drift superimposed on old hybridization events. Our study confirms the idea that unravelling old hybridization events may be compromised by the suite of evolutionary processes accumulated subsequently, particularly in areas with a history of climatic instability.
Collapse
Affiliation(s)
- David Criado Ruiz
- Real Jardín Botánico (RJB-CSIC), Plaza de Murillo 2, 28014 Madrid, Spain.
| | - Irene Villa Machío
- Real Jardín Botánico (RJB-CSIC), Plaza de Murillo 2, 28014 Madrid, Spain
| | | | | |
Collapse
|
16
|
Cai L, Xi Z, Lemmon EM, Lemmon AR, Mast A, Buddenhagen CE, Liu L, Davis CC. The Perfect Storm: Gene Tree Estimation Error, Incomplete Lineage Sorting, and Ancient Gene Flow Explain the Most Recalcitrant Ancient Angiosperm Clade, Malpighiales. Syst Biol 2020; 70:491-507. [PMID: 33169797 DOI: 10.1093/sysbio/syaa083] [Citation(s) in RCA: 46] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2019] [Revised: 10/20/2020] [Accepted: 10/28/2020] [Indexed: 12/20/2022] Open
Abstract
The genomic revolution offers renewed hope of resolving rapid radiations in the Tree of Life. The development of the multispecies coalescent model and improved gene tree estimation methods can better accommodate gene tree heterogeneity caused by incomplete lineage sorting (ILS) and gene tree estimation error stemming from the short internal branches. However, the relative influence of these factors in species tree inference is not well understood. Using anchored hybrid enrichment, we generated a data set including 423 single-copy loci from 64 taxa representing 39 families to infer the species tree of the flowering plant order Malpighiales. This order includes 9 of the top 10 most unstable nodes in angiosperms, which have been hypothesized to arise from the rapid radiation during the Cretaceous. Here, we show that coalescent-based methods do not resolve the backbone of Malpighiales and concatenation methods yield inconsistent estimations, providing evidence that gene tree heterogeneity is high in this clade. Despite high levels of ILS and gene tree estimation error, our simulations demonstrate that these two factors alone are insufficient to explain the lack of resolution in this order. To explore this further, we examined triplet frequencies among empirical gene trees and discovered some of them deviated significantly from those attributed to ILS and estimation error, suggesting gene flow as an additional and previously unappreciated phenomenon promoting gene tree variation in Malpighiales. Finally, we applied a novel method to quantify the relative contribution of these three primary sources of gene tree heterogeneity and demonstrated that ILS, gene tree estimation error, and gene flow contributed to 10.0$\%$, 34.8$\%$, and 21.4$\%$ of the variation, respectively. Together, our results suggest that a perfect storm of factors likely influence this lack of resolution, and further indicate that recalcitrant phylogenetic relationships like the backbone of Malpighiales may be better represented as phylogenetic networks. Thus, reducing such groups solely to existing models that adhere strictly to bifurcating trees greatly oversimplifies reality, and obscures our ability to more clearly discern the process of evolution. [Coalescent; concatenation; flanking region; hybrid enrichment, introgression; phylogenomics; rapid radiation, triplet frequency.].
Collapse
Affiliation(s)
- Liming Cai
- Department of Organismic and Evolutionary Biology, Harvard University Herbaria, Cambridge, MA 02138, USA
- Key Laboratory of Bio-Resource and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu 610065, China
| | - Zhenxiang Xi
- Department of Organismic and Evolutionary Biology, Harvard University Herbaria, Cambridge, MA 02138, USA
- Key Laboratory of Bio-Resource and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu 610065, China
| | - Emily Moriarty Lemmon
- Department of Biological Sciences, Florida State University, Tallahassee, FL 32306, USA
| | - Alan R Lemmon
- Department of Scientific Computing, Florida State University, Tallahassee, FL 32306, USA
| | - Austin Mast
- Department of Biological Sciences, Florida State University, Tallahassee, FL 32306, USA
| | - Christopher E Buddenhagen
- Department of Biological Sciences, Florida State University, Tallahassee, FL 32306, USA
- AgResearch, 10 Bisley Road, Hamilton 3214, New Zealand
| | - Liang Liu
- Department of Statistics and Institute of Bioinformatics, University of Georgia, Athens, GA 30602, USA
| | - Charles C Davis
- Department of Organismic and Evolutionary Biology, Harvard University Herbaria, Cambridge, MA 02138, USA
| |
Collapse
|
17
|
Forsythe ES, Nelson ADL, Beilstein MA. Biased Gene Retention in the Face of Introgression Obscures Species Relationships. Genome Biol Evol 2020; 12:1646-1663. [PMID: 33011798 PMCID: PMC7533067 DOI: 10.1093/gbe/evaa149] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/10/2020] [Indexed: 12/13/2022] Open
Abstract
Phylogenomic analyses are recovering previously hidden histories of hybridization, revealing the genomic consequences of these events on the architecture of extant genomes. We applied phylogenomic techniques and several complementary statistical tests to show that introgressive hybridization appears to have occurred between close relatives of Arabidopsis, resulting in cytonuclear discordance and impacting our understanding of species relationships in the group. The composition of introgressed and retained genes indicates that selection against incompatible cytonuclear and nuclear-nuclear interactions likely acted during introgression, whereas linkage also contributed to genome composition through the retention of ancient haplotype blocks. We also applied divergence-based tests to determine the species branching order and distinguish donor from recipient lineages. Surprisingly, these analyses suggest that cytonuclear discordance arose via extensive nuclear, rather than cytoplasmic, introgression. If true, this would mean that most of the nuclear genome was displaced during introgression whereas only a small proportion of native alleles were retained.
Collapse
|
18
|
Pfeifer B, Alachiotis N, Pavlidis P, Schimek MG. Genome scans for selection and introgression based on k-nearest neighbour techniques. Mol Ecol Resour 2020; 20:1597-1609. [PMID: 32639602 PMCID: PMC7689739 DOI: 10.1111/1755-0998.13221] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2019] [Revised: 06/22/2020] [Accepted: 06/29/2020] [Indexed: 12/27/2022]
Abstract
In recent years, genome-scan methods have been extensively used to detect local signatures of selection and introgression. Most of these methods are either designed for one or the other case, which may impair the study of combined cases. Here, we introduce a series of versatile genome-scan methods applicable for both cases, the detection of selection and introgression. The proposed approaches are based on nonparametric k-nearest neighbour (kNN) techniques, while incorporating pairwise Fixation Index (FST ) and pairwise nucleotide differences (dxy ) as features. We benchmark our methods using a wide range of simulation scenarios, with varying parameters, such as recombination rates, population background histories, selection strengths, the proportion of introgression and the time of gene flow. We find that kNN-based methods perform remarkably well compared with the state-of-the-art. Finally, we demonstrate how to perform kNN-based genome scans on real-world genomic data using the population genomics R-package popgenome.
Collapse
Affiliation(s)
- Bastian Pfeifer
- Research Unit of Statistical Bioinformatics, Institute for Medical Informatics, Statistics and Documentation, Medical University of Graz, Graz, Austria
| | | | - Pavlos Pavlidis
- Institute of Computer Science, Foundation for Research and Technology-Hellas, Crete, Greece
| | - Michael G Schimek
- Research Unit of Statistical Bioinformatics, Institute for Medical Informatics, Statistics and Documentation, Medical University of Graz, Graz, Austria
| |
Collapse
|
19
|
Teixeira MDM, Cattana ME, Matute DR, Muñoz JF, Arechavala A, Isbell K, Schipper R, Santiso G, Tracogna F, Sosa MDLÁ, Cech N, Alvarado P, Barreto L, Chacón Y, Ortellado J, Lima CMD, Chang MR, Niño-Vega G, Yasuda MAS, Felipe MSS, Negroni R, Cuomo CA, Barker B, Giusiano G. Genomic diversity of the human pathogen Paracoccidioides across the South American continent. Fungal Genet Biol 2020; 140:103395. [PMID: 32325168 PMCID: PMC7385733 DOI: 10.1016/j.fgb.2020.103395] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2019] [Revised: 02/27/2020] [Accepted: 04/16/2020] [Indexed: 12/30/2022]
Abstract
Paracoccidioidomycosis (PCM) is a life-threatening systemic mycosis widely reported in the Gran Chaco ecosystem. The disease is caused by different species from the genus Paracoccidioides, which are all endemic to South and Central America. Here, we sequenced and analyzed 31 isolates of Paracoccidioides across South America, with particular focus on isolates from Argentina and Paraguay. The de novo sequenced isolates were compared with publicly available genomes. Phylogenetics and population genomics revealed that PCM in Argentina and Paraguay is caused by three distinct Paracoccidioides genotypes, P. brasiliensis (S1a and S1b) and P. restrepiensis (PS3). P. brasiliensis S1a isolates from Argentina are frequently associated with chronic forms of the disease. Our results suggest the existence of extensive molecular polymorphism among Paracoccidioides species, and provide a framework to begin to dissect the connection between genotypic differences in the pathogen and the clinical outcomes of the disease.
Collapse
Affiliation(s)
- Marcus de Melo Teixeira
- Northern Arizona University, Flagstaff, AZ, USA; Universidade de Brasília, Brasilia, Brazil.
| | - Maria Emilia Cattana
- Northern Arizona University, Flagstaff, AZ, USA; Hospital Dr. Julio C. Perrando, Resistencia, Chaco, Argentina
| | - Daniel R Matute
- Biology Department, University of North Carolina, Chapel Hill, NC, USA
| | - José F Muñoz
- Broad Institute of MIT and Harvard, Cambridge, USA
| | | | - Kristin Isbell
- Biology Department, University of North Carolina, Chapel Hill, NC, USA
| | | | | | | | | | | | - Primavera Alvarado
- Servicio Autónomo Instituto de Biomedicina Dr. Jacinto Convit, Caracas, Venezuela
| | - Laura Barreto
- Instituto Superior de Formación Docente Salome Ureña, Santo Domingo, Dominican Republic
| | - Yone Chacón
- Hospital Señor del Milagro, Salta, Argentina
| | | | | | | | | | | | | | | | | | | | - Gustavo Giusiano
- Universidad Nacional del Nordeste, Resistencia, Chaco, Argentina.
| |
Collapse
|
20
|
Abstract
Introgressive hybridization results in the transfer of genetic material between species, often with fitness implications for the recipient species. The development of statistical methods for detecting the signatures of historical introgression in whole-genome data has been a major area of focus. Although existing techniques are able to identify the taxa that exchanged genes during introgression using a four-taxon system, most methods do not explicitly distinguish which taxon served as donor and which as recipient during introgression (i.e., polarization of introgression directionality). Existing methods that do polarize introgression are often only able to do so when there is a fifth taxon available and that taxon is sister to one of the taxa involved in introgression. Here, we present divergence-based introgression polarization (DIP), a method for polarizing introgression using patterns of sequence divergence across whole genomes, which operates in a four-taxon context. Thus, DIP can be applied to infer the directionality of introgression when additional taxa are not available. We use simulations to show that DIP can polarize introgression and identify potential sources of bias in the assignment of directionality, and we apply DIP to a well-described hominin introgression event.
Collapse
Affiliation(s)
- Evan S Forsythe
- Department of Biology, Colorado State University
- School of Plant Sciences, University of Arizona
| | | | | |
Collapse
|
21
|
Hamlin JAP, Hibbins MS, Moyle LC. Assessing biological factors affecting postspeciation introgression. Evol Lett 2020; 4:137-154. [PMID: 32313689 PMCID: PMC7156103 DOI: 10.1002/evl3.159] [Citation(s) in RCA: 28] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2019] [Revised: 11/26/2019] [Accepted: 01/12/2020] [Indexed: 12/14/2022] Open
Abstract
An increasing number of phylogenomic studies have documented a clear “footprint” of postspeciation introgression among closely related species. Nonetheless, systematic genome‐wide studies of factors that determine the likelihood of introgression remain rare. Here, we propose an a priori hypothesis‐testing framework that uses introgression statistics—including a new metric of estimated introgression, Dp—to evaluate general patterns of introgression prevalence and direction across multiple closely related species. We demonstrate this approach using whole genome sequences from 32 lineages in 11 wild tomato species to assess the effect of three factors on introgression—genetic relatedness, geographical proximity, and mating system differences—based on multiple trios within the “ABBA–BABA” test. Our analyses suggest each factor affects the prevalence of introgression, although our power to detect these is limited by the number of comparisons currently available. We find that of 14 species pairs with geographically “proximate” versus “distant” population comparisons, 13 showed evidence of introgression; in 10 of these cases, this was more prevalent between geographically closer populations. We also find modest evidence that introgression declines with increasing genetic divergence between lineages, is more prevalent between lineages that share the same mating system, and—when it does occur between mating systems—tends to involve gene flow from more inbreeding to more outbreeding lineages. Although our analysis indicates that recent postspeciation introgression is frequent in this group—detected in 15 of 17 tested trios—estimated levels of genetic exchange are modest (0.2–2.5% of the genome), so the relative importance of hybridization in shaping the evolutionary trajectories of these species could be limited. Regardless, similar clade‐wide analyses of genomic introgression would be valuable for disentangling the major ecological, reproductive, and historical determinants of postspeciation gene flow, and for assessing the relative contribution of introgression as a source of genetic variation.
Collapse
Affiliation(s)
| | - Mark S Hibbins
- Department of Biology Indiana University Bloomington Indiana 47405
| | - Leonie C Moyle
- Department of Biology Indiana University Bloomington Indiana 47405
| |
Collapse
|
22
|
Springer MS, Foley NM, Brady PL, Gatesy J, Murphy WJ. Evolutionary Models for the Diversification of Placental Mammals Across the KPg Boundary. Front Genet 2019; 10:1241. [PMID: 31850081 PMCID: PMC6896846 DOI: 10.3389/fgene.2019.01241] [Citation(s) in RCA: 26] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2019] [Accepted: 11/08/2019] [Indexed: 01/29/2023] Open
Abstract
Deciphering the timing of the placental mammal radiation is a longstanding problem in evolutionary biology, but consensus on the tempo and mode of placental diversification remains elusive. Nevertheless, an accurate timetree is essential for understanding the role of important events in Earth history (e.g., Cretaceous Terrestrial Revolution, KPg mass extinction) in promoting the taxonomic and ecomorphological diversification of Placentalia. Archibald and Deutschman described three competing models for the diversification of placental mammals, which are the Explosive, Long Fuse, and Short Fuse Models. More recently, the Soft Explosive Model and Trans-KPg Model have emerged as additional hypotheses for the placental radiation. Here, we review molecular and paleontological evidence for each of these five models including the identification of general problems that can negatively impact divergence time estimates. The Long Fuse Model has received more support from relaxed clock studies than any of the other models, but this model is not supported by morphological cladistic studies that position Cretaceous eutherians outside of crown Placentalia. At the same time, morphological cladistics has a poor track record of reconstructing higher-level relationships among the orders of placental mammals including the results of new pseudoextinction analyses that we performed on the largest available morphological data set for mammals (4,541 characters). We also examine the strengths and weaknesses of different timetree methods (node dating, tip dating, and fossilized birth-death dating) that may now be applied to estimate the timing of the placental radiation. While new methods such as tip dating are promising, they also have problems that must be addressed if these methods are to effectively discriminate among competing hypotheses for placental diversification. Finally, we discuss the complexities of timetree estimation when the signal of speciation times is impacted by incomplete lineage sorting (ILS) and hybridization. Not accounting for ILS results in dates that are older than speciation events. Hybridization, in turn, can result in dates than are younger or older than speciation dates. Disregarding this potential variation in "gene" history across the genome can distort phylogenetic branch lengths and divergence estimates when multiple unlinked genomic loci are combined together in a timetree analysis.
Collapse
Affiliation(s)
- Mark S. Springer
- Department of Evolution, Ecology, and Evolutionary Biology, University of California, Riverside, Riverside, CA, United States
| | - Nicole M. Foley
- Department of Veterinary Integrative Biosciences, Texas A&M University, College Station, TX, United States
| | - Peggy L. Brady
- Department of Evolution, Ecology, and Evolutionary Biology, University of California, Riverside, Riverside, CA, United States
| | - John Gatesy
- Division of Vertebrate Zoology, American Museum of Natural History, New York, NY, United States
| | - William J. Murphy
- Department of Veterinary Integrative Biosciences, Texas A&M University, College Station, TX, United States
| |
Collapse
|
23
|
Abstract
Abstract
Many methods exist for detecting introgression between nonsister species, but the most commonly used require either a single sequence from four or more taxa or multiple sequences from each of three taxa. Here, we present a test for introgression that uses only a single sequence from three taxa. This test, denoted D3, uses similar logic as the standard D-test for introgression, but by using pairwise distances instead of site patterns it is able to detect the same signal of introgression with fewer species. We use simulations to show that D3 has statistical power almost equal to D, demonstrating its use on a data set of wild bananas (Musa). The new test is easy to apply and easy to interpret, and should find wide use among currently available data sets.
Collapse
Affiliation(s)
- Matthew W Hahn
- Department of Biology, Indiana University, Bloomington, IN
- Department of Computer Science, Indiana University, Bloomington, IN
| | - Mark S Hibbins
- Department of Biology, Indiana University, Bloomington, IN
| |
Collapse
|
24
|
Bravo GA, Antonelli A, Bacon CD, Bartoszek K, Blom MPK, Huynh S, Jones G, Knowles LL, Lamichhaney S, Marcussen T, Morlon H, Nakhleh LK, Oxelman B, Pfeil B, Schliep A, Wahlberg N, Werneck FP, Wiedenhoeft J, Willows-Munro S, Edwards SV. Embracing heterogeneity: coalescing the Tree of Life and the future of phylogenomics. PeerJ 2019; 7:e6399. [PMID: 30783571 PMCID: PMC6378093 DOI: 10.7717/peerj.6399] [Citation(s) in RCA: 67] [Impact Index Per Article: 13.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2018] [Accepted: 01/07/2019] [Indexed: 12/23/2022] Open
Abstract
Building the Tree of Life (ToL) is a major challenge of modern biology, requiring advances in cyberinfrastructure, data collection, theory, and more. Here, we argue that phylogenomics stands to benefit by embracing the many heterogeneous genomic signals emerging from the first decade of large-scale phylogenetic analysis spawned by high-throughput sequencing (HTS). Such signals include those most commonly encountered in phylogenomic datasets, such as incomplete lineage sorting, but also those reticulate processes emerging with greater frequency, such as recombination and introgression. Here we focus specifically on how phylogenetic methods can accommodate the heterogeneity incurred by such population genetic processes; we do not discuss phylogenetic methods that ignore such processes, such as concatenation or supermatrix approaches or supertrees. We suggest that methods of data acquisition and the types of markers used in phylogenomics will remain restricted until a posteriori methods of marker choice are made possible with routine whole-genome sequencing of taxa of interest. We discuss limitations and potential extensions of a model supporting innovation in phylogenomics today, the multispecies coalescent model (MSC). Macroevolutionary models that use phylogenies, such as character mapping, often ignore the heterogeneity on which building phylogenies increasingly rely and suggest that assimilating such heterogeneity is an important goal moving forward. Finally, we argue that an integrative cyberinfrastructure linking all steps of the process of building the ToL, from specimen acquisition in the field to publication and tracking of phylogenomic data, as well as a culture that values contributors at each step, are essential for progress.
Collapse
Affiliation(s)
- Gustavo A. Bravo
- Department of Organismic and Evolutionary Biology, Museum of Comparative Zoology, Harvard University, Cambridge, MA, USA
| | - Alexandre Antonelli
- Department of Organismic and Evolutionary Biology, Museum of Comparative Zoology, Harvard University, Cambridge, MA, USA
- Gothenburg Global Biodiversity Centre, Göteborg, Sweden
- Department of Biological and Environmental Sciences, University of Gothenburg, Göteborg, Sweden
- Gothenburg Botanical Garden, Göteborg, Sweden
| | - Christine D. Bacon
- Gothenburg Global Biodiversity Centre, Göteborg, Sweden
- Department of Biological and Environmental Sciences, University of Gothenburg, Göteborg, Sweden
| | - Krzysztof Bartoszek
- Department of Computer and Information Science, Linköping University, Linköping, Sweden
| | - Mozes P. K. Blom
- Department of Bioinformatics and Genetics, Swedish Museum of Natural History, Stockholm, Sweden
| | - Stella Huynh
- Institut de Biologie, Université de Neuchâtel, Neuchâtel, Switzerland
| | - Graham Jones
- Department of Biological and Environmental Sciences, University of Gothenburg, Göteborg, Sweden
| | - L. Lacey Knowles
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI, USA
| | - Sangeet Lamichhaney
- Department of Organismic and Evolutionary Biology, Museum of Comparative Zoology, Harvard University, Cambridge, MA, USA
| | - Thomas Marcussen
- Centre for Ecological and Evolutionary Synthesis, University of Oslo, Oslo, Norway
| | - Hélène Morlon
- Institut de Biologie, Ecole Normale Supérieure de Paris, Paris, France
| | - Luay K. Nakhleh
- Department of Computer Science, Rice University, Houston, TX, USA
| | - Bengt Oxelman
- Gothenburg Global Biodiversity Centre, Göteborg, Sweden
- Department of Biological and Environmental Sciences, University of Gothenburg, Göteborg, Sweden
| | - Bernard Pfeil
- Department of Biological and Environmental Sciences, University of Gothenburg, Göteborg, Sweden
| | - Alexander Schliep
- Department of Computer Science and Engineering, Chalmers University of Technology and University of Gothenburg, Göteborg, Sweden
| | | | - Fernanda P. Werneck
- Coordenação de Biodiversidade, Programa de Coleções Científicas Biológicas, Instituto Nacional de Pesquisa da Amazônia, Manaus, AM, Brazil
| | - John Wiedenhoeft
- Department of Computer Science and Engineering, Chalmers University of Technology and University of Gothenburg, Göteborg, Sweden
- Department of Computer Science, Rutgers University, Piscataway, NJ, USA
| | - Sandi Willows-Munro
- School of Life Sciences, University of Kwazulu-Natal, Pietermaritzburg, South Africa
| | - Scott V. Edwards
- Department of Organismic and Evolutionary Biology, Museum of Comparative Zoology, Harvard University, Cambridge, MA, USA
- Gothenburg Centre for Advanced Studies in Science and Technology, Chalmers University of Technology and University of Gothenburg, Göteborg, Sweden
| |
Collapse
|