1
|
Zhang R, Drummond AJ, Mendes FK. Fast Bayesian Inference of Phylogenies from Multiple Continuous Characters. Syst Biol 2024; 73:102-124. [PMID: 38085256 PMCID: PMC11129596 DOI: 10.1093/sysbio/syad067] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2023] [Revised: 03/23/2023] [Accepted: 11/07/2023] [Indexed: 05/28/2024] Open
Abstract
Time-scaled phylogenetic trees are an ultimate goal of evolutionary biology and a necessary ingredient in comparative studies. The accumulation of genomic data has resolved the tree of life to a great extent, yet timing evolutionary events remain challenging if not impossible without external information such as fossil ages and morphological characters. Methods for incorporating morphology in tree estimation have lagged behind their molecular counterparts, especially in the case of continuous characters. Despite recent advances, such tools are still direly needed as we approach the limits of what molecules can teach us. Here, we implement a suite of state-of-the-art methods for leveraging continuous morphology in phylogenetics, and by conducting extensive simulation studies we thoroughly validate and explore our methods' properties. While retaining model generality and scalability, we make it possible to estimate absolute and relative divergence times from multiple continuous characters while accounting for uncertainty. We compile and analyze one of the most data-type diverse data sets to date, comprised of contemporaneous and ancient molecular sequences, and discrete and continuous morphological characters from living and extinct Carnivora taxa. We conclude by synthesizing lessons about our method's behavior, and suggest future research venues.
Collapse
Affiliation(s)
- Rong Zhang
- Programme in Emerging Infectious Diseases, Duke-NUS Medical School 169857, Singapore
| | - Alexei J Drummond
- Centre for Computational Evolution, The University of Auckland, Auckland 1010, New Zealand
- School of Biological Sciences, The University of Auckland, Auckland 1010, New Zealand
| | - Fábio K Mendes
- Department of Biology, Washington University in St. Louis, St. Louis, MO 63130, USA
| |
Collapse
|
2
|
Rurik I, Melichárková A, Gbúrová Štubová E, Kučera J, Kochjarová J, Paun O, Vďačný P, Slovák M. Homoplastic versus xenoplastic evolution: exploring the emergence of key intrinsic and extrinsic traits in the montane genus Soldanella (Primulaceae). THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2024; 118:753-765. [PMID: 38217489 DOI: 10.1111/tpj.16630] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/08/2023] [Revised: 12/02/2023] [Accepted: 12/27/2023] [Indexed: 01/15/2024]
Abstract
Specific ecological conditions in the high mountain environment exert a selective pressure that often leads to convergent trait evolution. Reticulations induced by incomplete lineage sorting and introgression can lead to discordant trait patterns among gene and species trees (hemiplasy/xenoplasy), providing a false illusion that the traits under study are homoplastic. Using phylogenetic species networks, we explored the effect of gene exchange on trait evolution in Soldanella, a genus profoundly influenced by historical introgression. At least three features evolved independently multiple times: the single-flowered dwarf phenotype, dysploid cytotype, and ecological generalism. The present analyses also indicated that the recurring occurrence of stoloniferous growth might have been prompted by an introgression event between an ancestral lineage and a still extant species, although its emergence via convergent evolution cannot be completely ruled out. Phylogenetic regression suggested that the independent evolution of larger genomes in snowbells is most likely a result of the interplay between hybridization events of dysploid and euploid taxa and hostile environments at the range margins of the genus. The emergence of key intrinsic and extrinsic traits in snowbells has been significantly impacted not only by convergent evolution but also by historical and recent introgression events.
Collapse
Affiliation(s)
- Ivan Rurik
- Department of Zoology, Comenius University Bratislava, Ilkovičova 6, 842 15, Bratislava, Slovak Republic
| | - Andrea Melichárková
- Institute of Botany, Plant Science and Biodiversity Centre, Slovak Academy of Sciences, Dúbravská cesta 9, 845 23, Bratislava, Slovak Republic
| | - Eliška Gbúrová Štubová
- Institute of Botany, Plant Science and Biodiversity Centre, Slovak Academy of Sciences, Dúbravská cesta 9, 845 23, Bratislava, Slovak Republic
- Slovak National Museum, Natural History Museum, Vajanského nábrežie 2, 810 06, Bratislava, Slovak Republic
| | - Jaromír Kučera
- Institute of Botany, Plant Science and Biodiversity Centre, Slovak Academy of Sciences, Dúbravská cesta 9, 845 23, Bratislava, Slovak Republic
| | - Judita Kochjarová
- Department of Phytology, Faculty of Forestry, Technical University Zvolen, Masarykova 24, 960 53, Zvolen, Slovak Republic
| | - Ovidiu Paun
- Department of Botany and Biodiversity Research, University of Vienna, Rennweg 14, 1030, Vienna, Austria
| | - Peter Vďačný
- Department of Zoology, Comenius University Bratislava, Ilkovičova 6, 842 15, Bratislava, Slovak Republic
| | - Marek Slovák
- Institute of Botany, Plant Science and Biodiversity Centre, Slovak Academy of Sciences, Dúbravská cesta 9, 845 23, Bratislava, Slovak Republic
- Department of Botany, Charles University, Benátská 2, 128 01, Prague, Czech Republic
| |
Collapse
|
3
|
Wicke K, Haque MR, Kubatko L. Implications of gene tree heterogeneity on downstream phylogenetic analyses: A case study employing the Fair Proportion index. PLoS One 2024; 19:e0300900. [PMID: 38662751 PMCID: PMC11045071 DOI: 10.1371/journal.pone.0300900] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2023] [Accepted: 03/01/2024] [Indexed: 04/28/2024] Open
Abstract
Many questions in evolutionary biology require the specification of a phylogeny for downstream phylogenetic analyses. However, with the increasingly widespread availability of genomic data, phylogenetic studies are often confronted with conflicting signal in the form of genomic heterogeneity and incongruence between gene trees and the species tree. This raises the question of determining what data and phylogeny should be used in downstream analyses, and to what extent the choice of phylogeny (e.g., gene trees versus species trees) impacts the analyses and their outcomes. In this paper, we study this question in the realm of phylogenetic diversity indices, which provide ways to prioritize species for conservation based on their relative evolutionary isolation on a phylogeny, and are thus one example of downstream phylogenetic analyses. We use the Fair Proportion (FP) index, also known as the evolutionary distinctiveness score, and explore the variability in species rankings based on gene trees as compared to the species tree for several empirical data sets. Our results indicate that prioritization rankings among species vary greatly depending on the underlying phylogeny, suggesting that the choice of phylogeny is a major influence in assessing phylogenetic diversity in a conservation setting. While we use phylogenetic diversity conservation as an example, we suspect that other types of downstream phylogenetic analyses such as ancestral state reconstruction are similarly affected by genomic heterogeneity and incongruence. Our aim is thus to raise awareness of this issue and inspire new research on which evolutionary information (species trees, gene trees, or a combination of both) should form the basis for analyses in these settings.
Collapse
Affiliation(s)
- Kristina Wicke
- Department of Mathematical Sciences, New Jersey Institute of Technology, Newark, NJ, United States of America
| | - Md. Rejuan Haque
- Division of Biostatistics, College of Public Health, The Ohio State University, Columbus, OH, United States of America
| | - Laura Kubatko
- Department of Evolution, Ecology, and Organismal Biology, The Ohio State University, Columbus, OH, United States of America
- Department of Statistics, The Ohio State University, Columbus, OH, United States of America
| |
Collapse
|
4
|
Schraiber JG, Edge MD, Pennell M. Unifying approaches from statistical genetics and phylogenetics for mapping phenotypes in structured populations. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.02.10.579721. [PMID: 38496530 PMCID: PMC10942266 DOI: 10.1101/2024.02.10.579721] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/19/2024]
Abstract
In both statistical genetics and phylogenetics, a major goal is to identify correlations between genetic loci or other aspects of the phenotype or environment and a focal trait. In these two fields, there are sophisticated but disparate statistical traditions aimed at these tasks. The disconnect between their respective approaches is becoming untenable as questions in medicine, conservation biology, and evolutionary biology increasingly rely on integrating data from within and among species, and once-clear conceptual divisions are becoming increasingly blurred. To help bridge this divide, we derive a general model describing the covariance between the genetic contributions to the quantitative phenotypes of different individuals. Taking this approach shows that standard models in both statistical genetics (e.g., Genome-Wide Association Studies; GWAS) and phylogenetic comparative biology (e.g., phylogenetic regression) can be interpreted as special cases of this more general quantitative-genetic model. The fact that these models share the same core architecture means that we can build a unified understanding of the strengths and limitations of different methods for controlling for genetic structure when testing for associations. We develop intuition for why and when spurious correlations may occur using analytical theory and conduct population-genetic and phylogenetic simulations of quantitative traits. The structural similarity of problems in statistical genetics and phylogenetics enables us to take methodological advances from one field and apply them in the other. We demonstrate this by showing how a standard GWAS technique-including both the genetic relatedness matrix (GRM) as well as its leading eigenvectors, corresponding to the principal components of the genotype matrix, in a regression model-can mitigate spurious correlations in phylogenetic analyses. As a case study of this, we re-examine an analysis testing for co-evolution of expression levels between genes across a fungal phylogeny, and show that including covariance matrix eigenvectors as covariates decreases the false positive rate while simultaneously increasing the true positive rate. More generally, this work provides a foundation for more integrative approaches for understanding the genetic architecture of phenotypes and how evolutionary processes shape it.
Collapse
|
5
|
Mendes FK, Landis MJ. PhyloJunction: a computational framework for simulating, developing, and teaching evolutionary models. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.12.15.571907. [PMID: 38168278 PMCID: PMC10760140 DOI: 10.1101/2023.12.15.571907] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/05/2024]
Abstract
We introduce PhyloJunction, a computational framework designed to facilitate the prototyping, testing, and characterization of evolutionary models. PhyloJunction is distributed as an open-source Python library that can be used to implement a variety of models, through its flexible graphical modeling architecture and dedicated model specification language. Model design and use are exposed to users via command-line and graphical interfaces, which integrate the steps of simulating, summarizing, and visualizing data. This paper describes the features of PhyloJunction - which include, but are not limited to, a general implementation of a popular family of phylogenetic diversification models - and, moving forward, how it may be expanded to not only include new models, but to also become a platform for conducting and teaching statistical learning.
Collapse
Affiliation(s)
- Fábio K. Mendes
- Department of Biology, Washington University in St. Louis, St. Louis, MO
| | - Michael J. Landis
- Department of Biology, Washington University in St. Louis, St. Louis, MO
| |
Collapse
|
6
|
Dimayacyac JR, Wu S, Jiang D, Pennell M. Evaluating the Performance of Widely Used Phylogenetic Models for Gene Expression Evolution. Genome Biol Evol 2023; 15:evad211. [PMID: 38000902 PMCID: PMC10709115 DOI: 10.1093/gbe/evad211] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2023] [Revised: 11/09/2023] [Accepted: 11/17/2023] [Indexed: 11/26/2023] Open
Abstract
Phylogenetic comparative methods are increasingly used to test hypotheses about the evolutionary processes that drive divergence in gene expression among species. However, it is unknown whether the distributional assumptions of phylogenetic models designed for quantitative phenotypic traits are realistic for expression data and importantly, the reliability of conclusions of phylogenetic comparative studies of gene expression may depend on whether the data is well described by the chosen model. To evaluate this, we first fit several phylogenetic models of trait evolution to 8 previously published comparative expression datasets, comprising a total of 54,774 genes with 145,927 unique gene-tissue combinations. Using a previously developed approach, we then assessed how well the best model of the set described the data in an absolute (not just relative) sense. First, we find that Ornstein-Uhlenbeck models, in which expression values are constrained around an optimum, were the preferred models for 66% of gene-tissue combinations. Second, we find that for 61% of gene-tissue combinations, the best-fit model of the set was found to perform well; the rest were found to be performing poorly by at least one of the test statistics we examined. Third, we find that when simple models do not perform well, this appears to be typically a consequence of failing to fully account for heterogeneity in the rate of the evolution. We advocate that assessment of model performance should become a routine component of phylogenetic comparative expression studies; doing so can improve the reliability of inferences and inspire the development of novel models.
Collapse
Affiliation(s)
- Jose Rafael Dimayacyac
- Department of Zoology, University of British Columbia, Vancouver, BC, Canada
- Michael Smith Laboratories, University of British Columbia, Vancouver, BC, Canada
| | - Shanyun Wu
- Department of Zoology, University of British Columbia, Vancouver, BC, Canada
- Department of Developmental Biology, Washington University School of Medicine in St. Louis, St. Louis, MO, USA
| | - Daohan Jiang
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, USA
| | - Matt Pennell
- Department of Zoology, University of British Columbia, Vancouver, BC, Canada
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, USA
- Department of Biological Sciences, University of Southern California, Los Angeles, CA, USA
| |
Collapse
|
7
|
Thomas GWC, Hughes JJ, Kumon T, Berv JS, Nordgren CE, Lampson M, Levine M, Searle JB, Good JM. The genomic landscape, causes, and consequences of extensive phylogenomic discordance in Old World mice and rats. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.08.28.555178. [PMID: 37693498 PMCID: PMC10491188 DOI: 10.1101/2023.08.28.555178] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/12/2023]
Abstract
A species tree is a central concept in evolutionary biology whereby a single branching phylogeny reflects relationships among species. However, the phylogenies of different genomic regions often differ from the species tree. Although tree discordance is often widespread in phylogenomic studies, we still lack a clear understanding of how variation in phylogenetic patterns is shaped by genome biology or the extent to which discordance may compromise comparative studies. We characterized patterns of phylogenomic discordance across the murine rodents (Old World mice and rats) - a large and ecologically diverse group that gave rise to the mouse and rat model systems. Combining new linked-read genome assemblies for seven murine species with eleven published rodent genomes, we first used ultra-conserved elements (UCEs) to infer a robust species tree. We then used whole genomes to examine finer-scale patterns of discordance and found that phylogenies built from proximate chromosomal regions had similar phylogenies. However, there was no relationship between tree similarity and local recombination rates in house mice, suggesting that genetic linkage influences phylogenetic patterns over deeper timescales. This signal may be independent of contemporary recombination landscapes. We also detected a strong influence of linked selection whereby purifying selection at UCEs led to less discordance, while genes experiencing positive selection showed more discordant and variable phylogenetic signals. Finally, we show that assuming a single species tree can result in high error rates when testing for positive selection under different models. Collectively, our results highlight the complex relationship between phylogenetic inference and genome biology and underscore how failure to account for this complexity can mislead comparative genomic studies.
Collapse
Affiliation(s)
- Gregg W. C. Thomas
- Division of Biological Sciences, University of Montana, Missoula, MT, 59801
- Informatics Group, Harvard University, Cambridge, MA, 02138
| | - Jonathan J. Hughes
- Department of Ecology and Evolutionary Biology, Cornell University, Ithaca, NY, 14853
- Department of Evolution, Ecology, and Organismal Biology, University of California Riverside, Riverside, CA, 92521
| | - Tomohiro Kumon
- Department of Biology, University of Pennsylvania, Philadelphia, PA, 19104
| | - Jacob S. Berv
- Department of Ecology and Evolutionary Biology, Cornell University, Ithaca, NY, 14853
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI, 48109
| | - C. Erik Nordgren
- Department of Biology, University of Pennsylvania, Philadelphia, PA, 19104
| | - Michael Lampson
- Department of Biology, University of Pennsylvania, Philadelphia, PA, 19104
| | - Mia Levine
- Department of Biology, University of Pennsylvania, Philadelphia, PA, 19104
| | - Jeremy B. Searle
- Department of Ecology and Evolutionary Biology, Cornell University, Ithaca, NY, 14853
| | - Jeffrey M. Good
- Division of Biological Sciences, University of Montana, Missoula, MT, 59801
| |
Collapse
|
8
|
Dimayacyac JR, Wu S, Jiang D, Pennell M. Evaluating the Performance of Widely Used Phylogenetic Models for Gene Expression Evolution. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.02.09.527893. [PMID: 37645857 PMCID: PMC10461906 DOI: 10.1101/2023.02.09.527893] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/31/2023]
Abstract
Phylogenetic comparative methods are increasingly used to test hypotheses about the evolutionary processes that drive divergence in gene expression among species. However, it is unknown whether the distributional assumptions of phylogenetic models designed for quantitative phenotypic traits are realistic for expression data and importantly, the reliability of conclusions of phylogenetic comparative studies of gene expression may depend on whether the data is well-described by the chosen model. To evaluate this, we first fit several phylogenetic models of trait evolution to 8 previously published comparative expression datasets, comprising a total of 54,774 genes with 145,927 unique gene-tissue combinations. Using a previously developed approach, we then assessed how well the best model of the set described the data in an absolute (not just relative) sense. First, we find that Ornstein-Uhlenbeck models, in which expression values are constrained around an optimum, were the preferred model for 66% of gene-tissue combinations. Second, we find that for 61% of gene-tissue combinations, the best fit model of the set was found to perform well; the rest were found to be performing poorly by at least one of the test statistics we examined. Third, we find that when simple models do not perform well, this appears to be typically a consequence of failing to fully account for heterogeneity in the rate of the evolution. We advocate that assessment of model performance should become a routine component of phylogenetic comparative expression studies; doing so can improve the reliability of inferences and inspire the development of novel models.
Collapse
Affiliation(s)
- Jose Rafael Dimayacyac
- Department of Zoology, University of British Columbia, Canada
- Michael Smith Laboratories, University of British Columbia, Canada
| | - Shanyun Wu
- Department of Zoology, University of British Columbia, Canada
- Department of Genetics, Washington University School of Medicine, USA
| | - Daohan Jiang
- Department of Quantitative and Computational Biology, University of Southern California, USA
| | - Matt Pennell
- Department of Zoology, University of British Columbia, Canada
- Department of Quantitative and Computational Biology, University of Southern California, USA
- Department of Biological Sciences, University of Southern California, USA
| |
Collapse
|
9
|
Hibbins MS, Breithaupt LC, Hahn MW. Phylogenomic comparative methods: Accurate evolutionary inferences in the presence of gene tree discordance. Proc Natl Acad Sci U S A 2023; 120:e2220389120. [PMID: 37216509 PMCID: PMC10235958 DOI: 10.1073/pnas.2220389120] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2022] [Accepted: 04/24/2023] [Indexed: 05/24/2023] Open
Abstract
Phylogenetic comparative methods have long been a mainstay of evolutionary biology, allowing for the study of trait evolution across species while accounting for their common ancestry. These analyses typically assume a single, bifurcating phylogenetic tree describing the shared history among species. However, modern phylogenomic analyses have shown that genomes are often composed of mosaic histories that can disagree both with the species tree and with each other-so-called discordant gene trees. These gene trees describe shared histories that are not captured by the species tree, and therefore that are unaccounted for in classic comparative approaches. The application of standard comparative methods to species histories containing discordance leads to incorrect inferences about the timing, direction, and rate of evolution. Here, we develop two approaches for incorporating gene tree histories into comparative methods: one that constructs an updated phylogenetic variance-covariance matrix from gene trees, and another that applies Felsenstein's pruning algorithm over a set of gene trees to calculate trait histories and likelihoods. Using simulation, we demonstrate that our approaches generate much more accurate estimates of tree-wide rates of trait evolution than standard methods. We apply our methods to two clades of the wild tomato genus Solanum with varying rates of discordance, demonstrating the contribution of gene tree discordance to variation in a set of floral traits. Our approaches have the potential to be applied to a broad range of classic inference problems in phylogenetics, including ancestral state reconstruction and the inference of lineage-specific rate shifts.
Collapse
Affiliation(s)
- Mark S. Hibbins
- Department of Ecology and Evolutionary Biology, University of Toronto, Toronto, ONM5S 3B2, Canada
- Department of Biology, Indiana University, Bloomington, IN47405
| | - Lara C. Breithaupt
- Department of Biology, Indiana University, Bloomington, IN47405
- Department of Computer Science, Duke University, Durham, NC27710
| | - Matthew W. Hahn
- Department of Biology, Indiana University, Bloomington, IN47405
- Department of Computer Science, Indiana University, Bloomington, IN47405
| |
Collapse
|
10
|
Bertram J, Fulton B, Tourigny JP, Peña-Garcia Y, Moyle LC, Hahn MW. CAGEE: Computational Analysis of Gene Expression Evolution. Mol Biol Evol 2023; 40:msad106. [PMID: 37158385 PMCID: PMC10195155 DOI: 10.1093/molbev/msad106] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2022] [Revised: 04/26/2023] [Accepted: 05/01/2023] [Indexed: 05/10/2023] Open
Abstract
Despite the increasing abundance of whole transcriptome data, few methods are available to analyze global gene expression across phylogenies. Here, we present a new software package (Computational Analysis of Gene Expression Evolution [CAGEE]) for inferring patterns of increases and decreases in gene expression across a phylogenetic tree, as well as the rate at which these changes occur. In contrast to previous methods that treat each gene independently, CAGEE can calculate genome-wide rates of gene expression, along with ancestral states for each gene. The statistical approach developed here makes it possible to infer lineage-specific shifts in rates of evolution across the genome, in addition to possible differences in rates among multiple tissues sampled from the same species. We demonstrate the accuracy and robustness of our method on simulated data and apply it to a data set of ovule gene expression collected from multiple self-compatible and self-incompatible species in the genus Solanum to test hypotheses about the evolutionary forces acting during mating system shifts. These comparisons allow us to highlight the power of CAGEE, demonstrating its utility for use in any empirical system and for the analysis of most morphological traits. Our software is available at https://github.com/hahnlab/CAGEE/.
Collapse
Affiliation(s)
- Jason Bertram
- Department of Biology, Indiana University, Bloomington, IN
- Department of Mathematics, Western University, London, ON, Canada
| | - Ben Fulton
- Department of Biology, Indiana University, Bloomington, IN
- University Information Technology Services, Indiana University, Bloomington, IN
| | - Jason P Tourigny
- Department of Biology, Indiana University, Bloomington, IN
- Department of Computer Science, Indiana University, Bloomington, IN
| | | | - Leonie C Moyle
- Department of Biology, Indiana University, Bloomington, IN
| | - Matthew W Hahn
- Department of Biology, Indiana University, Bloomington, IN
- Department of Computer Science, Indiana University, Bloomington, IN
| |
Collapse
|
11
|
Hibbins MS, Hahn MW. The effects of introgression across thousands of quantitative traits revealed by gene expression in wild tomatoes. PLoS Genet 2021; 17:e1009892. [PMID: 34748547 PMCID: PMC8601620 DOI: 10.1371/journal.pgen.1009892] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2021] [Revised: 11/18/2021] [Accepted: 10/18/2021] [Indexed: 01/13/2023] Open
Abstract
It is now understood that introgression can serve as powerful evolutionary force, providing genetic variation that can shape the course of trait evolution. Introgression also induces a shared evolutionary history that is not captured by the species phylogeny, potentially complicating evolutionary analyses that use a species tree. Such analyses are often carried out on gene expression data across species, where the measurement of thousands of trait values allows for powerful inferences while controlling for shared phylogeny. Here, we present a Brownian motion model for quantitative trait evolution under the multispecies network coalescent framework, demonstrating that introgression can generate apparently convergent patterns of evolution when averaged across thousands of quantitative traits. We test our theoretical predictions using whole-transcriptome expression data from ovules in the wild tomato genus Solanum. Examining two sub-clades that both have evidence for post-speciation introgression, but that differ substantially in its magnitude, we find patterns of evolution that are consistent with histories of introgression in both the sign and magnitude of ovule gene expression. Additionally, in the sub-clade with a higher rate of introgression, we observe a correlation between local gene tree topology and expression similarity, implicating a role for introgressed cis-regulatory variation in generating these broad-scale patterns. Our results reveal a general role for introgression in shaping patterns of variation across many thousands of quantitative traits, and provide a framework for testing for these effects using simple model-informed predictions. It is now known from studying large genetic datasets that species often hybridize and cross with each other over many generations – a phenomenon known as introgression. Introgression introduces new genetic variation into a population, and this variation can cause traits to be shared among the introgressing species. When researchers study the evolution of trait variation among species, this source of trait sharing is rarely accounted for. Here, we present a statistical model of the effects of introgression on trait variation. This model predicts that, when averaged across many thousands of traits, introgressing species are consistently more similar than expected from standard approaches. Researchers studying gene expression often consider the expression of many thousands of genes, making this a case where the expected effects of introgression are likely to manifest. We tested our model prediction using ovule gene expression data from the wild tomato genus Solanum, in two groups of species with evidence of historical introgression. We found that patterns of expression similarity in both groups are consistent with their histories of introgression and the predictions from our model. Our results highlight the importance of accounting for introgression as a source of trait variation among species.
Collapse
Affiliation(s)
- Mark S. Hibbins
- Department of Biology, Indiana University, Bloomington, Indiana, United States of America
- * E-mail:
| | - Matthew W. Hahn
- Department of Biology, Indiana University, Bloomington, Indiana, United States of America
- Department of Computer Science, Indiana University, Bloomington, Indiana, United States of America
| |
Collapse
|
12
|
Wang Y, Cao Z, Ogilvie HA, Nakhleh L. Phylogenomic assessment of the role of hybridization and introgression in trait evolution. PLoS Genet 2021; 17:e1009701. [PMID: 34407067 PMCID: PMC8405015 DOI: 10.1371/journal.pgen.1009701] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2021] [Revised: 08/30/2021] [Accepted: 07/07/2021] [Indexed: 11/30/2022] Open
Abstract
Trait evolution among a set of species-a central theme in evolutionary biology-has long been understood and analyzed with respect to a species tree. However, the field of phylogenomics, which has been propelled by advances in sequencing technologies, has ushered in the era of species/gene tree incongruence and, consequently, a more nuanced understanding of trait evolution. For a trait whose states are incongruent with the branching patterns in the species tree, the same state could have arisen independently in different species (homoplasy) or followed the branching patterns of gene trees, incongruent with the species tree (hemiplasy). Another evolutionary process whose extent and significance are better revealed by phylogenomic studies is gene flow between different species. In this work, we present a phylogenomic method for assessing the role of hybridization and introgression in the evolution of polymorphic or monomorphic binary traits. We apply the method to simulated evolutionary scenarios to demonstrate the interplay between the parameters of the evolutionary history and the role of introgression in a binary trait's evolution (which we call xenoplasy). Very importantly, we demonstrate, including on a biological data set, that inferring a species tree and using it for trait evolution analysis in the presence of gene flow could lead to misleading hypotheses about trait evolution.
Collapse
Affiliation(s)
- Yaxuan Wang
- Department of Computer Science, Rice University, Houston, Texas, United States of America
| | - Zhen Cao
- Department of Computer Science, Rice University, Houston, Texas, United States of America
| | - Huw A. Ogilvie
- Department of Computer Science, Rice University, Houston, Texas, United States of America
| | - Luay Nakhleh
- Department of Computer Science, Rice University, Houston, Texas, United States of America
- Department of BioSciences, Rice University, Houston, Texas, United States of America
| |
Collapse
|
13
|
Ogilvie HA, Mendes FK, Vaughan TG, Matzke NJ, Stadler T, Welch D, Drummond AJ. Novel Integrative Modeling of Molecules and Morphology across Evolutionary Timescales. Syst Biol 2021; 71:208-220. [PMID: 34228807 PMCID: PMC8677526 DOI: 10.1093/sysbio/syab054] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2021] [Revised: 06/23/2021] [Accepted: 06/29/2021] [Indexed: 11/13/2022] Open
Abstract
Evolutionary models account for either population- or species-level processes but usually not both. We introduce a new model, the FBD-MSC, which makes it possible for the first time to integrate both the genealogical and fossilization phenomena, by means of the multispecies coalescent (MSC) and the fossilized birth–death (FBD) processes. Using this model, we reconstruct the phylogeny representing all extant and many fossil Caninae, recovering both the relative and absolute time of speciation events. We quantify known inaccuracy issues with divergence time estimates using the popular strategy of concatenating molecular alignments and show that the FBD-MSC solves them. Our new integrative method and empirical results advance the paradigm and practice of probabilistic total evidence analyses in evolutionary biology.[Caninae; fossilized birth–death; molecular clock; multispecies coalescent; phylogenetics; species trees.]
Collapse
Affiliation(s)
- Huw A Ogilvie
- Department of Computer Science, Rice University, Houston TX, 77005, USA
| | - Fábio K Mendes
- Centre for Computational Evolution, The University of Auckland, Auckland, 1010, New Zealand.,School of Biological Sciences, The University of Auckland, Auckland, 1010, New Zealand
| | - Timothy G Vaughan
- Department of Biosystems Science and Engineering, ETH Zürich, Basel, 4058, Switzerland.,SIB Swiss Institute of Bioinformatics, Lausanne, 1015, Switzerland
| | - Nicholas J Matzke
- Centre for Computational Evolution, The University of Auckland, Auckland, 1010, New Zealand.,School of Biological Sciences, The University of Auckland, Auckland, 1010, New Zealand
| | - Tanja Stadler
- Department of Biosystems Science and Engineering, ETH Zürich, Basel, 4058, Switzerland.,SIB Swiss Institute of Bioinformatics, Lausanne, 1015, Switzerland
| | - David Welch
- Centre for Computational Evolution, The University of Auckland, Auckland, 1010, New Zealand.,School of Computer Science, The University of Auckland, Auckland, 1010, New Zealand
| | - Alexei J Drummond
- Centre for Computational Evolution, The University of Auckland, Auckland, 1010, New Zealand.,School of Computer Science, The University of Auckland, Auckland, 1010, New Zealand.,School of Biological Sciences, The University of Auckland, Auckland, 1010, New Zealand
| |
Collapse
|
14
|
Adams RH, Blackmon H, DeGiorgio M. Of Traits and Trees: Probabilistic Distances under Continuous Trait Models for Dissecting the Interplay among Phylogeny, Model, and Data. Syst Biol 2021; 70:660-680. [PMID: 33587145 PMCID: PMC8208806 DOI: 10.1093/sysbio/syab009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2019] [Accepted: 02/01/2021] [Indexed: 12/03/2022] Open
Abstract
Stochastic models of character trait evolution have become a cornerstone of evolutionary biology in an array of contexts. While probabilistic models have been used extensively for statistical inference, they have largely been ignored for the purpose of measuring distances between phylogeny-aware models. Recent contributions to the problem of phylogenetic distance computation have highlighted the importance of explicitly considering evolutionary model parameters and their impacts on molecular sequence data when quantifying dissimilarity between trees. By comparing two phylogenies in terms of their induced probability distributions that are functions of many model parameters, these distances can be more informative than traditional approaches that rely strictly on differences in topology or branch lengths alone. Currently, however, these approaches are designed for comparing models of nucleotide substitution and gene tree distributions, and thus, are unable to address other classes of traits and associated models that may be of interest to evolutionary biologists. Here, we expand the principles of probabilistic phylogenetic distances to compute tree distances under models of continuous trait evolution along a phylogeny. By explicitly considering both the degree of relatedness among species and the evolutionary processes that collectively give rise to character traits, these distances provide a foundation for comparing models and their predictions, and for quantifying the impacts of assuming one phylogenetic background over another while studying the evolution of a particular trait. We demonstrate the properties of these approaches using theory, simulations, and several empirical data sets that highlight potential uses of probabilistic distances in many scenarios. We also introduce an open-source R package named PRDATR for easy application by the scientific community for computing phylogenetic distances under models of character trait evolution.[Brownian motion; comparative methods; phylogeny; quantitative traits.].
Collapse
Affiliation(s)
- Richard H Adams
- Department of Computer and Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL 33431, USA
| | - Heath Blackmon
- Department of Biology, Texas A&M University, College Station, TX 77843, USA
| | - Michael DeGiorgio
- Department of Computer and Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL 33431, USA
| |
Collapse
|
15
|
Introgression is widespread in the radiation of carnivorous Nepenthes pitcher plants. Mol Phylogenet Evol 2021; 163:107214. [PMID: 34052438 DOI: 10.1016/j.ympev.2021.107214] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2021] [Revised: 05/14/2021] [Accepted: 05/25/2021] [Indexed: 11/23/2022]
Abstract
Introgression and hybridization are important processes in plant evolution, but they are difficult to study from a phylogenetic perspective, because they conflict with the bifurcating evolutionary history typically depicted in phylogenetic models. The role of hybridization in plant evolution is best documented in the form of allo-polyploidizations. In contrast, homoploid hybridization and introgression are less explored, although they may be crucial in adaptive radiations. Here we employ genome-wide data (ddRAD-seq, transcriptomes) to investigate the evolutionary history of Nepenthes, a radiation of c. 160 species of iconic carnivorous plants mainly from tropical Asia. Our data indicates that the main radiation is only c. 5 million years old, and confirms previous bifurcating phylogenies. However, due to a greatly expanded number of loci, we were able test for the first time the long-standing hypotheses of introgression and historical hybridization. The genus presents one very clear case of organellar capture between two distantly related but sympatric groups. Furthermore, all Nepenthes species show introgression signals in their nuclear genomes, as uncovered by a general survey of ABBA-BABA-like statistics. The ancestor of the rapid main radiation shows ancestry from two deeply diverged lineages, as indicated by phylogenetic network analyses. All major clades of the main radiation show further introgression both within and between each other, as suggested by admixture graphs. Our study supports the hypothesis that rapid adaptive radiations are hotspots of introgression in the tree of life, and highlights the need to consider non-treelike processes in evolutionary studies of Nepenthes in particular.
Collapse
|
16
|
Abstract
Evolutionary biologists have long been fascinated with the episodes of rapid phenotypic innovation that underlie the emergence of major lineages. Although our understanding of the environmental and ecological contexts of such episodes has steadily increased, it has remained unclear how population processes contribute to emergent macroevolutionary patterns. One insight gleaned from phylogenomics is that gene-tree conflict, frequently caused by population-level processes, is often rampant during the origin of major lineages. With the understanding that phylogenomic conflict is often driven by complex population processes, we hypothesized that there may be a direct correspondence between instances of high conflict and elevated rates of phenotypic innovation if both patterns result from the same processes. We evaluated this hypothesis in six clades spanning vertebrates and plants. We found that the most conflict-rich regions of these six clades also tended to experience the highest rates of phenotypic innovation, suggesting that population processes shaping both phenotypic and genomic evolution may leave signatures at deep timescales. Closer examination of the biological significance of phylogenomic conflict may yield improved connections between micro- and macroevolution and increase our understanding of the processes that shape the origin of major lineages across the Tree of Life.
Collapse
|
17
|
Porto DS, Almeida EAB, Pennell MW. Investigating Morphological Complexes Using Informational Dissonance and Bayes Factors: A Case Study in Corbiculate Bees. Syst Biol 2021; 70:295-306. [PMID: 32722788 PMCID: PMC7882150 DOI: 10.1093/sysbio/syaa059] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2019] [Revised: 07/16/2020] [Accepted: 07/17/2020] [Indexed: 11/22/2022] Open
Abstract
It is widely recognized that different regions of a genome often have different evolutionary histories and that ignoring this variation when estimating phylogenies can be misleading. However, the extent to which this is also true for morphological data is still largely unknown. Discordance among morphological traits might plausibly arise due to either variable convergent selection pressures or else phenomena such as hemiplasy. Here, we investigate patterns of discordance among 282 morphological characters, which we scored for 50 bee species particularly targeting corbiculate bees, a group that includes the well-known eusocial honeybees and bumblebees. As a starting point for selecting the most meaningful partitions in the data, we grouped characters as morphological modules, highly integrated trait complexes that as a result of developmental constraints or coordinated selection we expect to share an evolutionary history and trajectory. In order to assess conflict and coherence across and within these morphological modules, we used recently developed approaches for computing Bayesian phylogenetic information allied with model comparisons using Bayes factors. We found that despite considerable conflict among morphological complexes, accounting for among-character and among-partition rate variation with individual gamma distributions, rate multipliers, and linked branch lengths can lead to coherent phylogenetic inference using morphological data. We suggest that evaluating information content and dissonance among partitions is a useful step in estimating phylogenies from morphological data, just as it is with molecular data. Furthermore, we argue that adopting emerging approaches for investigating dissonance in genomic datasets may provide new insights into the integration and evolution of anatomical complexes. [Apidae; entropy; morphological modules; phenotypic integration; phylogenetic information.].
Collapse
Affiliation(s)
- Diego S Porto
- Laboratório de Biologia Comparada e Abelhas (LBCA), Departamento de Biologia, Faculdade de Filosofia, Ciências e Letras de Ribeirão Preto (FFCLRP), Universidade de São Paulo, 14040-901 Ribeirão Preto, SP, Brazil
- Department of Zoology and Biodiversity Research Centre, University of British Columbia, Vancouver BC V6T 1Z4, Canada
- Department of Biological Sciences, Virginia Polytechnic Institute and State University, 926 West Campus Drive, Blacksburg, VA 24061 USA
| | - Eduardo A B Almeida
- Laboratório de Biologia Comparada e Abelhas (LBCA), Departamento de Biologia, Faculdade de Filosofia, Ciências e Letras de Ribeirão Preto (FFCLRP), Universidade de São Paulo, 14040-901 Ribeirão Preto, SP, Brazil
| | - Matthew W Pennell
- Department of Zoology and Biodiversity Research Centre, University of British Columbia, Vancouver BC V6T 1Z4, Canada
| |
Collapse
|
18
|
Duchen P, Alfaro ML, Rolland J, Salamin N, Silvestro D. On the Effect of Asymmetrical Trait Inheritance on Models of Trait Evolution. Syst Biol 2021; 70:376-388. [PMID: 32681798 PMCID: PMC7875446 DOI: 10.1093/sysbio/syaa055] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2019] [Revised: 06/30/2020] [Accepted: 07/09/2020] [Indexed: 11/25/2022] Open
Abstract
Current phylogenetic comparative methods modeling quantitative trait evolution generally assume that, during speciation, phenotypes are inherited identically between the two daughter species. This, however, neglects the fact that species consist of a set of individuals, each bearing its own trait value. Indeed, because descendent populations after speciation are samples of a parent population, we can expect their mean phenotypes to randomly differ from one another potentially generating a "jump" of mean phenotypes due to asymmetrical trait inheritance at cladogenesis. Here, we aim to clarify the effect of asymmetrical trait inheritance at speciation on macroevolutionary analyses, focusing on model testing and parameter estimation using some of the most common models of quantitative trait evolution. We developed an individual-based simulation framework in which the evolution of phenotypes is determined by trait changes at the individual level accumulating across generations, and cladogenesis occurs then by separation of subsets of the individuals into new lineages. Through simulations, we assess the magnitude of phenotypic jumps at cladogenesis under different modes of trait inheritance at speciation. We show that even small jumps can strongly alter both the results of model selection and parameter estimations, potentially affecting the biological interpretation of the estimated mode of evolution of a trait. Our results call for caution when interpreting analyses of trait evolution, while highlighting the importance of testing a wide range of alternative models. In the light of our findings, we propose that future methodological advances in comparative methods should more explicitly model the intraspecific variability around species mean phenotypes and how it is inherited at speciation.
Collapse
Affiliation(s)
- Pablo Duchen
- Department of Computational Biology, University of Lausanne, Quartier Sorge, 1015 Lausanne, Switzerland
| | - Michael L Alfaro
- University of California Los Angeles (UCLA). College Life Sciences - Ecology and Evolutionary Biology. Los Angeles, CA, USA
| | - Jonathan Rolland
- Department of Computational Biology, University of Lausanne, Quartier Sorge, 1015 Lausanne, Switzerland
- Department of Zoology, University of British Columbia, #4200-6270 University Blvd, Vancouver, BC, Canada
| | - Nicolas Salamin
- Department of Computational Biology, University of Lausanne, Quartier Sorge, 1015 Lausanne, Switzerland
| | - Daniele Silvestro
- Department of Biology, University of Fribourg, 1700 Fribourg, Switzerland Nicolas Salamin and Daniele Silvestro contributed equally to this article
| |
Collapse
|
19
|
Hibbins MS, Gibson MJS, Hahn MW. Determining the probability of hemiplasy in the presence of incomplete lineage sorting and introgression. eLife 2020; 9:e63753. [PMID: 33345772 PMCID: PMC7800383 DOI: 10.7554/elife.63753] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2020] [Accepted: 12/18/2020] [Indexed: 12/11/2022] Open
Abstract
The incongruence of character states with phylogenetic relationships is often interpreted as evidence of convergent evolution. However, trait evolution along discordant gene trees can also generate these incongruences - a phenomenon known as hemiplasy. Classic comparative methods do not account for discordance, resulting in incorrect inferences about the number, timing, and direction of trait transitions. Biological sources of discordance include incomplete lineage sorting (ILS) and introgression, but only ILS has received theoretical consideration in the context of hemiplasy. Here, we present a model that shows introgression makes hemiplasy more likely, such that methods that account for ILS alone will be conservative. We also present a method and software (HeIST) for making statistical inferences about the probability of hemiplasy and homoplasy in large datasets that contain both ILS and introgression. We apply our methods to two empirical datasets, finding that hemiplasy is likely to contribute to the observed trait incongruences in both.
Collapse
Affiliation(s)
- Mark S Hibbins
- Department of Biology, Indiana UniversityBloomingtonUnited States
| | | | - Matthew W Hahn
- Department of Biology, Indiana UniversityBloomingtonUnited States
- Department of Computer Science, Indiana UniversityBloomingtonUnited States
| |
Collapse
|
20
|
Cope AL, O'Meara BC, Gilchrist MA. Gene expression of functionally-related genes coevolves across fungal species: detecting coevolution of gene expression using phylogenetic comparative methods. BMC Genomics 2020; 21:370. [PMID: 32434474 PMCID: PMC7240986 DOI: 10.1186/s12864-020-6761-3] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2020] [Accepted: 04/29/2020] [Indexed: 11/23/2022] Open
Abstract
BACKGROUND Researchers often measure changes in gene expression across conditions to better understand the shared functional roles and regulatory mechanisms of different genes. Analogous to this is comparing gene expression across species, which can improve our understanding of the evolutionary processes shaping the evolution of both individual genes and functional pathways. One area of interest is determining genes showing signals of coevolution, which can also indicate potential functional similarity, analogous to co-expression analysis often performed across conditions for a single species. However, as with any trait, comparing gene expression across species can be confounded by the non-independence of species due to shared ancestry, making standard hypothesis testing inappropriate. RESULTS We compared RNA-Seq data across 18 fungal species using a multivariate Brownian Motion phylogenetic comparative method (PCM), which allowed us to quantify coevolution between protein pairs while directly accounting for the shared ancestry of the species. Our work indicates proteins which physically-interact show stronger signals of coevolution than randomly-generated pairs. Interactions with stronger empirical and computational evidence also showing stronger signals of coevolution. We examined the effects of number of protein interactions and gene expression levels on coevolution, finding both factors are overall poor predictors of the strength of coevolution between a protein pair. Simulations further demonstrate the potential issues of analyzing gene expression coevolution without accounting for shared ancestry in a standard hypothesis testing framework. Furthermore, our simulations indicate the use of a randomly-generated null distribution as a means of determining statistical significance for detecting coevolving genes with phylogenetically-uncorrected correlations, as has previously been done, is less accurate than PCMs, although is a significant improvement over standard hypothesis testing. These methods are further improved by using a phylogenetically-corrected correlation metric. CONCLUSIONS Our work highlights potential benefits of using PCMs to detect gene expression coevolution from high-throughput omics scale data. This framework can be built upon to investigate other evolutionary hypotheses, such as changes in transcription regulatory mechanisms across species.
Collapse
Affiliation(s)
- Alexander L Cope
- Genome Science and Technology, University of Tennessee, Knoxville, Tennessee, USA.
- Chemical Sciences Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA.
| | - Brian C O'Meara
- Department of Ecology and Evolutionary Biology, University of Tennessee, Knoxville, Tennessee, USA
- National Institute of Mathematical and Biological Synthesis, University of Tennessee, Knoxville, Tennessee, USA
| | - Michael A Gilchrist
- Genome Science and Technology, University of Tennessee, Knoxville, Tennessee, USA
- Department of Ecology and Evolutionary Biology, University of Tennessee, Knoxville, Tennessee, USA
- National Institute of Mathematical and Biological Synthesis, University of Tennessee, Knoxville, Tennessee, USA
| |
Collapse
|
21
|
Züst T, Strickler SR, Powell AF, Mabry ME, An H, Mirzaei M, York T, Holland CK, Kumar P, Erb M, Petschenka G, Gómez JM, Perfectti F, Müller C, Pires JC, Mueller LA, Jander G. Independent evolution of ancestral and novel defenses in a genus of toxic plants ( Erysimum, Brassicaceae). eLife 2020; 9:e51712. [PMID: 32252891 PMCID: PMC7180059 DOI: 10.7554/elife.51712] [Citation(s) in RCA: 40] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2019] [Accepted: 03/24/2020] [Indexed: 11/13/2022] Open
Abstract
Phytochemical diversity is thought to result from coevolutionary cycles as specialization in herbivores imposes diversifying selection on plant chemical defenses. Plants in the speciose genus Erysimum (Brassicaceae) produce both ancestral glucosinolates and evolutionarily novel cardenolides as defenses. Here we test macroevolutionary hypotheses on co-expression, co-regulation, and diversification of these potentially redundant defenses across this genus. We sequenced and assembled the genome of E. cheiranthoides and foliar transcriptomes of 47 additional Erysimum species to construct a phylogeny from 9868 orthologous genes, revealing several geographic clades but also high levels of gene discordance. Concentrations, inducibility, and diversity of the two defenses varied independently among species, with no evidence for trade-offs. Closely related, geographically co-occurring species shared similar cardenolide traits, but not glucosinolate traits, likely as a result of specific selective pressures acting on each defense. Ancestral and novel chemical defenses in Erysimum thus appear to provide complementary rather than redundant functions.
Collapse
Affiliation(s)
- Tobias Züst
- Institute of Plant Sciences, University of BernBernSwitzerland
| | | | | | - Makenzie E Mabry
- Division of Biological Sciences, University of MissouriColumbiaUnited States
| | - Hong An
- Division of Biological Sciences, University of MissouriColumbiaUnited States
| | | | | | | | | | - Matthias Erb
- Institute of Plant Sciences, University of BernBernSwitzerland
| | - Georg Petschenka
- Institut für Insektenbiotechnologie, Justus-Liebig-Universität GiessenGiessenGermany
| | - José-María Gómez
- Department of Functional and Evolutionary Ecology, Estación Experimental de Zonas Áridas (EEZA-CSIC)AlmeríaSpain
| | - Francisco Perfectti
- Research Unit Modeling Nature, Department of Genetics, University of GranadaGranadaSpain
| | - Caroline Müller
- Department of Chemical Ecology, Bielefeld UniversityBielefeldGermany
| | - J Chris Pires
- Division of Biological Sciences, University of MissouriColumbiaUnited States
| | | | | |
Collapse
|
22
|
Lamichhaney S, Card DC, Grayson P, Tonini JFR, Bravo GA, Näpflin K, Termignoni-Garcia F, Torres C, Burbrink F, Clarke JA, Sackton TB, Edwards SV. Integrating natural history collections and comparative genomics to study the genetic architecture of convergent evolution. Philos Trans R Soc Lond B Biol Sci 2019; 374:20180248. [PMID: 31154982 PMCID: PMC6560268 DOI: 10.1098/rstb.2018.0248] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/25/2019] [Indexed: 12/20/2022] Open
Abstract
Evolutionary convergence has been long considered primary evidence of adaptation driven by natural selection and provides opportunities to explore evolutionary repeatability and predictability. In recent years, there has been increased interest in exploring the genetic mechanisms underlying convergent evolution, in part, owing to the advent of genomic techniques. However, the current 'genomics gold rush' in studies of convergence has overshadowed the reality that most trait classifications are quite broadly defined, resulting in incomplete or potentially biased interpretations of results. Genomic studies of convergence would be greatly improved by integrating deep 'vertical', natural history knowledge with 'horizontal' knowledge focusing on the breadth of taxonomic diversity. Natural history collections have and continue to be best positioned for increasing our comprehensive understanding of phenotypic diversity, with modern practices of digitization and databasing of morphological traits providing exciting improvements in our ability to evaluate the degree of morphological convergence. Combining more detailed phenotypic data with the well-established field of genomics will enable scientists to make progress on an important goal in biology: to understand the degree to which genetic or molecular convergence is associated with phenotypic convergence. Although the fields of comparative biology or comparative genomics alone can separately reveal important insights into convergent evolution, here we suggest that the synergistic and complementary roles of natural history collection-derived phenomic data and comparative genomics methods can be particularly powerful in together elucidating the genomic basis of convergent evolution among higher taxa. This article is part of the theme issue 'Convergent evolution in the genomics era: new insights and directions'.
Collapse
Affiliation(s)
- Sangeet Lamichhaney
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA 02138, USA
- Museum of Comparative Zoology, Harvard University, Cambridge, MA 02138, USA
| | - Daren C. Card
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA 02138, USA
- Museum of Comparative Zoology, Harvard University, Cambridge, MA 02138, USA
- Department of Biology, University of Texas Arlington, Arlington, TX 76019, USA
| | - Phil Grayson
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA 02138, USA
- Museum of Comparative Zoology, Harvard University, Cambridge, MA 02138, USA
| | - João F. R. Tonini
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA 02138, USA
- Museum of Comparative Zoology, Harvard University, Cambridge, MA 02138, USA
| | - Gustavo A. Bravo
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA 02138, USA
- Museum of Comparative Zoology, Harvard University, Cambridge, MA 02138, USA
| | - Kathrin Näpflin
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA 02138, USA
- Museum of Comparative Zoology, Harvard University, Cambridge, MA 02138, USA
| | - Flavia Termignoni-Garcia
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA 02138, USA
- Museum of Comparative Zoology, Harvard University, Cambridge, MA 02138, USA
| | - Christopher Torres
- Department of Biology, The University of Texas at Austin, Austin, MA 78712, USA
- Department of Geological Sciences, The University of Texas at Austin, Austin, MA 78712, USA
| | - Frank Burbrink
- Department of Herpetology, The American Museum of Natural History, New York, NY 10024, USA
| | - Julia A. Clarke
- Department of Biology, The University of Texas at Austin, Austin, MA 78712, USA
- Department of Geological Sciences, The University of Texas at Austin, Austin, MA 78712, USA
| | | | - Scott V. Edwards
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA 02138, USA
- Museum of Comparative Zoology, Harvard University, Cambridge, MA 02138, USA
| |
Collapse
|
23
|
Lee KM, Coop G. Population genomics perspectives on convergent adaptation. Philos Trans R Soc Lond B Biol Sci 2019; 374:20180236. [PMID: 31154979 PMCID: PMC6560269 DOI: 10.1098/rstb.2018.0236] [Citation(s) in RCA: 40] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/21/2018] [Indexed: 01/12/2023] Open
Abstract
Convergent adaptation is the independent evolution of similar traits conferring a fitness advantage in two or more lineages. Cases of convergent adaptation inform our ideas about the ecological and molecular basis of adaptation. In judging the degree to which putative cases of convergent adaptation provide an independent replication of the process of adaptation, it is necessary to establish the degree to which the evolutionary change is unexpected under null models and to show that selection has repeatedly, independently driven these changes. Here, we discuss the issues that arise from these questions particularly for closely related populations, where gene flow and standing variation add additional layers of complexity. We outline a conceptual framework to guide intuition as to the extent to which evolutionary change represents the independent gain of information owing to selection and show that this is a measure of how surprised we should be by convergence. Additionally, we summarize the ways population and quantitative genetics and genomics may help us address questions related to convergent adaptation, as well as open new questions and avenues of research. This article is part of the theme issue 'Convergent evolution in the genomics era: new insights and directions'.
Collapse
Affiliation(s)
- Kristin M. Lee
- Center for Population Biology, University of California, Davis, CA 95616, USA
- Department of Evolution and Ecology, University of California, Davis, CA 95616, USA
| | - Graham Coop
- Center for Population Biology, University of California, Davis, CA 95616, USA
- Department of Evolution and Ecology, University of California, Davis, CA 95616, USA
| |
Collapse
|
24
|
Mendes FK, Livera AP, Hahn MW. The perils of intralocus recombination for inferences of molecular convergence. Philos Trans R Soc Lond B Biol Sci 2019; 374:20180244. [PMID: 31154973 DOI: 10.1098/rstb.2018.0244] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023] Open
Abstract
Accurate inferences of convergence require that the appropriate tree topology be used. If there is a mismatch between the tree a trait has evolved along and the tree used for analysis, then false inferences of convergence ('hemiplasy') can occur. To avoid problems of hemiplasy when there are high levels of gene tree discordance with the species tree, researchers have begun to construct tree topologies from individual loci. However, due to intralocus recombination, even locus-specific trees may contain multiple topologies within them. This implies that the use of individual tree topologies discordant with the species tree can still lead to incorrect inferences about molecular convergence. Here, we examine the frequency with which single exons and single protein-coding genes contain multiple underlying tree topologies, in primates and Drosophila, and quantify the effects of hemiplasy when using trees inferred from individual loci. In both clades, we find that there are most often multiple diagnosable topologies within single exons and whole genes, with 91% of Drosophila protein-coding genes containing multiple topologies. Because of this underlying topological heterogeneity, even using trees inferred from individual protein-coding genes results in 25% and 38% of substitutions falsely labelled as convergent in primates and Drosophila, respectively. While constructing local trees can reduce the problem of hemiplasy, our results suggest that it will be difficult to completely avoid false inferences of convergence. We conclude by suggesting several ways forward in the analysis of convergent evolution, for both molecular and morphological characters. This article is part of the theme issue 'Convergent evolution in the genomics era: new insights and directions'.
Collapse
Affiliation(s)
- Fábio K Mendes
- 1 Department of Computer Science, The University of Auckland , Auckland 1010 , New Zealand.,2 Department of Biology, Indiana University , Bloomington, IN 47405 , USA
| | - Andrew P Livera
- 2 Department of Biology, Indiana University , Bloomington, IN 47405 , USA
| | - Matthew W Hahn
- 2 Department of Biology, Indiana University , Bloomington, IN 47405 , USA.,3 Department of Computer Science, Indiana University , Bloomington, IN 47405 , USA
| |
Collapse
|
25
|
Koch EM. The Effects of Demography and Genetics on the Neutral Distribution of Quantitative Traits. Genetics 2019; 211:1371-1394. [PMID: 30782599 PMCID: PMC6456309 DOI: 10.1534/genetics.118.301839] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2018] [Accepted: 02/15/2019] [Indexed: 11/18/2022] Open
Abstract
Neutral models for quantitative trait evolution are useful for identifying phenotypes under selection. These models often assume normally distributed phenotypes. This assumption may be violated when a trait is affected by relatively few variants or when the effects of those variants arise from skewed or heavy tailed distributions. Molecular phenotypes such as gene expression levels may have these properties. To accommodate deviations from normality, models making fewer assumptions about the underlying genetics and patterns of variation are needed. Here, we develop a general neutral model for quantitative trait variation using a coalescent approach. This model allows interpretation of trait distributions in terms of familiar population genetic parameters because it is based on the coalescent. We show how the normal distribution resulting from the infinitesimal limit, where the number of loci grows large as the effect size per mutation becomes small, depends only on expected pairwise coalescent times. We then demonstrate how deviations from normality depend on demography through the distribution of coalescence times as well as through genetic parameters. In particular, population growth events exacerbate deviations while bottlenecks reduce them. We demonstrate the practical applications of this model by showing how to sample from the neutral distribution of [Formula: see text], the ratio of the variance between subpopulations to that in the overall population. We further show it is likely impossible to distinguish sparsity from skewed or heavy tailed mutational effects using only sampled trait values. The model analyzed here greatly expands the parameter space for neutral trait models.
Collapse
Affiliation(s)
- Evan M Koch
- Department of Ecology and Evolution, University of Chicago, Illinois 60637
| |
Collapse
|
26
|
Bravo GA, Antonelli A, Bacon CD, Bartoszek K, Blom MPK, Huynh S, Jones G, Knowles LL, Lamichhaney S, Marcussen T, Morlon H, Nakhleh LK, Oxelman B, Pfeil B, Schliep A, Wahlberg N, Werneck FP, Wiedenhoeft J, Willows-Munro S, Edwards SV. Embracing heterogeneity: coalescing the Tree of Life and the future of phylogenomics. PeerJ 2019; 7:e6399. [PMID: 30783571 PMCID: PMC6378093 DOI: 10.7717/peerj.6399] [Citation(s) in RCA: 67] [Impact Index Per Article: 13.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2018] [Accepted: 01/07/2019] [Indexed: 12/23/2022] Open
Abstract
Building the Tree of Life (ToL) is a major challenge of modern biology, requiring advances in cyberinfrastructure, data collection, theory, and more. Here, we argue that phylogenomics stands to benefit by embracing the many heterogeneous genomic signals emerging from the first decade of large-scale phylogenetic analysis spawned by high-throughput sequencing (HTS). Such signals include those most commonly encountered in phylogenomic datasets, such as incomplete lineage sorting, but also those reticulate processes emerging with greater frequency, such as recombination and introgression. Here we focus specifically on how phylogenetic methods can accommodate the heterogeneity incurred by such population genetic processes; we do not discuss phylogenetic methods that ignore such processes, such as concatenation or supermatrix approaches or supertrees. We suggest that methods of data acquisition and the types of markers used in phylogenomics will remain restricted until a posteriori methods of marker choice are made possible with routine whole-genome sequencing of taxa of interest. We discuss limitations and potential extensions of a model supporting innovation in phylogenomics today, the multispecies coalescent model (MSC). Macroevolutionary models that use phylogenies, such as character mapping, often ignore the heterogeneity on which building phylogenies increasingly rely and suggest that assimilating such heterogeneity is an important goal moving forward. Finally, we argue that an integrative cyberinfrastructure linking all steps of the process of building the ToL, from specimen acquisition in the field to publication and tracking of phylogenomic data, as well as a culture that values contributors at each step, are essential for progress.
Collapse
Affiliation(s)
- Gustavo A. Bravo
- Department of Organismic and Evolutionary Biology, Museum of Comparative Zoology, Harvard University, Cambridge, MA, USA
| | - Alexandre Antonelli
- Department of Organismic and Evolutionary Biology, Museum of Comparative Zoology, Harvard University, Cambridge, MA, USA
- Gothenburg Global Biodiversity Centre, Göteborg, Sweden
- Department of Biological and Environmental Sciences, University of Gothenburg, Göteborg, Sweden
- Gothenburg Botanical Garden, Göteborg, Sweden
| | - Christine D. Bacon
- Gothenburg Global Biodiversity Centre, Göteborg, Sweden
- Department of Biological and Environmental Sciences, University of Gothenburg, Göteborg, Sweden
| | - Krzysztof Bartoszek
- Department of Computer and Information Science, Linköping University, Linköping, Sweden
| | - Mozes P. K. Blom
- Department of Bioinformatics and Genetics, Swedish Museum of Natural History, Stockholm, Sweden
| | - Stella Huynh
- Institut de Biologie, Université de Neuchâtel, Neuchâtel, Switzerland
| | - Graham Jones
- Department of Biological and Environmental Sciences, University of Gothenburg, Göteborg, Sweden
| | - L. Lacey Knowles
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI, USA
| | - Sangeet Lamichhaney
- Department of Organismic and Evolutionary Biology, Museum of Comparative Zoology, Harvard University, Cambridge, MA, USA
| | - Thomas Marcussen
- Centre for Ecological and Evolutionary Synthesis, University of Oslo, Oslo, Norway
| | - Hélène Morlon
- Institut de Biologie, Ecole Normale Supérieure de Paris, Paris, France
| | - Luay K. Nakhleh
- Department of Computer Science, Rice University, Houston, TX, USA
| | - Bengt Oxelman
- Gothenburg Global Biodiversity Centre, Göteborg, Sweden
- Department of Biological and Environmental Sciences, University of Gothenburg, Göteborg, Sweden
| | - Bernard Pfeil
- Department of Biological and Environmental Sciences, University of Gothenburg, Göteborg, Sweden
| | - Alexander Schliep
- Department of Computer Science and Engineering, Chalmers University of Technology and University of Gothenburg, Göteborg, Sweden
| | | | - Fernanda P. Werneck
- Coordenação de Biodiversidade, Programa de Coleções Científicas Biológicas, Instituto Nacional de Pesquisa da Amazônia, Manaus, AM, Brazil
| | - John Wiedenhoeft
- Department of Computer Science and Engineering, Chalmers University of Technology and University of Gothenburg, Göteborg, Sweden
- Department of Computer Science, Rutgers University, Piscataway, NJ, USA
| | - Sandi Willows-Munro
- School of Life Sciences, University of Kwazulu-Natal, Pietermaritzburg, South Africa
| | - Scott V. Edwards
- Department of Organismic and Evolutionary Biology, Museum of Comparative Zoology, Harvard University, Cambridge, MA, USA
- Gothenburg Centre for Advanced Studies in Science and Technology, Chalmers University of Technology and University of Gothenburg, Göteborg, Sweden
| |
Collapse
|
27
|
Parins-Fukuchi C. Bayesian placement of fossils on phylogenies using quantitative morphometric data. Evolution 2018; 72:1801-1814. [PMID: 29998561 DOI: 10.1111/evo.13516] [Citation(s) in RCA: 25] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2018] [Accepted: 05/25/2018] [Indexed: 11/29/2022]
Abstract
Jointly developing a comprehensive tree of life from living and fossil taxa has long been a fundamental goal in evolutionary biology. One major challenge has stemmed from difficulties in merging evidence from extant and extinct organisms. While these efforts have resulted in varying stages of synthesis, they have been hindered by their dependence on qualitative descriptions of morphology. Though rarely applied to phylogenetic inference, traditional and geometric morphometric data can improve these issues by generating more rigorous ways to quantify variation in morphological structures. They may also facilitate the rapid and objective aggregation of large morphological datasets. I describe a new Bayesian method that leverages quantitative trait data to reconstruct the positions of fossil taxa on fixed reference trees composed of extant taxa. Unlike most formulations of phylogenetic Brownian motion models, this method expresses branch lengths in units of morphological disparity, suggesting a new framework through which to construct Bayesian node calibration priors for molecular dating and explore comparative patterns in morphological disparity. I am hopeful that the approach described here will help to facilitate a deeper integration of neo- and paleontological data to move morphological phylogenetics further into the genomic era.
Collapse
Affiliation(s)
- Caroline Parins-Fukuchi
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, Michigan 48109
| |
Collapse
|