1
|
Deep Learning and Likelihood Approaches for Viral Phylogeography Converge on the Same Answers Whether the Inference Model Is Right or Wrong. Syst Biol 2024; 73:183-206. [PMID: 38189575 DOI: 10.1093/sysbio/syad074] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2023] [Revised: 11/22/2023] [Accepted: 01/05/2024] [Indexed: 01/09/2024] Open
Abstract
Analysis of phylogenetic trees has become an essential tool in epidemiology. Likelihood-based methods fit models to phylogenies to draw inferences about the phylodynamics and history of viral transmission. However, these methods are often computationally expensive, which limits the complexity and realism of phylodynamic models and makes them ill-suited for informing policy decisions in real-time during rapidly developing outbreaks. Likelihood-free methods using deep learning are pushing the boundaries of inference beyond these constraints. In this paper, we extend, compare, and contrast a recently developed deep learning method for likelihood-free inference from trees. We trained multiple deep neural networks using phylogenies from simulated outbreaks that spread among 5 locations and found they achieve close to the same levels of accuracy as Bayesian inference under the true simulation model. We compared robustness to model misspecification of a trained neural network to that of a Bayesian method. We found that both models had comparable performance, converging on similar biases. We also implemented a method of uncertainty quantification called conformalized quantile regression that we demonstrate has similar patterns of sensitivity to model misspecification as Bayesian highest posterior density (HPD) and greatly overlap with HPDs, but have lower precision (more conservative). Finally, we trained and tested a neural network against phylogeographic data from a recent study of the SARS-Cov-2 pandemic in Europe and obtained similar estimates of region-specific epidemiological parameters and the location of the common ancestor in Europe. Along with being as accurate and robust as likelihood-based methods, our trained neural networks are on average over 3 orders of magnitude faster after training. Our results support the notion that neural networks can be trained with simulated data to accurately mimic the good and bad statistical properties of the likelihood functions of generative phylogenetic models.
Collapse
|
2
|
A Diffusion-Based Approach for Simulating Forward-in-Time State-Dependent Speciation and Extinction Dynamics. ARXIV 2024:arXiv:2402.00246v1. [PMID: 38351931 PMCID: PMC10862938] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/20/2024]
Abstract
We establish a general framework using a diffusion approximation to simulate forward-in-time state counts or frequencies for cladogenetic state-dependent speciation-extinction (ClaSSE) models. We apply the framework to various two- and three-region geographic-state speciation-extinction (GeoSSE) models. We show that the species range state dynamics simulated under tree-based and diffusion-based processes are comparable. We derive a method to infer rate parameters that are compatible with given observed stationary state frequencies and obtain an analytical result to compute stationary state frequencies for a given set of rate parameters. We also describe a procedure to find the time to reach the stationary frequencies of a ClaSSE model using our diffusion-based approach, which we demonstrate using a worked example for a two-region GeoSSE model. Finally, we discuss how the diffusion framework can be applied to formalize relationships between evolutionary patterns and processes under state-dependent diversification scenarios.
Collapse
|
3
|
PhyloJunction: a computational framework for simulating, developing, and teaching evolutionary models. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.12.15.571907. [PMID: 38168278 PMCID: PMC10760140 DOI: 10.1101/2023.12.15.571907] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/05/2024]
Abstract
We introduce PhyloJunction, a computational framework designed to facilitate the prototyping, testing, and characterization of evolutionary models. PhyloJunction is distributed as an open-source Python library that can be used to implement a variety of models, through its flexible graphical modeling architecture and dedicated model specification language. Model design and use are exposed to users via command-line and graphical interfaces, which integrate the steps of simulating, summarizing, and visualizing data. This paper describes the features of PhyloJunction - which include, but are not limited to, a general implementation of a popular family of phylogenetic diversification models - and, moving forward, how it may be expanded to not only include new models, but to also become a platform for conducting and teaching statistical learning.
Collapse
|
4
|
A global phylogeny of butterflies reveals their evolutionary history, ancestral hosts and biogeographic origins. Nat Ecol Evol 2023; 7:903-913. [PMID: 37188966 PMCID: PMC10250192 DOI: 10.1038/s41559-023-02041-9] [Citation(s) in RCA: 14] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2022] [Accepted: 03/16/2023] [Indexed: 05/17/2023]
Abstract
Butterflies are a diverse and charismatic insect group that are thought to have evolved with plants and dispersed throughout the world in response to key geological events. However, these hypotheses have not been extensively tested because a comprehensive phylogenetic framework and datasets for butterfly larval hosts and global distributions are lacking. We sequenced 391 genes from nearly 2,300 butterfly species, sampled from 90 countries and 28 specimen collections, to reconstruct a new phylogenomic tree of butterflies representing 92% of all genera. Our phylogeny has strong support for nearly all nodes and demonstrates that at least 36 butterfly tribes require reclassification. Divergence time analyses imply an origin ~100 million years ago for butterflies and indicate that all but one family were present before the K/Pg extinction event. We aggregated larval host datasets and global distribution records and found that butterflies are likely to have first fed on Fabaceae and originated in what is now the Americas. Soon after the Cretaceous Thermal Maximum, butterflies crossed Beringia and diversified in the Palaeotropics. Our results also reveal that most butterfly species are specialists that feed on only one larval host plant family. However, generalist butterflies that consume two or more plant families usually feed on closely related plants.
Collapse
|
5
|
The build-up of the present-day tropical diversity of tetrapods. Proc Natl Acad Sci U S A 2023; 120:e2220672120. [PMID: 37159475 PMCID: PMC10194011 DOI: 10.1073/pnas.2220672120] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2022] [Accepted: 04/04/2023] [Indexed: 05/11/2023] Open
Abstract
The extraordinary number of species in the tropics when compared to the extra-tropics is probably the most prominent and consistent pattern in biogeography, suggesting that overarching processes regulate this diversity gradient. A major challenge to characterizing which processes are at play relies on quantifying how the frequency and determinants of tropical and extra-tropical speciation, extinction, and dispersal events shaped evolutionary radiations. We address this question by developing and applying spatiotemporal phylogenetic and paleontological models of diversification for tetrapod species incorporating paleoenvironmental variation. Our phylogenetic model results show that area, energy, or species richness did not uniformly affect speciation rates across tetrapods and dispute expectations of a latitudinal gradient in speciation rates. Instead, both neontological and fossil evidence coincide in underscoring the role of extra-tropical extinctions and the outflow of tropical species in shaping biodiversity. These diversification dynamics accurately predict present-day levels of species richness across latitudes and uncover temporal idiosyncrasies but spatial generality across the major tetrapod radiations.
Collapse
|
6
|
Genomics expands the mammalverse. Science 2023; 380:358-359. [PMID: 37104595 PMCID: PMC10876211 DOI: 10.1126/science.add2209] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/29/2023]
Abstract
Diverse mammal genomes open a new portal to hidden aspects of evolutionary history.
Collapse
|
7
|
Bayesian inference of admixture graphs on Native American and Arctic populations. PLoS Genet 2023; 19:e1010410. [PMID: 36780565 PMCID: PMC9956672 DOI: 10.1371/journal.pgen.1010410] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2022] [Revised: 02/24/2023] [Accepted: 01/23/2023] [Indexed: 02/15/2023] Open
Abstract
Admixture graphs are mathematical structures that describe the ancestry of populations in terms of divergence and merging (admixing) of ancestral populations as a graph. An admixture graph consists of a graph topology, branch lengths, and admixture proportions. The branch lengths and admixture proportions can be estimated using numerous numerical optimization methods, but inferring the topology involves a combinatorial search for which no polynomial algorithm is known. In this paper, we present a reversible jump MCMC algorithm for sampling high-probability admixture graphs and show that this approach works well both as a heuristic search for a single best-fitting graph and for summarizing shared features extracted from posterior samples of graphs. We apply the method to 11 Native American and Siberian populations and exploit the shared structure of high-probability graphs to characterize the relationship between Saqqaq, Inuit, Koryaks, and Athabascans. Our analyses show that the Saqqaq is not a good proxy for the previously identified gene flow from Arctic people into the Na-Dene speaking Athabascans.
Collapse
|
8
|
Publisher Correction: Replicated radiation of a plant clade along a cloud forest archipelago. Nat Ecol Evol 2022; 6:1398. [PMID: 35927318 DOI: 10.1038/s41559-022-01864-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
9
|
Replicated radiation of a plant clade along a cloud forest archipelago. Nat Ecol Evol 2022; 6:1318-1329. [DOI: 10.1038/s41559-022-01823-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2021] [Accepted: 06/08/2022] [Indexed: 11/09/2022]
|
10
|
RevGadgets: An R package for visualizing Bayesian phylogenetic analyses from RevBayes. Methods Ecol Evol 2021. [DOI: 10.1111/2041-210x.13750] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
|
11
|
Parallel power posterior analyses for fast computation of marginal likelihoods in phylogenetics. PeerJ 2021; 9:e12438. [PMID: 34760401 PMCID: PMC8570164 DOI: 10.7717/peerj.12438] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2021] [Accepted: 10/15/2021] [Indexed: 11/30/2022] Open
Abstract
In Bayesian phylogenetic inference, marginal likelihoods can be estimated using several different methods, including the path-sampling or stepping-stone-sampling algorithms. Both algorithms are computationally demanding because they require a series of power posterior Markov chain Monte Carlo (MCMC) simulations. Here we introduce a general parallelization strategy that distributes the power posterior MCMC simulations and the likelihood computations over available CPUs. Our parallelization strategy can easily be applied to any statistical model despite our primary focus on molecular substitution models in this study. Using two phylogenetic example datasets, we demonstrate that the runtime of the marginal likelihood estimation can be reduced significantly even if only two CPUs are available (an average performance increase of 1.96x). The performance increase is nearly linear with the number of available CPUs. We record a performance increase of 13.3x for cluster nodes with 16 CPUs, representing a substantial reduction to the runtime of marginal likelihood estimations. Hence, our parallelization strategy enables the estimation of marginal likelihoods to complete in a feasible amount of time which previously needed days, weeks or even months. The methods described here are implemented in our open-source software RevBayes which is available from http://www.RevBayes.com.
Collapse
|
12
|
Phylogenetic reconstruction of ancestral ecological networks through time for pierid butterflies and their host plants. Ecol Lett 2021; 24:2134-2145. [PMID: 34297474 DOI: 10.1111/ele.13842] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2021] [Accepted: 06/19/2021] [Indexed: 12/14/2022]
Abstract
The study of herbivorous insects underpins much of the theory that concerns the evolution of species interactions. In particular, Pieridae butterflies and their host plants have served as a model system for studying evolutionary arms races. To learn more about the coevolution of these two clades, we reconstructed ancestral ecological networks using stochastic mappings that were generated by a phylogenetic model of host-repertoire evolution. We then measured if, when, and how two ecologically important structural features of the ancestral networks (modularity and nestedness) evolved over time. Our study shows that as pierids gained new hosts and formed new modules, a subset of them retained or recolonised the ancestral host(s), preserving connectivity to the original modules. Together, host-range expansions and recolonisations promoted a phase transition in network structure. Our results demonstrate the power of combining network analysis with Bayesian inference of host-repertoire evolution to understand changes in complex species interactions over time.
Collapse
|
13
|
Bayesian Inference of Ancestral Host-Parasite Interactions under a Phylogenetic Model of Host Repertoire Evolution. Syst Biol 2020; 69:1149-1162. [PMID: 32191324 PMCID: PMC7584141 DOI: 10.1093/sysbio/syaa019] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2019] [Revised: 02/27/2020] [Accepted: 03/15/2020] [Indexed: 11/12/2022] Open
Abstract
Intimate ecological interactions, such as those between parasites and their hosts, may persist over long time spans, coupling the evolutionary histories of the lineages involved. Most methods that reconstruct the coevolutionary history of such interactions make the simplifying assumption that parasites have a single host. Many methods also focus on congruence between host and parasite phylogenies, using cospeciation as the null model. However, there is an increasing body of evidence suggesting that the host ranges of parasites are more complex: that host ranges often include more than one host and evolve via gains and losses of hosts rather than through cospeciation alone. Here, we develop a Bayesian approach for inferring coevolutionary history based on a model accommodating these complexities. Specifically, a parasite is assumed to have a host repertoire, which includes both potential hosts and one or more actual hosts. Over time, potential hosts can be added or lost, and potential hosts can develop into actual hosts or vice versa. Thus, host colonization is modeled as a two-step process that may potentially be influenced by host relatedness. We first explore the statistical behavior of our model by simulating evolution of host-parasite interactions under a range of parameter values. We then use our approach, implemented in the program RevBayes, to infer the coevolutionary history between 34 Nymphalini butterfly species and 25 angiosperm families. Our analysis suggests that host relatedness among angiosperm families influences how easily Nymphalini lineages gain new hosts. [Ancestral hosts; coevolution; herbivorous insects; probabilistic modeling.].
Collapse
|
14
|
Joint Phylogenetic Estimation of Geographic Movements and Biome Shifts during the Global Diversification of Viburnum. Syst Biol 2020; 70:67-85. [PMID: 32267945 DOI: 10.1093/sysbio/syaa027] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2019] [Revised: 03/19/2020] [Accepted: 03/30/2020] [Indexed: 11/14/2022] Open
Abstract
Phylogeny, molecular sequences, fossils, biogeography, and biome occupancy are all lines of evidence that reflect the singular evolutionary history of a clade, but they are most often studied separately, by first inferring a fossil-dated molecular phylogeny, then mapping on ancestral ranges and biomes inferred from extant species. Here we jointly model the evolution of biogeographic ranges, biome affinities, and molecular sequences, while incorporating fossils to estimate a dated phylogeny for all of the 163 extant species of the woody plant clade Viburnum (Adoxaceae) that we currently recognize in our ongoing worldwide monographic treatment of the group. Our analyses indicate that while the major Viburnum lineages evolved in the Eocene, the majority of extant species originated since the Miocene. Viburnum radiated first in Asia, in warm, broad-leaved evergreen (lucidophyllous) forests. Within Asia, we infer several early shifts into more tropical forests, and multiple shifts into forests that experience prolonged freezing. From Asia, we infer two early movements into the New World. These two lineages probably first occupied warm temperate forests and adapted later to spreading cold climates. One of these lineages (Porphyrotinus) occupied cloud forests and moved south through the mountains of the Neotropics. Several other movements into North America took place more recently, facilitated by prior adaptations to freezing in the Old World. We also infer four disjunctions between Asia and Europe: the Tinus lineage is the oldest and probably occupied warm forests when it spread, whereas the other three were more recent and in cold-adapted lineages. These results variously contradict published accounts, especially the view that Viburnum radiated initially in cold forests and, accordingly, maintained vessel elements with scalariform perforations. We explored how the location and biome assignments of fossils affected our inference of ancestral areas and biome states. Our results are sensitive to, but not entirely dependent upon, the inclusion of fossil biome data. It will be critical to take advantage of all available lines of evidence to decipher events in the distant past. The joint estimation approach developed here provides cautious hope even when fossil evidence is limited. [Biogeography; biome; combined evidence; fossil pollen; phylogeny; Viburnum.].
Collapse
|
15
|
An Evolutionary Insertion in the Mxra8 Receptor-Binding Site Confers Resistance to Alphavirus Infection and Pathogenesis. Cell Host Microbe 2020; 27:428-440.e9. [PMID: 32075743 DOI: 10.1016/j.chom.2020.01.008] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2019] [Revised: 12/11/2019] [Accepted: 01/14/2020] [Indexed: 01/08/2023]
Abstract
Alphaviruses are emerging, mosquito-transmitted RNA viruses with poorly understood cellular tropism and species selectivity. Mxra8 is a receptor for multiple alphaviruses including chikungunya virus (CHIKV). We discovered that while expression of mouse, rat, chimpanzee, dog, horse, goat, sheep, and human Mxra8 enables alphavirus infection in cell culture, cattle Mxra8 does not. Cattle Mxra8 encodes a 15-amino acid insertion in its ectodomain that prevents Mxra8 binding to CHIKV. Identical insertions are present in zebu, yak, and the extinct auroch. As other Bovinae lineages contain related Mxra8 sequences, this insertion likely occurred at least 5 million years ago. Removing the Mxra8 insertion in Bovinae enhances alphavirus binding and infection, while introducing the insertion into mouse Mxra8 blocks CHIKV binding, prevents infection by multiple alphaviruses in cells, and mitigates CHIKV-induced pathogenesis in mice. Our studies on how this insertion provides resistance to CHIKV infection could facilitate countermeasures that disrupt Mxra8 interactions with alphaviruses.
Collapse
|
16
|
Interdependent Phenotypic and Biogeographic Evolution Driven by Biotic Interactions. Syst Biol 2019; 69:739-755. [DOI: 10.1093/sysbio/syz082] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2019] [Revised: 12/06/2019] [Accepted: 12/10/2019] [Indexed: 11/13/2022] Open
Abstract
Abstract
Biotic interactions are hypothesized to be one of the main processes shaping trait and biogeographic evolution during lineage diversification. Theoretical and empirical evidence suggests that species with similar ecological requirements either spatially exclude each other, by preventing the colonization of competitors or by driving coexisting populations to extinction, or show niche divergence when in sympatry. However, the extent and generality of the effect of interspecific competition in trait and biogeographic evolution has been limited by a dearth of appropriate process-generating models to directly test the effect of biotic interactions. Here, we formulate a phylogenetic parametric model that allows interdependence between trait and biogeographic evolution, thus enabling a direct test of central hypotheses on how biotic interactions shape these evolutionary processes. We adopt a Bayesian data augmentation approach to estimate the joint posterior distribution of trait histories, range histories, and coevolutionary process parameters under this analytically intractable model. Through simulations, we show that our model is capable of distinguishing alternative scenarios of biotic interactions. We apply our model to the radiation of Darwin’s finches—a classic example of adaptive divergence—and find limited support for in situ trait divergence in beak size, but stronger evidence for convergence in traits such as beak shape and tarsus length and for competitive exclusion throughout their evolutionary history. These findings are more consistent with presympatric, rather than postsympatric, niche divergence. Our modeling framework opens new possibilities for testing more complex hypotheses about the processes underlying lineage diversification. More generally, it provides a robust probabilistic methodology to model correlated evolution of continuous and discrete characters. [Bayesian; biotic interactions; competition; data augmentation; historical biogeography; trait evolution.]
Collapse
|
17
|
Sterile marginal flowers increase visitation and fruit set in the hobblebush (Viburnum lantanoides, Adoxaceae) at multiple spatial scales. ANNALS OF BOTANY 2019; 123:381-390. [PMID: 29982369 PMCID: PMC6344212 DOI: 10.1093/aob/mcy117] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/12/2018] [Accepted: 06/05/2018] [Indexed: 05/13/2023]
Abstract
BACKGROUND AND AIMS Enlarged sterile flowers on the periphery of inflorescences increase the attractiveness of floral displays, and previous studies have generally demonstrated that these have positive effects on insect visitation and/or reproductive success. However, experiments have not specifically been designed to examine the benefits of sterile flowers under conditions that reflect the early stages in their evolution, i.e. when plants that produce sterile flowers are at low frequency. METHODS Over three years, three experiments were performed in natural populations of Viburnum lantanoides, which produces sterile marginal flowers (SMFs). The first experiment established that fruit production in V. lantanoides increases with the receipt of outcross pollen. The second tested the role of SMFs under extant conditions, comparing fruit production in two populations composed entirely of intact plants or entirely of plants with the SMFs removed. The third was designed to mimic the presumed context in which SMFs first evolved; here, SMFs were removed from all but a few plants in a population, and rates of insect visitation and fruit set were compared between plants with intact and denuded SMFs. KEY RESULTS In comparing whole populations, the presence of SMFs nearly doubled fruit set. Under simulated 'ancestral' conditions within a population, plants with intact SMFs received double the insect visits and produced significantly more fruits than denuded plants. There was no significant effect of the number of inflorescences or fertile flowers on insect visitation or fruit set, indicating that the presence of SMFs accounted for these differences. CONCLUSIONS The presence of SMFs significantly increased pollinator attraction and female reproductive success both in contemporary and simulated ancestral contexts, indicating that stabilizing selection is responsible for their maintenance, and directional selection likely drove their evolution when they first appeared. This study demonstrates a novel approach to incorporating historically relevant scenarios into experimental studies of floral evolution.
Collapse
|
18
|
Retracing the Hawaiian silversword radiation despite phylogenetic, biogeographic, and paleogeographic uncertainty. Evolution 2018; 72:2343-2359. [PMID: 30198108 DOI: 10.1111/evo.13594] [Citation(s) in RCA: 54] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2018] [Accepted: 08/17/2018] [Indexed: 12/25/2022]
Abstract
The Hawaiian silversword alliance (Asteraceae) is an iconic adaptive radiation. However, like many island plant lineages, no fossils have been assigned to the clade. As a result, the clade's age and diversification rate are not known precisely, making it difficult to test biogeographic hypotheses about the radiation. In lieu of fossils, paleogeographically structured biogeographic processes may inform species divergence times; for example, an island must first exist for a clade to radiate upon it. We date the silversword clade and test biogeographic hypotheses about its radiation across the Hawaiian Archipelago by modeling interactions between species relationships, molecular evolution, biogeographic scenarios, divergence times, and island origination times using the Bayesian phylogenetic framework, RevBayes. The ancestor of living silverswords most likely colonized the modern Hawaiian Islands once from the mainland approximately 5.1 Ma, with the most recent common ancestor of extant silversword lineages first appearing approximately 3.5 Ma. Applying an event-based test of the progression rule of island biogeography, we found strong evidence that the dispersal process favors old-to-young directionality, but strong evidence for diversification continuing unabated into later phases of island ontogeny, particularly for Kaua'i. This work serves as a general example for how diversification studies benefit from incorporating biogeographic and paleogeographic components.
Collapse
|
19
|
|
20
|
Biogeographic Dating of Speciation Times Using Paleogeographically Informed Processes. Syst Biol 2017; 66:128-144. [PMID: 27155009 PMCID: PMC5837510 DOI: 10.1093/sysbio/syw040] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2015] [Accepted: 04/28/2016] [Indexed: 11/13/2022] Open
Abstract
Standard models of molecular evolution cannot estimate absolute speciation times alone, and require external calibrations to do so, such as fossils. Because fossil calibration methods rely on the incomplete fossil record, a great number of nodes in the tree of life cannot be dated precisely. However, many major paleogeographical events are dated, and since biogeographic processes depend on paleogeographical conditions, biogeographic dating may be used as an alternative or complementary method to fossil dating. I demonstrate how a time-stratified biogeographic stochastic process may be used to estimate absolute divergence times by conditioning on dated paleogeographical events. Informed by the current paleogeographical literature, I construct an empirical dispersal graph using 25 areas and 26 epochs for the past 540 Ma of Earth's history. Simulations indicate biogeographic dating performs well so long as paleogeography imposes constraint on biogeographic character evolution. To gauge whether biogeographic dating may be of practical use, I analyzed the well-studied turtle clade (Testudines) to assess how well biogeographic dating fares when compared to fossil-calibrated dating estimates reported in the literature. Fossil-free biogeographic dating estimated the age of the most recent common ancestor of extant turtles to be from the Late Triassic, which is consistent with fossil-based estimates. Dating precision improves further when including a root node fossil calibration. The described model, paleogeographical dispersal graph, and analysis scripts are available for use with RevBayes.
Collapse
|
21
|
RevBayes: Bayesian Phylogenetic Inference Using Graphical Models and an Interactive Model-Specification Language. Syst Biol 2016; 65:726-36. [PMID: 27235697 PMCID: PMC4911942 DOI: 10.1093/sysbio/syw021] [Citation(s) in RCA: 293] [Impact Index Per Article: 36.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2015] [Accepted: 03/01/2015] [Indexed: 01/12/2023] Open
Abstract
Programs for Bayesian inference of phylogeny currently implement a unique and fixed suite of models. Consequently, users of these software packages are simultaneously forced to use a number of programs for a given study, while also lacking the freedom to explore models that have not been implemented by the developers of those programs. We developed a new open-source software package, RevBayes, to address these problems. RevBayes is entirely based on probabilistic graphical models, a powerful generic framework for specifying and analyzing statistical models. Phylogenetic-graphical models can be specified interactively in RevBayes, piece by piece, using a new succinct and intuitive language called Rev. Rev is similar to the R language and the BUGS model-specification language, and should be easy to learn for most users. The strength of RevBayes is the simplicity with which one can design, specify, and implement new and complex models. Fortunately, this tremendous flexibility does not come at the cost of slower computation; as we demonstrate, RevBayes outperforms competing software for several standard analyses. Compared with other programs, RevBayes has fewer black-box elements. Users need to explicitly specify each part of the model and analysis. Although this explicitness may initially be unfamiliar, we are convinced that this transparency will improve understanding of phylogenetic models in our field. Moreover, it will motivate the search for improvements to existing methods by brazenly exposing the model choices that we make to critical scrutiny. RevBayes is freely available at http://www.RevBayes.com. [Bayesian inference; Graphical models; MCMC; statistical phylogenetics.]
Collapse
|
22
|
Abstract
Recent years have seen a rapid expansion of the model space explored in statistical phylogenetics, emphasizing the need for new approaches to statistical model representation and software development. Clear communication and representation of the chosen model is crucial for: (i) reproducibility of an analysis, (ii) model development, and (iii) software design. Moreover, a unified, clear and understandable framework for model representation lowers the barrier for beginners and nonspecialists to grasp complex phylogenetic models, including their assumptions and parameter/variable dependencies. Graphical modeling is a unifying framework that has gained in popularity in the statistical literature in recent years. The core idea is to break complex models into conditionally independent distributions. The strength lies in the comprehensibility, flexibility, and adaptability of this formalism, and the large body of computational work based on it. Graphical models are well-suited to teach statistical models, to facilitate communication among phylogeneticists and in the development of generic software for simulation and statistical inference. Here, we provide an introduction to graphical models for phylogeneticists and extend the standard graphical model representation to the realm of phylogenetics. We introduce a new graphical model component, tree plates, to capture the changing structure of the subgraph corresponding to a phylogenetic tree. We describe a range of phylogenetic models using the graphical model framework and introduce modules to simplify the representation of standard components in large and complex models. Phylogenetic model graphs can be readily used in simulation, maximum likelihood inference, and Bayesian inference using, for example, Metropolis–Hastings or Gibbs sampling of the posterior distribution. [Computation; graphical models; inference; modularization; statistical phylogenetics; tree plate.]
Collapse
|
23
|
Abstract
Summary: Phylowood is a web service that uses JavaScript to generate in-browser animations of biogeographic and phylogeographic histories from annotated phylogenetic input. The animations are interactive, allowing the user to adjust spatial and temporal resolution, and highlight phylogenetic lineages of interest. Availability and implementation: All documentation and source code for Phylowood is freely available at https://github.com/mlandis/phylowood, and a live web application is available at https://mlandis.github.io/phylowood. Contact:mlandis@berkeley.edu
Collapse
|
24
|
Abstract
Historical biogeography is increasingly studied from an explicitly statistical perspective, using stochastic models to describe the evolution of species range as a continuous-time Markov process of dispersal between and extinction within a set of discrete geographic areas. The main constraint of these methods is the computational limit on the number of areas that can be specified. We propose a Bayesian approach for inferring biogeographic history that extends the application of biogeographic models to the analysis of more realistic problems that involve a large number of areas. Our solution is based on a "data-augmentation" approach, in which we first populate the tree with a history of biogeographic events that is consistent with the observed species ranges at the tips of the tree. We then calculate the likelihood of a given history by adopting a mechanistic interpretation of the instantaneous-rate matrix, which specifies both the exponential waiting times between biogeographic events and the relative probabilities of each biogeographic change. We develop this approach in a Bayesian framework, marginalizing over all possible biogeographic histories using Markov chain Monte Carlo (MCMC). Besides dramatically increasing the number of areas that can be accommodated in a biogeographic analysis, our method allows the parameters of a given biogeographic model to be estimated and different biogeographic models to be objectively compared. Our approach is implemented in the program, BayArea.
Collapse
|
25
|
Phylogenetic analysis using Lévy processes: finding jumps in the evolution of continuous traits. Syst Biol 2012; 62:193-204. [PMID: 23034385 DOI: 10.1093/sysbio/sys086] [Citation(s) in RCA: 93] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Gaussian processes, a class of stochastic processes including Brownian motion and the Ornstein-Uhlenbeck process, are widely used to model continuous trait evolution in statistical phylogenetics. Under such processes, observations at the tips of a phylogenetic tree have a multivariate Gaussian distribution, which may lead to suboptimal model specification under certain evolutionary conditions, as supposed in models of punctuated equilibrium or adaptive radiation. To consider non-normally distributed continuous trait evolution, we introduce a method to compute posterior probabilities when modeling continuous trait evolution as a Lévy process. Through data simulation and model testing, we establish that single-rate Brownian motion (BM) and Lévy processes with jumps generate distinct patterns in comparative data. We then analyzed body mass and endocranial volume measurements for 126 primates. We rejected single-rate BM in favor of a Lévy process with jumps for each trait, with the lineage leading to most recent common ancestor of great apes showing particularly strong evidence against single-rate BM.
Collapse
|