1
|
Mendes FK, Bouckaert R, Carvalho LM, Drummond AJ. How to Validate a Bayesian Evolutionary Model. Syst Biol 2025; 74:158-175. [PMID: 39506375 PMCID: PMC11809579 DOI: 10.1093/sysbio/syae064] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2024] [Revised: 10/21/2024] [Accepted: 11/03/2024] [Indexed: 11/08/2024] Open
Abstract
Biology has become a highly mathematical discipline in which probabilistic models play a central role. As a result, research in the biological sciences is now dependent on computational tools capable of carrying out complex analyses. These tools must be validated before they can be used, but what is understood as validation varies widely among methodological contributions. This may be a consequence of the still embryonic stage of the literature on statistical software validation for computational biology. Our manuscript aims to advance this literature. Here, we describe, illustrate, and introduce new good practices for assessing the correctness of a model implementation with an emphasis on Bayesian methods. We also introduce a suite of functionalities for automating validation protocols. It is our hope that the guidelines presented here help sharpen the focus of discussions on (as well as elevate) expected standards of statistical software for biology.
Collapse
Affiliation(s)
- Fábio K Mendes
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA 70803, USA
| | - Remco Bouckaert
- School of Computer Science, The University of Auckland, Auckland 1010, New Zealand
| | - Luiz M Carvalho
- Escola de Matemática Aplicada, Fundação Getulio Vargas, Rio de Janeiro, RJ 22250-900, Brazil
| | - Alexei J Drummond
- School of Biological Sciences, The University of Auckland, Auckland 1010, New Zealand
| |
Collapse
|
2
|
Bjornson S, Verbruggen H, Upham NS, Steenwyk JL. Reticulate evolution: Detection and utility in the phylogenomics era. Mol Phylogenet Evol 2024; 201:108197. [PMID: 39270765 DOI: 10.1016/j.ympev.2024.108197] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2024] [Revised: 08/13/2024] [Accepted: 09/08/2024] [Indexed: 09/15/2024]
Abstract
Phylogenomics has enriched our understanding that the Tree of Life can have network-like or reticulate structures among some taxa and genes. Two non-vertical modes of evolution - hybridization/introgression and horizontal gene transfer - deviate from a strictly bifurcating tree model, causing non-treelike patterns. However, these reticulate processes can produce similar patterns to incomplete lineage sorting or recombination, potentially leading to ambiguity. Here, we present a brief overview of a phylogenomic workflow for inferring organismal histories and compare methods for distinguishing modes of reticulate evolution. We discuss how the timing of coalescent events can help disentangle introgression from incomplete lineage sorting and how horizontal gene transfer events can help determine the relative timing of speciation events. In doing so, we identify pitfalls of certain methods and discuss how to extend their utility across the Tree of Life. Workflows, methods, and future directions discussed herein underscore the need to embrace reticulate evolutionary patterns for understanding the timing and rates of evolutionary events, providing a clearer view of life's history.
Collapse
Affiliation(s)
- Saelin Bjornson
- School of BioSciences, University of Melbourne, Victoria, Australia
| | - Heroen Verbruggen
- School of BioSciences, University of Melbourne, Victoria, Australia; CIBIO, Centro de Investigação em Biodiversidade e Recursos Genéticos, InBIO Laboratório Associado, Campus de Vairão, Universidade do Porto, 4485-661 Vairão, Portugal
| | - Nathan S Upham
- School of Life Sciences, Arizona State University, Tempe, AZ, USA.
| | - Jacob L Steenwyk
- Howards Hughes Medical Institute and the Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA, USA.
| |
Collapse
|
3
|
Judge C, Vaughan T, Russell T, Abbott S, du Plessis L, Stadler T, Brady O, Hill S. EpiFusion: Joint inference of the effective reproduction number by integrating phylodynamic and epidemiological modelling with particle filtering. PLoS Comput Biol 2024; 20:e1012528. [PMID: 39527637 PMCID: PMC11581393 DOI: 10.1371/journal.pcbi.1012528] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2024] [Revised: 11/21/2024] [Accepted: 10/01/2024] [Indexed: 11/16/2024] Open
Abstract
Accurately estimating the effective reproduction number (Rt) of a circulating pathogen is a fundamental challenge in the study of infectious disease. The fields of epidemiology and pathogen phylodynamics both share this goal, but to date, methodologies and data employed by each remain largely distinct. Here we present EpiFusion: a joint approach that can be used to harness the complementary strengths of each field to improve estimation of outbreak dynamics for large and poorly sampled epidemics, such as arboviral or respiratory virus outbreaks, and validate it for retrospective analysis. We propose a model of Rt that estimates outbreak trajectories conditional upon both phylodynamic (time-scaled trees estimated from genetic sequences) and epidemiological (case incidence) data. We simulate stochastic outbreak trajectories that are weighted according to epidemiological and phylodynamic observation models and fit using particle Markov Chain Monte Carlo. To assess performance, we test EpiFusion on simulated outbreaks in which transmission and/or surveillance rapidly changes and find that using EpiFusion to combine epidemiological and phylodynamic data maintains accuracy and increases certainty in trajectory and Rt estimates, compared to when each data type is used alone. We benchmark EpiFusion's performance against existing methods to estimate Rt and demonstrate advances in speed and accuracy. Importantly, our approach scales efficiently with dataset size. Finally, we apply our model to estimate Rt during the 2014 Ebola outbreak in Sierra Leone. EpiFusion is designed to accommodate future extensions that will improve its utility, such as explicitly modelling population structure, accommodations for phylogenetic uncertainty, and the ability to weight the contributions of genomic or case incidence to the inference.
Collapse
Affiliation(s)
- Ciara Judge
- Department of Infectious Disease Epidemiology and Dynamics, Faculty of Epidemiology and Public Health, London School of Hygiene and Tropical Medicine, United Kingdom
- Centre for Mathematical Modelling of Infectious Diseases, London School of Hygiene and Tropical Medicine, United Kingdom
- Department of Pathobiology and Population Sciences, Royal Veterinary College, United Kingdom
| | - Timothy Vaughan
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Timothy Russell
- Department of Infectious Disease Epidemiology and Dynamics, Faculty of Epidemiology and Public Health, London School of Hygiene and Tropical Medicine, United Kingdom
- Centre for Mathematical Modelling of Infectious Diseases, London School of Hygiene and Tropical Medicine, United Kingdom
| | - Sam Abbott
- Department of Infectious Disease Epidemiology and Dynamics, Faculty of Epidemiology and Public Health, London School of Hygiene and Tropical Medicine, United Kingdom
- Centre for Mathematical Modelling of Infectious Diseases, London School of Hygiene and Tropical Medicine, United Kingdom
| | - Louis du Plessis
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Tanja Stadler
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Oliver Brady
- Department of Infectious Disease Epidemiology and Dynamics, Faculty of Epidemiology and Public Health, London School of Hygiene and Tropical Medicine, United Kingdom
- Centre for Mathematical Modelling of Infectious Diseases, London School of Hygiene and Tropical Medicine, United Kingdom
| | - Sarah Hill
- Department of Pathobiology and Population Sciences, Royal Veterinary College, United Kingdom
| |
Collapse
|
4
|
Capobianco A, Friedman M. Fossils indicate marine dispersal in osteoglossid fishes, a classic example of continental vicariance. Proc Biol Sci 2024; 291:20241293. [PMID: 39137888 PMCID: PMC11321865 DOI: 10.1098/rspb.2024.1293] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2024] [Revised: 07/02/2024] [Accepted: 07/02/2024] [Indexed: 08/15/2024] Open
Abstract
The separation of closely related terrestrial or freshwater species by vast marine barriers represents a biogeographical riddle. Such cases can provide evidence for vicariance, a process whereby ancient geological events like continental rifting divided ancestral geographical ranges. With an evolutionary history extending tens of millions of years, freshwater ecology, and distribution encompassing widely separated southern landmasses, osteoglossid bonytongue fishes are a textbook case of vicariance attributed to Mesozoic fragmentation of the Gondwanan supercontinent. Largely overlooked fossils complicate the clean narrative invoked for extant species by recording occurrences on additional continents and in marine settings. Here, we present a new total-evidence phylogenetic hypothesis for bonytongue fishes combined with quantitative models of range evolution and show that the last common ancestor of extant osteoglossids was likely marine, and that the group colonized freshwater settings at least four times when both extant and extinct lineages are considered. The correspondence between extant osteoglossid relationships and patterns of continental fragmentation therefore represents a striking example of biogeographical pseudocongruence. Contrary to arguments against vicariance hypotheses that rely only on temporal or phylogenetic evidence, these results provide direct palaeontological support for enhanced dispersal ability early in the history of a group with widely separated distributions in the modern day.
Collapse
Affiliation(s)
- Alessio Capobianco
- GeoBio-Center LMU, Ludwig-Maximilians-Universität München, Munich, Germany
- Department of Earth and Environmental Sciences, Palaeontology & Geobiology, Ludwig-Maximilians-Universität München, Munich, Germany
- Department of Earth and Environmental Sciences, University of Michigan, Ann Arbor, MI, USA
- Museum of Paleontology, University of Michigan, Ann Arbor, MI, USA
| | - Matt Friedman
- Department of Earth and Environmental Sciences, University of Michigan, Ann Arbor, MI, USA
- Museum of Paleontology, University of Michigan, Ann Arbor, MI, USA
| |
Collapse
|
5
|
Khurana MP, Scheidwasser-Clow N, Penn MJ, Bhatt S, Duchêne DA. The Limits of the Constant-rate Birth-Death Prior for Phylogenetic Tree Topology Inference. Syst Biol 2024; 73:235-246. [PMID: 38153910 PMCID: PMC11129600 DOI: 10.1093/sysbio/syad075] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2023] [Revised: 12/20/2023] [Accepted: 12/27/2023] [Indexed: 12/30/2023] Open
Abstract
Birth-death models are stochastic processes describing speciation and extinction through time and across taxa and are widely used in biology for inference of evolutionary timescales. Previous research has highlighted how the expected trees under the constant-rate birth-death (crBD) model tend to differ from empirical trees, for example, with respect to the amount of phylogenetic imbalance. However, our understanding of how trees differ between the crBD model and the signal in empirical data remains incomplete. In this Point of View, we aim to expose the degree to which the crBD model differs from empirically inferred phylogenies and test the limits of the model in practice. Using a wide range of topology indices to compare crBD expectations against a comprehensive dataset of 1189 empirically estimated trees, we confirm that crBD model trees frequently differ topologically compared with empirical trees. To place this in the context of standard practice in the field, we conducted a meta-analysis for a subset of the empirical studies. When comparing studies that used Bayesian methods and crBD priors with those that used other non-crBD priors and non-Bayesian methods (i.e., maximum likelihood methods), we do not find any significant differences in tree topology inferences. To scrutinize this finding for the case of highly imbalanced trees, we selected the 100 trees with the greatest imbalance from our dataset, simulated sequence data for these tree topologies under various evolutionary rates, and re-inferred the trees under maximum likelihood and using the crBD model in a Bayesian setting. We find that when the substitution rate is low, the crBD prior results in overly balanced trees, but the tendency is negligible when substitution rates are sufficiently high. Overall, our findings demonstrate the general robustness of crBD priors across a broad range of phylogenetic inference scenarios but also highlight that empirically observed phylogenetic imbalance is highly improbable under the crBD model, leading to systematic bias in data sets with limited information content.
Collapse
Affiliation(s)
- Mark P Khurana
- Section of Epidemiology, Department of Public Health, University of Copenhagen, 1352 Copenhagen, Denmark
| | - Neil Scheidwasser-Clow
- Section of Epidemiology, Department of Public Health, University of Copenhagen, 1352 Copenhagen, Denmark
| | - Matthew J Penn
- Department of Statistics, University of Oxford, OX1 3LB, Oxford, UK
| | - Samir Bhatt
- Section of Epidemiology, Department of Public Health, University of Copenhagen, 1352 Copenhagen, Denmark
- MRC Centre for Global Infectious Disease Analysis, School of Public Health, Imperial College London, SW7 2AZ, London, UK
| | - David A Duchêne
- Centre for Evolutionary Hologenomics, University of Copenhagen, 1352 Copenhagen, Denmark
| |
Collapse
|
6
|
Allen BJ, Volkova Oliveira MV, Stadler T, Vaughan TG, Warnock RCM. Mechanistic phylodynamic models do not provide conclusive evidence that non-avian dinosaurs were in decline before their final extinction. CAMBRIDGE PRISMS. EXTINCTION 2024; 2:e6. [PMID: 40078801 PMCID: PMC11895757 DOI: 10.1017/ext.2024.5] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/03/2022] [Revised: 02/19/2024] [Accepted: 03/05/2024] [Indexed: 03/14/2025]
Abstract
Phylodynamic models can be used to estimate diversification trajectories from time-calibrated phylogenies. Here we apply two such models to phylogenies of non-avian dinosaurs, a clade whose evolutionary history has been widely debated. Although some authors have suggested that the clade experienced a decline in diversity, potentially starting millions of years before the end-Cretaceous mass extinction, others have suggested that the group remained highly diverse right up until the Cretaceous-Paleogene (K-Pg) boundary. Our results show that model assumptions, likely with respect to incomplete sampling, have a large impact on whether dinosaurs appear to have experienced a long-term decline or not. The results are also highly sensitive to the topology and branch lengths of the phylogeny used. Developing comprehensive models of sampling bias, and building larger and more accurate phylogenies, are likely to be necessary steps for us to determine whether dinosaur diversity was or was not in decline before the end-Cretaceous mass extinction.
Collapse
Affiliation(s)
- Bethany J. Allen
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland
- Computational Evolution Group, Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | | | - Tanja Stadler
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland
- Computational Evolution Group, Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Timothy G. Vaughan
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland
- Computational Evolution Group, Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | | |
Collapse
|
7
|
Yu Y, Zhang C, Xu X. Complex macroevolution of pterosaurs. Curr Biol 2023; 33:770-779.e4. [PMID: 36787747 DOI: 10.1016/j.cub.2023.01.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2022] [Revised: 11/13/2022] [Accepted: 01/05/2023] [Indexed: 02/16/2023]
Abstract
Pterosaurs, the earliest flying tetrapods, are the subject of some recent quantitative macroevolutionary analyses from different perspectives.1-2 Here, we use an integrative approach involving newly assembled phylogenetic and body size datasets, net diversification rates, morphological rates, and morphological disparity to gain a holistic understanding of the pterosaur macroevolution. The first two parameters are important in quantitative analyses of macroevolution, but they have been rarely used in previous pterosaur studies.1,3,4,2,5,6,7,8,9,10,11,12 Our study reveals an ∼115-Ma period-from Early Triassic to Early Cretaceous-of multi-wave increasing net diversification rates and disparity, as well as high morphological rates, followed by an ∼65-Ma period-from Early Cretaceous to the end of the Cretaceous-of mostly negative net diversification rates, decreasing disparity, and relatively low morphological rates in pterosaur evolution. Our study demonstrates the following: (1) body size plays an important role in pterosaur lineage diversification during nearly their whole evolutionary history, and the evolution of locomotion, trophic, and ornamental structures also plays a role in different periods; (2) birds, the other major flying tetrapod group at the time, might have affected pterosaur macroevolution for ∼100 Ma; and (3) different mass extinction events might have affected pterosaur evolution differently. Particularly, the revealed decline in pterosaur biodiversity during the Middle and Late Cretaceous periods provides further support for the possible presence of a biodiversity decline of large-sized terrestrial amniotes starting in the mid-Cretaceous,13,14 which may have been caused by multiple factors including a global land area decrease during these periods.
Collapse
Affiliation(s)
- Yilun Yu
- Key Laboratory of Vertebrate Evolution and Human Origins, Institute of Vertebrate Paleontology and Paleoanthropology, Chinese Academy of Sciences, Beijing, China; University of Chinese Academy of Sciences, Beijing, China
| | - Chi Zhang
- Key Laboratory of Vertebrate Evolution and Human Origins, Institute of Vertebrate Paleontology and Paleoanthropology, Chinese Academy of Sciences, Beijing, China; Center for Excellence in Life and Paleoenvironment, Chinese Academy of Sciences, Beijing 100044, China.
| | - Xing Xu
- Centre for Vertebrate Evolutionary Biology, Yunnan University, Kunming, China; Shenyang Normal University, Paleontological Museum of Liaoning, Shenyang, China.
| |
Collapse
|
8
|
Beck RMD, de Vries D, Janiak MC, Goodhead IB, Boubli JP. Total evidence phylogeny of platyrrhine primates and a comparison of undated and tip-dating approaches. J Hum Evol 2023; 174:103293. [PMID: 36493598 DOI: 10.1016/j.jhevol.2022.103293] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2021] [Revised: 10/21/2022] [Accepted: 10/21/2022] [Indexed: 12/12/2022]
Abstract
There have been multiple published phylogenetic analyses of platyrrhine primates (New World monkeys) using both morphological and molecular data, but relatively few that have integrated both types of data into a total evidence approach. Here, we present phylogenetic analyses of recent and fossil platyrrhines, based on a total evidence data set of 418 morphological characters and 10.2 kilobases of DNA sequence data from 17 nuclear genes taken from previous studies, using undated and tip-dating approaches in a Bayesian framework. We compare the results of these analyses with molecular scaffold analyses using maximum parsimony and Bayesian approaches, and we use a formal information theoretic approach to identify unstable taxa. After a posteriori pruning of unstable taxa, the undated and tip-dating topologies appear congruent with recent molecular analyses and support largely similar relationships, with strong support for Stirtonia as a stem alouattine, Neosaimiri as a stem saimirine, Cebupithecia as a stem pitheciine, and Lagonimico as a stem callitrichid. Both analyses find three Greater Antillean subfossil platyrrhines (Xenothrix, Antillothrix, and Paralouatta) to form a clade that is related to Callicebus, congruent with a single dispersal event by the ancestor of this clade to the Greater Antilles. They also suggest that the fossil Proteropithecia may not be closely related to pitheciines, and that all known platyrrhines older than the Middle Miocene are stem taxa. Notably, the undated analysis found the Early Miocene Panamacebus (currently recognized as the oldest known cebid) to be unstable, and the tip-dating analysis placed it outside crown Platyrrhini. Our tip-dating analysis supports a late Oligocene or earliest Miocene (20.8-27.0 Ma) age for crown Platyrrhini, congruent with recent molecular clock analyses.
Collapse
Affiliation(s)
- Robin M D Beck
- Ecosystems and Environment Research Centre, School of Science, Engineering and Environment, University of Salford, Manchester, UK.
| | - Dorien de Vries
- Ecosystems and Environment Research Centre, School of Science, Engineering and Environment, University of Salford, Manchester, UK
| | - Mareike C Janiak
- Ecosystems and Environment Research Centre, School of Science, Engineering and Environment, University of Salford, Manchester, UK
| | - Ian B Goodhead
- Ecosystems and Environment Research Centre, School of Science, Engineering and Environment, University of Salford, Manchester, UK
| | - Jean P Boubli
- Ecosystems and Environment Research Centre, School of Science, Engineering and Environment, University of Salford, Manchester, UK
| |
Collapse
|
9
|
Brée B, Condamine FL, Guinot G. Combining palaeontological and neontological data shows a delayed diversification burst of carcharhiniform sharks likely mediated by environmental change. Sci Rep 2022; 12:21906. [PMID: 36535995 PMCID: PMC9763247 DOI: 10.1038/s41598-022-26010-7] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2022] [Accepted: 12/07/2022] [Indexed: 12/23/2022] Open
Abstract
Estimating deep-time species-level diversification processes remains challenging. Both the fossil record and molecular phylogenies allow the estimation of speciation and extinction rates, but each type of data may still provide an incomplete picture of diversification dynamics. Here, we combine species-level palaeontological (fossil occurrences) and neontological (molecular phylogenies) data to estimate deep-time diversity dynamics through process-based birth-death models for Carcharhiniformes, the most speciose shark order today. Despite their abundant fossil record dating back to the Middle Jurassic, only a small fraction of extant carcharhiniform species is recorded as fossils, which impedes relying only on the fossil record to study their recent diversification. Combining fossil and phylogenetic data, we recover a complex evolutionary history for carcharhiniforms, exemplified by several variations in diversification rates with an early low diversity period followed by a Cenozoic radiation. We further reveal a burst of diversification in the last 30 million years, which is partially recorded with fossil data only. We also find that reef expansion and temperature change can explain variations in speciation and extinction through time. These results pinpoint the primordial importance of these environmental variables in the evolution of marine clades. Our study also highlights the benefit of combining the fossil record with phylogenetic data to address macroevolutionary questions.
Collapse
Affiliation(s)
- Baptiste Brée
- grid.462058.d0000 0001 2188 7059Institut des Sciences de l’Evolution de Montpellier, CNRS, IRD, EPHE, Université de Montpellier, 34095 Montpellier, France
| | - Fabien L. Condamine
- grid.462058.d0000 0001 2188 7059Institut des Sciences de l’Evolution de Montpellier, CNRS, IRD, EPHE, Université de Montpellier, 34095 Montpellier, France
| | - Guillaume Guinot
- grid.462058.d0000 0001 2188 7059Institut des Sciences de l’Evolution de Montpellier, CNRS, IRD, EPHE, Université de Montpellier, 34095 Montpellier, France
| |
Collapse
|