1
|
Morel M, Zhukova A, Lemoine F, Gascuel O. Accurate Detection of Convergent Mutations in Large Protein Alignments With ConDor. Genome Biol Evol 2024; 16:evae040. [PMID: 38451738 PMCID: PMC10986858 DOI: 10.1093/gbe/evae040] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2023] [Revised: 01/30/2024] [Accepted: 02/22/2024] [Indexed: 03/09/2024] Open
Abstract
Evolutionary convergences are observed at all levels, from phenotype to DNA and protein sequences, and changes at these different levels tend to be correlated. Notably, convergent mutations can lead to convergent changes in phenotype, such as changes in metabolism, drug resistance, and other adaptations to changing environments. We propose a two-component approach to detect mutations subject to convergent evolution in protein alignments. The "Emergence" component selects mutations that emerge more often than expected, while the "Correlation" component selects mutations that correlate with the convergent phenotype under study. With regard to Emergence, a phylogeny deduced from the alignment is provided by the user and is used to simulate the evolution of each alignment position. These simulations allow us to estimate the expected number of mutations in a neutral model, which is compared to the observed number of mutations in the data studied. In Correlation, a comparative phylogenetic approach, is used to measure whether the presence of each of the observed mutations is correlated with the convergent phenotype. Each component can be used on its own, for example Emergence when no phenotype is available. Our method is implemented in a standalone workflow and a webserver, called ConDor. We evaluate the properties of ConDor using simulated data, and we apply it to three real datasets: sedge PEPC proteins, HIV reverse transcriptase, and fish rhodopsin. The results show that the two components of ConDor complement each other, with an overall accuracy that compares favorably to other available tools, especially on large datasets.
Collapse
Affiliation(s)
- Marie Morel
- Institut Pasteur, Université Paris Cité, Unité Bioinformatique Evolutive, Paris, France
- Université Claude Bernard Lyon 1, LBBE, UMR 5558, CNRS, VAS, Villeurbanne, 69100, France
| | - Anna Zhukova
- Institut Pasteur, Université Paris Cité, Unité Bioinformatique Evolutive, Paris, France
- Institut Pasteur, Université Paris Cité, Bioinformatics and Biostatistics Hub, Paris, France
| | - Frédéric Lemoine
- Institut Pasteur, Université Paris Cité, Unité Bioinformatique Evolutive, Paris, France
- Institut Pasteur, Université Paris Cité, Bioinformatics and Biostatistics Hub, Paris, France
- Institut Pasteur, Université Paris Cité, CNR Virus Des Infections Respiratoires, Paris, France
| | - Olivier Gascuel
- Institut Pasteur, Université Paris Cité, Unité Bioinformatique Evolutive, Paris, France
- Institut de Systématique, Evolution, Biodiversité (UMR 7205—CNRS, Muséum National d’Histoire Naturelle, SU, EPHE, UA), Paris, France
| |
Collapse
|
2
|
Maritz B, Barends JM, Mohamed R, Maritz RA, Alexander GJ. Repeated dietary shifts in elapid snakes (Squamata: Elapidae) revealed by ancestral state reconstruction. Biol J Linn Soc Lond 2021. [DOI: 10.1093/biolinnean/blab115] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
Abstract
Identifying the traits of ancestral organisms can reveal patterns and drivers of organismal diversification. Unfortunately, reconstructing complex multistate traits (such as diet) remains challenging. Adopting a ‘reconstruct, then aggregate’ approach in a maximum likelihood framework, we reconstructed ancestral diets for 298 species of elapid snakes. We tested whether different prey types were correlated with one another, tested for one-way contingency between prey type pairs, and examined the relationship between snake body size and dietary composition. We demonstrate that the evolution of diet was characterized by niche conservation punctuated by repeated dietary shifts. The ancestor of elapids most likely fed on reptiles and possibly amphibians, with deviations from this ancestral diet occurring repeatedly due to shifts into marine environments and changes in body size. Moreover, we demonstrate important patterns of prey use, including one-way dependency—most obviously the inclusion of eggs being dependent on a diet that already included the producers of those eggs. Despite imperfect dietary data, our approach produced a robust overview of dietary evolution. Given the paucity of natural history information for many organisms, our approach has the potential to increase the number of lineages to which ancestral state reconstructions of multistate traits can be robustly applied.
Collapse
Affiliation(s)
- Bryan Maritz
- Department of Biodiversity and Conservation Biology, University of the Western Cape, Private Bag X17, Bellville, South Africa
| | - Jody M Barends
- Department of Biodiversity and Conservation Biology, University of the Western Cape, Private Bag X17, Bellville, South Africa
| | - Riaaz Mohamed
- Department of Biodiversity and Conservation Biology, University of the Western Cape, Private Bag X17, Bellville, South Africa
| | - Robin A Maritz
- Department of Biodiversity and Conservation Biology, University of the Western Cape, Private Bag X17, Bellville, South Africa
| | - Graham J Alexander
- School of Animal, Plant & Environmental Sciences, University of the Witwatersrand, Johannesburg, PO Wits, South Africa
| |
Collapse
|
3
|
Vega Yon GG, Thomas DC, Morrison J, Mi H, Thomas PD, Marjoram P. Bayesian parameter estimation for automatic annotation of gene functions using observational data and phylogenetic trees. PLoS Comput Biol 2021; 17:e1007948. [PMID: 33600408 PMCID: PMC7924801 DOI: 10.1371/journal.pcbi.1007948] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2020] [Revised: 03/02/2021] [Accepted: 12/30/2020] [Indexed: 11/29/2022] Open
Abstract
Gene function annotation is important for a variety of downstream analyses of genetic data. But experimental characterization of function remains costly and slow, making computational prediction an important endeavor. Phylogenetic approaches to prediction have been developed, but implementation of a practical Bayesian framework for parameter estimation remains an outstanding challenge. We have developed a computationally efficient model of evolution of gene annotations using phylogenies based on a Bayesian framework using Markov Chain Monte Carlo for parameter estimation. Unlike previous approaches, our method is able to estimate parameters over many different phylogenetic trees and functions. The resulting parameters agree with biological intuition, such as the increased probability of function change following gene duplication. The method performs well on leave-one-out cross-validation, and we further validated some of the predictions in the experimental scientific literature.
Collapse
Affiliation(s)
- George G. Vega Yon
- Division of Biostatistics, Department of Preventive Medicine, University of Southern California, Los Angeles, California, United States of America
| | - Duncan C. Thomas
- Division of Biostatistics, Department of Preventive Medicine, University of Southern California, Los Angeles, California, United States of America
| | - John Morrison
- Division of Biostatistics, Department of Preventive Medicine, University of Southern California, Los Angeles, California, United States of America
| | - Huaiyu Mi
- Division of Bioinformatics, Department of Preventive Medicine, University of Southern California, Los Angeles, California, United States of America
| | - Paul D. Thomas
- Division of Bioinformatics, Department of Preventive Medicine, University of Southern California, Los Angeles, California, United States of America
| | - Paul Marjoram
- Division of Biostatistics, Department of Preventive Medicine, University of Southern California, Los Angeles, California, United States of America
| |
Collapse
|
4
|
Roch S, Wang KC. Sufficient condition for root reconstruction by parsimony on binary trees with general weights. ELECTRONIC COMMUNICATIONS IN PROBABILITY 2021. [DOI: 10.1214/21-ecp423] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Affiliation(s)
- Sebastien Roch
- Department of Mathematics, University of Wisconsin–Madison, United States of America
| | | |
Collapse
|
5
|
Boyko JD, Beaulieu JM. Generalized hidden Markov models for phylogenetic comparative datasets. Methods Ecol Evol 2020. [DOI: 10.1111/2041-210x.13534] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
Affiliation(s)
- James D. Boyko
- Department of Biological Sciences University of Arkansas Fayetteville AR USA
| | - Jeremy M. Beaulieu
- Department of Biological Sciences University of Arkansas Fayetteville AR USA
| |
Collapse
|
6
|
Gascuel O, Steel M. A Darwinian Uncertainty Principle. Syst Biol 2020; 69:521-529. [PMID: 31432087 PMCID: PMC7188465 DOI: 10.1093/sysbio/syz054] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2019] [Accepted: 08/15/2019] [Indexed: 02/04/2023] Open
Abstract
Reconstructing ancestral characters and traits along a phylogenetic tree is central to evolutionary biology. It is the key to understanding morphology changes among species, inferring ancestral biochemical properties of life, or recovering migration routes in phylogeography. The goal is 2-fold: to reconstruct the character state at the tree root (e.g., the region of origin of some species) and to understand the process of state changes along the tree (e.g., species flow between countries). We deal here with discrete characters, which are “unique,” as opposed to sequence characters (nucleotides or amino-acids), where we assume the same model for all the characters (or for large classes of characters with site-dependent models) and thus benefit from multiple information sources. In this framework, we use mathematics and simulations to demonstrate that although each goal can be achieved with high accuracy individually, it is generally impossible to accurately estimate both the root state and the rates of state changes along the tree branches, from the observed data at the tips of the tree. This is because the global rates of state changes along the branches that are optimal for the two estimation tasks have opposite trends, leading to a fundamental trade-off in accuracy. This inherent “Darwinian uncertainty principle” concerning the simultaneous estimation of “patterns” and “processes” governs ancestral reconstructions in biology. For certain tree shapes (typically speciation trees) the uncertainty of simultaneous estimation is reduced when more tips are present; however, for other tree shapes it does not (e.g., coalescent trees used in population genetics).
Collapse
Affiliation(s)
- Olivier Gascuel
- Unité Bioinformatique Evolutive, C3BI USR 3756, Institut Pasteur & CNRS, Paris, France
| | - Mike Steel
- Biomathematics Research Centre, University of Canterbury, Christchurch, New Zealand
| |
Collapse
|
7
|
Oliva A, Pulicani S, Lefort V, Bréhélin L, Gascuel O, Guindon S. Accounting for ambiguity in ancestral sequence reconstruction. Bioinformatics 2020; 35:4290-4297. [PMID: 30977781 DOI: 10.1093/bioinformatics/btz249] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2018] [Revised: 03/29/2019] [Accepted: 04/06/2019] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION The reconstruction of ancestral genetic sequences from the analysis of contemporaneous data is a powerful tool to improve our understanding of molecular evolution. Various statistical criteria defined in a phylogenetic framework can be used to infer nucleotide, amino-acid or codon states at internal nodes of the tree, for every position along the sequence. These criteria generally select the state that maximizes (or minimizes) a given criterion. Although it is perfectly sensible from a statistical perspective, that strategy fails to convey useful information about the level of uncertainty associated to the inference. RESULTS The present study introduces a new criterion for ancestral sequence reconstruction, the minimum posterior expected error (MPEE), that selects a single state whenever the signal conveyed by the data is strong, and a combination of multiple states otherwise. We also assess the performance of a criterion based on the Brier scoring scheme which, like MPEE, does not rely on any tuning parameters. The precision and accuracy of several other criteria that involve arbitrarily set tuning parameters are also evaluated. Large scale simulations demonstrate the benefits of using the MPEE and Brier-based criteria with a substantial increase in the accuracy of the inference of past sequences compared to the standard approach and realistic compromises on the precision of the solutions returned. AVAILABILITY AND IMPLEMENTATION The software package PhyML (https://github.com/stephaneguindon/phyml) provides an implementation of the Maximum A Posteriori (MAP) and MPEE criteria for reconstructing ancestral nucleotide and amino-acid sequences.
Collapse
Affiliation(s)
- A Oliva
- Department of Computer Science, LIRMM, CNRS & Université de Montpellier, Montpellier, France
- Australian Centre for Ancient DNA, Adelaide, Australia
| | - S Pulicani
- Department of Computer Science, LIRMM, CNRS & Université de Montpellier, Montpellier, France
| | - V Lefort
- Department of Computer Science, LIRMM, CNRS & Université de Montpellier, Montpellier, France
| | - L Bréhélin
- Department of Computer Science, LIRMM, CNRS & Université de Montpellier, Montpellier, France
| | - O Gascuel
- Unité Bioinformatique Evolutive, C3BI USR 3756, Institut Pasteur & CNRS, Paris, France
| | - S Guindon
- Department of Computer Science, LIRMM, CNRS & Université de Montpellier, Montpellier, France
| |
Collapse
|
8
|
|
9
|
Holland BR, Ketelaar-Jones S, O'Mara AR, Woodhams MD, Jordan GJ. Accuracy of ancestral state reconstruction for non-neutral traits. Sci Rep 2020; 10:7644. [PMID: 32376845 PMCID: PMC7203120 DOI: 10.1038/s41598-020-64647-4] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2019] [Accepted: 04/09/2020] [Indexed: 12/05/2022] Open
Abstract
The assumptions underpinning ancestral state reconstruction are violated in many evolutionary systems, especially for traits under directional selection. However, the accuracy of ancestral state reconstruction for non-neutral traits is poorly understood. To investigate the accuracy of ancestral state reconstruction methods, trees and binary characters were simulated under the BiSSE (Binary State Speciation and Extinction) model using a wide range of character-state-dependent rates of speciation, extinction and character-state transition. We used maximum parsimony (MP), BiSSE and two-state Markov (Mk2) models to reconstruct ancestral states. Under each method, error rates increased with node depth, true number of state transitions, and rates of state transition and extinction; exceeding 30% for the deepest 10% of nodes and highest rates of extinction and character-state transition. Where rates of character-state transition were asymmetrical, error rates were greater when the rate away from the ancestral state was largest. Preferential extinction of species with the ancestral character state also led to higher error rates. BiSSE outperformed Mk2 in all scenarios where either speciation or extinction was state dependent and outperformed MP under most conditions. MP outperformed Mk2 in most scenarios except when the rates of character-state transition and/or extinction were highly asymmetrical and the ancestral state was unfavoured.
Collapse
Affiliation(s)
- Barbara R Holland
- School of Natural Sciences, University of Tasmania, Private Bag 55, Hobart, Tas, 7001, Australia.
| | - Saan Ketelaar-Jones
- School of Natural Sciences, University of Tasmania, Private Bag 55, Hobart, Tas, 7001, Australia
| | - Aidan R O'Mara
- School of Health Sciences, University of Tasmania, Private Bag 121, Hobart, Tas, 7001, Australia
| | - Michael D Woodhams
- School of Natural Sciences, University of Tasmania, Private Bag 55, Hobart, Tas, 7001, Australia
| | - Gregory J Jordan
- School of Natural Sciences, University of Tasmania, Private Bag 55, Hobart, Tas, 7001, Australia
| |
Collapse
|
10
|
PONTI R, ARCONES A, VIEITES DR. Challenges in estimating ancestral state reconstructions: the evolution of migration in
Sylvia
warblers as a study case. Integr Zool 2020; 15:161-173. [DOI: 10.1111/1749-4877.12418] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Affiliation(s)
- Raquel PONTI
- National Museum of Natural Sciences Madrid Spain
- Department of Evolutionary Biology, Ecology and Environmental Sciences, Faculty of BiologyUniversity of Barcelona Barcelona Spain
- Biodiversity Research Institute (IRBIO)University of Barcelona Barcelona Spain
| | - Angel ARCONES
- National Museum of Natural Sciences Madrid Spain
- Department of Evolutionary Biology, Ecology and Environmental Sciences, Faculty of BiologyUniversity of Barcelona Barcelona Spain
- Biodiversity Research Institute (IRBIO)University of Barcelona Barcelona Spain
| | | |
Collapse
|
11
|
Ishikawa SA, Zhukova A, Iwasaki W, Gascuel O. A Fast Likelihood Method to Reconstruct and Visualize Ancestral Scenarios. Mol Biol Evol 2019; 36:2069-2085. [PMID: 31127303 PMCID: PMC6735705 DOI: 10.1093/molbev/msz131] [Citation(s) in RCA: 132] [Impact Index Per Article: 22.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Abstract
The reconstruction of ancestral scenarios is widely used to study the evolution of characters along phylogenetic trees. One commonly uses the marginal posterior probabilities of the character states, or the joint reconstruction of the most likely scenario. However, marginal reconstructions provide users with state probabilities, which are difficult to interpret and visualize, whereas joint reconstructions select a unique state for every tree node and thus do not reflect the uncertainty of inferences. We propose a simple and fast approach, which is in between these two extremes. We use decision-theory concepts (namely, the Brier score) to associate each node in the tree to a set of likely states. A unique state is predicted in tree regions with low uncertainty, whereas several states are predicted in uncertain regions, typically around the tree root. To visualize the results, we cluster the neighboring nodes associated with the same states and use graph visualization tools. The method is implemented in the PastML program and web server. The results on simulated data demonstrate the accuracy and robustness of the approach. PastML was applied to the phylogeography of Dengue serotype 2 (DENV2), and the evolution of drug resistances in a large HIV data set. These analyses took a few minutes and provided convincing results. PastML retrieved the main transmission routes of human DENV2 and showed the uncertainty of the human-sylvatic DENV2 geographic origin. With HIV, the results show that resistance mutations mostly emerge independently under treatment pressure, but resistance clusters are found, corresponding to transmissions among untreated patients.
Collapse
Affiliation(s)
- Sohta A Ishikawa
- Unité Bioinformatique Evolutive, Institut Pasteur, C3BI USR 3756 IP & CNRS, Paris, France
- Department of Biological Sciences, The University of Tokyo, Tokyo, Japan
- Evolutionary Genomics of RNA Viruses, Virology Department, Institut Pasteur, Paris, France
| | - Anna Zhukova
- Unité Bioinformatique Evolutive, Institut Pasteur, C3BI USR 3756 IP & CNRS, Paris, France
| | - Wataru Iwasaki
- Department of Biological Sciences, The University of Tokyo, Tokyo, Japan
| | - Olivier Gascuel
- Unité Bioinformatique Evolutive, Institut Pasteur, C3BI USR 3756 IP & CNRS, Paris, France
| |
Collapse
|
12
|
Chevenet F, Castel G, Jousselin E, Gascuel O. PastView: a user-friendly interface to explore ancestral scenarios. BMC Evol Biol 2019; 19:163. [PMID: 31375065 PMCID: PMC6679476 DOI: 10.1186/s12862-019-1490-4] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2019] [Accepted: 07/25/2019] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Ancestral character states computed from the combination of phylogenetic trees with extrinsic traits are used to decipher evolutionary scenarios in various research fields such as phylogeography, epidemiology, and ecology. Despite the existence of powerful methods and software in ancestral character state inference, difficulties may arise when interpreting the outputs of such inferences. The growing complexity of data (trees, annotations), the diversity of optimization criteria for computing trees and ancestral character states, the combinatorial explosion of potential evolutionary scenarios if some ancestral characters states do not stand out clearly from others, requires the design of new methods to explore associations of phylogenetic trees with extrinsic traits, to ease the visualization and interpretation of evolutionary scenarios. RESULT We developed PastView, a user-friendly interface that includes numerical and graphical features to help users to import and/or compute ancestral character states from discrete variables and extract ancestral scenarios as sets of successive transitions of character states from the tree root to its leaves. PastView provides summarized views such as transition maps and integrates comparative tools to highlight agreements or discrepancies between methods of ancestral annotations inference. CONCLUSION The main contribution of PastView is to assemble known numerical and graphical methods into a multi-maps graphical user interface dedicated to the computing, searching and viewing of evolutionary scenarios based on phylogenetic trees and ancestral character states. PastView is available publicly as a standalone software on www.pastview.org .
Collapse
Affiliation(s)
- François Chevenet
- MIVEGEC, Université de Montpellier, CNRS, IRD, Montpellier, France. .,LIRMM, Université de Montpellier, CNRS, Montpellier, France.
| | - Guillaume Castel
- CBGP, INRA, CIRAD, IRD, Montpellier SupAgro, Université de Montpellier, Montpellier, France
| | - Emmanuelle Jousselin
- CBGP, INRA, CIRAD, IRD, Montpellier SupAgro, Université de Montpellier, Montpellier, France
| | - Olivier Gascuel
- LIRMM, Université de Montpellier, CNRS, Montpellier, France.,Unité de Bioinformatique Evolutive, C3BI, USR 3756, Institut Pasteur & CNRS, Paris, France
| |
Collapse
|
13
|
Phylogeography of Puumala orthohantavirus in Europe. Viruses 2019; 11:v11080679. [PMID: 31344894 PMCID: PMC6723369 DOI: 10.3390/v11080679] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2019] [Revised: 07/12/2019] [Accepted: 07/22/2019] [Indexed: 12/21/2022] Open
Abstract
Puumala virus is an RNA virus hosted by the bank vole (Myodes glareolus) and is today present in most European countries. Whilst it is generally accepted that hantaviruses have been tightly co-evolving with their hosts, Puumala virus (PUUV) evolutionary history is still controversial and so far has not been studied at the whole European level. This study attempts to reconstruct the phylogeographical spread of modern PUUV throughout Europe during the last postglacial period in the light of an upgraded dataset of complete PUUV small (S) segment sequences and by using most recent computational approaches. Taking advantage of the knowledge on the past migrations of its host, we identified at least three potential independent dispersal routes of PUUV during postglacial recolonization of Europe by the bank vole. From the Alpe-Adrian region (Balkan, Austria, and Hungary) to Western European countries (Germany, France, Belgium, and Netherland), and South Scandinavia. From the vicinity of Carpathian Mountains to the Baltic countries and to Poland, Russia, and Finland. The dissemination towards Denmark and North Scandinavia is more hypothetical and probably involved several independent streams from south and north Fennoscandia.
Collapse
|
14
|
Beaulieu JM, O'Meara BC. Diversity and skepticism are vital for comparative biology: a response to Donoghue and Edwards (2019). AMERICAN JOURNAL OF BOTANY 2019; 106:613-617. [PMID: 31050366 DOI: 10.1002/ajb2.1278] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/31/2019] [Accepted: 04/04/2019] [Indexed: 06/09/2023]
Affiliation(s)
- Jeremy M Beaulieu
- Department of Biological Sciences, University of Arkansas, Fayetteville, AR, 72701, USA
| | - Brian C O'Meara
- Department of Ecology and Evolutionary Biology, University of Tennessee, Knoxville, TN, 37996-1610, USA
| |
Collapse
|
15
|
Herbst L, Li H, Steel M. Quantifying the accuracy of ancestral state prediction in a phylogenetic tree under maximum parsimony. J Math Biol 2019; 78:1953-1979. [PMID: 30758663 DOI: 10.1007/s00285-019-01330-x] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2018] [Revised: 01/21/2019] [Indexed: 11/26/2022]
Abstract
In phylogenetic studies, biologists often wish to estimate the ancestral discrete character state at an interior vertex v of an evolutionary tree T from the states that are observed at the leaves of the tree. A simple and fast estimation method-maximum parsimony-takes the ancestral state at v to be any state that minimises the number of state changes in T required to explain its evolution on T. In this paper, we investigate the reconstruction accuracy of this estimation method further, under a simple symmetric model of state change, and obtain a number of new results, both for 2-state characters, and r-state characters ([Formula: see text]). Our results rely on establishing new identities and inequalities, based on a coupling argument that involves a simpler 'coin toss' approach to ancestral state reconstruction.
Collapse
Affiliation(s)
- Lina Herbst
- Institute of Mathematics and Computer Science, University of Greifswald, Greifswald, Germany
| | - Heyang Li
- School of Mathematics and Statistics, University of Canterbury, Christchurch, New Zealand
| | - Mike Steel
- Biomathematics Research Centre, University of Canterbury, Christchurch, New Zealand.
| |
Collapse
|
16
|
Joly S, Lambert F, Alexandre H, Clavel J, Léveillé‐Bourret É, Clark JL. Greater pollination generalization is not associated with reduced constraints on corolla shape in Antillean plants. Evolution 2018; 72:244-260. [DOI: 10.1111/evo.13410] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2017] [Accepted: 11/29/2017] [Indexed: 12/12/2022]
Affiliation(s)
- Simon Joly
- Montreal Botanical Garden 4101 Sherbrooke East Montréal QC, H1X 2B2 Canada
- Institut de recherche en biologie végétale, Département de sciences biologiquesUniversité de MontréalMontréal Canada
| | - François Lambert
- Institut de recherche en biologie végétale, Département de sciences biologiquesUniversité de MontréalMontréal Canada
| | - Hermine Alexandre
- Institut de recherche en biologie végétale, Département de sciences biologiquesUniversité de MontréalMontréal Canada
| | - Julien Clavel
- École Normale Supérieure, Paris Sciences et Lettres (PSL) Research University, Institut de Biologie de l'École Normale Supérieure (IBENS), CNRS UMR 8197, INSERM U1024, 46 rue d'Ulm F‐75005 Paris France
| | - Étienne Léveillé‐Bourret
- Institut de recherche en biologie végétale, Département de sciences biologiquesUniversité de MontréalMontréal Canada
- Current Address: Department of BiologyUniversity of OttawaOttawa Canada
| | - John L. Clark
- Department of Biological SciencesThe University of AlabamaTuscaloosa, Alabama 35487
- Science DepartmentThe Lawrenceville SchoolLawrenceville, New Jersey U.S.A
| |
Collapse
|
17
|
Herbst L, Fischer M. Ancestral Sequence Reconstruction with Maximum Parsimony. Bull Math Biol 2017; 79:2865-2886. [PMID: 28993971 DOI: 10.1007/s11538-017-0354-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2017] [Accepted: 09/23/2017] [Indexed: 10/18/2022]
Abstract
One of the main aims in phylogenetics is the estimation of ancestral sequences based on present-day data like, for instance, DNA alignments. One way to estimate the data of the last common ancestor of a given set of species is to first reconstruct a phylogenetic tree with some tree inference method and then to use some method of ancestral state inference based on that tree. One of the best-known methods both for tree inference and for ancestral sequence inference is Maximum Parsimony (MP). In this manuscript, we focus on this method and on ancestral state inference for fully bifurcating trees. In particular, we investigate a conjecture published by Charleston and Steel in 1995 concerning the number of species which need to have a particular state, say a, at a particular site in order for MP to unambiguously return a as an estimate for the state of the last common ancestor. We prove the conjecture for all even numbers of character states, which is the most relevant case in biology. We also show that the conjecture does not hold in general for odd numbers of character states, but also present some positive results for this case.
Collapse
Affiliation(s)
- Lina Herbst
- Institute for Mathematics and Computer Science, Greifswald University, Walther-Rathenau-Str. 47, 17489, Greifswald, Germany
| | - Mareike Fischer
- Institute for Mathematics and Computer Science, Greifswald University, Walther-Rathenau-Str. 47, 17489, Greifswald, Germany.
| |
Collapse
|
18
|
Abstract
We consider a stochastic evolutionary model for a phenotype developing amongst n related species with unknown phylogeny. The unknown tree is modelled by a Yule process conditioned on n contemporary nodes. The trait value is assumed to evolve along lineages as an Ornstein-Uhlenbeck process. As a result, the trait values of the n species form a sample with dependent observations. We establish three limit theorems for the sample mean corresponding to three domains for the adaptation rate. In the case of fast adaptation, we show that for large n the normalized sample mean is approximately normally distributed. Using these limit theorems, we develop novel confidence interval formulae for the optimal trait value.
Collapse
|
19
|
Abstract
We consider a stochastic evolutionary model for a phenotype developing amongst n related species with unknown phylogeny. The unknown tree is modelled by a Yule process conditioned on n contemporary nodes. The trait value is assumed to evolve along lineages as an Ornstein-Uhlenbeck process. As a result, the trait values of the n species form a sample with dependent observations. We establish three limit theorems for the sample mean corresponding to three domains for the adaptation rate. In the case of fast adaptation, we show that for large n the normalized sample mean is approximately normally distributed. Using these limit theorems, we develop novel confidence interval formulae for the optimal trait value.
Collapse
|
20
|
Topology and inference for Yule trees with multiple states. J Math Biol 2016; 73:1251-1291. [PMID: 27009067 DOI: 10.1007/s00285-016-0992-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2014] [Revised: 03/10/2016] [Indexed: 10/22/2022]
Abstract
We introduce two models for random trees with multiple states motivated by studies of trait dependence in the evolution of species. Our discrete time model, the multiple state ERM tree, is a generalization of Markov propagation models on a random tree generated by a binary search or 'equal rates Markov' mechanism. Our continuous time model, the multiple state Yule tree, is a generalization of the tree generated by a pure birth or Yule process to the tree generated by multi-type branching processes. We study state dependent topological properties of these two random tree models. We derive asymptotic results that allow one to infer model parameters from data on states at the leaves and at branch-points that are one step away from the leaves.
Collapse
|
21
|
Bartoszek K, Sagitov S. A consistent estimator of the evolutionary rate. J Theor Biol 2015; 371:69-78. [DOI: 10.1016/j.jtbi.2015.01.019] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2014] [Revised: 01/14/2015] [Accepted: 01/18/2015] [Indexed: 11/25/2022]
|
22
|
Mossel E, Steel M. Majority rule has transition ratio 4 on Yule trees under a 2-state symmetric model. J Theor Biol 2014; 360:315-318. [PMID: 25108194 DOI: 10.1016/j.jtbi.2014.07.029] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2014] [Revised: 07/03/2014] [Accepted: 07/23/2014] [Indexed: 11/17/2022]
Abstract
Inferring the ancestral state at the root of a phylogenetic tree from states observed at the leaves is a problem arising in evolutionary biology. The simplest technique - majority rule - estimates the root state by the most frequently occurring state at the leaves. Alternative methods - such as maximum parsimony - explicitly take the tree structure into account. Since either method can outperform the other on particular trees, it is useful to consider the accuracy of the methods on trees generated under some evolutionary null model, such as a Yule pure-birth model. In this short note, we answer a recently posed question concerning the performance of majority rule on Yule trees under a symmetric 2-state Markovian substitution model of character state change. We show that majority rule is accurate precisely when the ratio of the birth (speciation) rate of the Yule process to the substitution rate exceeds the value 4. By contrast, maximum parsimony has been shown to be accurate only when this ratio is at least 6. Our proof relies on a second moment calculation, coupling, and a novel application of a reflection principle.
Collapse
Affiliation(s)
| | - Mike Steel
- Biomathmatics Research centre, University of Canterbury, Christchurch, New Zealand.
| |
Collapse
|