1
|
Morel M, Zhukova A, Lemoine F, Gascuel O. Accurate Detection of Convergent Mutations in Large Protein Alignments With ConDor. Genome Biol Evol 2024; 16:evae040. [PMID: 38451738 PMCID: PMC10986858 DOI: 10.1093/gbe/evae040] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2023] [Revised: 01/30/2024] [Accepted: 02/22/2024] [Indexed: 03/09/2024] Open
Abstract
Evolutionary convergences are observed at all levels, from phenotype to DNA and protein sequences, and changes at these different levels tend to be correlated. Notably, convergent mutations can lead to convergent changes in phenotype, such as changes in metabolism, drug resistance, and other adaptations to changing environments. We propose a two-component approach to detect mutations subject to convergent evolution in protein alignments. The "Emergence" component selects mutations that emerge more often than expected, while the "Correlation" component selects mutations that correlate with the convergent phenotype under study. With regard to Emergence, a phylogeny deduced from the alignment is provided by the user and is used to simulate the evolution of each alignment position. These simulations allow us to estimate the expected number of mutations in a neutral model, which is compared to the observed number of mutations in the data studied. In Correlation, a comparative phylogenetic approach, is used to measure whether the presence of each of the observed mutations is correlated with the convergent phenotype. Each component can be used on its own, for example Emergence when no phenotype is available. Our method is implemented in a standalone workflow and a webserver, called ConDor. We evaluate the properties of ConDor using simulated data, and we apply it to three real datasets: sedge PEPC proteins, HIV reverse transcriptase, and fish rhodopsin. The results show that the two components of ConDor complement each other, with an overall accuracy that compares favorably to other available tools, especially on large datasets.
Collapse
Affiliation(s)
- Marie Morel
- Institut Pasteur, Université Paris Cité, Unité Bioinformatique Evolutive, Paris, France
- Université Claude Bernard Lyon 1, LBBE, UMR 5558, CNRS, VAS, Villeurbanne, 69100, France
| | - Anna Zhukova
- Institut Pasteur, Université Paris Cité, Unité Bioinformatique Evolutive, Paris, France
- Institut Pasteur, Université Paris Cité, Bioinformatics and Biostatistics Hub, Paris, France
| | - Frédéric Lemoine
- Institut Pasteur, Université Paris Cité, Unité Bioinformatique Evolutive, Paris, France
- Institut Pasteur, Université Paris Cité, Bioinformatics and Biostatistics Hub, Paris, France
- Institut Pasteur, Université Paris Cité, CNR Virus Des Infections Respiratoires, Paris, France
| | - Olivier Gascuel
- Institut Pasteur, Université Paris Cité, Unité Bioinformatique Evolutive, Paris, France
- Institut de Systématique, Evolution, Biodiversité (UMR 7205—CNRS, Muséum National d’Histoire Naturelle, SU, EPHE, UA), Paris, France
| |
Collapse
|
2
|
Harish A. Protein structures unravel the signatures and patterns of deep time evolution. QRB DISCOVERY 2024; 5:e3. [PMID: 38616890 PMCID: PMC11016368 DOI: 10.1017/qrd.2024.4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2023] [Revised: 11/13/2023] [Accepted: 12/12/2023] [Indexed: 04/16/2024] Open
Abstract
The formulation and testing of hypotheses using 'big biology data' often lie at the interface of computational biology and structural biology. The Protein Data Bank (PDB), which was established about 50 years ago, catalogs three-dimensional (3D) shapes of organic macromolecules and showcases a structural view of biology. The comparative analysis of the structures of homologs, particularly of proteins, from different species has significantly improved the in-depth analyses of molecular and cell biological questions. In addition, computational tools that were developed to analyze the 'protein universe' are providing the means for efficient resolution of longstanding debates in cell and molecular evolution. In celebrating the golden jubilee of the PDB, much has been written about the transformative impact of PDB on a broad range of fields of scientific inquiry and how structural biology transformed the study of the fundamental processes of life. Yet, the transforming influence of PDB on one field of inquiry of fundamental interest-the reconstruction of the distant biological past-has gone almost unnoticed. Here, I discuss the recent advances to highlight how insights and tools of structural biology are bearing on the data required for the empirical resolution of vigorously debated and apparently contradicting hypotheses in evolutionary biology. Specifically, I show that evolutionary characters defined by protein structure are superior compared to conventional sequence characters for reliable, data-driven resolution of competing hypotheses about the origins of the major clades of life and evolutionary relationship among those clades. Since the better quality data unequivocally support two primary domains of life, it is imperative that the primary classification of life be revised accordingly.
Collapse
|
3
|
Gonçalves C, Harrison MC, Steenwyk JL, Opulente DA, LaBella AL, Wolters JF, Zhou X, Shen XX, Groenewald M, Hittinger CT, Rokas A. Diverse signatures of convergent evolution in cacti-associated yeasts. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.09.14.557833. [PMID: 37745407 PMCID: PMC10515907 DOI: 10.1101/2023.09.14.557833] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/26/2023]
Abstract
Many distantly related organisms have convergently evolved traits and lifestyles that enable them to live in similar ecological environments. However, the extent of phenotypic convergence evolving through the same or distinct genetic trajectories remains an open question. Here, we leverage a comprehensive dataset of genomic and phenotypic data from 1,049 yeast species in the subphylum Saccharomycotina (Kingdom Fungi, Phylum Ascomycota) to explore signatures of convergent evolution in cactophilic yeasts, ecological specialists associated with cacti. We inferred that the ecological association of yeasts with cacti arose independently ~17 times. Using machine-learning, we further found that cactophily can be predicted with 76% accuracy from functional genomic and phenotypic data. The most informative feature for predicting cactophily was thermotolerance, which is likely associated with duplication and altered evolutionary rates of genes impacting the cell envelope in several cactophilic lineages. We also identified horizontal gene transfer and duplication events of plant cell wall-degrading enzymes in distantly related cactophilic clades, suggesting that putatively adaptive traits evolved through disparate molecular mechanisms. Remarkably, multiple cactophilic lineages and their close relatives are emerging human opportunistic pathogens, suggesting that the cactophilic lifestyle-and perhaps more generally lifestyles favoring thermotolerance-may preadapt yeasts to cause human disease. This work underscores the potential of a multifaceted approach involving high throughput genomic and phenotypic data to shed light onto ecological adaptation and highlights how convergent evolution to wild environments could facilitate the transition to human pathogenicity.
Collapse
Affiliation(s)
- Carla Gonçalves
- Vanderbilt University, Department of Biological Sciences, VU Station B #35-1634, Nashville, TN 37235, United States of America
- Evolutionary Studies Initiative, Vanderbilt University, Nashville, TN 37235, USA
- Present address: Associate Laboratory i4HB—Institute for Health and Bioeconomy and UCIBIO—Applied Molecular Biosciences Unit, Department of Life Sciences, NOVA School of Science and Technology, Universidade NOVA de Lisboa, Caparica, Portugal
- Present address: UCIBIO-i4HB, Departamento de Ciências da Vida, Faculdade de Ciências e Tecnologia, Universidade Nova de Lisboa, Caparica, Portugal
| | - Marie-Claire Harrison
- Vanderbilt University, Department of Biological Sciences, VU Station B #35-1634, Nashville, TN 37235, United States of America
- Evolutionary Studies Initiative, Vanderbilt University, Nashville, TN 37235, USA
| | - Jacob L. Steenwyk
- Vanderbilt University, Department of Biological Sciences, VU Station B #35-1634, Nashville, TN 37235, United States of America
- Evolutionary Studies Initiative, Vanderbilt University, Nashville, TN 37235, USA
- Howards Hughes Medical Institute and the Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA, USA
| | - Dana A. Opulente
- Laboratory of Genetics, DOE Great Lakes Bioenergy Research Center, Center for Genomic Science Innovation, J. F. Crow Institute for the Study of Evolution, Wisconsin Energy Institu te, University of Wisconsin-Madison, Madison, WI 53726, USA
- Biology Department, Villanova University, Villanova, PA 19085, USA
| | - Abigail L. LaBella
- Vanderbilt University, Department of Biological Sciences, VU Station B #35-1634, Nashville, TN 37235, United States of America
- Evolutionary Studies Initiative, Vanderbilt University, Nashville, TN 37235, USA
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, Charlotte NC 28223
| | - John F. Wolters
- Laboratory of Genetics, DOE Great Lakes Bioenergy Research Center, Center for Genomic Science Innovation, J. F. Crow Institute for the Study of Evolution, Wisconsin Energy Institu te, University of Wisconsin-Madison, Madison, WI 53726, USA
| | - Xiaofan Zhou
- Vanderbilt University, Department of Biological Sciences, VU Station B #35-1634, Nashville, TN 37235, United States of America
- Evolutionary Studies Initiative, Vanderbilt University, Nashville, TN 37235, USA
- Guangdong Province Key Laboratory of Microbial Signals and Disease Control, Integrative Microbiology Research Center, South China Agricultural University, Guangzhou 510642, China
| | - Xing-Xing Shen
- Vanderbilt University, Department of Biological Sciences, VU Station B #35-1634, Nashville, TN 37235, United States of America
- Evolutionary Studies Initiative, Vanderbilt University, Nashville, TN 37235, USA
- College of Agriculture and Biotechnology and Centre for Evolutionary & Organismal Biology, Zhejiang University, Hangzhou 310058, China
| | | | - Chris Todd Hittinger
- Laboratory of Genetics, DOE Great Lakes Bioenergy Research Center, Center for Genomic Science Innovation, J. F. Crow Institute for the Study of Evolution, Wisconsin Energy Institu te, University of Wisconsin-Madison, Madison, WI 53726, USA
| | - Antonis Rokas
- Vanderbilt University, Department of Biological Sciences, VU Station B #35-1634, Nashville, TN 37235, United States of America
- Evolutionary Studies Initiative, Vanderbilt University, Nashville, TN 37235, USA
| |
Collapse
|
4
|
Hu Y, Wang X, Xu Y, Yang H, Tong Z, Tian R, Xu S, Yu L, Guo Y, Shi P, Huang S, Yang G, Shi S, Wei F. Molecular mechanisms of adaptive evolution in wild animals and plants. SCIENCE CHINA. LIFE SCIENCES 2023; 66:453-495. [PMID: 36648611 PMCID: PMC9843154 DOI: 10.1007/s11427-022-2233-x] [Citation(s) in RCA: 24] [Impact Index Per Article: 24.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/07/2022] [Accepted: 08/30/2022] [Indexed: 01/18/2023]
Abstract
Wild animals and plants have developed a variety of adaptive traits driven by adaptive evolution, an important strategy for species survival and persistence. Uncovering the molecular mechanisms of adaptive evolution is the key to understanding species diversification, phenotypic convergence, and inter-species interaction. As the genome sequences of more and more non-model organisms are becoming available, the focus of studies on molecular mechanisms of adaptive evolution has shifted from the candidate gene method to genetic mapping based on genome-wide scanning. In this study, we reviewed the latest research advances in wild animals and plants, focusing on adaptive traits, convergent evolution, and coevolution. Firstly, we focused on the adaptive evolution of morphological, behavioral, and physiological traits. Secondly, we reviewed the phenotypic convergences of life history traits and responding to environmental pressures, and the underlying molecular convergence mechanisms. Thirdly, we summarized the advances of coevolution, including the four main types: mutualism, parasitism, predation and competition. Overall, these latest advances greatly increase our understanding of the underlying molecular mechanisms for diverse adaptive traits and species interaction, demonstrating that the development of evolutionary biology has been greatly accelerated by multi-omics technologies. Finally, we highlighted the emerging trends and future prospects around the above three aspects of adaptive evolution.
Collapse
Affiliation(s)
- Yibo Hu
- CAS Key Lab of Animal Ecology and Conservation Biology, Chinese Academy of Sciences, Beijing, 100101, China.
| | - Xiaoping Wang
- State Key Laboratory for Conservation and Utilization of Bio-Resources in Yunnan, School of Life Sciences, Yunnan University, Kunming, 650091, China
| | - Yongchao Xu
- State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, Chinese Academy of Sciences, Beijing, 100093, China
| | - Hui Yang
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, 650201, China
| | - Zeyu Tong
- Institute of Evolution and Ecology, School of Life Sciences, Central China Normal University, Wuhan, 430079, China
| | - Ran Tian
- College of Life Sciences, Nanjing Normal University, Nanjing, 210023, China
| | - Shaohua Xu
- State Key Laboratory of Biocontrol, Guangdong Key Lab of Plant Resources, School of Life Sciences, Sun Yat-sen University, Guangzhou, 510275, China
| | - Li Yu
- State Key Laboratory for Conservation and Utilization of Bio-Resources in Yunnan, School of Life Sciences, Yunnan University, Kunming, 650091, China.
| | - Yalong Guo
- State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, Chinese Academy of Sciences, Beijing, 100093, China.
| | - Peng Shi
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, 650201, China.
| | - Shuangquan Huang
- Institute of Evolution and Ecology, School of Life Sciences, Central China Normal University, Wuhan, 430079, China.
| | - Guang Yang
- Southern Marine Science and Engineering Guangdong Laboratory (Guangzhou), Guangzhou, 511458, China.
- College of Life Sciences, Nanjing Normal University, Nanjing, 210023, China.
| | - Suhua Shi
- State Key Laboratory of Biocontrol, Guangdong Key Lab of Plant Resources, School of Life Sciences, Sun Yat-sen University, Guangzhou, 510275, China.
| | - Fuwen Wei
- CAS Key Lab of Animal Ecology and Conservation Biology, Chinese Academy of Sciences, Beijing, 100101, China.
- Southern Marine Science and Engineering Guangdong Laboratory (Guangzhou), Guangzhou, 511458, China.
| |
Collapse
|
5
|
Runthala A, Mbye M, Ayyash M, Xu Y, Kamal-Eldin A. Caseins: Versatility of Their Micellar Organization in Relation to the Functional and Nutritional Properties of Milk. Molecules 2023; 28:molecules28052023. [PMID: 36903269 PMCID: PMC10004547 DOI: 10.3390/molecules28052023] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2022] [Revised: 02/10/2023] [Accepted: 02/11/2023] [Indexed: 02/24/2023] Open
Abstract
The milk of mammals is a complex fluid mixture of various proteins, minerals, lipids, and other micronutrients that play a critical role in providing nutrition and immunity to newborns. Casein proteins together with calcium phosphate form large colloidal particles, called casein micelles. Caseins and their micelles have received great scientific interest, but their versatility and role in the functional and nutritional properties of milk from different animal species are not fully understood. Caseins belong to a class of proteins that exhibit open and flexible conformations. Here, we discuss the key features that maintain the structures of the protein sequences in four selected animal species: cow, camel, human, and African elephant. The primary sequences of these proteins and their posttranslational modifications (phosphorylation and glycosylation) that determine their secondary structures have distinctively evolved in these different animal species, leading to differences in their structural, functional, and nutritional properties. The variability in the structures of milk caseins influence the properties of their dairy products, such as cheese and yogurt, as well as their digestibility and allergic properties. Such differences are beneficial to the development of different functionally improved casein molecules with variable biological and industrial utilities.
Collapse
Affiliation(s)
- Ashish Runthala
- Department of Biotechnology, Koneru Lakshmaiah Education Foundation, Vijayawada 522302, India
- Correspondence: (A.R.); (A.K.-E.); Tel.: +971-5-0138-9248 (A.K.-E.)
| | - Mustapha Mbye
- Department of Food Science, United Arab Emirates University, Al Ain P.O. Box 15551, United Arab Emirates
| | - Mutamed Ayyash
- Department of Food Science, United Arab Emirates University, Al Ain P.O. Box 15551, United Arab Emirates
| | - Yajun Xu
- Department of Nutrition and Food Hygiene, School of Public Health, Peking University, Beijing 100871, China
| | - Afaf Kamal-Eldin
- Department of Food Science, United Arab Emirates University, Al Ain P.O. Box 15551, United Arab Emirates
- Zayed Bin Sultan Center for Health Sciences, United Arab Emirates University, Al Ain P.O. Box 15551, United Arab Emirates
- Correspondence: (A.R.); (A.K.-E.); Tel.: +971-5-0138-9248 (A.K.-E.)
| |
Collapse
|
6
|
Abstract
Paleoproteomics, the study of ancient proteins, is a rapidly growing field at the intersection of molecular biology, paleontology, archaeology, paleoecology, and history. Paleoproteomics research leverages the longevity and diversity of proteins to explore fundamental questions about the past. While its origins predate the characterization of DNA, it was only with the advent of soft ionization mass spectrometry that the study of ancient proteins became truly feasible. Technological gains over the past 20 years have allowed increasing opportunities to better understand preservation, degradation, and recovery of the rich bioarchive of ancient proteins found in the archaeological and paleontological records. Growing from a handful of studies in the 1990s on individual highly abundant ancient proteins, paleoproteomics today is an expanding field with diverse applications ranging from the taxonomic identification of highly fragmented bones and shells and the phylogenetic resolution of extinct species to the exploration of past cuisines from dental calculus and pottery food crusts and the characterization of past diseases. More broadly, these studies have opened new doors in understanding past human-animal interactions, the reconstruction of past environments and environmental changes, the expansion of the hominin fossil record through large scale screening of nondiagnostic bone fragments, and the phylogenetic resolution of the vertebrate fossil record. Even with these advances, much of the ancient proteomic record still remains unexplored. Here we provide an overview of the history of the field, a summary of the major methods and applications currently in use, and a critical evaluation of current challenges. We conclude by looking to the future, for which innovative solutions and emerging technology will play an important role in enabling us to access the still unexplored "dark" proteome, allowing for a fuller understanding of the role ancient proteins can play in the interpretation of the past.
Collapse
Affiliation(s)
- Christina Warinner
- Department
of Anthropology, Harvard University, Cambridge, Massachusetts 02138, United States
- Department of Archaeogenetics, Max Planck Institute for Evolutionary Anthropology, Leipzig 04103, Germany
| | - Kristine Korzow Richter
- Department
of Anthropology, Harvard University, Cambridge, Massachusetts 02138, United States
| | - Matthew J. Collins
- Department
of Archaeology, Cambridge University, Cambridge CB2 3DZ, United Kingdom
- Section
for Evolutionary Genomics, Globe Institute,
University of Copenhagen, Copenhagen 1350, Denmark
| |
Collapse
|
7
|
Sun X, Cheng J. Phylogenetic Signal Dissection of Heterogeneous 28S and 16S rRNA Genes in Spinicaudata (Branchiopoda, Diplostraca). Genes (Basel) 2021; 12:1705. [PMID: 34828311 PMCID: PMC8625258 DOI: 10.3390/genes12111705] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2021] [Revised: 10/20/2021] [Accepted: 10/26/2021] [Indexed: 12/05/2022] Open
Abstract
It is still a challenge to reconstruct the deep phylogenetic relationships within spinicaudatans, and there are several different competing hypotheses regarding the interrelationships among Eocyzicidae, Cyzicidae s. s., Leptestheriidae, and Limnadiidae of the Suborder Spinicaudata. In order to explore the source of the inconsistencies, we focus on the sequence variation and the structure model of two rRNA genes based on extensive taxa sampling. The comparative sequence analysis revealed heterogeneity across species and the existence of conserved motifs in all spinicaudatan species. The level of intraspecific heterogeneity differed among species, which suggested that some species might have undergone a relaxed concerted evolution with respect to the 28S rRNA gene. The Bayesian analyses were performed on nuclear (28S rRNA, EF1α) and mitochondrial (16S rRNA, COI) genes. Further, we investigated compositional heterogeneity between lineages and assessed the potential for phylogenetic noise compared to signal in the combined data set. Reducing the non-phylogenetic signals and application of optimal rRNA model recovered a topology congruent with inference from the transcriptome data, whereby Limnadiidae was placed as a sister group to Leptestheriidae + Eocyzicidae with high support (topology I). Tests of alternative hypotheses provided implicit support for four competing topologies, and topology I was the best.
Collapse
Affiliation(s)
| | - Jinhui Cheng
- State Key Laboratory of Palaeobiology and Stratigraphy, Nanjing Institute of Geology and Palaeontology and Center for Excellence in Life and Palaeoenvironment, Chinese Academy of Sciences, No. 39, Beijing Eastroad, Nanjing 210008, China;
| |
Collapse
|
8
|
Ito RK, Harada S, Tabata R, Watanabe K. Molecular evolution and convergence of the rhodopsin gene in Gymnogobius, a goby group having diverged into coastal to freshwater habitats. J Evol Biol 2021; 35:333-346. [PMID: 34689368 DOI: 10.1111/jeb.13955] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2020] [Revised: 10/03/2021] [Accepted: 10/12/2021] [Indexed: 12/20/2022]
Abstract
Adaptive evolution of vision-related genes has been frequently observed in the process of invasion of new environments in a wide range of animal taxa. The typical example is that of the molecular evolution of rhodopsin associated with habitat changes in aquatic animals. However, few studies have investigated rhodopsin evolution during adaptive radiation across various habitats. In the present study, we examined the link between molecular evolutionary patterns in the rhodopsin gene and macroscopic habitat changes in Gymnogobius species (Gobiidae), which have adaptively radiated to diverse aquatic habitats including the sea, brackish waters, rivers and lakes. Analysis of amino acid substitutions in rhodopsin in the phylogenetic framework revealed convergent substitutions in 4-5 amino acids in three groups (four species), including two spectral tuning amino acid sites known to change rhodopsin's absorption wavelength. Positive selection was detected in the basal branches of each of these three groups, suggesting adaptive molecular convergence of rhodopsin. However, no significant correlation was observed between amino acid substitutions and the species' habitat changes, suggesting molecular adaptation to some unidentified micro-ecological environments. Taken together, these results emphasize the importance of considering not only macroscopic habitats but also micro-ecological environments when elucidating the driving forces of adaptive evolution of the visual system.
Collapse
Affiliation(s)
- Ryosuke K Ito
- Division of Biological Sciences, Department of Zoology, Graduate School of Science, Kyoto University, Kyoto City, Japan
| | - Shigeo Harada
- Resource Management Division, Fisheries Bureau, Agriculture, Forestry and Fisheries Department, Wakayama Prefectural Government, Wakayama City, Japan
| | | | - Katsutoshi Watanabe
- Division of Biological Sciences, Department of Zoology, Graduate School of Science, Kyoto University, Kyoto City, Japan
| |
Collapse
|
9
|
Youssef N, Susko E, Roger AJ, Bielawski JP. Shifts in amino acid preferences as proteins evolve: A synthesis of experimental and theoretical work. Protein Sci 2021; 30:2009-2028. [PMID: 34322924 PMCID: PMC8442975 DOI: 10.1002/pro.4161] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2021] [Revised: 07/19/2021] [Accepted: 07/26/2021] [Indexed: 11/08/2022]
Abstract
Amino acid preferences vary across sites and time. While variation across sites is widely accepted, the extent and frequency of temporal shifts are contentious. Our understanding of the drivers of amino acid preference change is incomplete: To what extent are temporal shifts driven by adaptive versus nonadaptive evolutionary processes? We review phenomena that cause preferences to vary (e.g., evolutionary Stokes shift, contingency, and entrenchment) and clarify how they differ. To determine the extent and prevalence of shifted preferences, we review experimental and theoretical studies. Analyses of natural sequence alignments often detect decreases in homoplasy (convergence and reversions) rates, and variation in replacement rates with time-signals that are consistent with temporally changing preferences. While approaches inferring shifts in preferences from patterns in natural alignments are valuable, they are indirect since multiple mechanisms (both adaptive and nonadaptive) could lead to the observed signal. Alternatively, site-directed mutagenesis experiments allow for a more direct assessment of shifted preferences. They corroborate evidence from multiple sequence alignments, revealing that the preference for an amino acid at a site varies depending on the background sequence. However, shifts in preferences are usually minor in magnitude and sites with significantly shifted preferences are low in frequency. The small yet consistent perturbations in preferences could, nevertheless, jeopardize the accuracy of inference procedures, which assume constant preferences. We conclude by discussing if and how such shifts in preferences might influence widely used time-homogenous inference procedures and potential ways to mitigate such effects.
Collapse
Affiliation(s)
- Noor Youssef
- Department of BiologyDalhousie UniversityHalifaxNova ScotiaCanada
| | - Edward Susko
- Department of Mathematics and StatisticsDalhousie UniversityHalifaxNova ScotiaCanada
| | - Andrew J. Roger
- Department of Biochemistry and Molecular BiologyDalhousie UniversityHalifaxNova ScotiaCanada
| | - Joseph P. Bielawski
- Department of BiologyDalhousie UniversityHalifaxNova ScotiaCanada
- Department of Mathematics and StatisticsDalhousie UniversityHalifaxNova ScotiaCanada
| |
Collapse
|
10
|
Literman R, Schwartz R. Genome-Scale Profiling Reveals Noncoding Loci Carry Higher Proportions of Concordant Data. Mol Biol Evol 2021; 38:2306-2318. [PMID: 33528497 PMCID: PMC8136493 DOI: 10.1093/molbev/msab026] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
Many evolutionary relationships remain controversial despite whole-genome sequencing data. These controversies arise, in part, due to challenges associated with accurately modeling the complex phylogenetic signal coming from genomic regions experiencing distinct evolutionary forces. Here, we examine how different regions of the genome support or contradict well-established relationships among three mammal groups using millions of orthologous parsimony-informative biallelic sites (PIBS) distributed across primate, rodent, and Pecora genomes. We compared PIBS concordance percentages among locus types (e.g. coding sequences (CDS), introns, intergenic regions), and contrasted PIBS utility over evolutionary timescales. Sites derived from noncoding sequences provided more data and proportionally more concordant sites compared with those from CDS in all clades. CDS PIBS were also predominant drivers of tree incongruence in two cases of topological conflict. PIBS derived from most locus types provided surprisingly consistent support for splitting events spread across the timescales we examined, although we find evidence that CDS and intronic PIBS may, respectively and to a limited degree, inform disproportionately about older and younger splits. In this era of accessible wholegenome sequence data, these results:1) suggest benefits to more intentionally focusing on noncoding loci as robust data for tree inference and 2) reinforce the importance of accurate modeling, especially when using CDS data.
Collapse
Affiliation(s)
- Robert Literman
- Department of Biological Sciences, University of Rhode Island, South Kingstown, RI, USA.,Center for Food Safety and Applied Nutrition, Office of Regulatory Science, U.S. Food and Drug Administration, College Park, MD, USA
| | - Rachel Schwartz
- Department of Biological Sciences, University of Rhode Island, South Kingstown, RI, USA
| |
Collapse
|
11
|
Merényi Z, Prasanna AN, Wang Z, Kovács K, Hegedüs B, Bálint B, Papp B, Townsend JP, Nagy LG. Unmatched Level of Molecular Convergence among Deeply Divergent Complex Multicellular Fungi. Mol Biol Evol 2021; 37:2228-2240. [PMID: 32191325 PMCID: PMC7403615 DOI: 10.1093/molbev/msaa077] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Abstract
Convergent evolution is pervasive in nature, but it is poorly understood how various constraints and natural selection limit the diversity of evolvable phenotypes. Here, we analyze the transcriptome across fruiting body development to understand the independent evolution of complex multicellularity in the two largest clades of fungi—the Agarico- and Pezizomycotina. Despite >650 My of divergence between these clades, we find that very similar sets of genes have convergently been co-opted for complex multicellularity, followed by expansions of their gene families by duplications. Over 82% of shared multicellularity-related gene families were expanding in both clades, indicating a high prevalence of convergence also at the gene family level. This convergence is coupled with a rich inferred repertoire of multicellularity-related genes in the most recent common ancestor of the Agarico- and Pezizomycotina, consistent with the hypothesis that the coding capacity of ancestral fungal genomes might have promoted the repeated evolution of complex multicellularity. We interpret this repertoire as an indication of evolutionary predisposition of fungal ancestors for evolving complex multicellular fruiting bodies. Our work suggests that evolutionary convergence may happen not only when organisms are closely related or are under similar selection pressures, but also when ancestral genomic repertoires render certain evolutionary trajectories more likely than others, even across large phylogenetic distances.
Collapse
Affiliation(s)
- Zsolt Merényi
- Synthetic and Systems Biology Unit, Institute of Biochemistry, Biological Research Center, Szeged, Hungary
| | - Arun N Prasanna
- Synthetic and Systems Biology Unit, Institute of Biochemistry, Biological Research Center, Szeged, Hungary
| | - Zheng Wang
- Department of Biostatistics, Yale University, New Haven, CT
| | - Károly Kovács
- Synthetic and Systems Biology Unit, Institute of Biochemistry, Biological Research Center, Szeged, Hungary.,Hungarian Centre of Excellence for Molecular Medicine, Metabolic Systems Biology Lab, Szeged, Hungary
| | - Botond Hegedüs
- Synthetic and Systems Biology Unit, Institute of Biochemistry, Biological Research Center, Szeged, Hungary
| | - Balázs Bálint
- Synthetic and Systems Biology Unit, Institute of Biochemistry, Biological Research Center, Szeged, Hungary
| | - Balázs Papp
- Synthetic and Systems Biology Unit, Institute of Biochemistry, Biological Research Center, Szeged, Hungary.,Hungarian Centre of Excellence for Molecular Medicine, Metabolic Systems Biology Lab, Szeged, Hungary
| | - Jeffrey P Townsend
- Department of Biostatistics, Yale University, New Haven, CT.,Department of Ecology and Evolutionary Biology, Yale University, New Haven, CT.,Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT
| | - László G Nagy
- Synthetic and Systems Biology Unit, Institute of Biochemistry, Biological Research Center, Szeged, Hungary
| |
Collapse
|
12
|
Pinney MM, Mokhtari DA, Akiva E, Yabukarski F, Sanchez DM, Liang R, Doukov T, Martinez TJ, Babbitt PC, Herschlag D. Parallel molecular mechanisms for enzyme temperature adaptation. Science 2021; 371:371/6533/eaay2784. [PMID: 33674467 DOI: 10.1126/science.aay2784] [Citation(s) in RCA: 29] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2019] [Revised: 08/23/2020] [Accepted: 01/04/2021] [Indexed: 12/13/2022]
Abstract
The mechanisms that underly the adaptation of enzyme activities and stabilities to temperature are fundamental to our understanding of molecular evolution and how enzymes work. Here, we investigate the molecular and evolutionary mechanisms of enzyme temperature adaption, combining deep mechanistic studies with comprehensive sequence analyses of thousands of enzymes. We show that temperature adaptation in ketosteroid isomerase (KSI) arises primarily from one residue change with limited, local epistasis, and we establish the underlying physical mechanisms. This residue change occurs in diverse KSI backgrounds, suggesting parallel adaptation to temperature. We identify residues associated with organismal growth temperature across 1005 diverse bacterial enzyme families, suggesting widespread parallel adaptation to temperature. We assess the residue properties, molecular interactions, and interaction networks that appear to underly temperature adaptation.
Collapse
Affiliation(s)
- Margaux M Pinney
- Department of Biochemistry, Stanford University, Stanford, CA 94305, USA.
| | - Daniel A Mokhtari
- Department of Biochemistry, Stanford University, Stanford, CA 94305, USA
| | - Eyal Akiva
- Department of Bioengineering and Therapeutic Sciences and Quantitative Biosciences Institute, University of California, San Francisco, CA 94158, USA
| | - Filip Yabukarski
- Department of Biochemistry, Stanford University, Stanford, CA 94305, USA.,Chan Zuckerberg Biohub, San Francisco, CA 94110, USA
| | - David M Sanchez
- Department of Chemistry, Stanford University, Stanford, CA 94305, USA.,Department of Photon Sciences, SLAC National Accelerator Laboratory, Menlo Park, CA 94025, USA
| | - Ruibin Liang
- Department of Chemistry, Stanford University, Stanford, CA 94305, USA.,Department of Photon Sciences, SLAC National Accelerator Laboratory, Menlo Park, CA 94025, USA
| | - Tzanko Doukov
- Stanford Synchrotron Radiation Lightsource, SLAC National Accelerator Laboratory, Menlo Park, CA 94025, USA
| | - Todd J Martinez
- Department of Chemistry, Stanford University, Stanford, CA 94305, USA.,Department of Photon Sciences, SLAC National Accelerator Laboratory, Menlo Park, CA 94025, USA
| | - Patricia C Babbitt
- Department of Bioengineering and Therapeutic Sciences and Quantitative Biosciences Institute, University of California, San Francisco, CA 94158, USA
| | - Daniel Herschlag
- Department of Biochemistry, Stanford University, Stanford, CA 94305, USA. .,Department of Chemical Engineering, Stanford University, Stanford, CA 94305, USA.,Stanford ChEM-H, Stanford University, Stanford, CA 94305, USA
| |
Collapse
|
13
|
Lu B, Jin H, Fu J. Molecular convergent and parallel evolution among four high-elevation anuran species from the Tibetan region. BMC Genomics 2020; 21:839. [PMID: 33246413 PMCID: PMC7694343 DOI: 10.1186/s12864-020-07269-4] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2019] [Accepted: 11/23/2020] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND To date, evidence for the relative prevalence or rarity of molecular convergent and parallel evolution is conflicting, and understanding of how these processes contribute to adaptation is limited. We compared four high-elevation anuran species (Bufo tibetanus, Nanorana parkeri, Rana kukunoris and Scutiger boulengeri) from the Tibetan region, and examined convergent and parallel amino acid substitutions between them and how they may have contributed to high-elevation adaptation. RESULTS Genomic data of the four high-elevation species and eight of their low-elevation close relatives were gathered. A total of 1098 orthologs shared by all species were identified. We first conducted pairwise comparisons using Zhang and Kumar's test. Then, the Rconv index was calculated and convergence/divergence correlation plotting was conducted. Furthermore, genes under positive selection and with elevated evolutionary rate were examined. We detected a large number of amino acid sites with convergent or parallel substitutions. Several pairs of high-elevation species, in particular, R. kukunoris vs N. parkeri and B. tibetanus vs S. boulengeri, had excessive amounts of convergent substitutions compared to neutral expectation. Nevertheless, these sites were mostly concentrated in a small number of genes (3-32), and no genome-wide convergence was detected. Furthermore, the majority of these convergent genes were neither under detectable positive selection nor had elevated evolutionary rates, although functional prediction analysis suggested some of the convergent genes could potentially contribute to high-elevation adaptation. CONCLUSIONS There is a substantial amount of convergent evolution at the amino-acid level among high-elevation amphibians, although these sites are concentrated in a few genes, not widespread across the genomes. This may attribute to the fact that all the target species are from the same environment. The relative prevalence of convergent substitutions among high-elevation amphibians provides an excellent opportunity for further study of molecular convergent evolution.
Collapse
Affiliation(s)
- Bin Lu
- Chengdu Institute of Biology, Chinese Academy of Sciences, Chengdu, China
| | - Hong Jin
- Chengdu Institute of Biology, Chinese Academy of Sciences, Chengdu, China.,University of the Chinese Academy of Sciences, Beijing, China
| | - Jinzhong Fu
- Chengdu Institute of Biology, Chinese Academy of Sciences, Chengdu, China. .,Department of Integrative Biology, University of Guelph, Guelph, Canada.
| |
Collapse
|
14
|
Xu S, Wang J, Guo Z, He Z, Shi S. Genomic Convergence in the Adaptation to Extreme Environments. PLANT COMMUNICATIONS 2020; 1:100117. [PMID: 33367270 PMCID: PMC7747959 DOI: 10.1016/j.xplc.2020.100117] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/23/2020] [Revised: 10/12/2020] [Accepted: 10/28/2020] [Indexed: 05/08/2023]
Abstract
Convergent evolution is especially common in plants that have independently adapted to the same extreme environments (i.e., extremophile plants). The recent burst of omics data has alleviated many limitations that have hampered molecular convergence studies of non-model extremophile plants. In this review, we summarize cases of genomic convergence in these taxa to examine the extent and type of genomic convergence during the process of adaptation to extreme environments. Despite being well studied by candidate gene approaches, convergent evolution at individual sites is rare and often has a high false-positive rate when assessed in whole genomes. By contrast, genomic convergence at higher genetic levels has been detected during adaptation to the same extreme environments. Examples include the convergence of biological pathways and changes in gene expression, gene copy number, amino acid usage, and GC content. Higher convergence levels play important roles in the adaptive evolution of extremophiles and may be more frequent and involve more genes. In several cases, multiple types of convergence events have been found to co-occur. However, empirical and theoretical studies of this higher level convergent evolution are still limited. In conclusion, both the development of powerful approaches and the detection of convergence at various genetic levels are needed to further reveal the genetic mechanisms of plant adaptation to extreme environments.
Collapse
Affiliation(s)
- Shaohua Xu
- State Key Laboratory of Biocontrol, Guangdong Provincial Key Laboratory of Plant Resources, Key Laboratory of Biodiversity Dynamics and Conservation of Guangdong Higher Education Institutes, School of Life Sciences, Sun Yat-sen University, Guangzhou, Guangdong, China
| | - Jiayan Wang
- State Key Laboratory of Biocontrol, Guangdong Provincial Key Laboratory of Plant Resources, Key Laboratory of Biodiversity Dynamics and Conservation of Guangdong Higher Education Institutes, School of Life Sciences, Sun Yat-sen University, Guangzhou, Guangdong, China
| | - Zixiao Guo
- State Key Laboratory of Biocontrol, Guangdong Provincial Key Laboratory of Plant Resources, Key Laboratory of Biodiversity Dynamics and Conservation of Guangdong Higher Education Institutes, School of Life Sciences, Sun Yat-sen University, Guangzhou, Guangdong, China
| | - Ziwen He
- State Key Laboratory of Biocontrol, Guangdong Provincial Key Laboratory of Plant Resources, Key Laboratory of Biodiversity Dynamics and Conservation of Guangdong Higher Education Institutes, School of Life Sciences, Sun Yat-sen University, Guangzhou, Guangdong, China
| | - Suhua Shi
- State Key Laboratory of Biocontrol, Guangdong Provincial Key Laboratory of Plant Resources, Key Laboratory of Biodiversity Dynamics and Conservation of Guangdong Higher Education Institutes, School of Life Sciences, Sun Yat-sen University, Guangzhou, Guangdong, China
| |
Collapse
|
15
|
Burskaia V, Naumenko S, Schelkunov M, Bedulina D, Neretina T, Kondrashov A, Yampolsky L, Bazykin GA. Excessive Parallelism in Protein Evolution of Lake Baikal Amphipod Species Flock. Genome Biol Evol 2020; 12:1493-1503. [PMID: 32653919 DOI: 10.1093/gbe/evaa138] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/03/2020] [Indexed: 11/12/2022] Open
Abstract
Repeated emergence of similar adaptations is often explained by parallel evolution of underlying genes. However, evidence of parallel evolution at amino acid level is limited. When the analyzed species are highly divergent, this can be due to epistatic interactions underlying the dynamic nature of the amino acid preferences: The same amino acid substitution may have different phenotypic effects on different genetic backgrounds. Distantly related species also often inhabit radically different environments, which makes the emergence of parallel adaptations less likely. Here, we hypothesize that parallel molecular adaptations are more prevalent between closely related species. We analyze the rate of parallel evolution in genome-size sets of orthologous genes in three groups of species with widely ranging levels of divergence: 46 species of the relatively recent lake Baikal amphipod radiation, a species flock of very closely related cichlids, and a set of significantly more divergent vertebrates. Strikingly, in genes of amphipods, the rate of parallel substitutions at nonsynonymous sites exceeded that at synonymous sites, suggesting rampant selection driving parallel adaptation. At sites of parallel substitutions, the intraspecies polymorphism is low, suggesting that parallelism has been driven by positive selection and is therefore adaptive. By contrast, in cichlids, the rate of nonsynonymous parallel evolution was similar to that at synonymous sites, whereas in vertebrates, this rate was lower than that at synonymous sites, indicating that in these groups of species, parallel substitutions are mainly fixed by drift.
Collapse
Affiliation(s)
- Valentina Burskaia
- Center of Life Sciences, Skolkovo Institute of Science and Technology, Moscow, Moscow Oblast, Russia
| | - Sergey Naumenko
- Institute for Information Transmission Problems of the Russian Academy of Sciences (Kharkevitch Institute), Moscow, Russia
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts
| | - Mikhail Schelkunov
- Center of Life Sciences, Skolkovo Institute of Science and Technology, Moscow, Moscow Oblast, Russia
- Institute for Information Transmission Problems of the Russian Academy of Sciences (Kharkevitch Institute), Moscow, Russia
| | - Daria Bedulina
- Institute of Biology, Irkutsk State University, Russia
- Baikal Research Centre, Irkutsk, Russia
| | - Tatyana Neretina
- Institute for Information Transmission Problems of the Russian Academy of Sciences (Kharkevitch Institute), Moscow, Russia
- N.A. Pertsov White Sea Biological Station, Lomonosov Moscow State University, Primorskiy, Russia
- Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Russia
| | - Alexey Kondrashov
- Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Russia
- Department of Ecology and Evolutionary Biology, University of Michigan
| | - Lev Yampolsky
- Department of Biological Sciences, East Tennessee State University
| | - Georgii A Bazykin
- Center of Life Sciences, Skolkovo Institute of Science and Technology, Moscow, Moscow Oblast, Russia
- Institute for Information Transmission Problems of the Russian Academy of Sciences (Kharkevitch Institute), Moscow, Russia
| |
Collapse
|
16
|
Recurrent sequence evolution after independent gene duplication. BMC Evol Biol 2020; 20:98. [PMID: 32770961 PMCID: PMC7414715 DOI: 10.1186/s12862-020-01660-1] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2020] [Accepted: 07/17/2020] [Indexed: 11/10/2022] Open
Abstract
Background Convergent and parallel evolution provide unique insights into the mechanisms of natural selection. Some of the most striking convergent and parallel (collectively recurrent) amino acid substitutions in proteins are adaptive, but there are also many that are selectively neutral. Accordingly, genome-wide assessment has shown that recurrent sequence evolution in orthologs is chiefly explained by nearly neutral evolution. For paralogs, more frequent functional change is expected because additional copies are generally not retained if they do not acquire their own niche. Yet, it is unknown to what extent recurrent sequence differentiation is discernible after independent gene duplications in different eukaryotic taxa. Results We develop a framework that detects patterns of recurrent sequence evolution in duplicated genes. This is used to analyze the genomes of 90 diverse eukaryotes. We find a remarkable number of families with a potentially predictable functional differentiation following gene duplication. In some protein families, more than ten independent duplications show a similar sequence-level differentiation between paralogs. Based on further analysis, the sequence divergence is found to be generally asymmetric. Moreover, about 6% of the recurrent sequence evolution between paralog pairs can be attributed to recurrent differentiation of subcellular localization. Finally, we reveal the specific recurrent patterns for the gene families Hint1/Hint2, Sco1/Sco2 and vma11/vma3. Conclusions The presented methodology provides a means to study the biochemical underpinning of functional differentiation between paralogs. For instance, two abundantly repeated substitutions are identified between independently derived Sco1 and Sco2 paralogs. Such identified substitutions allow direct experimental testing of the biological role of these residues for the repeated functional differentiation. We also uncover a diverse set of families with recurrent sequence evolution and reveal trends in the functional and evolutionary trajectories of this hitherto understudied phenomenon.
Collapse
|
17
|
Abstract
Background: Locating the root node of the "tree of life" (ToL) is one of the hardest problems in phylogenetics, given the time depth. The root-node, or the universal common ancestor (UCA), groups descendants into organismal clades/domains. Two notable variants of the two-domains ToL (2D-ToL) have gained support recently. One 2D-ToL posits that eukaryotes (organisms with nuclei) and akaryotes (organisms without nuclei) are sister clades that diverged from the UCA, and that Asgard archaea are sister to other archaea. The other 2D-ToL proposes that eukaryotes emerged from within archaea and places Asgard archaea as sister to eukaryotes. Williams et al. ( Nature Ecol. Evol. 4: 138-147; 2020) re-evaluated the data and methods that support the competing two-domains proposals and concluded that eukaryotes are the closest relatives of Asgard archaea. Critique: The poor resolution of the archaea in their analysis, despite employing amino acid alignments from thousands of proteins and the best-fitting substitution models, contradicts their conclusions. We argue that they overlooked important aspects of estimating evolutionary relatedness and assessing phylogenetic signal in empirical data. Which 2D-ToL is better supported depends on which kind of molecular features are better for resolving common ancestors at the roots of clades - protein-domains or their component amino acids. We focus on phylogenetic character reconstructions necessary to describe the UCA or its closest descendants in the absence of reliable fossils. Clarifications: It is well known that different character types present different perspectives on evolutionary history that relate to different phylogenetic depths. We show that protein structural-domains support more reliable phylogenetic reconstructions of deep-diverging clades in the ToL. Accordingly, Eukaryotes and Akaryotes are better supported clades in a 2D-ToL.
Collapse
Affiliation(s)
| | - David Morrison
- Department of Organismal Biology, Systematic Biology, Uppsala University, Uppsala, 752 36, Sweden
| |
Collapse
|
18
|
Miller JB, McKinnon LM, Whiting MF, Ridge PG. Codon use and aversion is largely phylogenetically conserved across the tree of life. Mol Phylogenet Evol 2019; 144:106697. [PMID: 31805345 DOI: 10.1016/j.ympev.2019.106697] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2018] [Revised: 04/10/2019] [Accepted: 11/29/2019] [Indexed: 01/11/2023]
Abstract
Using parsimony, we analyzed codon usages across 12,337 species and 25,727 orthologous genes to rank specific genes and codons according to their phylogenetic signal. We examined each codon within each ortholog to determine the codon usage for each species. In total, 890,814 codons were parsimony informative. Next, we compared species that used a codon with species that did not use the codon. We assessed each codon's congruence with species relationships provided in the Open Tree of Life (OTL) and determined the statistical probability of observing these results by random chance. We determined that 25,771 codons had no parallelisms or reversals when mapped to the OTL. Codon usages from orthologous genes spanning many species were 1109× more likely to be congruent with species relationships in the OTL than would be expected by random chance. Using the OTL as a reference, we show that codon usage is phylogenetically conserved within orthologous genes in archaea, bacteria, plants, mammals, and other vertebrates. We also show how to use our provided framework to test different tree hypotheses by confirming the placement of turtles as sister taxa to archosaurs.
Collapse
Affiliation(s)
- Justin B Miller
- Department of Biology, Brigham Young University, Provo, UT 84602, USA
| | - Lauren M McKinnon
- Department of Biology, Brigham Young University, Provo, UT 84602, USA
| | - Michael F Whiting
- Department of Biology, Brigham Young University, Provo, UT 84602, USA; M.L. Bean Museum, Brigham Young University, Provo, UT 84602, USA
| | - Perry G Ridge
- Department of Biology, Brigham Young University, Provo, UT 84602, USA.
| |
Collapse
|
19
|
Burbrink FT, Grazziotin FG, Pyron RA, Cundall D, Donnellan S, Irish F, Keogh JS, Kraus F, Murphy RW, Noonan B, Raxworthy CJ, Ruane S, Lemmon AR, Lemmon EM, Zaher H. Interrogating Genomic-Scale Data for Squamata (Lizards, Snakes, and Amphisbaenians) Shows no Support for Key Traditional Morphological Relationships. Syst Biol 2019; 69:502-520. [DOI: 10.1093/sysbio/syz062] [Citation(s) in RCA: 119] [Impact Index Per Article: 23.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2019] [Revised: 09/05/2019] [Accepted: 09/10/2019] [Indexed: 12/15/2022] Open
Abstract
Abstract
Genomics is narrowing uncertainty in the phylogenetic structure for many amniote groups. For one of the most diverse and species-rich groups, the squamate reptiles (lizards, snakes, and amphisbaenians), an inverse correlation between the number of taxa and loci sampled still persists across all publications using DNA sequence data and reaching a consensus on the relationships among them has been highly problematic. In this study, we use high-throughput sequence data from 289 samples covering 75 families of squamates to address phylogenetic affinities, estimate divergence times, and characterize residual topological uncertainty in the presence of genome-scale data. Importantly, we address genomic support for the traditional taxonomic groupings Scleroglossa and Macrostomata using novel machine-learning techniques. We interrogate genes using various metrics inherent to these loci, including parsimony-informative sites (PIS), phylogenetic informativeness, length, gaps, number of substitutions, and site concordance to understand why certain loci fail to find previously well-supported molecular clades and how they fail to support species-tree estimates. We show that both incomplete lineage sorting and poor gene-tree estimation (due to a few undesirable gene properties, such as an insufficient number of PIS), may account for most gene and species-tree discordance. We find overwhelming signal for Toxicofera, and also show that none of the loci included in this study supports Scleroglossa or Macrostomata. We comment on the origins and diversification of Squamata throughout the Mesozoic and underscore remaining uncertainties that persist in both deeper parts of the tree (e.g., relationships between Dibamia, Gekkota, and remaining squamates; among the three toxicoferan clades Iguania, Serpentes, and Anguiformes) and within specific clades (e.g., affinities among gekkotan, pleurodont iguanians, and colubroid families).
Collapse
Affiliation(s)
- Frank T Burbrink
- Department of Herpetology, The American Museum of Natural History, 79th Street at Central Park West, New York, NY 10024, USA
| | - Felipe G Grazziotin
- Laboratório de Coleções Zoológicas, Instituto Butantan, Av. Vital Brasil, 1500—Butantã, São Paulo—SP 05503-900, Brazil
| | - R Alexander Pyron
- Department of Biological Sciences, The George Washington University, Washington, DC 20052, USA
| | - David Cundall
- Department of Biological Sciences, 1 W. Packer Avenue, Lehigh University, Bethlehem, PA 18015, USA
| | - Steve Donnellan
- South Australian Museum, North Terrace, Adelaide SA 5000, Australia
- School of Biological Sciences, University of Adelaide, SA 5005 Australia
| | - Frances Irish
- Department of Biological Sciences, Moravian College, 1200 Main St, Bethlehem, PA 18018, US
| | - J Scott Keogh
- Division of Ecology and Evolution, Research School of Biology, The Australian National University, Canberra, ACT 2601, Australia
| | - Fred Kraus
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI 48109, USA
| | - Robert W Murphy
- Department of Natural History, Royal Ontario Museum, 100 Queens Park, Toronto, ON M5S 2C6, Canada
| | - Brice Noonan
- Department of Biology, University of Mississippi, Oxford, MS 38677, USA
| | - Christopher J Raxworthy
- Department of Herpetology, The American Museum of Natural History, 79th Street at Central Park West, New York, NY 10024, USA
| | - Sara Ruane
- Department of Biological Sciences, 206 Boyden Hall, Rutgers University, 195 University Avenue, Newark, NJ 07102, USA
| | - Alan R Lemmon
- Department of Scientific Computing, Florida State University, Dirac Science Library, Tallahassee, FL 32306-4102, USA
| | - Emily Moriarty Lemmon
- Department of Biological Science, Florida State University, 319 Stadium Drive, Tallahassee, FL 32306-4295, USA
| | - Hussam Zaher
- Museu de Zoologia da Universidade de São Paulo, São Paulo, Brazil CEP 04263-000, Brazil
- Centre de Recherche sur la Paléobiodiversité et les Paléoenvironnements (CR2P), UMR 7207 CNRS/MNHN/Sorbonne Université, Muséum national d’Histoire naturelle, 8 rue Buffon, CP 38, 75005 Paris, France
| |
Collapse
|
20
|
Reply to Jiang and Zhang: Parallel transcriptomic signature of monogamy: What is the null hypothesis anyway? Proc Natl Acad Sci U S A 2019; 116:17629-17630. [PMID: 31431540 DOI: 10.1073/pnas.1911022116] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
|
21
|
Mendes FK, Livera AP, Hahn MW. The perils of intralocus recombination for inferences of molecular convergence. Philos Trans R Soc Lond B Biol Sci 2019; 374:20180244. [PMID: 31154973 DOI: 10.1098/rstb.2018.0244] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023] Open
Abstract
Accurate inferences of convergence require that the appropriate tree topology be used. If there is a mismatch between the tree a trait has evolved along and the tree used for analysis, then false inferences of convergence ('hemiplasy') can occur. To avoid problems of hemiplasy when there are high levels of gene tree discordance with the species tree, researchers have begun to construct tree topologies from individual loci. However, due to intralocus recombination, even locus-specific trees may contain multiple topologies within them. This implies that the use of individual tree topologies discordant with the species tree can still lead to incorrect inferences about molecular convergence. Here, we examine the frequency with which single exons and single protein-coding genes contain multiple underlying tree topologies, in primates and Drosophila, and quantify the effects of hemiplasy when using trees inferred from individual loci. In both clades, we find that there are most often multiple diagnosable topologies within single exons and whole genes, with 91% of Drosophila protein-coding genes containing multiple topologies. Because of this underlying topological heterogeneity, even using trees inferred from individual protein-coding genes results in 25% and 38% of substitutions falsely labelled as convergent in primates and Drosophila, respectively. While constructing local trees can reduce the problem of hemiplasy, our results suggest that it will be difficult to completely avoid false inferences of convergence. We conclude by suggesting several ways forward in the analysis of convergent evolution, for both molecular and morphological characters. This article is part of the theme issue 'Convergent evolution in the genomics era: new insights and directions'.
Collapse
Affiliation(s)
- Fábio K Mendes
- 1 Department of Computer Science, The University of Auckland , Auckland 1010 , New Zealand.,2 Department of Biology, Indiana University , Bloomington, IN 47405 , USA
| | - Andrew P Livera
- 2 Department of Biology, Indiana University , Bloomington, IN 47405 , USA
| | - Matthew W Hahn
- 2 Department of Biology, Indiana University , Bloomington, IN 47405 , USA.,3 Department of Computer Science, Indiana University , Bloomington, IN 47405 , USA
| |
Collapse
|
22
|
Crispell J, Balaz D, Gordon SV. HomoplasyFinder: a simple tool to identify homoplasies on a phylogeny. Microb Genom 2019; 5:e000245. [PMID: 30663960 PMCID: PMC6412054 DOI: 10.1099/mgen.0.000245] [Citation(s) in RCA: 41] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2018] [Accepted: 11/26/2018] [Indexed: 01/10/2023] Open
Abstract
A homoplasy is a nucleotide identity resulting from a process other than inheritance from a common ancestor. Importantly, by distorting the ancestral relationships between nucleotide sequences, homoplasies can change the structure of the phylogeny. Homoplasies can emerge naturally, especially under high selection pressures and/or high mutation rates, or be created during the generation and processing of sequencing data. Identification of homoplasies is critical, both to understand their influence on the analyses of phylogenetic data and to allow an investigation into how they arose. Here we present HomoplasyFinder, a java application that can be used as a stand-a-lone tool or within the statistical programming environment R. Within R and Java, HomoplasyFinder is shown to be able to automatically, and quickly, identify any homoplasies present in simulated and real phylogenetic data. HomoplasyFinder can easily be incorporated into existing analysis pipelines, either within or outside of R, allowing the user to quickly identify homoplasies to inform downstream analyses and interpretation.
Collapse
Affiliation(s)
- Joseph Crispell
- School of Veterinary Medicine, College of Health and Agricultural Sciences, University College Dublin, Republic of Ireland
| | - Daniel Balaz
- Royal (Dick) School of Veterinary Studies, University of Edinburgh, Edinburgh, Scotland
| | - Stephen Vincent Gordon
- School of Veterinary Medicine, College of Health and Agricultural Sciences, University College Dublin, Republic of Ireland
| |
Collapse
|
23
|
Bolnick DI, Barrett RD, Oke KB, Rennison DJ, Stuart YE. (Non)Parallel Evolution. ANNUAL REVIEW OF ECOLOGY EVOLUTION AND SYSTEMATICS 2018. [DOI: 10.1146/annurev-ecolsys-110617-062240] [Citation(s) in RCA: 155] [Impact Index Per Article: 25.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Parallel evolution across replicate populations has provided evolutionary biologists with iconic examples of adaptation. When multiple populations colonize seemingly similar habitats, they may evolve similar genes, traits, or functions. Yet, replicated evolution in nature or in the laboratory often yields inconsistent outcomes: Some replicate populations evolve along highly similar trajectories, whereas other replicate populations evolve to different extents or in distinct directions. To understand these heterogeneous outcomes, biologists are increasingly treating parallel evolution not as a binary phenomenon but rather as a quantitative continuum ranging from parallel to nonparallel. By measuring replicate populations’ positions along this (non)parallel continuum, we can test hypotheses about evolutionary and ecological factors that influence the extent of repeatable evolution. We review evidence regarding the manifestation of (non)parallel evolution in the laboratory, in natural populations, and in applied contexts such as cancer. We enumerate the many genetic, ecological, and evolutionary processes that contribute to variation in the extent of parallel evolution.
Collapse
Affiliation(s)
- Daniel I. Bolnick
- Department of Integrative Biology, University of Texas at Austin, Austin, Texas 78712, USA
- Current affiliation: Department of Ecology and Evolution, University of Connecticut, Storrs, Connecticut 06268, USA
| | | | - Krista B. Oke
- Redpath Museum, McGill University, Montreal, Quebec H3A 2K6, Canada
- Department of Ecology and Evolutionary Biology, University of California, Santa Cruz, California 95060, USA
| | - Diana J. Rennison
- Institute of Ecology and Evolution, University of Bern, 3012 Bern, Switzerland
| | - Yoel E. Stuart
- Department of Integrative Biology, University of Texas at Austin, Austin, Texas 78712, USA
| |
Collapse
|
24
|
Harish A. What is an archaeon and are the Archaea really unique? PeerJ 2018; 6:e5770. [PMID: 30357005 PMCID: PMC6196074 DOI: 10.7717/peerj.5770] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2018] [Accepted: 09/05/2018] [Indexed: 12/05/2022] Open
Abstract
The recognition of the group Archaea as a major branch of the tree of life (ToL) prompted a new view of the evolution of biodiversity. The genomic representation of archaeal biodiversity has since significantly increased. In addition, advances in phylogenetic modeling of multi-locus datasets have resolved many recalcitrant branches of the ToL. Despite the technical advances and an expanded taxonomic representation, two important aspects of the origins and evolution of the Archaea remain controversial, even as we celebrate the 40th anniversary of the monumental discovery. These issues concern (i) the uniqueness (monophyly) of the Archaea, and (ii) the evolutionary relationships of the Archaea to the Bacteria and the Eukarya; both of these are relevant to the deep structure of the ToL. To explore the causes for this persistent ambiguity, I examine multiple datasets and different phylogenetic approaches that support contradicting conclusions. I find that the uncertainty is primarily due to a scarcity of information in standard datasets-universal core-genes datasets-to reliably resolve the conflicts. These conflicts can be resolved efficiently by comparing patterns of variation in the distribution of functional genomic signatures, which are less diffused unlike patterns of primary sequence variation. Relatively lower heterogeneity in distribution patterns minimizes uncertainties and supports statistically robust phylogenetic inferences, especially of the earliest divergences of life. This case study further highlights the limitations of primary sequence data in resolving difficult phylogenetic problems, and raises questions about evolutionary inferences drawn from the analyses of sequence alignments of a small set of core genes. In particular, the findings of this study corroborate the growing consensus that reversible substitution mutations may not be optimal phylogenetic markers for resolving early divergences in the ToL, nor for determining the polarity of evolutionary transitions across the ToL.
Collapse
Affiliation(s)
- Ajith Harish
- Department of Cell and Molecular Biology, Program in Molecular Biology, Uppsala University, Uppsala, Sweden
| |
Collapse
|
25
|
Abstract
In the history of life, some phenotypes have been acquired several times independently, through convergent evolution. Recently, lots of genome-scale studies have been devoted to identify nucleotides or amino acids that changed in a convergent manner when the convergent phenotypes evolved. These efforts have had mixed results, probably because of differences in the detection methods, and because of conceptual differences about the definition of a convergent substitution. Some methods contend that substitutions are convergent only if they occur on all branches where the phenotype changed toward the exact same state at a given nucleotide or amino acid position. Others are much looser in their requirements and define a convergent substitution as one that leads the site at which they occur to prefer a phylogeny in which species with the convergent phenotype group together. Here, we suggest to look for convergent shifts in amino acid preferences instead of convergent substitutions to the exact same amino acid. We define as convergent shifts substitutions that occur on all branches where the phenotype changed and such that they correspond to a change in the type of amino acid preferred at this position. We implement the corresponding model into a method named PCOC. We show on simulations that PCOC better recovers convergent shifts than existing methods in terms of sensitivity and specificity. We test it on a plant protein alignment where convergent evolution has been studied in detail and find that our method recovers several previously identified convergent substitutions and proposes credible new candidates.
Collapse
Affiliation(s)
- Carine Rey
- UnivLyon, ENS de Lyon, Univ Claude Bernard, CNRS UMR 5239, INSERM U1210, Laboratoire de Biologie et Modélisation de la Cellule, Lyon, France
- Laboratoire de Biométrie et Biologie Évolutive (LBBE), Université de Lyon, Université Lyon 1, CNRS, Villeurbanne, France
| | - Laurent Guéguen
- Laboratoire de Biométrie et Biologie Évolutive (LBBE), Université de Lyon, Université Lyon 1, CNRS, Villeurbanne, France
| | - Marie Sémon
- UnivLyon, ENS de Lyon, Univ Claude Bernard, CNRS UMR 5239, INSERM U1210, Laboratoire de Biologie et Modélisation de la Cellule, Lyon, France
| | - Bastien Boussau
- Laboratoire de Biométrie et Biologie Évolutive (LBBE), Université de Lyon, Université Lyon 1, CNRS, Villeurbanne, France
| |
Collapse
|
26
|
Horn RL, Marques AJD, Manseau M, Golding B, Klütsch CFC, Abraham K, Wilson PJ. Parallel evolution of site-specific changes in divergent caribou lineages. Ecol Evol 2018; 8:6053-6064. [PMID: 29988428 PMCID: PMC6024114 DOI: 10.1002/ece3.4154] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2018] [Revised: 04/06/2018] [Accepted: 04/09/2018] [Indexed: 12/15/2022] Open
Abstract
The parallel evolution of phenotypes or traits within or between species provides important insight into the basic mechanisms of evolution. Genetic and genomic advances have allowed investigations into the genetic underpinnings of parallel evolution and the independent evolution of similar traits in sympatric species. Parallel evolution may best be exemplified among species where multiple genetic lineages, descended from a common ancestor, colonized analogous environmental niches, and converged on a genotypic or phenotypic trait. Modern North American caribou (Rangifer tarandus) originated from three ancestral sources separated during the Last Glacial Maximum (LGM): the Beringian-Eurasian lineage (BEL), the North American lineage (NAL), and the High Arctic lineage (HAL). Historical introgression between the NAL and the BEL has been found throughout Ontario and eastern Manitoba. In this study, we first characterized the functional differentiation in the cytochrome-b (cytB) gene by identifying nonsynonymous changes. Second, the caribou lineages were used as a direct means to assess site-specific parallel changes among lineages. There was greater functional diversity within the NAL despite the BEL having greater neutral diversity. The patterns of amino acid substitutions occurring within different lineages supported the parallel evolution of cytB amino acid substitutions suggesting different selective pressures among lineages. This study highlights the independent evolution of identical amino acid substitutions within a wide-ranging mammal species that have diversified from different ancestral haplogroups and where ecological niches can invoke parallel evolution.
Collapse
Affiliation(s)
| | | | - Micheline Manseau
- Science and TechnologyEnvironment and Climate Change CanadaOttawaONCanada
- Natural Resources InstituteUniversity of ManitobaWinnipegMBCanada
| | - Brian Golding
- Department of BiologyMcMaster UniversityHamiltonONCanada
| | | | | | | |
Collapse
|
27
|
Zou Z, Zhang J. Gene Tree Discordance Does Not Explain Away the Temporal Decline of Convergence in Mammalian Protein Sequence Evolution. Mol Biol Evol 2017; 34:1682-1688. [PMID: 28379570 DOI: 10.1093/molbev/msx109] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Open
Abstract
Several authors reported lower frequencies of protein sequence convergence between more distantly related evolutionary lineages and attributed this trend to epistasis, which renders the acceptable amino acids at a site more different and convergence less likely in more divergent lineages. A recent primate study, however, suggested that this trend is at least partially and potentially entirely an artifact of gene tree discordance (GTD). Here, we demonstrate in a genome-wide data set from 17 mammals that the temporal trend remains (1) upon the control of the GTD level, (2) in genes whose genealogies are concordant with the species tree, and (3) for convergent changes, which are extremely unlikely to be caused by GTD. Similar results are observed in a comparable data set of 12 fruit flies in some but not all of these tests. We conclude that, at least in some cases, the temporal decline of convergence is genuine, reflecting an impact of epistasis on protein evolution.
Collapse
Affiliation(s)
- Zhengting Zou
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI
| | - Jianzhi Zhang
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI
| |
Collapse
|
28
|
Thomas GWC, Hahn MW, Hahn Y. The Effects of Increasing the Number of Taxa on Inferences of Molecular Convergence. Genome Biol Evol 2017; 9:213-221. [PMID: 28057728 PMCID: PMC5381636 DOI: 10.1093/gbe/evw306] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/01/2017] [Indexed: 12/27/2022] Open
Abstract
Convergent evolution provides insight into the link between phenotype and genotype. Recently, large-scale comparative studies of convergent evolution have become possible, but researchers are still trying to determine the best way to design these types of analyses. One aspect of molecular convergence studies that has not yet been investigated is how taxonomic sample size affects inferences of molecular convergence. Here we show that increased sample size decreases the amount of inferred molecular convergence associated with the three convergent transitions to a marine environment in mammals. The sampling of more taxa-both with and without the convergent phenotype-reveals that alleles associated only with marine mammals in small datasets are actually more widespread, or are not shared by all marine species. The sampling of more taxa also allows finer resolution of ancestral substitutions, revealing that they are not in fact on lineages leading to solely marine species. We revisit a previous study on marine mammals and find that only 7 of the reported 43 genes with convergent substitutions still show signs of convergence with a larger number of background species. However, four of those seven genes also showed signs of positive selection in the original analysis and may still be good candidates for adaptive convergence. Though our study is framed around the convergence of marine mammals, we expect our conclusions on taxonomic sampling are generalizable to any study of molecular convergence.
Collapse
Affiliation(s)
- Gregg W C Thomas
- Department of Biology and School of Informatics and Computing, Indiana University, Bloomington, Indiana
| | - Matthew W Hahn
- Department of Biology and School of Informatics and Computing, Indiana University, Bloomington, Indiana
| | - Yoonsoo Hahn
- Department of Life Science, Research Center for Biomolecules and Biosystems, Chung-Ang University, Seoul, Republic of Korea
| |
Collapse
|
29
|
Xu S, He Z, Guo Z, Zhang Z, Wyckoff GJ, Greenberg A, Wu CI, Shi S. Genome-Wide Convergence during Evolution of Mangroves from Woody Plants. Mol Biol Evol 2017; 34:1008-1015. [PMID: 28087771 DOI: 10.1093/molbev/msw277] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
When living organisms independently invade a new environment, the evolution of similar phenotypic traits is often observed. An interesting but contentious issue is whether the underlying molecular biology also converges in the new habitat. Independent invasions of tropical intertidal zones by woody plants, collectively referred to as mangrove trees, represent some dramatic examples. The high salinity, hypoxia, and other stressors in the new habitat might have affected both genomic features and protein structures. Here, we developed a new method for detecting convergence at conservative Sites (CCS) and applied it to the genomic sequences of mangroves. In simulations, the CCS method drastically reduces random convergence at rapidly evolving sites as well as falsely inferred convergence caused by the misinferences of the ancestral character. In mangrove genomes, we estimated ∼400 genes that have experienced convergence over the background level of convergence in the nonmangrove relatives. The convergent genes are enriched in pathways related to stress response and embryo development, which could be important for mangroves' adaptation to the new habitat.
Collapse
Affiliation(s)
- Shaohua Xu
- State Key Laboratory of Biocontrol, Guangdong Provincial Key Laboratory of Plant Resources, Key Laboratory of Biodiversity Dynamics and Conservation of Guangdong Higher Education Institutes, School of Life Sciences, Sun Yat-Sen University, Guangzhou, Guangdong, China
| | - Ziwen He
- State Key Laboratory of Biocontrol, Guangdong Provincial Key Laboratory of Plant Resources, Key Laboratory of Biodiversity Dynamics and Conservation of Guangdong Higher Education Institutes, School of Life Sciences, Sun Yat-Sen University, Guangzhou, Guangdong, China
| | - Zixiao Guo
- State Key Laboratory of Biocontrol, Guangdong Provincial Key Laboratory of Plant Resources, Key Laboratory of Biodiversity Dynamics and Conservation of Guangdong Higher Education Institutes, School of Life Sciences, Sun Yat-Sen University, Guangzhou, Guangdong, China
| | - Zhang Zhang
- State Key Laboratory of Biocontrol, Guangdong Provincial Key Laboratory of Plant Resources, Key Laboratory of Biodiversity Dynamics and Conservation of Guangdong Higher Education Institutes, School of Life Sciences, Sun Yat-Sen University, Guangzhou, Guangdong, China
| | - Gerald J Wyckoff
- Molecular Biology and Biochemistry, University of Missouri-Kansas City, Kansas City, MO
| | | | - Chung-I Wu
- State Key Laboratory of Biocontrol, Guangdong Provincial Key Laboratory of Plant Resources, Key Laboratory of Biodiversity Dynamics and Conservation of Guangdong Higher Education Institutes, School of Life Sciences, Sun Yat-Sen University, Guangzhou, Guangdong, China.,Department of Ecology and Evolution, University of Chicago, Chicago, IL
| | - Suhua Shi
- State Key Laboratory of Biocontrol, Guangdong Provincial Key Laboratory of Plant Resources, Key Laboratory of Biodiversity Dynamics and Conservation of Guangdong Higher Education Institutes, School of Life Sciences, Sun Yat-Sen University, Guangzhou, Guangdong, China
| |
Collapse
|
30
|
Barlowe S, Coan HB, Youker RT. SubVis: an interactive R package for exploring the effects of multiple substitution matrices on pairwise sequence alignment. PeerJ 2017; 5:e3492. [PMID: 28674656 PMCID: PMC5490468 DOI: 10.7717/peerj.3492] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2017] [Accepted: 05/27/2017] [Indexed: 01/13/2023] Open
Abstract
Understanding how proteins mutate is critical to solving a host of biological problems. Mutations occur when an amino acid is substituted for another in a protein sequence. The set of likelihoods for amino acid substitutions is stored in a matrix and input to alignment algorithms. The quality of the resulting alignment is used to assess the similarity of two or more sequences and can vary according to assumptions modeled by the substitution matrix. Substitution strategies with minor parameter variations are often grouped together in families. For example, the BLOSUM and PAM matrix families are commonly used because they provide a standard, predefined way of modeling substitutions. However, researchers often do not know if a given matrix family or any individual matrix within a family is the most suitable. Furthermore, predefined matrix families may inaccurately reflect a particular hypothesis that a researcher wishes to model or otherwise result in unsatisfactory alignments. In these cases, the ability to compare the effects of one or more custom matrices may be needed. This laborious process is often performed manually because the ability to simultaneously load multiple matrices and then compare their effects on alignments is not readily available in current software tools. This paper presents SubVis, an interactive R package for loading and applying multiple substitution matrices to pairwise alignments. Users can simultaneously explore alignments resulting from multiple predefined and custom substitution matrices. SubVis utilizes several of the alignment functions found in R, a common language among protein scientists. Functions are tied together with the Shiny platform which allows the modification of input parameters. Information regarding alignment quality and individual amino acid substitutions is displayed with the JavaScript language which provides interactive visualizations for revealing both high-level and low-level alignment information.
Collapse
Affiliation(s)
- Scott Barlowe
- Department of Mathematics and Computer Science, Western Carolina University, Cullowhee, NC, United States of America
| | - Heather B Coan
- Department of Biology, Western Carolina University, Cullowhee, NC, United States of America
| | - Robert T Youker
- Department of Biology, Western Carolina University, Cullowhee, NC, United States of America
| |
Collapse
|
31
|
Dolichol phosphate mannose synthase: a Glycosyltransferase with Unity in molecular diversities. Glycoconj J 2017; 34:467-479. [PMID: 28616799 DOI: 10.1007/s10719-017-9777-4] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2016] [Revised: 04/20/2017] [Accepted: 05/18/2017] [Indexed: 10/19/2022]
Abstract
N-glycans provide structural and functional stability to asparagine-linked (N-linked) glycoproteins, and add flexibility. Glycan biosynthesis is elaborative, multi-compartmental and involves many glycosyltransferases. Failure to assemble N-glycans leads to phenotypic changes developing infection, cancer, congenital disorders of glycosylation (CDGs) among others. Biosynthesis of N-glycans begins at the endoplasmic reticulum (ER) with the assembly of dolichol-linked tetra-decasaccharide (Glc3Man9GlcNAc2-PP-Dol) where dolichol phosphate mannose synthase (DPMS) plays a central role. DPMS is also essential for GPI anchor biosynthesis as well as for O- and C-mannosylation of proteins in yeast and in mammalian cells. DPMS has been purified from several sources and its gene has been cloned from 39 species (e.g., from protozoan parasite to human). It is an inverting GT-A folded enzyme and classified as GT2 by CAZy (carbohydrate active enZyme; http://www.cazy.org ). The sequence alignment detects the presence of a metal binding DAD signature in DPMS from all 39 species but finds cAMP-dependent protein phosphorylation motif (PKA motif) in only 38 species. DPMS also has hydrophobic region(s). Hydropathy analysis of amino acid sequences from bovine, human, S. crevisiae and A. thaliana DPMS show PKA motif is present between the hydrophobic domains. The location of PKA motif as well as the hydrophobic domain(s) in the DPMS sequence vary from species to species. For example, the domain(s) could be located at the center or more towards the C-terminus. Irrespective of their catalytic similarity, the DNA sequence, the amino acid identity, and the lack of a stretch of hydrophobic amino acid residues at the C-terminus, DPMS is still classified as Type I and Type II enzyme. Because of an apparent bio-sensing ability, extracellular signaling and microenvironment regulate DPMS catalytic activity. In this review, we highlight some important features and the molecular diversities of DPMS.
Collapse
|
32
|
Yang W, Lu B, Fu J. Molecular Convergent Evolution of the MYBPC2 Gene Among Three High-Elevation Amphibian Species. J Mol Evol 2017; 84:139-143. [PMID: 28220195 DOI: 10.1007/s00239-017-9782-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2016] [Accepted: 02/03/2017] [Indexed: 11/30/2022]
Abstract
We report a strong pattern of molecular-level convergent/parallel evolution of the MYBPC2 gene. Three high-elevation amphibian species, Bufo gargarizans minshanicus, Nanorana pleskei, Rana kukunoris, revealed remarkable numbers of convergent and parallel amino acid substitutions. On the MYBPC2 gene tree of eleven anurans, the three distantly related species formed a strongly supported clade that was away from their respective relatives. Furthermore, we generated both model-based and empirical data-based null distributions for neutral convergent evolution. All three pairwise comparisons among the three species showed significantly more convergent and parallel substitutions than the null distributions. This study adds to the very small roster of clear cases of non-neutral molecular convergent evolution (e.g. prestin, rhodopsin). Molecular convergent evolution has significant implications in biology and detailed case studies will likely provide more insight into its genetic mechanisms.
Collapse
Affiliation(s)
- Weizhao Yang
- Chengdu Institute of Biology, Chinese Academy of Sciences, Chengdu, 610041, China.,Department of Biology, Lund University, 223 62, Lund, Sweden
| | - Bin Lu
- Chengdu Institute of Biology, Chinese Academy of Sciences, Chengdu, 610041, China
| | - Jinzhong Fu
- Chengdu Institute of Biology, Chinese Academy of Sciences, Chengdu, 610041, China. .,Department of Integrative Biology, University of Guelph, Guelph, ON, N1G 2W1, Canada.
| |
Collapse
|
33
|
Differential paralog divergence modulates genome evolution across yeast species. PLoS Genet 2017; 13:e1006585. [PMID: 28196070 PMCID: PMC5308817 DOI: 10.1371/journal.pgen.1006585] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2016] [Accepted: 01/13/2017] [Indexed: 11/24/2022] Open
Abstract
Evolutionary outcomes depend not only on the selective forces acting upon a species, but also on the genetic background. However, large timescales and uncertain historical selection pressures can make it difficult to discern such important background differences between species. Experimental evolution is one tool to compare evolutionary potential of known genotypes in a controlled environment. Here we utilized a highly reproducible evolutionary adaptation in Saccharomyces cerevisiae to investigate whether experimental evolution of other yeast species would select for similar adaptive mutations. We evolved populations of S. cerevisiae, S. paradoxus, S. mikatae, S. uvarum, and interspecific hybrids between S. uvarum and S. cerevisiae for ~200–500 generations in sulfate-limited continuous culture. Wild-type S. cerevisiae cultures invariably amplify the high affinity sulfate transporter gene, SUL1. However, while amplification of the SUL1 locus was detected in S. paradoxus and S. mikatae populations, S. uvarum cultures instead selected for amplification of the paralog, SUL2. We measured the relative fitness of strains bearing deletions and amplifications of both SUL genes from different species, confirming that, converse to S. cerevisiae, S. uvarum SUL2 contributes more to fitness in sulfate limitation than S. uvarum SUL1. By measuring the fitness and gene expression of chimeric promoter-ORF constructs, we were able to delineate the cause of this differential fitness effect primarily to the promoter of S. uvarum SUL1. Our data show evidence of differential sub-functionalization among the sulfate transporters across Saccharomyces species through recent changes in noncoding sequence. Furthermore, these results show a clear example of how such background differences due to paralog divergence can drive changes in genome evolution. Both comparative genomics and experimental evolution are powerful tools that can be used to make inferences about evolutionary processes. Together, these approaches provide the opportunity to observe evolutionary adaptation over millions of years where selective history is largely unknown, and over short timescales under controlled selective pressures in the laboratory. We have used comparative experimental evolution to observe the evolutionary fate of an adaptive mutation, and determined to what degree the outcome is conditional on the genetic background. We evolved several populations of different yeast species for over 200 generations in sulfate-limited conditions to determine how the differences in genomic context can alter evolutionary routes when challenged with a nutrient limitation selection pressure. We find that the gene encoding a high affinity sulfur transporter becomes amplified in most species of Saccharomyces, except in S. uvarum, in which the amplification of the paralogous sulfate transporter gene SUL2 is recovered. We attribute this change in amplification preference to mutations in the non-coding region of SUL1, likely due to reduced expression of this gene in S. uvarum. We conclude that the adaptive mutations selected for in each organism depend on the genomic context, even when faced with the same environmental condition.
Collapse
|
34
|
Evolutionary switches between two serine codon sets are driven by selection. Proc Natl Acad Sci U S A 2016; 113:13109-13113. [PMID: 27799560 DOI: 10.1073/pnas.1615832113] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Serine is the only amino acid that is encoded by two disjoint codon sets so that a tandem substitution of two nucleotides is required to switch between the two sets. Previously published evidence suggests that, for the most evolutionarily conserved serines, the codon set switch occurs by simultaneous substitution of two nucleotides. Here we report a genome-wide reconstruction of the evolution of serine codons in triplets of closely related species from diverse prokaryotes and eukaryotes. The results indicate that the great majority of codon set switches proceed by two consecutive nucleotide substitutions, via a threonine or cysteine intermediate, and are driven by selection. These findings imply a strong pressure of purifying selection in protein evolution, which in the case of serine codon set switches occurs via an initial deleterious substitution quickly followed by a second, compensatory substitution. The result is frequent reversal of amino acid replacements and, at short evolutionary distances, pervasive homoplasy.
Collapse
|
35
|
Mendes FK, Hahn Y, Hahn MW. Gene Tree Discordance Can Generate Patterns of Diminishing Convergence over Time. Mol Biol Evol 2016; 33:3299-3307. [PMID: 27634870 DOI: 10.1093/molbev/msw197] [Citation(s) in RCA: 35] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Phenotypic convergence is an exciting outcome of adaptive evolution, occurring when different species find similar solutions to the same problem. Unraveling the molecular basis of convergence provides a way to link genotype to adaptive phenotypes, but can also shed light on the extent to which molecular evolution is repeatable and predictable. Many recent genome-wide studies have uncovered a striking pattern of diminishing convergence over time, ascribing this pattern to the presence of intramolecular epistatic interactions. Here, we consider gene tree discordance as an alternative cause of changes in convergence levels over time in a primate dataset. We demonstrate that gene tree discordance can produce patterns of diminishing convergence by itself, and that controlling for discordance as a cause of apparent convergence makes the pattern disappear. We also show that synonymous substitutions, where neither selection nor epistasis should be prevalent, have the same diminishing pattern of molecular convergence in primates. Finally, we demonstrate that even in situations where biological discordance is not possible, discordance due to errors in species tree inference can drive similar patterns. Though intramolecular epistasis could in principle create a pattern of declining convergence over time, our results suggest a possible alternative explanation for this widespread pattern. These results contribute to a growing appreciation not just of the presence of gene tree discordance, but of the unpredictable effects this discordance can have on analyses of molecular evolution.
Collapse
Affiliation(s)
- Fábio K Mendes
- Department of Biology, Indiana University, Bloomington, IN
| | - Yoonsoo Hahn
- Department of Life Science, Research Center for Biomolecules and Biosystems, Chung-Ang University, Seoul, Republic of Korea
| | - Matthew W Hahn
- Department of Biology, Indiana University, Bloomington, IN.,School of Informatics and Computing, Indiana University, Bloomington, IN
| |
Collapse
|
36
|
Zou Z, Zhang J. Morphological and molecular convergences in mammalian phylogenetics. Nat Commun 2016; 7:12758. [PMID: 27585543 PMCID: PMC5025827 DOI: 10.1038/ncomms12758] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2016] [Accepted: 07/29/2016] [Indexed: 12/30/2022] Open
Abstract
Phylogenetic trees reconstructed from molecular sequences are often considered more reliable than those reconstructed from morphological characters, in part because convergent evolution, which confounds phylogenetic reconstruction, is believed to be rarer for molecular sequences than for morphologies. However, neither the validity of this belief nor its underlying cause is known. Here comparing thousands of characters of each type that have been used for inferring the phylogeny of mammals, we find that on average morphological characters indeed experience much more convergences than amino acid sites, but this disparity is explained by fewer states per character rather than an intrinsically higher susceptibility to convergence for morphologies than sequences. We show by computer simulation and actual data analysis that a simple method for identifying and removing convergence-prone characters improves phylogenetic accuracy, potentially enabling, when necessary, the inclusion of morphologies and hence fossils for reliable tree inference. Morphological characters are generally thought to have higher rates of convergence than molecular characters. Here, Zou and Zhang provide empirical evidence for this assumption and devise a method to improve the accuracy of phylogenetic reconstruction through identifying and removing convergence-prone characters.
Collapse
Affiliation(s)
- Zhengting Zou
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan 48109, USA
| | - Jianzhi Zhang
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, Michigan 48109, USA
| |
Collapse
|
37
|
Epistasis and the Dynamics of Reversion in Molecular Evolution. Genetics 2016; 203:1335-51. [PMID: 27194749 DOI: 10.1534/genetics.116.188961] [Citation(s) in RCA: 47] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2016] [Accepted: 04/27/2016] [Indexed: 12/27/2022] Open
Abstract
Recent studies of protein evolution contend that the longer an amino acid substitution is present at a site, the less likely it is to revert to the amino acid previously occupying that site. Here we study this phenomenon of decreasing reversion rates rigorously and in a much more general context. We show that, under weak mutation and for arbitrary fitness landscapes, reversion rates decrease with time for any site that is involved in at least one epistatic interaction. Specifically, we prove that, at stationarity, the hazard function of the distribution of waiting times until reversion is strictly decreasing for any such site. Thus, in the presence of epistasis, the longer a particular character has been absent from a site, the less likely the site will revert to its prior state. We also explore several examples of this general result, which share a common pattern whereby the probability of having reverted increases rapidly at short times to some substantial value before becoming almost flat after a few substitutions at other sites. This pattern indicates a characteristic tendency for reversion to occur either almost immediately after the initial substitution or only after a very long time.
Collapse
|
38
|
Abstract
To what extent is the convergent evolution of protein function attributable to convergent or parallel changes at the amino acid level? The mutations that contribute to adaptive protein evolution may represent a biased subset of all possible beneficial mutations owing to mutation bias and/or variation in the magnitude of deleterious pleiotropy. A key finding is that the fitness effects of amino acid mutations are often conditional on genetic background. This context dependence (epistasis) can reduce the probability of convergence and parallelism because it reduces the number of possible mutations that are unconditionally acceptable in divergent genetic backgrounds. Here, I review factors that influence the probability of replicated evolution at the molecular level.
Collapse
Affiliation(s)
- Jay F Storz
- School of Biological Sciences, University of Nebraska, Lincoln, Nebraska 68588, USA
| |
Collapse
|
39
|
Holliday JA, Zhou L, Bawa R, Zhang M, Oubida RW. Evidence for extensive parallelism but divergent genomic architecture of adaptation along altitudinal and latitudinal gradients in Populus trichocarpa. THE NEW PHYTOLOGIST 2016; 209:1240-51. [PMID: 26372471 DOI: 10.1111/nph.13643] [Citation(s) in RCA: 52] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/01/2015] [Accepted: 08/13/2015] [Indexed: 05/10/2023]
Abstract
Adaptation to climate across latitude and altitude reflects shared climatic constraints, which may lead to parallel adaptation. However, theory predicts that higher gene flow should favor more concentrated genomic architectures, which would lead to fewer locally maladapted recombinants. We used exome capture to resequence the gene space along a latitudinal and two altitudinal transects in the model tree Populus trichocapra. Adaptive trait phenotyping was coupled with FST outlier tests and sliding window analysis to assess the degree of parallel adaptation as well as the genomic distribution of outlier loci. Up to 51% of outlier loci overlapped between transect pairs and up to 15% of these loci overlapped among all three transects. Genomic clustering of adaptive loci was more pronounced for altitudinal than latitudinal transects. In both altitudinal transects, there was a larger number of these 'islands of divergence', which were on average longer and included several of exceptional physical length. Our results suggest that recapitulation of genetic clines over latitude and altitude involves extensive parallelism, but that steep altitudinal clines generate islands of divergence. This suggests that physical proximity of genes in coadapted complexes may buffer against the movement of maladapted alleles from geographically proximal but climatically distinct populations.
Collapse
Affiliation(s)
- Jason A Holliday
- Department of Forest Resources and Environmental Conservation, Virginia Polytechnic Institute and State University, 304 Cheatham Hall, Blacksburg, VA, 24061, USA
| | - Lecong Zhou
- Department of Forest Resources and Environmental Conservation, Virginia Polytechnic Institute and State University, 304 Cheatham Hall, Blacksburg, VA, 24061, USA
| | - Rajesh Bawa
- Department of Forest Resources and Environmental Conservation, Virginia Polytechnic Institute and State University, 304 Cheatham Hall, Blacksburg, VA, 24061, USA
| | - Man Zhang
- Department of Forest Resources and Environmental Conservation, Virginia Polytechnic Institute and State University, 304 Cheatham Hall, Blacksburg, VA, 24061, USA
| | - Regis W Oubida
- Department of Forest Resources and Environmental Conservation, Virginia Polytechnic Institute and State University, 304 Cheatham Hall, Blacksburg, VA, 24061, USA
| |
Collapse
|
40
|
Kurland CG, Harish A. The phylogenomics of protein structures: The backstory. Biochimie 2015; 119:284-302. [DOI: 10.1016/j.biochi.2015.07.027] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2015] [Accepted: 07/28/2015] [Indexed: 12/11/2022]
|
41
|
Convergent evolution of SOCS4 between yak and Tibetan antelope in response to high-altitude stress. Gene 2015; 572:298-302. [DOI: 10.1016/j.gene.2015.08.024] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2015] [Revised: 07/22/2015] [Accepted: 08/10/2015] [Indexed: 10/24/2022]
|
42
|
Zou Z, Zhang J. Are Convergent and Parallel Amino Acid Substitutions in Protein Evolution More Prevalent Than Neutral Expectations? Mol Biol Evol 2015; 32:2085-96. [PMID: 25862140 DOI: 10.1093/molbev/msv091] [Citation(s) in RCA: 85] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Convergent and parallel amino acid substitutions in protein evolution, collectively referred to as molecular convergence here, have small probabilities under neutral evolution. For this reason, molecular convergence is commonly viewed as evidence for similar adaptations of different species. The surge in the number of reports of molecular convergence in the last decade raises the intriguing question of whether molecular convergence occurs substantially more frequently than expected under neutral evolution. We here address this question using all one-to-one orthologous proteins encoded by the genomes of 12 fruit fly species and those encoded by 17 mammals. We found that the expected amount of molecular convergence varies greatly depending on the specific neutral substitution model assumed at each amino acid site and that the observed amount of molecular convergence is explainable by neutral models incorporating site-specific information of acceptable amino acids. Interestingly, the total number of convergent and parallel substitutions between two lineages, relative to the neutral expectation, decreases with the genetic distance between the two lineages, regardless of the model used in computing the neutral expectation. We hypothesize that this trend results from differences in the amino acids acceptable at a given site among different clades of a phylogeny, due to prevalent epistasis, and provide simulation as well as empirical evidence for this hypothesis. Together, our study finds no genomic evidence for higher-than-neutral levels of molecular convergence, but suggests the presence of abundant epistasis that decreases the likelihood of molecular convergence between distantly related lineages.
Collapse
Affiliation(s)
- Zhengting Zou
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor
| | - Jianzhi Zhang
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor
| |
Collapse
|
43
|
Goldstein RA, Pollard ST, Shah SD, Pollock DD. Nonadaptive Amino Acid Convergence Rates Decrease over Time. Mol Biol Evol 2015; 32:1373-81. [PMID: 25737491 PMCID: PMC4572784 DOI: 10.1093/molbev/msv041] [Citation(s) in RCA: 58] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023] Open
Abstract
Convergence is a central concept in evolutionary studies because it provides strong evidence for adaptation. It also provides information about the nature of the fitness landscape and the repeatability of evolution, and can mislead phylogenetic inference. To understand the role of adaptive convergence, we need to understand the patterns of nonadaptive convergence. Here, we consider the relationship between nonadaptive convergence and divergence in mitochondrial and model proteins. Surprisingly, nonadaptive convergence is much more common than expected in closely related organisms, falling off as organisms diverge. The extent of the convergent drop-off in mitochondrial proteins is well predicted by epistatic or coevolutionary effects in our "evolutionary Stokes shift" models and poorly predicted by conventional evolutionary models. Convergence probabilities decrease dramatically if the ancestral amino acids of branches being compared have diverged, but also drop slowly over evolutionary time even if the ancestral amino acids have not substituted. Convergence probabilities drop-off rapidly for quickly evolving sites, but much more slowly for slowly evolving sites. Furthermore, once sites have diverged their convergence probabilities are extremely low and indistinguishable from convergence levels at randomized sites. These results indicate that we cannot assume that excessive convergence early on is necessarily adaptive. This new understanding should help us to better discriminate adaptive from nonadaptive convergence and develop more relevant evolutionary models with improved validity for phylogenetic inference.
Collapse
Affiliation(s)
- Richard A Goldstein
- Division of Infection & Immunity, University College London, London, United Kingdom
| | - Stephen T Pollard
- Department of Biochemistry and Molecular Genetics, University of Colorado School of Medicine, Aurora
| | - Seena D Shah
- Department of Biochemistry and Molecular Genetics, University of Colorado School of Medicine, Aurora
| | - David D Pollock
- Department of Biochemistry and Molecular Genetics, University of Colorado School of Medicine, Aurora
| |
Collapse
|
44
|
Vinogradov AE. Consolidation of slow or fast but not moderately evolving genes at the level of pathways and processes. Gene 2015; 561:30-4. [PMID: 25707747 DOI: 10.1016/j.gene.2015.01.066] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2014] [Revised: 01/04/2015] [Accepted: 01/09/2015] [Indexed: 11/15/2022]
Abstract
Conservatism versus innovation is probably the most important dichotomy of all evolving systems. In molecular evolution the distinction between conservative (negative) selection, innovative (positive) selection and unconstrained evolution (drift) is usually ambiguous at the gene level. Only rare cases with the ratio of nonsynonymous to synonymous nucleotide substitutions above unity (dN/dS>1) are thought to be due to positive selection, whereas the lower dN/dS ratio may indicate negative selection in combination with drift. The density of the dN/dS ratio for orthologous genes forms a unimodal distribution where no particular regions can be discerned. Here it is shown that at the level of overrepresented pathways and processes the picture is strikingly different. The distribution is strongly polarized with a wide completely depressed middle part. This three-phase distribution is very robust. It is observed with various substitution models and remains at very low significance of overrepresentation (up to p<0.99). This fact suggests consolidation of either negative or positive selection but not of unconstrained evolution at the level of pathways/processes. The effect is demonstrated for different phylogenetic distances: from human to other primates, mammals and vertebrates. This approach suggests estimating the boundaries for conservative and innovative selection using the pathway/process level. Emphasizing the role of a critical mass of negatively or positively selected genes in a pathway/process, it can elucidate how the bridge between 'tinkering' at the gene level and 'design' at the higher levels is forming.
Collapse
|
45
|
Abstract
Toothed whales and two groups of bats independently acquired echolocation, the ability to locate and identify objects by reflected sound. Echolocation requires physiologically complex and coordinated vocal, auditory, and neural functions, but the molecular basis of the capacity for echolocation is not well understood. A recent study suggested that convergent amino acid substitutions widespread in the proteins of echolocators underlay the convergent origins of mammalian echolocation. Here, we show that genomic signatures of molecular convergence between echolocating lineages are generally no stronger than those between echolocating and comparable nonecholocating lineages. The same is true for the group of 29 hearing-related proteins claimed to be enriched with molecular convergence. Reexamining the previous selection test reveals several flaws and invalidates the asserted evidence for adaptive convergence. Together, these findings indicate that the reported genomic signatures of convergence largely reflect the background level of sequence convergence unrelated to the origins of echolocation.
Collapse
Affiliation(s)
- Zhengting Zou
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor
| | - Jianzhi Zhang
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor
| |
Collapse
|
46
|
Vogwill T, Kojadinovic M, Furió V, MacLean RC. Testing the role of genetic background in parallel evolution using the comparative experimental evolution of antibiotic resistance. Mol Biol Evol 2014; 31:3314-23. [PMID: 25228081 PMCID: PMC4245821 DOI: 10.1093/molbev/msu262] [Citation(s) in RCA: 40] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023] Open
Abstract
Parallel evolution is the independent evolution of the same phenotype or genotype in response to the same selection pressure. There are examples of parallel molecular evolution across divergent genetic backgrounds, suggesting that genetic background may not play an important role in determining the outcome of adaptation. Here, we measure the influence of genetic background on phenotypic and molecular adaptation by combining experimental evolution with comparative analysis. We selected for resistance to the antibiotic rifampicin in eight strains of bacteria from the genus Pseudomonas using a short term selection experiment. Adaptation occurred by 47 mutations at conserved sites in rpoB, the target of rifampicin, and due to the high diversity of possible mutations the probability of within-strain parallel evolution was low. The probability of between-strain parallel evolution was only marginally lower, because different strains substituted similar rpoB mutations. In contrast, we found that more than 30% of the phenotypic variation in the growth rate of evolved clones was attributable to among-strain differences. Parallel molecular evolution across strains resulted in divergent phenotypic evolution because rpoB mutations had different effects on growth rate in different strains. This study shows that genetic divergence between strains constrains parallel phenotypic evolution, but had little detectable impact on the molecular basis of adaptation in this system.
Collapse
Affiliation(s)
- Tom Vogwill
- Department of Zoology, University of Oxford, Oxford, United Kingdom
| | - Mila Kojadinovic
- CNRS, Aix-Marseille Université, Laboratoire de Bioénergétique et Ingénierie des Protéines, UMR 7281, IMM, Marseille, France
| | - Victoria Furió
- Department of Zoology, University of Oxford, Oxford, United Kingdom
| | - R Craig MacLean
- Department of Zoology, University of Oxford, Oxford, United Kingdom
| |
Collapse
|
47
|
Usmanova DR, Ferretti L, Povolotskaya IS, Vlasov PK, Kondrashov FA. A model of substitution trajectories in sequence space and long-term protein evolution. Mol Biol Evol 2014; 32:542-54. [PMID: 25415964 DOI: 10.1093/molbev/msu318] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
The nature of factors governing the tempo and mode of protein evolution is a fundamental issue in evolutionary biology. Specifically, whether or not interactions between different sites, or epistasis, are important in directing the course of evolution became one of the central questions. Several recent reports have scrutinized patterns of long-term protein evolution claiming them to be compatible only with an epistatic fitness landscape. However, these claims have not yet been substantiated with a formal model of protein evolution. Here, we formulate a simple covarion-like model of protein evolution focusing on the rate at which the fitness impact of amino acids at a site changes with time. We then apply the model to the data on convergent and divergent protein evolution to test whether or not the incorporation of epistatic interactions is necessary to explain the data. We find that convergent evolution cannot be explained without the incorporation of epistasis and the rate at which an amino acid state switches from being acceptable at a site to being deleterious is faster than the rate of amino acid substitution. Specifically, for proteins that have persisted in modern prokaryotic organisms since the last universal common ancestor for one amino acid substitution approximately ten amino acid states switch from being accessible to being deleterious, or vice versa. Thus, molecular evolution can only be perceived in the context of rapid turnover of which amino acids are available for evolution.
Collapse
Affiliation(s)
- Dinara R Usmanova
- Moscow Institute of Physics and Technology, Institutskiy Pereulok 9, g.Dolgoprudny, Russia Bioinformatics and Genomics Programme, Centre for Genomic Regulation (CRG), Barcelona, Spain Universitat Pompeu Fabra (UPF), Barcelona, Spain
| | - Luca Ferretti
- Systématique, Adaptation et Evolution (UMR 7138), UPMC University Paris 06, CNRS, MNHN, IRD, Paris, France CIRB, Collège de France, Paris, France
| | - Inna S Povolotskaya
- Bioinformatics and Genomics Programme, Centre for Genomic Regulation (CRG), Barcelona, Spain Universitat Pompeu Fabra (UPF), Barcelona, Spain
| | - Peter K Vlasov
- Bioinformatics and Genomics Programme, Centre for Genomic Regulation (CRG), Barcelona, Spain Universitat Pompeu Fabra (UPF), Barcelona, Spain
| | - Fyodor A Kondrashov
- Bioinformatics and Genomics Programme, Centre for Genomic Regulation (CRG), Barcelona, Spain Universitat Pompeu Fabra (UPF), Barcelona, Spain Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain
| |
Collapse
|
48
|
Morgan CC, Creevey CJ, O'Connell MJ. Mitochondrial data are not suitable for resolving placental mammal phylogeny. Mamm Genome 2014; 25:636-47. [PMID: 25239304 DOI: 10.1007/s00335-014-9544-9] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2014] [Accepted: 09/01/2014] [Indexed: 02/01/2023]
Abstract
Mitochondrial data have traditionally been used in reconstructing a variety of species phylogenies. The low rates of recombination and thorough characterization of mitochondrial data across vertebrate species make it a particularly attractive phylogenetic marker. The relatively low number of fully sequenced mammal genomes and the lack of extensive sampling within Superorders have posed a serious problem for reaching agreement on the placement mammal species. The use of mitochondrial data sequences from large numbers of mammals could serve to circumvent the taxon-sampling deficit. Here we assess the suitability of mitochondrial data as a phylogenetic marker in mammal phylogenetics. MtDNA datasets of mammal origin have been filtered as follows: (i) we have sampled sparsely across the phylogenetic tree, (ii) we have constrained our sampling to genes with high taxon coverage, (iii) we have categorised rates across sites in a phylogeny independent manner and have removed fast evolving sites, and (iv), we have sampled from very shallow divergence times to reduce phylogenetic conflict. However, topologies obtained using these filters are not consistent with previous studies and are discordant across different genes. Individual mitochondrial genes, and indeed all mitochondrial genes analysed as a supermatrix, resulted in poor resolution of the species phylogeny. Overall, our study highlights the limitations of mitochondrial data, not only for resolving deep divergences and but also for shallow divergences in the mammal phylogeny.
Collapse
Affiliation(s)
- Claire C Morgan
- Bioinformatics and Molecular Evolution Group, School of Biotechnology, Dublin City University, Glasnevin, Dublin 9, Ireland,
| | | | | |
Collapse
|
49
|
Bloom JD. An experimentally informed evolutionary model improves phylogenetic fit to divergent lactamase homologs. Mol Biol Evol 2014; 31:2753-69. [PMID: 25063439 PMCID: PMC4166927 DOI: 10.1093/molbev/msu220] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
Abstract
Phylogenetic analyses of molecular data require a quantitative model for how
sequences evolve. Traditionally, the details of the site-specific selection that
governs sequence evolution are not known a priori, making it challenging to
create evolutionary models that adequately capture the heterogeneity of
selection at different sites. However, recent advances in high-throughput
experiments have made it possible to quantify the effects of all single
mutations on gene function. I have previously shown that such high-throughput
experiments can be combined with knowledge of underlying mutation rates to
create a parameter-free evolutionary model that describes the phylogeny of
influenza nucleoprotein far better than commonly used existing models. Here, I
extend this work by showing that published experimental data on TEM-1
beta-lactamase (Firnberg E, Labonte JW, Gray JJ, Ostermeier M. 2014. A
comprehensive, high-resolution map of a gene’s fitness landscape.
Mol Biol Evol. 31:1581–1592) can be combined with a
few mutation rate parameters to create an evolutionary model that describes
beta-lactamase phylogenies much better than most common existing models. This
experimentally informed evolutionary model is superior even for homologs that
are substantially diverged (about 35% divergence at the protein level)
from the TEM-1 parent that was the subject of the experimental study. These
results suggest that experimental measurements can inform phylogenetic
evolutionary models that are applicable to homologs that span a substantial
range of sequence divergence.
Collapse
Affiliation(s)
- Jesse D Bloom
- Division of Basic Sciences and Computational Biology Program, Fred Hutchinson Cancer Research Center, Seattle, WA
| |
Collapse
|
50
|
Polzin K, Rokas A. Evaluating rare amino acid substitutions (RGC_CAMs) in a yeast model clade. PLoS One 2014; 9:e92213. [PMID: 24637883 PMCID: PMC3956930 DOI: 10.1371/journal.pone.0092213] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2013] [Accepted: 02/20/2014] [Indexed: 12/25/2022] Open
Abstract
When inferring phylogenetic relationships, not all sites in a sequence alignment are equally informative. One recently proposed approach that takes advantage of this inequality relies on sites that contain amino acids whose replacement requires multiple substitutions. Identifying these so-called RGC_CAM substitutions (after Rare Genomic Changes as Conserved Amino acids-Multiple substitutions) requires that, first, at any given site in the amino acid sequence alignment, there must be a minimum of two different amino acids; second, each amino acid must be present in at least two taxa; and third, the amino acids must require a minimum of two nucleotide substitutions to replace each other. Although theory suggests that RGC_CAM substitutions are expected to be rare and less likely to be homoplastic, the informativeness of RGC_CAM substitutions has not been extensively evaluated in biological data sets. We investigated the quality of RGC_CAM substitutions by examining their degree of homoplasy and internode certainty in nearly 2.7 million aligned amino acid sites from 5,261 proteins from five species belonging to the yeast Saccharomyces sensu stricto clade whose phylogeny is well-established. We identified 2,647 sites containing RGC_CAM substitutions, a number that contrasts sharply with the 100,887 sites containing RGC_non-CAM substitutions (i.e., changes between amino acids that require only a single nucleotide substitution). We found that RGC_CAM substitutions had significantly lower homoplasy than RGC_non-CAM ones; specifically RGC_CAM substitutions showed a per-site average homoplasy index of 0.100, whereas RGC_non-CAM substitutions had a homoplasy index of 0.215. Internode certainty values were also higher for sites containing RGC_CAM substitutions than for RGC_non-CAM ones. These results suggest that RGC_CAM substitutions possess a strong phylogenetic signal and are useful markers for phylogenetic inference despite their rarity.
Collapse
Affiliation(s)
- Kenneth Polzin
- Department of Biological Sciences, Vanderbilt University, Nashville, Tennessee, United States of America
| | - Antonis Rokas
- Department of Biological Sciences, Vanderbilt University, Nashville, Tennessee, United States of America
- * E-mail:
| |
Collapse
|