1
|
Jain A, Begum T, Ahmad S. Analysis and Prediction of Pathogen Nucleic Acid Specificity for Toll-like Receptors in Vertebrates. J Mol Biol 2023; 435:168208. [PMID: 37479078 DOI: 10.1016/j.jmb.2023.168208] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2023] [Revised: 06/20/2023] [Accepted: 07/13/2023] [Indexed: 07/23/2023]
Abstract
Identification of key sequence, expression and function related features of nucleic acid-sensing host proteins is of fundamental importance to understand the dynamics of pathogen-specific host responses. To meet this objective, we considered toll-like receptors (TLRs), a representative class of membrane-bound sensor proteins, from 17 vertebrate species covering mammals, birds, reptiles, amphibians, and fishes in this comparative study. We identified the molecular signatures of host TLRs that are responsible for sensing pathogen nucleic acids or other pathogen-associated molecular patterns (PAMPs), and potentially play important roles in host defence mechanism. Interestingly, our findings reveal that such host-specific features are directly related to the strand (single or double) specificity of nucleic acid from pathogens. However, during host-pathogen interactions, such features were unable to explain the pathogenic PAMP (i.e., DNA, RNA or other) selectivity, suggesting a more complex mechanism. Using these features, we developed a number of machine learning models, of which Random Forest achieved a high performance (94.57% accuracy) to predict strand specificity of TLRs from protein-derived features. We applied the trained model to propose strand specificity of some previously uncharacterized distinct fish-specific novel TLRs (TLR18, TLR23, TLR24, TLR25, TLR27).
Collapse
Affiliation(s)
- Anuja Jain
- School of Computational and Integrative Sciences, Jawaharlal Nehru University, New Delhi 110067, India. https://twitter.com/@Anuja334
| | - Tina Begum
- School of Computational and Integrative Sciences, Jawaharlal Nehru University, New Delhi 110067, India.
| | - Shandar Ahmad
- School of Computational and Integrative Sciences, Jawaharlal Nehru University, New Delhi 110067, India.
| |
Collapse
|
2
|
Padmanabhan S, Manjithaya R. Leaderless secretory proteins of the neurodegenerative diseases via TNTs: a structure-function perspective. Front Mol Neurosci 2023; 16:983108. [PMID: 37396786 PMCID: PMC10308029 DOI: 10.3389/fnmol.2023.983108] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2022] [Accepted: 05/26/2023] [Indexed: 07/04/2023] Open
Abstract
Neurodegenerative disease-causing proteins such as alpha-synuclein, tau, and huntingtin are known to traverse across cells via exosomes, extracellular vesicles and tunneling nanotubes (TNTs). There seems to be good synergy between exosomes and TNTs in intercellular communication. Interestingly, many of the known major neurodegenerative proteins/proteolytic products are leaderless and are also reported to be secreted out of the cell via unconventional protein secretion. Such classes contain intrinsically disordered proteins and regions (IDRs) within them. The dynamic behavior of these proteins is due to their heterogenic conformations that is exhibited owing to various factors that occur inside the cells. The amino acid sequence along with the chemical modifications has implications on the functional roles of IDRs inside the cells. Proteins that form aggregates resulting in neurodegeneration become resistant to degradation by the processes of autophagy and proteasome system thus leading to Tunneling nanotubes, TNT formation. The proteins that traverse across TNTs may or may not be dependent on the autophagy machinery. It is not yet clear whether the conformation of the protein plays a crucial role in its transport from one cell to another without getting degraded. Although there is some experimental data, there are many grey areas which need to be revisited. This review provides a different perspective on the structural and functional aspects of these leaderless proteins that get secreted outside the cell. In this review, attention has been focused on the characteristic features that lead to aggregation of leaderless secretory proteins (from structural-functional aspect) with special emphasis on TNTs.
Collapse
|
3
|
Variables Influencing Differences in Sequence Conservation in the Fission Yeast Schizosaccharomyces pombe. J Mol Evol 2021; 89:601-610. [PMID: 34436628 PMCID: PMC8599406 DOI: 10.1007/s00239-021-10028-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2021] [Accepted: 08/17/2021] [Indexed: 11/17/2022]
Abstract
Which variables determine the constraints on gene sequence evolution is one of the most central questions in molecular evolution. In the fission yeast Schizosaccharomyces pombe, an important model organism, the variables influencing the rate of sequence evolution have yet to be determined. Previous studies in other single celled organisms have generally found gene expression levels to be most significant, with numerous other variables such as gene length and functional importance identified as having a smaller impact. Using publicly available data, we used partial least squares regression, principal components regression, and partial correlations to determine the variables most strongly associated with sequence evolution constraints. We identify centrality in the protein–protein interactions network, amino acid composition, and cellular location as the most important determinants of sequence conservation. However, each factor only explains a small amount of variance, and there are numerous variables having a significant or heterogeneous influence. Our models explain more than half of the variance in dN, raising the possibility that future refined models could quantify the role of stochastics in evolutionary rate variation.
Collapse
|
4
|
McDonough-Goldstein CE, Borziak K, Pitnick S, Dorus S. Drosophila female reproductive tract gene expression reveals coordinated mating responses and rapidly evolving tissue-specific genes. G3 (BETHESDA, MD.) 2021; 11:jkab020. [PMID: 33890615 PMCID: PMC8063083 DOI: 10.1093/g3journal/jkab020] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/07/2020] [Accepted: 12/29/2020] [Indexed: 12/13/2022]
Abstract
Sexual reproduction in internally fertilizing species requires complex coordination between female and male reproductive systems and among the diverse tissues of the female reproductive tract (FRT). Here, we report a comprehensive, tissue-specific investigation of Drosophila melanogaster FRT gene expression before and after mating. We identified expression profiles that distinguished each tissue, including major differences between tissues with glandular or primarily nonglandular epithelium. All tissues were enriched for distinct sets of genes possessing secretion signals that exhibited accelerated evolution, as might be expected for genes participating in molecular interactions between the sexes within the FRT extracellular environment. Despite robust transcriptional differences between tissues, postmating responses were dominated by coordinated transient changes indicative of an integrated systems-level functional response. This comprehensive characterization of gene expression throughout the FRT identifies putative female contributions to postcopulatory events critical to reproduction and potentially reproductive isolation, as well as the putative targets of sexual selection and conflict.
Collapse
Affiliation(s)
| | - Kirill Borziak
- Center for Reproductive Evolution, Biology Department, Syracuse University, Syracuse, NY, USA
| | - Scott Pitnick
- Center for Reproductive Evolution, Biology Department, Syracuse University, Syracuse, NY, USA
| | - Steve Dorus
- Center for Reproductive Evolution, Biology Department, Syracuse University, Syracuse, NY, USA
| |
Collapse
|
5
|
Jain A, Perisa D, Fliedner F, von Haeseler A, Ebersberger I. The Evolutionary Traceability of a Protein. Genome Biol Evol 2019; 11:531-545. [PMID: 30649284 PMCID: PMC6394115 DOI: 10.1093/gbe/evz008] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/11/2019] [Indexed: 12/12/2022] Open
Abstract
Orthologs document the evolution of genes and metabolic capacities encoded in extant and ancient genomes. However, the similarity between orthologs decays with time, and ultimately it becomes insufficient to infer common ancestry. This leaves ancient gene set reconstructions incomplete and distorted to an unknown extent. Here we introduce the “evolutionary traceability” as a measure that quantifies, for each protein, the evolutionary distance beyond which the sensitivity of the ortholog search becomes limiting. Using yeast, we show that genes that were thought to date back to the last universal common ancestor are of high traceability. Their functions mostly involve catalysis, ion transport, and ribonucleoprotein complex assembly. In turn, the fraction of yeast genes whose traceability is not sufficient to infer their presence in last universal common ancestor is enriched for regulatory functions. Computing the traceabilities of genes that have been experimentally characterized as being essential for a self-replicating cell reveals that many of the genes that lack orthologs outside bacteria have low traceability. This leaves open whether their orthologs in the eukaryotic and archaeal domains have been overlooked. Looking at the example of REC8, a protein essential for chromosome cohesion, we demonstrate how a traceability-informed adjustment of the search sensitivity identifies hitherto missed orthologs in the fast-evolving microsporidia. Taken together, the evolutionary traceability helps to differentiate between true absence and nondetection of orthologs, and thus improves our understanding about the evolutionary conservation of functional protein networks. “protTrace,” a software tool for computing evolutionary traceability, is freely available at https://github.com/BIONF/protTrace.git; last accessed February 10, 2019.
Collapse
Affiliation(s)
- Arpit Jain
- Applied Bioinformatics Group, Institute of Cell Biology & Neuroscience, Goethe University, Frankfurt, Germany
| | - Dominik Perisa
- Applied Bioinformatics Group, Institute of Cell Biology & Neuroscience, Goethe University, Frankfurt, Germany
| | - Fabian Fliedner
- Applied Bioinformatics Group, Institute of Cell Biology & Neuroscience, Goethe University, Frankfurt, Germany
| | - Arndt von Haeseler
- Center for Integrative Bioinformatics Vienna, Max F. Perutz Laboratories, University of Vienna, Medical University Vienna, Austria.,Bioinformatics and Computational Biology, Faculty of Computer Science, University of Vienna, Austria
| | - Ingo Ebersberger
- Applied Bioinformatics Group, Institute of Cell Biology & Neuroscience, Goethe University, Frankfurt, Germany.,Senckenberg Biodiversity and Climate Research Center (BiK-F), Frankfurt, Germany.,LOEWE Centre for Translational Biodiversity Genomics (LOEWE-TBG), Frankfurt, Germany
| |
Collapse
|
6
|
Karr TL, Southern H, Rosenow MA, Gossmann TI, Snook RR. The Old and the New: Discovery Proteomics Identifies Putative Novel Seminal Fluid Proteins in Drosophila. Mol Cell Proteomics 2019; 18:S23-S33. [PMID: 30760537 PMCID: PMC6427231 DOI: 10.1074/mcp.ra118.001098] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2018] [Revised: 02/11/2019] [Indexed: 12/11/2022] Open
Abstract
Seminal fluid proteins (SFPs), the nonsperm component of male ejaculates produced by male accessory glands, are viewed as central mediators of reproductive fitness. SFPs effect both male and female post-mating functions and show molecular signatures of rapid adaptive evolution. Although Drosophila melanogaster, is the dominant insect model for understanding SFP evolution, understanding of SFP evolutionary causes and consequences require additional comparative analyses of close and distantly related taxa. Although SFP identification was historically challenging, advances in label-free quantitative proteomics expands the scope of studying other systems to further advance the field. Focused studies of SFPs has so far overlooked the proteomes of male reproductive glands and their inherent complex protein networks for which there is little information on the overall signals of molecular evolution. Here we applied label-free quantitative proteomics to identify the accessory gland proteome and secretome in Drosophila pseudoobscura,, a close relative of D. melanogaster,, and use the dataset to identify both known and putative novel SFPs. Using this approach, we identified 163 putative SFPs, 32% of which overlapped with previously identified D. melanogaster, SFPs and show that SFPs with known extracellular annotation evolve more rapidly than other proteins produced by or contained within the accessory gland. Our results will further the understanding of the evolution of SFPs and the underlying male accessory gland proteins that mediate reproductive fitness of the sexes.
Collapse
Affiliation(s)
- Timothy L Karr
- From the ‡Center for Mechanisms of Evolution, The Biodesign Institute, Arizona State University, Tempe, Arizona;.
| | - Helen Southern
- Department of Animal and Plant Sciences, University of Sheffield, Sheffield, UK
| | | | - Toni I Gossmann
- Department of Animal and Plant Sciences, University of Sheffield, Sheffield, UK
| | - Rhonda R Snook
- Department of Zoology, Stockholm University, Stockholm, Sweden.
| |
Collapse
|
7
|
Homma K, Anbo H, Noguchi T, Fukuchi S. Both Intrinsically Disordered Regions and Structural Domains Evolve Rapidly in Immune-Related Mammalian Proteins. Int J Mol Sci 2018; 19:ijms19123860. [PMID: 30518031 PMCID: PMC6321239 DOI: 10.3390/ijms19123860] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2018] [Revised: 12/01/2018] [Accepted: 12/02/2018] [Indexed: 01/07/2023] Open
Abstract
Eukaryotic proteins consist of structural domains (SDs) and intrinsically disordered regions (IDRs), i.e., regions that by themselves do not assume unique three-dimensional structures. IDRs are generally subject to less constraint and evolve more rapidly than SDs. Proteins with a lower number of protein-to-protein interactions (PPIs) are also less constrained and tend to evolve fast. Extracellular proteins of mammals, especially immune-related extracellular proteins, on average have relatively high evolution rates. This article aims to examine if a high evolution rate in IDRs or that in SDs accounts for the rapid evolution of extracellular proteins. To this end, we classified eukaryotic proteins based on their cellular localizations and analyzed them. Moreover, we divided proteins into SDs and IDRs and calculated the respective evolution rate. Fractional IDR content is positively correlated with evolution rate. For their fractional IDR content, immune-related extracellular proteins show an aberrantly high evolution rate. IDRs evolve more rapidly than SDs in most subcellular localizations. In extracellular proteins, however, the difference is diminished. For immune-related proteins in mammals in particular, the evolution rates in SDs come close to those in IDRs. Thus high evolution rates in both IDRs and SDs account for the rapid evolution of immune-related proteins.
Collapse
Affiliation(s)
- Keiichi Homma
- Department of Life Science and Informatics, Maebashi Institute of Technology, 460-1 Kamisadori-machi, Maebashi-shi 371-0816, Japan.
| | - Hiroto Anbo
- Department of Life Science and Informatics, Maebashi Institute of Technology, 460-1 Kamisadori-machi, Maebashi-shi 371-0816, Japan.
| | - Tamotsu Noguchi
- Pharmaceutical Education Research Center, Meiji Pharmaceutical University, 2-522-1 Noshio, Kiyose-shi, Tokyo 204-8588, Japan.
| | - Satoshi Fukuchi
- Department of Life Science and Informatics, Maebashi Institute of Technology, 460-1 Kamisadori-machi, Maebashi-shi 371-0816, Japan.
| |
Collapse
|
8
|
Aguilar-Rodríguez J, Wagner A. Metabolic Determinants of Enzyme Evolution in a Genome-Scale Bacterial Metabolic Network. Genome Biol Evol 2018; 10:3076-3088. [PMID: 30351420 PMCID: PMC6257574 DOI: 10.1093/gbe/evy234] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/22/2018] [Indexed: 11/12/2022] Open
Abstract
Different genes and proteins evolve at very different rates. To identify the factors that explain these differences is an important aspect of research in molecular evolution. One such factor is the role a protein plays in a large molecular network. Here, we analyze the evolutionary rates of enzyme-coding genes in the genome-scale metabolic network of Escherichia coli to find the evolutionary constraints imposed by the structure and function of this complex metabolic system. Central and highly connected enzymes appear to evolve more slowly than less connected enzymes, but we find that they do so as a by-product of their high abundance, and not because of their position in the metabolic network. In contrast, enzymes catalyzing reactions with high metabolic flux-high substrate to product conversion rates-evolve slowly even after we account for their abundance. Moreover, enzymes catalyzing reactions that are difficult to by-pass through alternative pathways, such that they are essential in many different genetic backgrounds, also evolve more slowly. Our analyses show that an enzyme's role in the function of a metabolic network affects its evolution more than its place in the network's structure. They highlight the value of a system-level perspective for studies of molecular evolution.
Collapse
Affiliation(s)
- José Aguilar-Rodríguez
- Department of Evolutionary Biology and Environmental Studies, University of Zurich, Zurich, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
- Department of Biology, Stanford University, Stanford, CA and Department of Chemical and Systems Biology, Stanford University School of Medicine, Stanford, CA
| | - Andreas Wagner
- Department of Evolutionary Biology and Environmental Studies, University of Zurich, Zurich, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
- The Santa Fe Institute, Santa Fe, New Mexico
| |
Collapse
|
9
|
Ghiselli F, Iannello M, Puccio G, Chang PL, Plazzi F, Nuzhdin SV, Passamonti M. Comparative Transcriptomics in Two Bivalve Species Offers Different Perspectives on the Evolution of Sex-Biased Genes. Genome Biol Evol 2018; 10:1389-1402. [PMID: 29897459 PMCID: PMC6007409 DOI: 10.1093/gbe/evy082] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/19/2018] [Indexed: 12/13/2022] Open
Abstract
Comparative genomics has become a central tool for evolutionary biology, and a better knowledge of understudied taxa represents the foundation for future work. In this study, we characterized the transcriptome of male and female mature gonads in the European clam Ruditapes decussatus, compared with that in the Manila clam Ruditapes philippinarum providing, for the first time in bivalves, information about transcription dynamics and sequence evolution of sex-biased genes. In both the species, we found a relatively low number of sex-biased genes (1,284, corresponding to 41.3% of the orthologous genes between the two species), probably due to the absence of sexual dimorphism, and the transcriptional bias is maintained in only 33% of the orthologs. The dN/dS is generally low, indicating purifying selection, with genes where the female-biased transcription is maintained between the two species showing a significantly higher dN/dS. Genes involved in embryo development, cell proliferation, and maintenance of genome stability show a faster sequence evolution. Finally, we report a lack of clear correlation between transcription level and evolutionary rate in these species, in contrast with studies that reported a negative correlation. We discuss such discrepancy and call into question some methodological approaches and rationales generally used in this type of comparative studies.
Collapse
Affiliation(s)
- Fabrizio Ghiselli
- Department of Biological, Geological, and Environmental Sciences, University of Bologna, Italy
| | - Mariangela Iannello
- Department of Biological, Geological, and Environmental Sciences, University of Bologna, Italy
| | - Guglielmo Puccio
- Department of Biological, Geological, and Environmental Sciences, University of Bologna, Italy
| | - Peter L Chang
- Program in Molecular and Computational Biology, Department of Biological Sciences, University of Southern California, Los Angeles, USA
| | - Federico Plazzi
- Department of Biological, Geological, and Environmental Sciences, University of Bologna, Italy
| | - Sergey V Nuzhdin
- Program in Molecular and Computational Biology, Department of Biological Sciences, University of Southern California, Los Angeles, USA
| | - Marco Passamonti
- Department of Biological, Geological, and Environmental Sciences, University of Bologna, Italy
| |
Collapse
|
10
|
Hönigschmid P, Bykova N, Schneider R, Ivankov D, Frishman D. Evolutionary Interplay between Symbiotic Relationships and Patterns of Signal Peptide Gain and Loss. Genome Biol Evol 2018; 10:928-938. [PMID: 29608732 PMCID: PMC5952966 DOI: 10.1093/gbe/evy049] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/02/2018] [Indexed: 01/18/2023] Open
Abstract
Can orthologous proteins differ in terms of their ability to be secreted? To answer this question, we investigated the distribution of signal peptides within the orthologous groups of Enterobacterales. Parsimony analysis and sequence comparisons revealed a large number of signal peptide gain and loss events, in which signal peptides emerge or disappear in the course of evolution. Signal peptide losses prevail over gains, an effect which is especially pronounced in the transition from the free-living or commensal to the endosymbiotic lifestyle. The disproportionate decline in the number of signal peptide-containing proteins in endosymbionts cannot be explained by the overall reduction of their genomes. Signal peptides can be gained and lost either by acquisition/elimination of the corresponding N-terminal regions or by gradual accumulation of mutations. The evolutionary dynamics of signal peptides in bacterial proteins represents a powerful mechanism of functional diversification.
Collapse
Affiliation(s)
- Peter Hönigschmid
- Department of Bioinformatics, Technische Universität München, Wissenschaftszentrum Weihenstephan, Freising, Germany
| | - Nadya Bykova
- Institute for Information Transmission Problems (Kharkevich Institute), RAS, Moscow, Russia
| | - René Schneider
- Department of Bioinformatics, Technische Universität München, Wissenschaftszentrum Weihenstephan, Freising, Germany
| | - Dmitry Ivankov
- Institute of Science and Technology Austria, Klosterneuburg, Austria
| | - Dmitrij Frishman
- Department of Bioinformatics, Technische Universität München, Wissenschaftszentrum Weihenstephan, Freising, Germany.,Laboratory of Bioinformatics, RASA Research Center, St. Petersburg State Polytechnical University, Russia
| |
Collapse
|
11
|
Alvarez-Ponce D, Feyertag F, Chakraborty S. Position Matters: Network Centrality Considerably Impacts Rates of Protein Evolution in the Human Protein-Protein Interaction Network. Genome Biol Evol 2018; 9:1742-1756. [PMID: 28854629 PMCID: PMC5570066 DOI: 10.1093/gbe/evx117] [Citation(s) in RCA: 30] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/01/2017] [Indexed: 02/06/2023] Open
Abstract
The proteins of any organism evolve at disparate rates. A long list of factors affecting rates of protein evolution have been identified. However, the relative importance of each factor in determining rates of protein evolution remains unresolved. The prevailing view is that evolutionary rates are dominantly determined by gene expression, and that other factors such as network centrality have only a marginal effect, if any. However, this view is largely based on analyses in yeasts, and accurately measuring the importance of the determinants of rates of protein evolution is complicated by the fact that the different factors are often correlated with each other, and by the relatively poor quality of available functional genomics data sets. Here, we use correlation, partial correlation and principal component regression analyses to measure the contributions of several factors to the variability of the rates of evolution of human proteins. For this purpose, we analyzed the entire human protein–protein interaction data set and the human signal transduction network—a network data set of exceptionally high quality, obtained by manual curation, which is expected to be virtually free from false positives. In contrast with the prevailing view, we observe that network centrality (measured as the number of physical and nonphysical interactions, betweenness, and closeness) has a considerable impact on rates of protein evolution. Surprisingly, the impact of centrality on rates of protein evolution seems to be comparable, or even superior according to some analyses, to that of gene expression. Our observations seem to be independent of potentially confounding factors and from the limitations (biases and errors) of interactomic data sets.
Collapse
|
12
|
Schumacher J, Herlyn H. Correlates of evolutionary rates in the murine sperm proteome. BMC Evol Biol 2018; 18:35. [PMID: 29580206 PMCID: PMC5870804 DOI: 10.1186/s12862-018-1157-6] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2017] [Accepted: 03/19/2018] [Indexed: 01/20/2023] Open
Abstract
Background Protein-coding genes expressed in sperm evolve at different rates. To gain deeper insight into the factors underlying this heterogeneity we examined the relative importance of a diverse set of previously described rate correlates in determining the evolution of murine sperm proteins. Results Using partial rank correlations we detected several major rate indicators: Phyletic gene age, numbers of protein-protein interactions, and survival essentiality emerged as particularly important rate correlates in murine sperm proteins. Tissue specificity, numbers of paralogs, and untranslated region lengths also correlate significantly with sperm genes’ evolutionary rates, albeit to a lesser extent. Multifunctionality, coding sequence or average intron lengths, and mean expression level have insignificant or virtually no independent effects on evolutionary rates in murine sperm genes. Gene ontology enrichment analyses of three equally sized murine sperm protein groups classified based on their evolutionary rates indicate strongest sperm-specific functional specialization in the most quickly evolving gene class. Conclusions We propose a model according to which slowly evolving murine sperm proteins tend to be constrained by factors such as survival essentiality, network connectivity, and/or broad expression. In contrast, evolutionary change may arise especially in less constrained sperm proteins, which might, moreover, be prone to specialize to reproduction-related functions. Our results should be taken into account in future studies on rate variations of reproductive genes. Electronic supplementary material The online version of this article (10.1186/s12862-018-1157-6) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Julia Schumacher
- Institute of Organismic and Molecular Evolution, Anthropology, Johannes Gutenberg University, Mainz, Germany.
| | - Holger Herlyn
- Institute of Organismic and Molecular Evolution, Anthropology, Johannes Gutenberg University, Mainz, Germany.
| |
Collapse
|
13
|
Feyertag F, Alvarez-Ponce D. Disulfide Bonds Enable Accelerated Protein Evolution. Mol Biol Evol 2018; 34:1833-1837. [PMID: 28431018 DOI: 10.1093/molbev/msx135] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open
Abstract
The different proteins of any proteome evolve at enormously different rates. What factors contribute to this variability, and to what extent, is still a largely open question. We hypothesized that disulfide bonds, by increasing protein stability, should make proteins' structures relatively independent of their amino acid sequences, thus acting as buffers of deleterious mutations and enabling accelerated sequence evolution. In agreement with this hypothesis, we observed that membrane proteins with disulfide bonds evolved 88% faster than those without disulfide bonds, and that extracellular proteins with disulfide bonds evolved 49% faster than those without disulfide bonds. In addition, genes encoding proteins with disulfide bonds exhibit an increased likelihood of showing signatures of positive selection. Multivariate analyses indicate that the trend is independent of a number of potentially confounding factors. The effect, however, is not observed among the longest proteins, which can become stabilized by mechanisms other than disulfide bonds.
Collapse
Affiliation(s)
- Felix Feyertag
- Department of Biology, University of Nevada-Reno, Reno, NV
| | | |
Collapse
|
14
|
Feyertag F, Berninsone PM, Alvarez-Ponce D. Secreted Proteins Defy the Expression Level-Evolutionary Rate Anticorrelation. Mol Biol Evol 2017; 34:692-706. [PMID: 28007979 DOI: 10.1093/molbev/msw268] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
The rates of evolution of the proteins of any organism vary across orders of magnitude. A primary factor influencing rates of protein evolution is expression. A strong negative correlation between expression levels and evolutionary rates (the so-called E-R anticorrelation) has been observed in virtually all studied organisms. This effect is currently attributed to the abundance-dependent fitness costs of misfolding and unspecific protein-protein interactions, among other factors. Secreted proteins are folded in the endoplasmic reticulum, a compartment where chaperones, folding catalysts, and stringent quality control mechanisms promote their correct folding and may reduce the fitness costs of misfolding. In addition, confinement of secreted proteins to the extracellular space may reduce misinteractions and their deleterious effects. We hypothesize that each of these factors (the secretory pathway quality control and extracellular location) may reduce the strength of the E-R anticorrelation. Indeed, here we show that among human proteins that are secreted to the extracellular space, rates of evolution do not correlate with protein abundances. This trend is robust to controlling for several potentially confounding factors and is also observed when analyzing protein abundance data for 6 human tissues. In addition, analysis of mRNA abundance data for 32 human tissues shows that the E-R correlation is always less negative, and sometimes nonsignificant, in secreted proteins. Similar observations were made in Caenorhabditis elegans and in Escherichia coli, and to a lesser extent in Drosophila melanogaster, Saccharomyces cerevisiae and Arabidopsis thaliana. Our observations contribute to understand the causes of the E-R anticorrelation.
Collapse
Affiliation(s)
- Felix Feyertag
- Department of Biology, University of Nevada, Reno, Reno, NV
| | | | | |
Collapse
|
15
|
Sojo V, Dessimoz C, Pomiankowski A, Lane N. Membrane Proteins Are Dramatically Less Conserved than Water-Soluble Proteins across the Tree of Life. Mol Biol Evol 2016; 33:2874-2884. [PMID: 27501943 PMCID: PMC5062322 DOI: 10.1093/molbev/msw164] [Citation(s) in RCA: 40] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Membrane proteins are crucial in transport, signaling, bioenergetics, catalysis, and as drug targets. Here, we show that membrane proteins have dramatically fewer detectable orthologs than water-soluble proteins, less than half in most species analyzed. This sparse distribution could reflect rapid divergence or gene loss. We find that both mechanisms operate. First, membrane proteins evolve faster than water-soluble proteins, particularly in their exterior-facing portions. Second, we demonstrate that predicted ancestral membrane proteins are preferentially lost compared with water-soluble proteins in closely related species of archaea and bacteria. These patterns are consistent across the whole tree of life, and in each of the three domains of archaea, bacteria, and eukaryotes. Our findings point to a fundamental evolutionary principle: membrane proteins evolve faster due to stronger adaptive selection in changing environments, whereas cytosolic proteins are under more stringent purifying selection in the homeostatic interior of the cell. This effect should be strongest in prokaryotes, weaker in unicellular eukaryotes (with intracellular membranes), and weakest in multicellular eukaryotes (with extracellular homeostasis). We demonstrate that this is indeed the case. Similarly, we show that extracellular water-soluble proteins exhibit an even stronger pattern of low homology than membrane proteins. These striking differences in conservation of membrane proteins versus water-soluble proteins have important implications for evolution and medicine.
Collapse
Affiliation(s)
- Victor Sojo
- CoMPLEX, University College London, London, United Kingdom Department of Genetics, Evolution and Environment, University College London, London, United Kingdom Systems Biophysics, Faculty of Physics, Ludwig-Maximilian University of Munich, Munich, Germany
| | - Christophe Dessimoz
- Department of Genetics, Evolution and Environment, University College London, London, United Kingdom Department of Ecology and Evolution, University of Lausanne, Lausanne, Switzerland Center for Integrative Genomics, University of Lausanne, Lausanne, Switzerland
| | - Andrew Pomiankowski
- CoMPLEX, University College London, London, United Kingdom Department of Genetics, Evolution and Environment, University College London, London, United Kingdom
| | - Nick Lane
- CoMPLEX, University College London, London, United Kingdom Department of Genetics, Evolution and Environment, University College London, London, United Kingdom
| |
Collapse
|
16
|
Hsu CH, Chiang AWT, Hwang MJ, Liao BY. Proteins with Highly Evolvable Domain Architectures Are Nonessential but Highly Retained. Mol Biol Evol 2016; 33:1219-30. [DOI: 10.1093/molbev/msw006] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
|
17
|
Mukherjee D, Mukherjee A, Ghosh TC. Evolutionary Rate Heterogeneity of Primary and Secondary Metabolic Pathway Genes in Arabidopsis thaliana. Genome Biol Evol 2015; 8:17-28. [PMID: 26556590 PMCID: PMC4758233 DOI: 10.1093/gbe/evv217] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
Primary metabolism is essential to plants for growth and development, and secondary metabolism helps plants to interact with the environment. Many plant metabolites are industrially important. These metabolites are produced by plants through complex metabolic pathways. Lack of knowledge about these pathways is hindering the successful breeding practices for these metabolites. For a better knowledge of the metabolism in plants as a whole, evolutionary rate variation of primary and secondary metabolic pathway genes is a prerequisite. In this study, evolutionary rate variation of primary and secondary metabolic pathway genes has been analyzed in the model plant Arabidopsis thaliana. Primary metabolic pathway genes were found to be more conserved than secondary metabolic pathway genes. Several factors such as gene structure, expression level, tissue specificity, multifunctionality, and domain number are the key factors behind this evolutionary rate variation. This study will help to better understand the evolutionary dynamics of plant metabolism.
Collapse
Affiliation(s)
- Dola Mukherjee
- Bioinformatics Centre, Bose Institute, Kolkata, West Bengal, India
| | - Ashutosh Mukherjee
- Department of Botany, Vivekananda College, Thakurpukur, Kolkata, West Bengal, India
| | | |
Collapse
|
18
|
Badet T, Peyraud R, Raffaele S. Common protein sequence signatures associate with Sclerotinia borealis lifestyle and secretion in fungal pathogens of the Sclerotiniaceae. FRONTIERS IN PLANT SCIENCE 2015; 6:776. [PMID: 26442085 DOI: 10.3389/fpls.2015.00776issn=1664-462x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Subscribe] [Scholar Register] [Received: 06/23/2015] [Accepted: 09/10/2015] [Indexed: 05/25/2023]
Abstract
Fungal plant pathogens produce secreted proteins adapted to function outside fungal cells to facilitate colonization of their hosts. In many cases such as for fungi from the Sclerotiniaceae family the repertoire and function of secreted proteins remains elusive. In the Sclerotiniaceae, whereas Sclerotinia sclerotiorum and Botrytis cinerea are cosmopolitan broad host-range plant pathogens, Sclerotinia borealis has a psychrophilic lifestyle with a low optimal growth temperature, a narrow host range and geographic distribution. To spread successfully, S. borealis must synthesize proteins adapted to function in its specific environment. The search for signatures of adaptation to S. borealis lifestyle may therefore help revealing proteins critical for colonization of the environment by Sclerotiniaceae fungi. Here, we analyzed amino acids usage and intrinsic protein disorder in alignments of groups of orthologous proteins from the three Sclerotiniaceae species. We found that enrichment in Thr, depletion in Glu and Lys, and low disorder frequency in hot loops are significantly associated with S. borealis proteins. We designed an index to report bias in these properties and found that high index proteins were enriched among secreted proteins in the three Sclerotiniaceae fungi. High index proteins were also enriched in function associated with plant colonization in S. borealis, and in in planta-induced genes in S. sclerotiorum. We highlight a novel putative antifreeze protein and a novel putative lytic polysaccharide monooxygenase identified through our pipeline as candidate proteins involved in colonization of the environment. Our findings suggest that similar protein signatures associate with S. borealis lifestyle and with secretion in the Sclerotiniaceae. These signatures may be useful for identifying proteins of interest as targets for the management of plant diseases.
Collapse
Affiliation(s)
- Thomas Badet
- Laboratoire des Interactions Plantes-Microorganismes, Institut National de la Recherche Agronomique, UMR441 Castanet-Tolosan, France ; Laboratoire des Interactions Plantes-Microorganismes, Centre National de la Recherche Scientifique, UMR2594 Castanet-Tolosan, France
| | - Rémi Peyraud
- Laboratoire des Interactions Plantes-Microorganismes, Institut National de la Recherche Agronomique, UMR441 Castanet-Tolosan, France ; Laboratoire des Interactions Plantes-Microorganismes, Centre National de la Recherche Scientifique, UMR2594 Castanet-Tolosan, France
| | - Sylvain Raffaele
- Laboratoire des Interactions Plantes-Microorganismes, Institut National de la Recherche Agronomique, UMR441 Castanet-Tolosan, France ; Laboratoire des Interactions Plantes-Microorganismes, Centre National de la Recherche Scientifique, UMR2594 Castanet-Tolosan, France
| |
Collapse
|
19
|
Aird SD, Aggarwal S, Villar-Briones A, Tin MMY, Terada K, Mikheyev AS. Snake venoms are integrated systems, but abundant venom proteins evolve more rapidly. BMC Genomics 2015; 16:647. [PMID: 26315097 PMCID: PMC4552096 DOI: 10.1186/s12864-015-1832-6] [Citation(s) in RCA: 61] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2015] [Accepted: 08/07/2015] [Indexed: 12/19/2022] Open
Abstract
Background While many studies have shown that extracellular proteins evolve rapidly, how selection acts on them remains poorly understood. We used snake venoms to understand the interaction between ecology, expression level, and evolutionary rate in secreted protein systems. Venomous snakes employ well-integrated systems of proteins and organic constituents to immobilize prey. Venoms are generally optimized to subdue preferred prey more effectively than non-prey, and many venom protein families manifest positive selection and rapid gene family diversification. Although previous studies have illuminated how individual venom protein families evolve, how selection acts on venoms as integrated systems, is unknown. Results Using next-generation transcriptome sequencing and mass spectrometry, we examined microevolution in two pitvipers, allopatrically separated for at least 1.6 million years, and their hybrids. Transcriptomes of parental species had generally similar compositions in regard to protein families, but for a given protein family, the homologs present and concentrations thereof sometimes differed dramatically. For instance, a phospholipase A2 transcript comprising 73.4 % of the Protobothrops elegans transcriptome, was barely present in the P. flavoviridis transcriptome (<0.05 %). Hybrids produced most proteins found in both parental venoms. Protein evolutionary rates were positively correlated with transcriptomic and proteomic abundances, and the most abundant proteins showed positive selection. This pattern holds with the addition of four other published crotaline transcriptomes, from two more genera, and also for the recently published king cobra genome, suggesting that rapid evolution of abundant proteins may be generally true for snake venoms. Looking more broadly at Protobothrops, we show that rapid evolution of the most abundant components is due to positive selection, suggesting an interplay between abundance and adaptation. Conclusions Given log-scale differences in toxin abundance, which are likely correlated with biosynthetic costs, we hypothesize that as a result of natural selection, snakes optimize return on energetic investment by producing more of venom proteins that increase their fitness. Natural selection then acts on the additive genetic variance of these components, in proportion to their contributions to overall fitness. Adaptive evolution of venoms may occur most rapidly through changes in expression levels that alter fitness contributions, and thus the strength of selection acting on specific secretome components. Electronic supplementary material The online version of this article (doi:10.1186/s12864-015-1832-6) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Steven D Aird
- Okinawa Institute of Science and Technology Graduate University, Tancha 1919-1, Onna-son, Kunigami-gun, Okinawa-ken, 904-0412, Japan.
| | - Shikha Aggarwal
- Okinawa Institute of Science and Technology Graduate University, Tancha 1919-1, Onna-son, Kunigami-gun, Okinawa-ken, 904-0412, Japan. .,University School of Environment Management, Guru Gobind Singh Indraprastha University, Sector 16C, Dwarka, New Delhi, 110078, India.
| | - Alejandro Villar-Briones
- Okinawa Institute of Science and Technology Graduate University, Tancha 1919-1, Onna-son, Kunigami-gun, Okinawa-ken, 904-0412, Japan.
| | - Mandy Man-Ying Tin
- Okinawa Institute of Science and Technology Graduate University, Tancha 1919-1, Onna-son, Kunigami-gun, Okinawa-ken, 904-0412, Japan.
| | - Kouki Terada
- Okinawa Prefectural Institute of Health and the Environment, Biology and Ecology Group, 2003 Ozato, Ozato, Nanjo-shi, Okinawa, 901-1202, Japan.
| | - Alexander S Mikheyev
- Okinawa Institute of Science and Technology Graduate University, Tancha 1919-1, Onna-son, Kunigami-gun, Okinawa-ken, 904-0412, Japan. .,Research School of Biology, Australian National University, Canberra, ACT 0200, Australia.
| |
Collapse
|
20
|
Abstract
The rate and mechanism of protein sequence evolution have been central questions in evolutionary biology since the 1960s. Although the rate of protein sequence evolution depends primarily on the level of functional constraint, exactly what determines functional constraint has remained unclear. The increasing availability of genomic data has enabled much needed empirical examinations on the nature of functional constraint. These studies found that the evolutionary rate of a protein is predominantly influenced by its expression level rather than functional importance. A combination of theoretical and empirical analyses has identified multiple mechanisms behind these observations and demonstrated a prominent role in protein evolution of selection against errors in molecular and cellular processes.
Collapse
Affiliation(s)
- Jianzhi Zhang
- Department of Ecology and Evolutionary Biology, University of Michigan, 830 North University Avenue, Ann Arbor, Michigan 48109, USA
| | - Jian-Rong Yang
- Department of Ecology and Evolutionary Biology, University of Michigan, 830 North University Avenue, Ann Arbor, Michigan 48109, USA
| |
Collapse
|
21
|
Echave J, Jackson EL, Wilke CO. Relationship between protein thermodynamic constraints and variation of evolutionary rates among sites. Phys Biol 2015; 12:025002. [PMID: 25787027 PMCID: PMC4391963 DOI: 10.1088/1478-3975/12/2/025002] [Citation(s) in RCA: 42] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]
Abstract
Evolutionary-rate variation among sites within proteins depends on functional and biophysical properties that constrain protein evolution. It is generally accepted that proteins must be able to fold stably in order to function. However, the relationship between stability constraints and among-sites rate variation is not well understood. Here, we present a biophysical model that links the thermodynamic stability changes due to mutations at sites in proteins ([Formula: see text]) to the rate at which mutations accumulate at those sites over evolutionary time. We find that such a 'stability model' generally performs well, displaying correlations between predicted and empirically observed rates of up to 0.75 for some proteins. We further find that our model has comparable predictive power as does an alternative, recently proposed 'stress model' that explains evolutionary-rate variation among sites in terms of the excess energy needed for mutants to adopt the correct active structure ([Formula: see text]). The two models make distinct predictions, though, and for some proteins the stability model outperforms the stress model and vice versa. We conclude that both stability and stress constrain site-specific sequence evolution in proteins.
Collapse
|
22
|
Mikheyev AS, Linksvayer TA. Genes associated with ant social behavior show distinct transcriptional and evolutionary patterns. eLife 2015; 4:e04775. [PMID: 25621766 PMCID: PMC4383337 DOI: 10.7554/elife.04775] [Citation(s) in RCA: 59] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2014] [Accepted: 01/23/2015] [Indexed: 11/24/2022] Open
Abstract
Studies of the genetic basis and evolution of complex social behavior emphasize
either conserved or novel genes. To begin to reconcile these perspectives, we studied
how the evolutionary conservation of genes associated with social behavior depends on
regulatory context, and whether genes associated with social behavior exist in
distinct regulatory and evolutionary contexts. We identified modules of co-expressed
genes associated with age-based division of labor between nurses and foragers in the
ant Monomorium pharaonis, and we studied the relationship between
molecular evolution, connectivity, and expression. Highly connected and expressed
genes were more evolutionarily conserved, as expected. However, compared to the rest
of the genome, forager-upregulated genes were much more highly connected and
conserved, while nurse-upregulated genes were less connected and more evolutionarily
labile. Our results indicate that the genetic architecture of social behavior
includes both highly connected and conserved components as well as loosely connected
and evolutionarily labile components. DOI:http://dx.doi.org/10.7554/eLife.04775.001 Animal species vary widely in their degree of social behavior. Some species live
solitarily, and others, such as ants and humans, form large societies. Many
researchers have tried to understand the genetic changes underlying the evolution of
social behavior. Some researchers suggest that it involves recycling existing genes
that also have other conserved functions. Others propose that the evolution of social
behavior involves completely new genes that are not found in related but solitary
species. Ants are one of the best-studied social animals. An established colony can contain
many 1000s of individuals that live and work together and perform different roles.
The queen's job is to lay eggs, while the worker ants do everything else,
including collecting food, caring for the young, and protecting the colony. In some
species of ant—including the pharaoh ant—a worker's role changes
as it ages. Younger workers tend to stay in the nest and nurse the brood, while older
workers tend to leave the nest and forage for food. Mikheyev and Linksvayer asked: which genes are responsible for this age-based
division of labor? And how did this aspect of social behavior evolve? First, after
observing pharaoh ants from two colonies set up in the laboratory, they confirmed
that workers nursing the brood were on average almost a week younger than those seen
collecting food. Next Mikheyev and Linksvayer identified which genes were expressed
in ants of different ages, or ants engaged in different tasks. Specific sets of genes
were expressed more (or ‘up-regulated’) in nurse workers, while others
were up-regulated in foraging workers. Mikheyev and Linksvayer then investigated how rapidly these genes had evolved by
comparing them to related genes found in other social insects (fire ants and honey
bees). They also determined the ‘connectivity’ of these genes by asking
how many other genes showed similar expression patterns. In many organisms, how
rapidly a gene evolves depends on how tightly connected its expression is to the
expression of other genes; highly connected genes evolve more slowly. The genes that were expressed more in the older foraging workers were both more
highly connected and more evolutionarily conserved in the other social insects. Genes
that were up-regulated in the younger nurse workers were more loosely connected and
rapidly evolving. Mikheyev and Linksvayer's findings show that the evolution of social behavior
in animals involves both new genes, which tend to be loosely connected, and conserved
genes, which tend to be more highly connected. DOI:http://dx.doi.org/10.7554/eLife.04775.002
Collapse
Affiliation(s)
- Alexander S Mikheyev
- Ecology and Evolution Unit, Okinawa Institute of Science and Technology, Okinawa, Japan
| | | |
Collapse
|
23
|
Badet T, Peyraud R, Raffaele S. Common protein sequence signatures associate with Sclerotinia borealis lifestyle and secretion in fungal pathogens of the Sclerotiniaceae. FRONTIERS IN PLANT SCIENCE 2015; 6:776. [PMID: 26442085 PMCID: PMC4585107 DOI: 10.3389/fpls.2015.00776] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/23/2015] [Accepted: 09/10/2015] [Indexed: 05/04/2023]
Abstract
Fungal plant pathogens produce secreted proteins adapted to function outside fungal cells to facilitate colonization of their hosts. In many cases such as for fungi from the Sclerotiniaceae family the repertoire and function of secreted proteins remains elusive. In the Sclerotiniaceae, whereas Sclerotinia sclerotiorum and Botrytis cinerea are cosmopolitan broad host-range plant pathogens, Sclerotinia borealis has a psychrophilic lifestyle with a low optimal growth temperature, a narrow host range and geographic distribution. To spread successfully, S. borealis must synthesize proteins adapted to function in its specific environment. The search for signatures of adaptation to S. borealis lifestyle may therefore help revealing proteins critical for colonization of the environment by Sclerotiniaceae fungi. Here, we analyzed amino acids usage and intrinsic protein disorder in alignments of groups of orthologous proteins from the three Sclerotiniaceae species. We found that enrichment in Thr, depletion in Glu and Lys, and low disorder frequency in hot loops are significantly associated with S. borealis proteins. We designed an index to report bias in these properties and found that high index proteins were enriched among secreted proteins in the three Sclerotiniaceae fungi. High index proteins were also enriched in function associated with plant colonization in S. borealis, and in in planta-induced genes in S. sclerotiorum. We highlight a novel putative antifreeze protein and a novel putative lytic polysaccharide monooxygenase identified through our pipeline as candidate proteins involved in colonization of the environment. Our findings suggest that similar protein signatures associate with S. borealis lifestyle and with secretion in the Sclerotiniaceae. These signatures may be useful for identifying proteins of interest as targets for the management of plant diseases.
Collapse
Affiliation(s)
- Thomas Badet
- Laboratoire des Interactions Plantes-Microorganismes, Institut National de la Recherche Agronomique, UMR441Castanet-Tolosan, France
- Laboratoire des Interactions Plantes-Microorganismes, Centre National de la Recherche Scientifique, UMR2594Castanet-Tolosan, France
| | - Rémi Peyraud
- Laboratoire des Interactions Plantes-Microorganismes, Institut National de la Recherche Agronomique, UMR441Castanet-Tolosan, France
- Laboratoire des Interactions Plantes-Microorganismes, Centre National de la Recherche Scientifique, UMR2594Castanet-Tolosan, France
| | - Sylvain Raffaele
- Laboratoire des Interactions Plantes-Microorganismes, Institut National de la Recherche Agronomique, UMR441Castanet-Tolosan, France
- Laboratoire des Interactions Plantes-Microorganismes, Centre National de la Recherche Scientifique, UMR2594Castanet-Tolosan, France
- *Correspondence: Sylvain Raffaele, Laboratoire des Interactions Plante Micro-organismes, 24 Chemin de Borde Rouge – Auzeville, 31326 Castanet Tolosan, France
| |
Collapse
|
24
|
Chuang TJ, Chiang TW. Impacts of pretranscriptional DNA methylation, transcriptional transcription factor, and posttranscriptional microRNA regulations on protein evolutionary rate. Genome Biol Evol 2014; 6:1530-41. [PMID: 24923326 PMCID: PMC4080426 DOI: 10.1093/gbe/evu124] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
Abstract
Gene expression is largely regulated by DNA methylation, transcription factor (TF), and
microRNA (miRNA) before, during, and after transcription, respectively. Although the
evolutionary effects of TF/miRNA regulations have been widely studied, evolutionary
analysis of simultaneously accounting for DNA methylation, TF, and miRNA regulations and
whether promoter methylation and gene body (coding regions) methylation have different
effects on the rate of gene evolution remain uninvestigated. Here, we compared
human–macaque and human–mouse protein evolutionary rates against
experimentally determined single base-resolution DNA methylation data, revealing that
promoter methylation level is positively correlated with protein evolutionary rates but
negatively correlated with TF/miRNA regulations, whereas the opposite was observed for
gene body methylation level. Our results showed that the relative importance of these
regulatory factors in determining the rate of mammalian protein evolution is as follows:
Promoter methylation ≈ miRNA regulation > gene body methylation > TF regulation,
and further indicated that promoter methylation and miRNA regulation have a significant
dependent effect on protein evolutionary rates. Although the mechanisms underlying
cooperation between DNA methylation and TFs/miRNAs in gene regulation remain unclear, our
study helps to not only illuminate the impact of these regulatory factors on mammalian
protein evolution but also their intricate interaction within gene regulatory
networks.
Collapse
Affiliation(s)
- Trees-Juen Chuang
- Division of Physical & Computational Genomics, Genomics Research Center, Academia Sinica, Taipei, Taiwan
| | - Tai-Wei Chiang
- Division of Physical & Computational Genomics, Genomics Research Center, Academia Sinica, Taipei, Taiwan
| |
Collapse
|
25
|
Chang TY, Liao BY. Flagellated algae protein evolution suggests the prevalence of lineage-specific rules governing evolutionary rates of eukaryotic proteins. Genome Biol Evol 2013; 5:913-22. [PMID: 23563973 PMCID: PMC3673635 DOI: 10.1093/gbe/evt055] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023] Open
Abstract
Understanding the general rules governing the rate of protein evolution is fundamental to evolutionary biology. However, attempts to address this issue in yeasts and mammals have revealed considerable differences in the relative importance of determinants for protein evolutionary rates. This phenomenon was previously explained by the fact that yeasts and mammals are different in many cellular and genomic properties. Flagellated algae species have several cellular and genomic characteristics that are intermediate between yeasts and mammals. Using partial correlation analyses on the evolution of 6,921 orthologous proteins from Chlamydomonas reinhardtii and Volvox carteri, we examined factors influencing evolutionary rates of proteins in flagellated algae. Previous studies have shown that mRNA abundance and gene compactness are strong determinants for protein evolutionary rates in yeasts and mammals, respectively. We show that both factors also influence algae protein evolution with mRNA abundance having a larger impact than gene compactness on the rates of algae protein evolution. More importantly, among all the factors examined, coding sequence (CDS) length has the strongest (positive) correlation with protein evolutionary rates. This correlation between CDS length and the rates of protein evolution is not due to alignment-related issues or domain density. These results suggest no simple and universal rules governing protein evolutionary rates across different eukaryotic lineages. Instead, gene properties influence the rate of protein evolution in a lineage-specific manner.
Collapse
Affiliation(s)
- Ting-Yan Chang
- Division of Biostatistics and Bioinformatics, Institute of Population Health Sciences, National Health Research Institutes, Zhunan, Taiwan, Republic of China
| | | |
Collapse
|
26
|
Wei W, Zhang T, Lin D, Yang ZJ, Guo FB. Transcriptional abundance is not the single force driving the evolution of bacterial proteins. BMC Evol Biol 2013; 13:162. [PMID: 23914835 PMCID: PMC3734234 DOI: 10.1186/1471-2148-13-162] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2013] [Accepted: 08/01/2013] [Indexed: 11/20/2022] Open
Abstract
Background Despite rapid progress in understanding the mechanisms that shape the evolution of proteins, the relative importance of various factors remain to be elucidated. In this study, we have assessed the effects of 16 different biological features on the evolutionary rates (ERs) of protein-coding sequences in bacterial genomes. Results Our analysis of 18 bacterial species revealed new correlations between ERs and constraining factors. Previous studies have suggested that transcriptional abundance overwhelmingly constrains the evolution of yeast protein sequences. This transcriptional abundance leads to selection against misfolding or misinteractions. In this study we found that there was no single factor in determining the evolution of bacterial proteins. Not only transcriptional abundance (codon adaptation index and expression level), but also protein-protein associations (PPAs), essentiality (ESS), subcellular localization of cytoplasmic membrane (SLM), transmembrane helices (TMH) and hydropathicity score (HS) independently and significantly affected the ERs of bacterial proteins. In some species, PPA and ESS demonstrate higher correlations with ER than transcriptional abundance. Conclusions Different forces drive the evolution of protein sequences in yeast and bacteria. In bacteria, the constraints are involved in avoiding a build-up of toxic molecules caused by misfolding/misinteraction (transcriptional abundance), while retaining important functions (ESS, PPA) and maintaining the cell membrane (SLM, TMH and HS). Each of these independently contributes to the variation in protein evolution.
Collapse
Affiliation(s)
- Wen Wei
- Center of Bioinformatics and Key Laboratory for NeuroInformation of the Ministry of Education, School of Life Science and Technology, University of Electronic Science and Technology of China, 610054 Chengdu, China
| | | | | | | | | |
Collapse
|
27
|
Chen YC, Cheng JH, Tsai ZTY, Tsai HK, Chuang TJ. The impact of trans-regulation on the evolutionary rates of metazoan proteins. Nucleic Acids Res 2013; 41:6371-80. [PMID: 23658220 PMCID: PMC3711421 DOI: 10.1093/nar/gkt349] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2013] [Revised: 04/10/2013] [Accepted: 04/14/2013] [Indexed: 11/13/2022] Open
Abstract
Transcription factor (TF) and microRNA (miRNA) are two crucial trans-regulatory factors that coordinately control gene expression. Understanding the impacts of these two factors on the rate of protein sequence evolution is of great importance in evolutionary biology. While many biological factors associated with evolutionary rate variations have been studied, evolutionary analysis of simultaneously accounting for TF and miRNA regulations across metazoans is still uninvestigated. Here, we provide a series of statistical analyses to assess the influences of TF and miRNA regulations on evolutionary rates across metazoans (human, mouse and fruit fly). Our results reveal that the negative correlations between trans-regulation and evolutionary rates hold well across metazoans, but the strength of TF regulation as a rate indicator becomes weak when the other confounding factors that may affect evolutionary rates are controlled. We show that miRNA regulation tends to be a more essential indicator of evolutionary rates than TF regulation, and the combination of TF and miRNA regulations has a significant dependent effect on protein evolutionary rates. We also show that trans-regulation (especially miRNA regulation) is much more important in human/mouse than in fruit fly in determining protein evolutionary rates, suggesting a considerable variation in rate determinants between vertebrates and invertebrates.
Collapse
Affiliation(s)
- Yi-Ching Chen
- Institute of Information Science, Academia Sinica, Taipei 115, Taiwan, Bioinformatics Program, Taiwan International Graduate Program, Academia Sinica, Taipei 115, Taiwan and Genomic Research Center, Academia Sinica, Taipei 115, Taiwan
| | - Jen-Hao Cheng
- Institute of Information Science, Academia Sinica, Taipei 115, Taiwan, Bioinformatics Program, Taiwan International Graduate Program, Academia Sinica, Taipei 115, Taiwan and Genomic Research Center, Academia Sinica, Taipei 115, Taiwan
| | - Zing Tsung-Yeh Tsai
- Institute of Information Science, Academia Sinica, Taipei 115, Taiwan, Bioinformatics Program, Taiwan International Graduate Program, Academia Sinica, Taipei 115, Taiwan and Genomic Research Center, Academia Sinica, Taipei 115, Taiwan
| | - Huai-Kuang Tsai
- Institute of Information Science, Academia Sinica, Taipei 115, Taiwan, Bioinformatics Program, Taiwan International Graduate Program, Academia Sinica, Taipei 115, Taiwan and Genomic Research Center, Academia Sinica, Taipei 115, Taiwan
| | - Trees-Juen Chuang
- Institute of Information Science, Academia Sinica, Taipei 115, Taiwan, Bioinformatics Program, Taiwan International Graduate Program, Academia Sinica, Taipei 115, Taiwan and Genomic Research Center, Academia Sinica, Taipei 115, Taiwan
| |
Collapse
|
28
|
Guo YL. Gene family evolution in green plants with emphasis on the origination and evolution of Arabidopsis thaliana genes. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2013; 73:941-51. [PMID: 23216999 DOI: 10.1111/tpj.12089] [Citation(s) in RCA: 88] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/22/2012] [Revised: 11/01/2012] [Accepted: 11/23/2012] [Indexed: 05/04/2023]
Abstract
Gene family size variation is an important mechanism that shapes the natural variation for adaptation in various species. Despite its importance, the pattern of gene family size variation in green plants is still not well understood. In particular, the evolutionary pattern of genes and gene families remains unknown in the model plant Arabidopsis thaliana in the context of green plants. In this study, eight representative genomes of green plants are sampled to study gene family evolution and characterize the origination of A. thaliana genes, respectively. Four important insights gained are that: (i) the rate of gene gains and losses is about 0.001359 per gene every million years, similar to the rate in yeast, Drosophila, and mammals; (ii) some gene families evolved rapidly with extreme expansions or contractions, and 2745 gene families present in all the eight species represent the 'core' proteome of green plants; (iii) 70% of A. thaliana genes could be traced back to 450 million years ago; and (iv) intriguingly, A. thaliana genes with early origination are under stronger purifying selection and more conserved. In summary, the present study provides genome-wide insights into evolutionary history and mechanisms of genes and gene families in green plants and especially in A. thaliana.
Collapse
Affiliation(s)
- Ya-Long Guo
- State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, Chinese Academy of Sciences, Beijing, 100093, China.
| |
Collapse
|
29
|
Differential requirements for mRNA folding partially explain why highly expressed proteins evolve slowly. Proc Natl Acad Sci U S A 2013; 110:E678-86. [PMID: 23382244 DOI: 10.1073/pnas.1218066110] [Citation(s) in RCA: 85] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The cause of the tremendous among-protein variation in the rate of sequence evolution is a central subject of molecular evolution. Expression level has been identified as a leading determinant of this variation among genes encoded in the same genome, but the underlying mechanisms are not fully understood. We here propose and demonstrate that a requirement for stronger folding of more abundant mRNAs results in slower evolution of more highly expressed genes and proteins. Specifically, we show that: (i) the higher the expression level of a gene, the greater the selective pressure for its mRNA to fold; (ii) random mutations are more likely to decrease mRNA folding when occurring in highly expressed genes than in lowly expressed genes; and (iii) amino acid substitution rate is negatively correlated with mRNA folding strength, with or without the control of expression level. Furthermore, synonymous (d(S)) and nonsynonymous (d(N)) nucleotide substitution rates are both negatively correlated with mRNA folding strength. However, counterintuitively, d(S) and d(N) are differentially constrained by selection for mRNA folding, resulting in a significant correlation between mRNA folding strength and d(N)/d(S), even when gene expression level is controlled. The direction and magnitude of this correlation is determined primarily by the G+C frequency at third codon positions. Together, these findings explain why highly expressed genes evolve slowly, demonstrate a major role of natural selection at the mRNA level in constraining protein evolution, and reveal a previously unrecognized and unexpected form of nonprotein-level selection that impacts d(N)/d(S).
Collapse
|
30
|
Pérez-Bercoff Å, Hudson CM, Conant GC. A conserved mammalian protein interaction network. PLoS One 2013; 8:e52581. [PMID: 23320073 PMCID: PMC3539715 DOI: 10.1371/journal.pone.0052581] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2012] [Accepted: 11/20/2012] [Indexed: 11/19/2022] Open
Abstract
Physical interactions between proteins mediate a variety of biological functions, including signal transduction, physical structuring of the cell and regulation. While extensive catalogs of such interactions are known from model organisms, their evolutionary histories are difficult to study given the lack of interaction data from phylogenetic outgroups. Using phylogenomic approaches, we infer a upper bound on the time of origin for a large set of human protein-protein interactions, showing that most such interactions appear relatively ancient, dating no later than the radiation of placental mammals. By analyzing paired alignments of orthologous and putatively interacting protein-coding genes from eight mammals, we find evidence for weak but significant co-evolution, as measured by relative selective constraint, between pairs of genes with interacting proteins. However, we find no strong evidence for shared instances of directional selection within an interacting pair. Finally, we use a network approach to show that the distribution of selective constraint across the protein interaction network is non-random, with a clear tendency for interacting proteins to share similar selective constraints. Collectively, the results suggest that, on the whole, protein interactions in mammals are under selective constraint, presumably due to their functional roles.
Collapse
Affiliation(s)
- Åsa Pérez-Bercoff
- Smurfit Institute of Genetics, University of Dublin, Trinity College, Dublin, Ireland
| | - Corey M. Hudson
- Informatics Institute, University of Missouri, Columbia, Missouri, United States of America
| | - Gavin C. Conant
- Informatics Institute, University of Missouri, Columbia, Missouri, United States of America
- Division of Animal Sciences, University of Missouri, Columbia, Missouri, United States of America
- * E-mail:
| |
Collapse
|
31
|
Liao BY, Weng MP. Natural selection drives rapid evolution of mouse embryonic heart enhancers. BMC SYSTEMS BIOLOGY 2012; 6 Suppl 2:S1. [PMID: 23281795 PMCID: PMC3521173 DOI: 10.1186/1752-0509-6-s2-s1] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Background Mouse E11.5 embryonic heart enhancers were found to exhibit exceptionally weak sequence conservation during vertebrate evolution compared to enhancers of other developing organs. However, it is unknown whether this phenomenon is due to elevated mutation rates, or is a consequence of natural selection. Results In this study, based on the aligned orthologous genomic sequences of mouse and other closely related mammals, the substitution rates of fourfold degenerate sites or intron sequences in neighboring genes were used as neutral references to normalize substitution rates of mouse enhancers. Subsequent comparisons indicated that heart enhancers' evolutionary rates were increased by natural selection. Correspondingly, the results of Fisher's exact tests to examine the differential enrichment of substitutions between enhancers and neutral sequences suggest that both relaxed purifying selection and positive selection caused the rapid evolution of heart enhancers. Analyses on recombination rates and substitution patterns indicated that GC-biased gene conversion does not contribute to evolutionary rate variations among enhancers. In general, pleiotropic enhancers and enhancers in proximity to weakly expressed genes, tend to evolve slowly. Although heart enhancers are less pleiotropic and are adjacent to highly expressed genes, these biases do not account for the rapid evolution observed. Conclusions In combination, the results of the present study suggest that factors associated with functions or characteristics of the tissue may exert direct and profound effects on the intensity and direction of the natural selection applied to regulatory DNAs, such as enhancers.
Collapse
Affiliation(s)
- Ben-Yang Liao
- Division of Biostatistics & Bioinformatics, Institute of Population Health Sciences, National Health Research Institutes, Zhunan Town, Miaoli County 350, Taiwan, ROC.
| | | |
Collapse
|
32
|
Singh PP, Affeldt S, Cascone I, Selimoglu R, Camonis J, Isambert H. On the expansion of "dangerous" gene repertoires by whole-genome duplications in early vertebrates. Cell Rep 2012; 2:1387-98. [PMID: 23168259 DOI: 10.1016/j.celrep.2012.09.034] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2012] [Revised: 09/17/2012] [Accepted: 09/27/2012] [Indexed: 10/27/2022] Open
Abstract
The emergence and evolutionary expansion of gene families implicated in cancers and other severe genetic diseases is an evolutionary oddity from a natural selection perspective. Here, we show that gene families prone to deleterious mutations in the human genome have been preferentially expanded by the retention of "ohnolog" genes from two rounds of whole-genome duplication (WGD) dating back from the onset of jawed vertebrates. We further demonstrate that the retention of many ohnologs suspected to be dosage balanced is in fact indirectly mediated by their susceptibility to deleterious mutations. This enhanced retention of "dangerous" ohnologs, defined as prone to autosomal-dominant deleterious mutations, is shown to be a consequence of WGD-induced speciation and the ensuing purifying selection in post-WGD species. These findings highlight the importance of WGD-induced nonadaptive selection for the emergence of vertebrate complexity, while rationalizing, from an evolutionary perspective, the expansion of gene families frequently implicated in genetic disorders and cancers.
Collapse
Affiliation(s)
- Param Priya Singh
- CNRS UMR168, UPMC, Institut Curie, Research Center, 26, rue d'Ulm, 75248 Paris, France
| | | | | | | | | | | |
Collapse
|
33
|
Alvarez-Ponce D. The relationship between the hierarchical position of proteins in the human signal transduction network and their rate of evolution. BMC Evol Biol 2012; 12:192. [PMID: 23020283 PMCID: PMC3527147 DOI: 10.1186/1471-2148-12-192] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2012] [Accepted: 09/14/2012] [Indexed: 11/23/2022] Open
Abstract
Background Proteins evolve at disparate rates, as a result of the action of different types and strengths of evolutionary forces. An open question in evolutionary biology is what factors are responsible for this variability. In general, proteins whose function has a great impact on organisms’ fitness are expected to evolve under stronger selective pressures. In biosynthetic pathways, upstream genes usually evolve under higher levels of selective constraint than those acting at the downstream part, as a result of their higher hierarchical position. Similar observations have been made in transcriptional regulatory networks, whose upstream elements appear to be more essential and subject to selection. Less well understood is, however, how selective pressures distribute along signal transduction pathways. Results Here, I combine comparative genomics and directed protein interaction data to study the distribution of evolutionary forces across the human signal transduction network. Surprisingly, no evidence was found for higher levels of selective constraint at the upstream network genes (those occupying more hierarchical positions). On the contrary, purifying selection was found to act more strongly on genes acting at the downstream part of the network, which seems to be due to downstream genes being more highly and broadly expressed, performing certain functions and, in particular, encoding proteins that are more highly connected in the protein–protein interaction network. When the effect of these confounding factors is discounted, upstream and downstream genes evolve at similar rates. The trends found in the overall signaling network are exemplified by analysis of the distribution of purifying selection along the mammalian Ras signaling pathway, showing that upstream and downstream genes evolve at similar rates. Conclusions These results indicate that the upstream/downstream position of proteins in the signal transduction network has, in general, no direct effect on their rates of evolution, suggesting that upstream and downstream genes are similarly important for the function of the network. This implies that natural selection differently distributes across signal transduction networks and across biosynthetic and transcriptional regulatory networks, which might reflect fundamental differences in their function and organization.
Collapse
Affiliation(s)
- David Alvarez-Ponce
- Department of Biology, National University of Ireland Maynooth, Maynooth, County Kildare, Ireland.
| |
Collapse
|
34
|
Wu GCT, Chen FC. Determinants of exon-level evolutionary rates in Arabidopsis species. Evol Bioinform Online 2012; 8:389-415. [PMID: 22844194 PMCID: PMC3399485 DOI: 10.4137/ebo.s9743] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022] Open
Abstract
What causes the variations in evolutionary rates is fundamental to molecular evolution. However, in plants, the causes of within-gene evolutionary rate variations remain underexplored. Here we use the principal component regression to examine the contributions of eleven exon features to the within-gene variations in nonsynonymous substitution rate (d(N)), synonymous substitution rate (d(S)), and the d(N)/d(S) ratio in Arabidopsis species. We demonstrate that exon features related to protein structural-functional constraints and mRNA splicing account for the largest proportions of within-gene variations in d(N)/d(S) and d(N). Meanwhile, for d(S), a combination of expression level, exon length, and structural-functional features explains the largest proportion of within-gene variances. Our results suggest that the determinants of within-gene variations differ from those of between-gene variations in evolutionary rates. Furthermore, the relative importance of different exon features also differs between plants and animals. Our study thus may shed a new light on the evolution of plant genes.
Collapse
Affiliation(s)
- Gideon C-T Wu
- Graduate Institute of Life Sciences, National Defense Medical Center, 114 Taiwan
| | | |
Collapse
|
35
|
Chen FC, Liao BY, Pan CL, Lin HY, Chang AYF. Assessing determinants of exonic evolutionary rates in mammals. Mol Biol Evol 2012; 29:3121-9. [PMID: 22504521 DOI: 10.1093/molbev/mss116] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
From studies investigating the differences in evolutionary rates between genes, gene compactness and gene expression level have been identified as important determinants of gene-level protein evolutionary rate, as represented by nonsynonymous to synonymous substitution rate (d(N)/d(S)) ratio. However, the causes of exon-level variances in d(N)/d(S) are less understood. Here, we use principal component regression to examine to what extent 13 exon features explain the variance in d(N), d(S), and the d(N)/d(S) ratio of human-rhesus macaque or human-mouse orthologous exons. The exon features were grouped into six functional categories: expression features, mRNA splicing features, structural-functional features, compactness features, exon duplicability, and other features, including G + C content and exon length. Although expression features are important for determining d(N) and d(N)/d(S) between exons of different genes, structural-functional features and splicing features explained more of the variance for exons of the same genes. Furthermore, we show that compactness features can explain only a relatively small percentage of variance in exon-level d(N) or d(N)/d(S) in either between-gene or within-gene comparison. By contrast, d(S) yielded inconsistent results in the human-mouse comparison and the human-rhesus macaque comparison. This inconsistency may suggest rapid evolutionary changes of the mutation landscape in mammals. Our results suggest that between-gene and within-gene variation in d(N)/d(S) (and d(N)) are driven by different evolutionary forces and that the role of mRNA splicing in causing the variation in evolutionary rates of coding sequences may be underappreciated.
Collapse
Affiliation(s)
- Feng-Chi Chen
- Division of Biostatistics and Bioinformatics, Institute of Population Health Sciences, National Health Research Institutes, Zhunan, Miaoli County, Taiwan, Republic of China.
| | | | | | | | | |
Collapse
|
36
|
Protein misinteraction avoidance causes highly expressed proteins to evolve slowly. Proc Natl Acad Sci U S A 2012; 109:E831-40. [PMID: 22416125 DOI: 10.1073/pnas.1117408109] [Citation(s) in RCA: 129] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The tempo and mode of protein evolution have been central questions in biology. Genomic data have shown a strong influence of the expression level of a protein on its rate of sequence evolution (E-R anticorrelation), which is currently explained by the protein misfolding avoidance hypothesis. Here, we show that this hypothesis does not fully explain the E-R anticorrelation, especially for protein surface residues. We propose that natural selection against protein-protein misinteraction, which wastes functional molecules and is potentially toxic, constrains the evolution of surface residues. Because highly expressed proteins are under stronger pressures to avoid misinteraction, surface residues are expected to show an E-R anticorrelation. Our molecular-level evolutionary simulation and yeast genomic analysis confirm multiple predictions of the hypothesis. These findings show a pluralistic origin of the E-R anticorrelation and reveal the role of protein misinteraction, an inherent property of complex cellular systems, in constraining protein evolution.
Collapse
|
37
|
|
38
|
Abstract
Despite our extensive knowledge about the rate of protein sequence evolution for thousands of genes in hundreds of species, the corresponding rate of protein function evolution is virtually unknown, especially at the genomic scale. This lack of knowledge is primarily because of the huge diversity in protein function and the consequent difficulty in gauging and comparing rates of protein function evolution. Nevertheless, most proteins function through interacting with other proteins, and protein-protein interaction (PPI) can be tested by standard assays. Thus, the rate of protein function evolution may be measured by the rate of PPI evolution. Here, we experimentally examine 87 potential interactions between Kluyveromyces waltii proteins, whose one to one orthologs in the related budding yeast Saccharomyces cerevisiae have been reported to interact. Combining our results with available data from other eukaryotes, we estimate that the evolutionary rate of protein interaction is (2.6 ± 1.6) × 10(-10) per PPI per year, which is three orders of magnitude lower than the rate of protein sequence evolution measured by the number of amino acid substitutions per protein per year. The extremely slow evolution of protein molecular function may account for the remarkable conservation of life at molecular and cellular levels and allow for studying the mechanistic basis of human disease in much simpler organisms.
Collapse
|
39
|
Hudson CM, Conant GC. Expression level, cellular compartment and metabolic network position all influence the average selective constraint on mammalian enzymes. BMC Evol Biol 2011; 11:89. [PMID: 21470417 PMCID: PMC3082228 DOI: 10.1186/1471-2148-11-89] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2010] [Accepted: 04/06/2011] [Indexed: 12/24/2022] Open
Abstract
BACKGROUND A gene's position in regulatory, protein interaction or metabolic networks can be predictive of the strength of purifying selection acting on it, but these relationships are neither universal nor invariably strong. Following work in bacteria, fungi and invertebrate animals, we explore the relationship between selective constraint and metabolic function in mammals. RESULTS We measure the association between selective constraint, estimated by the ratio of nonsynonymous (Ka) to synonymous (Ks) substitutions, and several, primarily metabolic, measures of gene function. We find significant differences between the selective constraints acting on enzyme-coding genes from different cellular compartments, with the nucleus showing higher constraint than genes from either the cytoplasm or the mitochondria. Among metabolic genes, the centrality of an enzyme in the metabolic network is significantly correlated with Ka/Ks. In contrast to yeasts, gene expression magnitude does not appear to be the primary predictor of selective constraint in these organisms. CONCLUSIONS Our results imply that the relationship between selective constraint and enzyme centrality is complex: the strength of selective constraint acting on mammalian genes is quite variable and does not appear to exclusively follow patterns seen in other organisms.
Collapse
Affiliation(s)
- Corey M Hudson
- Informatics Institute, University of Missouri, Columbia, MO, USA.
| | | |
Collapse
|
40
|
Chen SCC, Chuang TJ, Li WH. The relationships among microRNA regulation, intrinsically disordered regions, and other indicators of protein evolutionary rate. Mol Biol Evol 2011; 28:2513-20. [PMID: 21398349 DOI: 10.1093/molbev/msr068] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
Many indicators of protein evolutionary rate have been proposed, but some of them are interrelated. The purpose of this study is to disentangle their correlations. We assess the strength of each indicator by controlling for the other indicators under study. We find that the number of microRNA (miRNA) types that regulate a gene is the strongest rate indicator (a negative correlation), followed by disorder content (the percentage of disordered regions in a protein, a positive correlation); the strength of disorder content as a rate indicator is substantially increased after controlling for the number of miRNA types. By dividing proteins into lowly and highly intrinsically disordered proteins (L-IDPs and H-IDPs), we find that proteins interacting with more H-IDPs tend to evolve more slowly, which largely explains the previous observation of a negative correlation between the number of protein-protein interactions and evolutionary rate. Moreover, all of the indicators examined here, except for the number of miRNA types, have different strengths in L-IDPs and in H-IDPs. Finally, the number of phosphorylation sites is weakly correlated with the number of miRNA types, and its strength as a rate indicator is substantially reduced when other indicators are considered. Our study reveals the relative strength of each rate indicator and increases our understanding of protein evolution.
Collapse
Affiliation(s)
- Sean Chun-Chang Chen
- Institute of BioMedical Informatics, National Yang-Ming University, Taipei, Taiwan
| | | | | |
Collapse
|
41
|
Yang JR, Zhuang SM, Zhang J. Impact of translational error-induced and error-free misfolding on the rate of protein evolution. Mol Syst Biol 2011; 6:421. [PMID: 20959819 PMCID: PMC2990641 DOI: 10.1038/msb.2010.78] [Citation(s) in RCA: 77] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2010] [Accepted: 08/31/2010] [Indexed: 11/26/2022] Open
Abstract
Theoretical calculations suggest that, in addition to translational error-induced protein misfolding, a non-negligible fraction of misfolded proteins are error free. We propose that the anticorrelation between the expression level of a protein and its rate of sequence evolution be explained by an overarching protein-misfolding-avoidance hypothesis that includes selection against both error-induced and error-free protein misfolding, and verify this model by a molecular-level evolutionary simulation. We provide strong empirical evidence for the protein-misfolding-avoidance hypothesis, including a positive correlation between protein expression level and stability, enrichment of misfolding-minimizing codons and amino acids in highly expressed genes, and stronger evolutionary conservation of residues in which nonsynonymous changes are more likely to increase protein misfolding.
The rate of protein sequence evolution has long been of central interest to molecular evolutionists. Different proteins of the same species evolve at vastly different rates, which is commonly explained by a variation in functional constraint among different proteins (Kimura and Ohta, 1974). However, it is unclear how to quantify the functional constraint of a protein from the knowledge of its function. In the past decade, various types of genomic data from model organisms have been examined to look for the determinants of the rate of protein sequence evolution. The most unexpected discovery was a very strong anticorrelation between the expression level and evolutionary rate of a protein (E–R anticorrelation) (Pal et al, 2001). The prevailing explanation of the E–R anticorrelation is the translational robustness hypothesis (Drummond et al, 2005). This hypothesis posits that mistranslation induces protein misfolding, which is toxic to cells (Figure 1). Consequently, highly expressed proteins are under stronger pressures to be translationally robust and thus are more constrained in sequence evolution. However, the impact of the other source of misfolded proteins, translational error-free proteins (Figure 1), has not been evaluated. By theoretical calculation, computer simulation, and empirical data analysis, we examined the role of selection against both error-induced and error-free protein misfolding in creating the E–R correlation. Our theoretical calculations suggested that a non-negligible fraction of misfolded proteins are error free. We estimated that when a protein is not very stable, on average ∼20% of misfolded molecules are error free. However, when a protein is very stable, this fraction reduces to ∼5%, which is probably a result of natural selection against protein misfolding. We conducted a molecular-level evolutionary simulation (Figure 2A) using three different schemes: error-induced misfolding only, error-free misfolding only, and both types of misfolding. As expected, results from the first simulation are similar to those from a previous study that considers only error-induced misfolding (Drummond and Wilke, 2008). Interestingly, the second and third simulations can also generate the same patterns, including a positive correlation between the protein expression level and the unfolding energy (ΔG) of the error-free protein (Figure 2B), a negative correlation between the expression level and the fraction of protein molecules that misfold after being mistranslated (Figure 2C), a negative correlation between ΔG and the evolutionary rate (Figure 2D), and a negative correlation between the expression level and the evolutionary rate (i.e., the E–R anticorrelation) (Figure 2E). Furthermore, we found that selection against protein misfolding is more effective in reducing error-free misfolding than error-induced misfolding. Based on these results, we propose that an overarching protein-misfolding-avoidance hypothesis that includes both sources of misfolding is superior to the prevailing translational robustness hypothesis, which considers only error-induced misfolding. We tested three key predictions of the protein-misfolding-avoidance hypotheses using yeast data. First, we showed that, consistent with our prediction, a positive correlation exists between the protein expression level and stability, which is measured by the unfolding energy or melting temperature. In addition, protein expression level is negatively correlated with protein aggregation propensity. Second, we found that codons minimizing protein misfolding are used more frequently in highly expressed proteins than in lowly expressed ones. Third, we showed that, within the same protein, amino acid residues in which random nonsynonymous mutations are more likely to increase protein misfolding are evolutionarily more conserved. Together, these results provide unambiguous evidence that avoidance of both error-induced and error-free protein misfolding is a major source of the E–R anticorrelation and that protein stability and mistranslation have important roles in protein evolution. What determines the rate of protein evolution is a fundamental question in biology. Recent genomic studies revealed a surprisingly strong anticorrelation between the expression level of a protein and its rate of sequence evolution. This observation is currently explained by the translational robustness hypothesis in which the toxicity of translational error-induced protein misfolding selects for higher translational robustness of more abundant proteins, which constrains sequence evolution. However, the impact of error-free protein misfolding has not been evaluated. We estimate that a non-negligible fraction of misfolded proteins are error free and demonstrate by a molecular-level evolutionary simulation that selection against protein misfolding results in a greater reduction of error-free misfolding than error-induced misfolding. Thus, an overarching protein-misfolding-avoidance hypothesis that includes both sources of misfolding is superior to the translational robustness hypothesis. We show that misfolding-minimizing amino acids are preferentially used in highly abundant yeast proteins and that these residues are evolutionarily more conserved than other residues of the same proteins. These findings provide unambiguous support to the role of protein-misfolding-avoidance in determining the rate of protein sequence evolution.
Collapse
Affiliation(s)
- Jian-Rong Yang
- Key Laboratory of Gene Engineering of the Ministry of Education, State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-sen University, Guangzhou, PR China
| | | | | |
Collapse
|