1
|
Potera K, Tomala K. Using yeasts for the studies of nonfunctional factors in protein evolution. Yeast 2024. [PMID: 38895906 DOI: 10.1002/yea.3970] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2024] [Revised: 05/08/2024] [Accepted: 06/06/2024] [Indexed: 06/21/2024] Open
Abstract
The evolution of protein sequence is driven not only by factors directly related to protein function and shape but also by nonfunctional factors. Such factors in protein evolution might be categorized as those connected to energetic costs, synthesis efficiency, and avoidance of misfolding and toxicity. A common approach to studying them is correlational analysis contrasting them with some characteristics of the protein, like amino acid composition, but these features are interdependent. To avoid possible bias, empirical studies are needed, and not enough work has been done to date. In this review, we describe the role of nonfunctional factors in protein evolution and present an experimental approach using yeast as a suitable model organism. The focus of the proposed approach is on the potential negative impact on the fitness of mutations that change protein properties not related to function and the frequency of mutations that change these properties. Experimental results of testing the misfolding avoidance hypothesis as an explanation for why highly expressed proteins evolve slowly are inconsistent with correlational research results. Therefore, more efforts should be made to empirically test the effects of nonfunctional factors in protein evolution and to contrast these results with the results of the correlational analysis approach.
Collapse
Affiliation(s)
- Katarzyna Potera
- Faculty of Biology, Institute of Environmental Sciences, Jagiellonian University, Krakow, Poland
- Doctoral School of Exact and Natural Sciences, Jagiellonian University, Krakow, Poland
| | - Katarzyna Tomala
- Faculty of Biology, Institute of Environmental Sciences, Jagiellonian University, Krakow, Poland
| |
Collapse
|
2
|
Jain V, Cope AL. Examining the Effects of Temperature on the Evolution of Bacterial tRNA Pools. Genome Biol Evol 2024; 16:evae116. [PMID: 38805023 PMCID: PMC11166485 DOI: 10.1093/gbe/evae116] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2023] [Revised: 05/13/2024] [Accepted: 05/21/2024] [Indexed: 05/29/2024] Open
Abstract
The genetic code consists of 61 codons coding for 20 amino acids. These codons are recognized by transfer RNAs (tRNAs) that bind to specific codons during protein synthesis. All organisms utilize less than all 61 possible anticodons due to base pair wobble: the ability to have a mismatch with a codon at its third nucleotide. Previous studies observed a correlation between the tRNA pool of bacteria and the temperature of their respective environments. However, it is unclear if these patterns represent biological adaptations to maintain the efficiency and accuracy of protein synthesis in different environments. A mechanistic mathematical model of mRNA translation is used to quantify the expected elongation rates and error rate for each codon based on an organism's tRNA pool. A comparative analysis across a range of bacteria that accounts for covariance due to shared ancestry is performed to quantify the impact of environmental temperature on the evolution of the tRNA pool. We find that thermophiles generally have more anticodons represented in their tRNA pool than mesophiles or psychrophiles. Based on our model, this increased diversity is expected to lead to increased missense errors. The implications of this for protein evolution in thermophiles are discussed.
Collapse
Affiliation(s)
- Vatsal Jain
- Biotechnology High School, Freehold, NJ, USA
| | - Alexander L Cope
- Department of Genetics, Rutgers University, Piscataway, NJ, USA
- Human Genetics Institute of New Jersey, Rutgers University, Piscataway, NJ, USA
- Robert Wood Johnson Medical School, Rutgers University, Piscataway, NJ, USA
| |
Collapse
|
3
|
Xu L, Ren Y, Wu J, Cui T, Dong R, Huang C, Feng Z, Zhang T, Yang P, Yuan J, Xu X, Liu J, Wang J, Chen W, Mi D, Irwin DM, Yan Y, Xu L, Yu X, Li G. Evolution and expression patterns of the neo-sex chromosomes of the crested ibis. Nat Commun 2024; 15:1670. [PMID: 38395916 PMCID: PMC10891136 DOI: 10.1038/s41467-024-46052-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2023] [Accepted: 02/08/2024] [Indexed: 02/25/2024] Open
Abstract
Bird sex chromosomes play a unique role in sex-determination, and affect the sexual morphology and behavior of bird species. Core waterbirds, a major clade of birds, share the common characteristics of being sexually monomorphic and having lower levels of inter-sexual conflict, yet their sex chromosome evolution remains poorly understood. Here, by we analyse of a chromosome-level assembly of a female crested ibis (Nipponia nippon), a typical core waterbird. We identify neo-sex chromosomes resulting from fusion of microchromosomes with ancient sex chromosomes. These fusion events likely occurred following the divergence of Threskiornithidae and Ardeidae. The neo-W chromosome of the crested ibis exhibits the characteristics of slow degradation, which is reflected in its retention of abundant gametologous genes. Neo-W chromosome genes display an apparent ovary-biased gene expression, which is largely driven by genes that are retained on the crested ibis W chromosome but lost in other bird species. These results provide new insights into the evolutionary history and expression patterns for the sex chromosomes of bird species.
Collapse
Affiliation(s)
- Lulu Xu
- College of Life Sciences, Shaanxi Normal University, Xi'an, China
| | - Yandong Ren
- College of Life Sciences, Shaanxi Normal University, Xi'an, China
| | - Jiahong Wu
- MOE Key Laboratory of Freshwater Fish Reproduction and Development, School of Life Sciences, Southwest University, Chongqing, China
| | - Tingting Cui
- College of Life Sciences, Shaanxi Normal University, Xi'an, China
| | - Rong Dong
- Research Center for Qinling Giant Panda, Shaanxi Academy of Forestry, Xi'an, China
| | - Chen Huang
- College of Life Sciences, Shaanxi Normal University, Xi'an, China
| | - Zhe Feng
- College of Life Sciences, Shaanxi Normal University, Xi'an, China
| | - Tianmin Zhang
- College of Life Sciences, Shaanxi Normal University, Xi'an, China
| | - Peng Yang
- College of Life Sciences, Shaanxi Normal University, Xi'an, China
| | - Jiaqing Yuan
- College of Life Sciences, Shaanxi Normal University, Xi'an, China
| | - Xiao Xu
- College of Life Sciences, Shaanxi Normal University, Xi'an, China
| | - Jiao Liu
- MOE Key Laboratory of Freshwater Fish Reproduction and Development, School of Life Sciences, Southwest University, Chongqing, China
| | - Jinhong Wang
- College of Life Sciences, Shaanxi Normal University, Xi'an, China
| | - Wu Chen
- Guangzhou Wildlife Research Center, Guangzhou Zoo, Guangzhou, China
| | - Da Mi
- Xi'an Haorui Genomics Technology Co., LTD, Xi'an, China
| | - David M Irwin
- Department of Laboratory Medicine and Pathobiology, University of Toronto, Toronto, ON, M5S 1A8, Canada
| | - Yaping Yan
- College of Life Sciences, Shaanxi Normal University, Xi'an, China
| | - Luohao Xu
- MOE Key Laboratory of Freshwater Fish Reproduction and Development, School of Life Sciences, Southwest University, Chongqing, China.
| | - Xiaoping Yu
- College of Life Sciences, Shaanxi Normal University, Xi'an, China.
| | - Gang Li
- College of Life Sciences, Shaanxi Normal University, Xi'an, China.
- Guangzhou Wildlife Research Center, Guangzhou Zoo, Guangzhou, China.
| |
Collapse
|
4
|
Liu H, Sun M, Zhang J. Genomic estimates of mutation and substitution rates contradict the evolutionary speed hypothesis of the latitudinal diversity gradient. Proc Biol Sci 2023; 290:20231787. [PMID: 37876195 PMCID: PMC10598419 DOI: 10.1098/rspb.2023.1787] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2023] [Accepted: 09/26/2023] [Indexed: 10/26/2023] Open
Abstract
The latitudinal diversity gradient (LDG) refers to a decrease in biodiversity from the equator to the poles. The evolutionary speed hypothesis, backed by the metabolic theory of ecology, asserts that nucleotide mutation and substitution rates per site per year are higher and thereby speciation rates are higher at higher temperatures, generating the LDG. However, prior empirical investigations of the relationship between the temperature and mutation or substitution rate were based on a few genes and the results were mixed. We here revisit this relationship using genomic data. No significant correlation between the temperature and mutation rate is found in 13 prokaryotes or in 107 eukaryotes. An analysis of 234 diverse trios of bacterial taxa indicates that the synonymous substitution rate is not significantly associated with the growth temperature. The same data, however, reveal a significant negative association between the nonsynonymous substitution rate and temperature, which is explainable by a larger fraction of detrimental nonsynonymous mutations at higher temperatures due to a stronger demand for protein stability. We conclude that the evolutionary speed hypothesis of the LDG is unsupported by genomic data and advise that future mechanistic studies of the LDG should focus on other hypotheses.
Collapse
Affiliation(s)
- Haoxuan Liu
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI 48109, USA
- Center for Evolutionary & Organismal Biology and the Fourth Affiliated Hospital of Zhejiang University, Zhejiang University School of Medicine, Hangzhou 310058, People's Republic of China
| | - Mengyi Sun
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI 48109, USA
| | - Jianzhi Zhang
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI 48109, USA
| |
Collapse
|
5
|
Jain V, Cope AL. Determining the effects of temperature on the evolution of bacterial tRNA pools. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.09.26.559538. [PMID: 37873246 PMCID: PMC10592612 DOI: 10.1101/2023.09.26.559538] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/25/2023]
Abstract
The genetic code consists of 61 codon coding for 20 amino acids. These codons are recognized by transfer RNAs (tRNA) that bind to specific codons during protein synthesis. Most organisms utilize less than all 61 possible anticodons due to base pair wobble: the ability to have a mismatch with a codon at its third nucleotide. Previous studies observed a correlation between the tRNA pool of bacteria and the temperature of their respective environments. However, it is unclear if these patterns represent biological adaptations to maintain the efficiency and accuracy of protein synthesis in different environments. A mechanistic mathematical model of mRNA translation is used to quantify the expected elongation rates and error rate for each codon based on an organism's tRNA pool. A comparative analysis across a range of bacteria that accounts for covariance due to shared ancestry is performed to quantify the impact of environmental temperature on the evolution of the tRNA pool. We find that thermophiles generally have more anticodons represented in their tRNA pool than mesophiles or psychrophiles. Based on our model, this increased diversity is expected to lead to increased missense errors. The implications of this for protein evolution in thermophiles are discussed.
Collapse
Affiliation(s)
- Vatsal Jain
- Biotechnology High School, Freehold, New Jersey
| | - Alexander L. Cope
- Department of Genetics, Rutgers University, Piscataway, New Jersey
- Human Genetics Institute of New Jersey, Rutgers University, Piscataway, New Jersey
| |
Collapse
|
6
|
Luzuriaga-Neira AR, Ritchie AM, Payne BL, Carrillo-Parramon O, Liberles DA, Alvarez-Ponce D. Highly Abundant Proteins Are Highly Thermostable. Genome Biol Evol 2023; 15:evad112. [PMID: 37399326 DOI: 10.1093/gbe/evad112] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/08/2023] [Indexed: 07/05/2023] Open
Abstract
Highly abundant proteins tend to evolve slowly (a trend called E-R anticorrelation), and a number of hypotheses have been proposed to explain this phenomenon. The misfolding avoidance hypothesis attributes the E-R anticorrelation to the abundance-dependent toxic effects of protein misfolding. To avoid these toxic effects, protein sequences (particularly those of highly expressed proteins) would be under selection to fold properly. One prediction of the misfolding avoidance hypothesis is that highly abundant proteins should exhibit high thermostability (i.e., a highly negative free energy of folding, ΔG). Thus far, only a handful of analyses have tested for a relationship between protein abundance and thermostability, producing contradictory results. These analyses have been limited by 1) the scarcity of ΔG data, 2) the fact that these data have been obtained by different laboratories and under different experimental conditions, 3) the problems associated with using proteins' melting energy (Tm) as a proxy for ΔG, and 4) the difficulty of controlling for potentially confounding variables. Here, we use computational methods to compare the free energy of folding of pairs of human-mouse orthologous proteins with different expression levels. Even though the effect size is limited, the most highly expressed ortholog is often the one with a more negative ΔG of folding, indicating that highly expressed proteins are often more thermostable.
Collapse
Affiliation(s)
| | - Andrew M Ritchie
- Department of Biology and Center for Computational Genetics and Genomics, Temple University, Philadelphia, Pennsylvania, USA
| | | | | | - David A Liberles
- Department of Biology and Center for Computational Genetics and Genomics, Temple University, Philadelphia, Pennsylvania, USA
| | | |
Collapse
|
7
|
Zhang J. What Has Genomics Taught An Evolutionary Biologist? GENOMICS, PROTEOMICS & BIOINFORMATICS 2023; 21:1-12. [PMID: 36720382 PMCID: PMC10373158 DOI: 10.1016/j.gpb.2023.01.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/07/2022] [Revised: 01/06/2023] [Accepted: 01/19/2023] [Indexed: 01/30/2023]
Abstract
Genomics, an interdisciplinary field of biology on the structure, function, and evolution of genomes, has revolutionized many subdisciplines of life sciences, including my field of evolutionary biology, by supplying huge data, bringing high-throughput technologies, and offering a new approach to biology. In this review, I describe what I have learned from genomics and highlight the fundamental knowledge and mechanistic insights gained. I focus on three broad topics that are central to evolutionary biology and beyond-variation, interaction, and selection-and use primarily my own research and study subjects as examples. In the next decade or two, I expect that the most important contributions of genomics to evolutionary biology will be to provide genome sequences of nearly all known species on Earth, facilitate high-throughput phenotyping of natural variants and systematically constructed mutants for mapping genotype-phenotype-fitness landscapes, and assist the determination of causality in evolutionary processes using experimental evolution.
Collapse
Affiliation(s)
- Jianzhi Zhang
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI 48109, USA.
| |
Collapse
|
8
|
Bédard C, Cisneros AF, Jordan D, Landry CR. Correlation between protein abundance and sequence conservation: what do recent experiments say? Curr Opin Genet Dev 2022; 77:101984. [PMID: 36162152 DOI: 10.1016/j.gde.2022.101984] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2022] [Revised: 08/23/2022] [Accepted: 08/26/2022] [Indexed: 01/27/2023]
Abstract
Cells evolve in a space of parameter values set by physical and chemical forces. These constraints create associations among cellular properties. A particularly strong association is the negative correlation between the rate of evolution of proteins and their abundance in the cell. Highly expressed proteins evolve slower than lowly expressed ones. Multiple hypotheses have been put forward to explain this relationship, including, for instance, the requirement for higher mRNA stability, misfolding avoidance, and misinteraction avoidance for highly expressed proteins. Here, we review some of these hypotheses, their predictions, and how they are supported to finally discuss recent experiments that have been performed to test these predictions.
Collapse
Affiliation(s)
- Camille Bédard
- Département de Biologie, Faculté des Sciences et de Génie, Université Laval, G1V 0A6, Canada; Institut de Biologie Intégrative et des Systèmes, Université Laval, G1V 0A6, Canada; PROTEO, Le regroupement québécois de recherche sur la fonction, l'ingénierie et les applications des protéines, Université Laval, G1V 0A6, Canada; Centre de Recherche sur les Données Massives, Université Laval, G1V 0A6, Canada. https://twitter.com/@CamilleBed17
| | - Angel F Cisneros
- Institut de Biologie Intégrative et des Systèmes, Université Laval, G1V 0A6, Canada; PROTEO, Le regroupement québécois de recherche sur la fonction, l'ingénierie et les applications des protéines, Université Laval, G1V 0A6, Canada; Centre de Recherche sur les Données Massives, Université Laval, G1V 0A6, Canada; Département de Biochimie, de Microbiologie et de Bio-informatique, Faculté des Sciences et de Génie, Université Laval, G1V 0A6, Canada. https://twitter.com/@AngelFCC119
| | - David Jordan
- Institut de Biologie Intégrative et des Systèmes, Université Laval, G1V 0A6, Canada; PROTEO, Le regroupement québécois de recherche sur la fonction, l'ingénierie et les applications des protéines, Université Laval, G1V 0A6, Canada; Centre de Recherche sur les Données Massives, Université Laval, G1V 0A6, Canada; Département de Biochimie, de Microbiologie et de Bio-informatique, Faculté des Sciences et de Génie, Université Laval, G1V 0A6, Canada. https://twitter.com/@DavidJordan1997
| | - Christian R Landry
- Département de Biologie, Faculté des Sciences et de Génie, Université Laval, G1V 0A6, Canada; Institut de Biologie Intégrative et des Systèmes, Université Laval, G1V 0A6, Canada; PROTEO, Le regroupement québécois de recherche sur la fonction, l'ingénierie et les applications des protéines, Université Laval, G1V 0A6, Canada; Centre de Recherche sur les Données Massives, Université Laval, G1V 0A6, Canada; Département de Biochimie, de Microbiologie et de Bio-informatique, Faculté des Sciences et de Génie, Université Laval, G1V 0A6, Canada.
| |
Collapse
|
9
|
Shibai A, Kotani H, Sakata N, Furusawa C, Tsuru S. Purifying selection enduringly acts on the sequence evolution of highly expressed proteins in Escherichia coli. G3 GENES|GENOMES|GENETICS 2022; 12:6694045. [PMID: 36073932 PMCID: PMC9635659 DOI: 10.1093/g3journal/jkac235] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/04/2022] [Accepted: 08/27/2022] [Indexed: 11/17/2022]
Abstract
The evolutionary speed of a protein sequence is constrained by its expression level, with highly expressed proteins evolving relatively slowly. This negative correlation between expression levels and evolutionary rates (known as the E–R anticorrelation) has already been widely observed in past macroevolution between species from bacteria to animals. However, it remains unclear whether this seemingly general law also governs recent evolution, including past and de novo, within a species. However, the advent of genomic sequencing and high-throughput phenotyping, particularly for bacteria, has revealed fundamental gaps between the 2 evolutionary processes and has provided empirical data opposing the possible underlying mechanisms which are widely believed. These conflicts raise questions about the generalization of the E–R anticorrelation and the relevance of plausible mechanisms. To explore the ubiquitous impact of expression levels on molecular evolution and test the relevance of the possible underlying mechanisms, we analyzed the genome sequences of 99 strains of Escherichia coli for evolution within species in nature. We also analyzed genomic mutations accumulated under laboratory conditions as a model of de novo evolution within species. Here, we show that E–R anticorrelation is significant in both past and de novo evolution within species in E. coli. Our data also confirmed ongoing purifying selection on highly expressed genes. Ongoing selection included codon-level purifying selection, supporting the relevance of the underlying mechanisms. However, the impact of codon-level purifying selection on the constraints in evolution within species might be smaller than previously expected from evolution between species.
Collapse
Affiliation(s)
- Atsushi Shibai
- Center for Biosystems Dynamics Research (BDR), RIKEN , Osaka 565-0874, Japan
| | - Hazuki Kotani
- Center for Biosystems Dynamics Research (BDR), RIKEN , Osaka 565-0874, Japan
| | - Natsue Sakata
- Center for Biosystems Dynamics Research (BDR), RIKEN , Osaka 565-0874, Japan
| | - Chikara Furusawa
- Center for Biosystems Dynamics Research (BDR), RIKEN , Osaka 565-0874, Japan
- Universal Biology Institute, School of Science, The University of Tokyo , Tokyo 113-0033, Japan
| | - Saburo Tsuru
- Universal Biology Institute, School of Science, The University of Tokyo , Tokyo 113-0033, Japan
| |
Collapse
|
10
|
Low protein expression enhances phenotypic evolvability by intensifying selection on folding stability. Nat Ecol Evol 2022; 6:1155-1164. [PMID: 35798838 PMCID: PMC7613228 DOI: 10.1038/s41559-022-01797-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2021] [Accepted: 05/19/2022] [Indexed: 01/09/2023]
Abstract
Protein abundance affects the evolution of protein genotypes, but we do not know how it affects the evolution of protein phenotypes. Here we investigate the role of protein abundance in the evolvability of green fluorescent protein (GFP) towards the novel phenotype of cyan fluorescence. We evolve GFP in E. coli through multiple cycles of mutation and selection and show that low GFP expression facilitates the evolution of cyan fluorescence. A computational model whose predictions we test experimentally helps explain why: lowly expressed proteins are under stronger selection for proper folding, which facilitates their evolvability on short evolutionary time scales. The reason is that high fluorescence can be achieved by either few proteins that fold well or by many proteins that fold less well. In other words, we observe a synergy between a protein's scarcity and its stability. Because many proteins meet the essential requirements for this scarcity-stability synergy, it may be a widespread mechanism by which low expression helps proteins evolve new phenotypes and functions.
Collapse
|
11
|
Expression level is a major modifier of the fitness landscape of a protein coding gene. Nat Ecol Evol 2021; 6:103-115. [PMID: 34795386 DOI: 10.1038/s41559-021-01578-x] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2021] [Accepted: 10/01/2021] [Indexed: 11/09/2022]
Abstract
The phenotypic consequence of a genetic mutation depends on many factors including the expression level of a gene. However, a comprehensive quantification of this expression effect is still lacking, as is a further general mechanistic understanding of the effect. Here, we measured the fitness effect of almost all (>97.5%) single-nucleotide mutations in GFP, an exogenous gene with no physiological function, and URA3, a conditionally essential gene. Both genes were driven by two promoters whose expression levels differed by around tenfold. The resulting fitness landscapes revealed that the fitness effects of at least 42% of all single-nucleotide mutations within the genes were expression dependent. Although only a small fraction of variation in fitness effects among different mutations can be explained by biophysical properties of the protein and messenger RNA of the gene, our analyses revealed that the avoidance of stochastic molecular errors generally underlies the expression dependency of mutational effects and suggested protein misfolding as the most important type of molecular error among those examined. Our results therefore directly explained the slower evolution of highly expressed genes and highlighted cytotoxicity due to stochastic molecular errors as a non-negligible component for understanding the phenotypic consequence of mutations.
Collapse
|
12
|
Sarkar C, Alvarez-Ponce D. Extracellular domains of transmembrane proteins defy the expression level-evolutionary rate anticorrelation. Genome Biol Evol 2021; 14:6402012. [PMID: 34665250 PMCID: PMC8755491 DOI: 10.1093/gbe/evab235] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/14/2021] [Indexed: 11/13/2022] Open
Abstract
Highly expressed proteins tend to evolve slowly, a trend known as the expression level-rate of evolution (E-R) anticorrelation. Whereas the reasons for this anticorrelation remain unclear, the most influential hypotheses attribute it to highly expressed proteins being subjected to strong selective pressures to avoid misfolding and/or misinteraction. In accordance with these hypotheses, work in our laboratory has recently shown that extracellular (secreted) proteins lack an E-R anticorrelation (or exhibit a weaker than usual E-R anticorrelation). Extracellular proteins are folded inside the endoplasmic reticulum, where enhanced quality control of folding mechanisms exist, and function in the extracellular space, where misinteraction is unlikely to occur or to produce deleterious effects. Transmembrane proteins contain both intracellular domains (which are folded and function in the cytosol) and extracellular domains (which complete their folding in the endoplasmic reticulum and function in the extracellular space). We thus hypothesized that the extracellular domains of transmembrane proteins should exhibit a weaker E-R anticorrelation than their intracellular domains. Our analyses of human, Saccharomyces and Arabidopsis transmembrane proteins allowed us to confirm our hypothesis. Our results are in agreement with models attributing the E-R anticorrelation to the deleterious effects of misfolding and/or misinteraction.
Collapse
Affiliation(s)
- Chandra Sarkar
- Department of Biology, University of Nevada, Reno, NV, USA
| | | |
Collapse
|
13
|
Biesiadecka MK, Sliwa P, Tomala K, Korona R. An Overexpression Experiment Does Not Support the Hypothesis That Avoidance of Toxicity Determines the Rate of Protein Evolution. Genome Biol Evol 2021; 12:589-596. [PMID: 32259256 PMCID: PMC7250497 DOI: 10.1093/gbe/evaa067] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/01/2020] [Indexed: 12/22/2022] Open
Abstract
The misfolding avoidance hypothesis postulates that sequence mutations render proteins cytotoxic and therefore the higher the gene expression, the stronger the operation of selection against substitutions. This translates into prediction that relative toxicity of extant proteins is higher for those evolving faster. In the present experiment, we selected pairs of yeast genes which were paralogous but evolving at different rates. We expressed them artificially to high levels. We expected that toxicity would be higher for ones bearing more mutations, especially that overcrowding should rather exacerbate than reverse the already existing differences in misfolding rates. We did find that the applied mode of overexpression caused a considerable decrease in fitness and that the decrease was proportional to the amount of excessive protein. However, it was not higher for proteins which are normally expressed at lower levels (and have less conserved sequence). This result was obtained consistently, regardless whether the rate of growth or ability to compete in common cultures was used as a proxy for fitness. In additional experiments, we applied factors that reduce accuracy of translation or enhance structural instability of proteins. It did not change a consistent pattern of independence between the fitness cost caused by overexpression of a protein and the rate of its sequence evolution.
Collapse
Affiliation(s)
| | - Piotr Sliwa
- Department of Genetics, Faculty of Biotechnology, University of Rzeszów, Poland
| | - Katarzyna Tomala
- Institute of Environmental Sciences, Faculty of Biology, Jagiellonian University, Cracow, Poland
| | - Ryszard Korona
- Institute of Environmental Sciences, Faculty of Biology, Jagiellonian University, Cracow, Poland
| |
Collapse
|
14
|
Maddamsetti R. Universal Constraints on Protein Evolution in the Long-Term Evolution Experiment with Escherichia coli. Genome Biol Evol 2021; 13:evab070. [PMID: 33856016 PMCID: PMC8233687 DOI: 10.1093/gbe/evab070] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/30/2021] [Indexed: 12/18/2022] Open
Abstract
Although it is well known that abundant proteins evolve slowly across the tree of life, there is little consensus for why this is true. Here, I report that abundant proteins evolve slowly in the hypermutator populations of Lenski's long-term evolution experiment with Escherichia coli (LTEE). Specifically, the density of all observed mutations per gene, as measured in metagenomic time series covering 60,000 generations of the LTEE, significantly anticorrelates with mRNA abundance, protein abundance, and degree of protein-protein interaction. The same pattern holds for nonsynonymous mutation density. However, synonymous mutation density, measured across the LTEE hypermutator populations, positively correlates with protein abundance. These results show that universal constraints on protein evolution are visible in data spanning three decades of experimental evolution. Therefore, it should be possible to design experiments to answer why abundant proteins evolve slowly.
Collapse
Affiliation(s)
- Rohan Maddamsetti
- Department of Biomedical Engineering, Duke University, Durham, North Carolina, USA
| |
Collapse
|
15
|
Razban RM, Dasmeh P, Serohijos AWR, Shakhnovich EI. Avoidance of protein unfolding constrains protein stability in long-term evolution. Biophys J 2021; 120:2413-2424. [PMID: 33932438 PMCID: PMC8390877 DOI: 10.1016/j.bpj.2021.03.042] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2020] [Revised: 02/24/2021] [Accepted: 03/17/2021] [Indexed: 11/28/2022] Open
Abstract
Every amino acid residue can influence a protein's overall stability, making stability highly susceptible to change throughout evolution. We consider the distribution of protein stabilities evolutionarily permittable under two previously reported protein fitness functions: flux dynamics and misfolding avoidance. We develop an evolutionary dynamics theory and find that it agrees better with an extensive protein stability data set for dihydrofolate reductase orthologs under the misfolding avoidance fitness function rather than the flux dynamics fitness function. Further investigation with ribonuclease H data demonstrates that not any misfolded state is avoided; rather, it is only the unfolded state. At the end, we discuss how our work pertains to the universal protein abundance-evolutionary rate correlation seen across organisms' proteomes. We derive a closed-form expression relating protein abundance to evolutionary rate that captures Escherichia coli, Saccharomyces cerevisiae, and Homo sapiens experimental trends without fitted parameters.
Collapse
Affiliation(s)
- Rostam M Razban
- Department of Chemistry and Chemical Biology, Harvard University, Cambridge, Massachusetts
| | - Pouria Dasmeh
- Department of Chemistry and Chemical Biology, Harvard University, Cambridge, Massachusetts; Departement de Biochimie, Université de Montréal, Montreal, Quebec, Canada
| | | | - Eugene I Shakhnovich
- Department of Chemistry and Chemical Biology, Harvard University, Cambridge, Massachusetts.
| |
Collapse
|
16
|
Wei C, Chen YM, Chen Y, Qian W. The Missing Expression Level-Evolutionary Rate Anticorrelation in Viruses Does Not Support Protein Function as a Main Constraint on Sequence Evolution. Genome Biol Evol 2021; 13:evab049. [PMID: 33713114 PMCID: PMC7989579 DOI: 10.1093/gbe/evab049] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/06/2021] [Indexed: 12/13/2022] Open
Abstract
One of the central goals in molecular evolutionary biology is to determine the sources of variation in the rate of sequence evolution among proteins. Gene expression level is widely accepted as the primary determinant of protein evolutionary rate, because it scales with the extent of selective constraints imposed on a protein, leading to the well-known negative correlation between expression level and protein evolutionary rate (the E-R anticorrelation). Selective constraints have been hypothesized to entail the maintenance of protein function, the avoidance of cytotoxicity caused by protein misfolding or nonspecific protein-protein interactions, or both. However, empirical tests evaluating the relative importance of these hypotheses remain scarce, likely due to the nontrivial difficulties in distinguishing the effect of a deleterious mutation on a protein's function versus its cytotoxicity. We realized that examining the sequence evolution of viral proteins could overcome this hurdle. It is because purifying selection against mutations in a viral protein that result in cytotoxicity per se is likely relaxed, whereas purifying selection against mutations that impair viral protein function persists. Multiple analyses of SARS-CoV-2 and nine other virus species revealed a complete absence of any E-R anticorrelation. As a control, the E-R anticorrelation does exist in human endogenous retroviruses where purifying selection against cytotoxicity is present. Taken together, these observations do not support the maintenance of protein function as the main constraint on protein sequence evolution in cellular organisms.
Collapse
Affiliation(s)
- Changshuo Wei
- State Key Laboratory of Plant Genomics, Institute of Genetics and Developmental Biology, Innovation Academy for Seed Design, Chinese Academy of Sciences, Beijing, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Yan-Ming Chen
- State Key Laboratory of Plant Genomics, Institute of Genetics and Developmental Biology, Innovation Academy for Seed Design, Chinese Academy of Sciences, Beijing, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Ying Chen
- State Key Laboratory of Plant Genomics, Institute of Genetics and Developmental Biology, Innovation Academy for Seed Design, Chinese Academy of Sciences, Beijing, China
| | - Wenfeng Qian
- State Key Laboratory of Plant Genomics, Institute of Genetics and Developmental Biology, Innovation Academy for Seed Design, Chinese Academy of Sciences, Beijing, China
- University of Chinese Academy of Sciences, Beijing, China
| |
Collapse
|
17
|
Specificity of RNA Folding and Its Association with Evolutionarily Adaptive mRNA Secondary Structures. GENOMICS PROTEOMICS & BIOINFORMATICS 2021; 19:882-900. [PMID: 33607297 PMCID: PMC9403030 DOI: 10.1016/j.gpb.2019.11.013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/17/2018] [Revised: 08/03/2019] [Accepted: 11/08/2019] [Indexed: 11/23/2022]
Abstract
The secondary structure is a fundamental feature of both noncoding and messenger RNAs. However, our understanding of the secondary structure of mRNA, especially that of the coding regions, remains elusive, likely due to translation and the lack of RNA-binding proteins that sustain the consensus structure, such as those that bind to noncoding RNA. Indeed, mRNA has recently been found to adopt diverse alternative structures, the overall functional significance of which remains untested. We hereby approached this problem by estimating the folding specificity, i.e., the probability that a fragment of RNA folds back to the same partner once refolded. We showed that the folding specificity of mRNA is lower than that of noncoding RNA and exhibits moderate evolutionary conservation. Notably, we found that specific rather than alternative folding is likely evolutionarily adaptive since specific folding is frequently associated with functionally important genes or sites within a gene. Additional analysis in combination with ribosome density suggests the ability to modulate ribosome movement as one potential functional advantage provided by specific folding. Our findings revealed a novel facet of the RNA structurome with important functional and evolutionary implications and indicated a potential method for distinguishing the mRNA secondary structures maintained by natural selection from molecular noise.
Collapse
|
18
|
Usmanova DR, Plata G, Vitkup D. The Relationship between the Misfolding Avoidance Hypothesis and Protein Evolutionary Rates in the Light of Empirical Evidence. Genome Biol Evol 2021; 13:6081017. [PMID: 33432359 PMCID: PMC7874998 DOI: 10.1093/gbe/evab006] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/07/2021] [Indexed: 12/14/2022] Open
Abstract
For more than a decade, the misfolding avoidance hypothesis (MAH) and related theories have dominated evolutionary discussions aimed at explaining the variance of the molecular clock across cellular proteins. In this study, we use various experimental data to further investigate the consistency of the MAH predictions with empirical evidence. We also critically discuss experimental results that motivated the MAH development and that are often viewed as evidence of its major contribution to the variability of protein evolutionary rates. We demonstrate, in Escherichia coli and Homo sapiens, the lack of a substantial negative correlation between protein evolutionary rates and Gibbs free energies of unfolding, a direct measure of protein stability. We then analyze multiple new genome-scale data sets characterizing protein aggregation and interaction propensities, the properties that are likely optimized in evolution to alleviate deleterious effects associated with toxic protein misfolding and misinteractions. Our results demonstrate that the propensity of proteins to aggregate, the fraction of charged amino acids, and protein stickiness do correlate with protein abundances. Nevertheless, across multiple organisms and various data sets we do not observe substantial correlations between proteins’ aggregation- and stability-related properties and evolutionary rates. Therefore, diverse empirical data support the conclusion that the MAH and similar hypotheses do not play a major role in mediating a strong negative correlation between protein expression and the molecular clock, and thus in explaining the variability of evolutionary rates across cellular proteins.
Collapse
Affiliation(s)
- Dinara R Usmanova
- Department of Systems Biology, Columbia University, New York, NY, USA
| | - Germán Plata
- Department of Systems Biology, Columbia University, New York, NY, USA.,Elanco Animal Health, Greenfield, IN, USA
| | - Dennis Vitkup
- Department of Systems Biology, Columbia University, New York, NY, USA.,Department of Biomedical Informatics, Columbia University, New York, NY, USA
| |
Collapse
|
19
|
Evans P, Cox NJ, Gamazon ER. The regulatory genome constrains protein sequence evolution: implications for the search for disease-associated genes. PeerJ 2020; 8:e9554. [PMID: 32765967 PMCID: PMC7380284 DOI: 10.7717/peerj.9554] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2020] [Accepted: 06/24/2020] [Indexed: 11/20/2022] Open
Abstract
The development of explanatory models of protein sequence evolution has broad implications for our understanding of cellular biology, population history, and disease etiology. Here we analyze the GTEx transcriptome resource to quantify the effect of the transcriptome on protein sequence evolution in a multi-tissue framework. We find substantial variation among the central nervous system tissues in the effect of expression variance on evolutionary rate, with highly variable genes in the cortex showing significantly greater purifying selection than highly variable genes in subcortical regions (Mann-Whitney U p = 1.4 × 10-4). The remaining tissues cluster in observed expression correlation with evolutionary rate, enabling evolutionary analysis of genes in diverse physiological systems, including digestive, reproductive, and immune systems. Importantly, the tissue in which a gene attains its maximum expression variance significantly varies (p = 5.55 × 10-284) with evolutionary rate, suggesting a tissue-anchored model of protein sequence evolution. Using a large-scale reference resource, we show that the tissue-anchored model provides a transcriptome-based approach to predicting the primary affected tissue of developmental disorders. Using gradient boosted regression trees to model evolutionary rate under a range of model parameters, selected features explain up to 62% of the variation in evolutionary rate and provide additional support for the tissue model. Finally, we investigate several methodological implications, including the importance of evolutionary-rate-aware gene expression imputation models using genetic data for improved search for disease-associated genes in transcriptome-wide association studies. Collectively, this study presents a comprehensive transcriptome-based analysis of a range of factors that may constrain molecular evolution and proposes a novel framework for the study of gene function and disease mechanism.
Collapse
Affiliation(s)
- Patrick Evans
- Division of Genetic Medicine, Vanderbilt University Medical Center, Nashville, TN, United States of America
| | - Nancy J Cox
- Division of Genetic Medicine, Vanderbilt University Medical Center, Nashville, TN, United States of America
| | - Eric R Gamazon
- Division of Genetic Medicine, Vanderbilt University Medical Center, Nashville, TN, United States of America.,Clare Hall, University of Cambridge, Cambridge, United Kingdom.,MRC Epidemiology Unit, University of Cambridge, Cambridge, United Kingdom.,Data Science Institute, Vanderbilt University, Nashville, TN, United States of America
| |
Collapse
|
20
|
Abstract
Cells adapt to changing environments. Perturb a cell and it returns to a point of homeostasis. Perturb a population and it evolves toward a fitness peak. We review quantitative models of the forces of adaptation and their visualizations on landscapes. While some adaptations result from single mutations or few-gene effects, others are more cooperative, more delocalized in the genome, and more universal and physical. For example, homeostasis and evolution depend on protein folding and aggregation, energy and protein production, protein diffusion, molecular motor speeds and efficiencies, and protein expression levels. Models provide a way to learn about the fitness of cells and cell populations by making and testing hypotheses.
Collapse
Affiliation(s)
- Luca Agozzino
- The Louis and Beatrice Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York 11794, USA; .,Department of Physics and Astronomy, Stony Brook University, Stony Brook, New York 11794, USA
| | - Gábor Balázsi
- The Louis and Beatrice Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York 11794, USA; .,Department of Biomedical Engineering, Stony Brook University, Stony Brook, New York 11794, USA
| | - Jin Wang
- The Louis and Beatrice Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York 11794, USA; .,Department of Physics and Astronomy, Stony Brook University, Stony Brook, New York 11794, USA.,Department of Chemistry, Stony Brook University, Stony Brook, New York 11790, USA
| | - Ken A Dill
- The Louis and Beatrice Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York 11794, USA; .,Department of Physics and Astronomy, Stony Brook University, Stony Brook, New York 11794, USA.,Department of Chemistry, Stony Brook University, Stony Brook, New York 11790, USA
| |
Collapse
|
21
|
Effects of Single Mutations on Protein Stability Are Gaussian Distributed. Biophys J 2020; 118:2872-2878. [PMID: 32416078 DOI: 10.1016/j.bpj.2020.04.027] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2019] [Revised: 04/14/2020] [Accepted: 04/24/2020] [Indexed: 12/16/2022] Open
Abstract
The distribution of protein stability effects is known to be well approximated by a Gaussian distribution from previous empirical fits. Starting from first-principles statistical mechanics, we more rigorously motivate this empirical observation by deriving per-residue-position protein stability effects to be Gaussian. Our derivation requires the number of amino acids to be large, which is satisfied by the standard set of 20 amino acids found in nature. No assumption is needed on the number of residues in close proximity in space, in contrast to previous applications of the central limit theorem to protein energetics. We support our derivation results with computational and experimental data on mutant protein stabilities across all types of protein residues.
Collapse
|
22
|
Razban RM. Protein Melting Temperature Cannot Fully Assess Whether Protein Folding Free Energy Underlies the Universal Abundance-Evolutionary Rate Correlation Seen in Proteins. Mol Biol Evol 2019; 36:1955-1963. [PMID: 31093676 PMCID: PMC6736436 DOI: 10.1093/molbev/msz119] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open
Abstract
The protein misfolding avoidance hypothesis explains the universal negative correlation between protein abundance and sequence evolutionary rate across the proteome by identifying protein folding free energy (ΔG) as the confounding variable. Abundant proteins resist toxic misfolding events by being more stable, and more stable proteins evolve slower because their mutations are more destabilizing. Direct supporting evidence consists only of computer simulations. A study taking advantage of a recent experimental breakthrough in measuring protein stability proteome-wide through melting temperature (Tm) (Leuenberger et al. 2017), found weak misfolding avoidance hypothesis support for the Escherichia coli proteome, and no support for the Saccharomyces cerevisiae, Homo sapiens, and Thermus thermophilus proteomes (Plata and Vitkup 2018). I find that the nontrivial relationship between Tm and ΔG and inaccuracy in Tm measurements by Leuenberger et al. 2017 can be responsible for not observing strong positive abundance-Tm and strong negative Tm-evolutionary rate correlations.
Collapse
Affiliation(s)
- Rostam M Razban
- Department of Chemistry and Chemical Biology, Harvard University, Cambridge, MA
| |
Collapse
|
23
|
Hajizadeh M, Gibbs AJ, Amirnia F, Glasa M. The global phylogeny o f Plum pox virus is emerging. J Gen Virol 2019; 100:1457-1468. [PMID: 31418674 DOI: 10.1099/jgv.0.001308] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023] Open
Abstract
The 206 complete genomic sequences of Plum pox virus in GenBank (January 2019) were downloaded. Their main open reading frames (ORF)s were compared by phylogenetic and population genetic methods. All fell into the nine previously recognized strain clusters; the PPV-Rec and PPV-T strain ORFs were all recombinants, whereas most of those in the PPV-C, PPV-CR, PPV-CV, PPV-D, PPV-EA, PPV-M and PPV-W strain clusters were not. The strain clusters ranged in size from 2 (PPV-CV and PPV-EA) to 74 (PPV-D). The isolates of eight of the nine strains came solely from Europe and the Levant (with an exception resulting from a quarantine breach), but many PPV-D strain isolates also came from east and south Asia and the Americas. The estimated time to the most recent common ancestor (TMRCA) of all 134 non-recombinant ORFs was 820 (865-775) BCE. Most strain populations were only a few decades old, and had small intra-strain, but large inter-strain, differences; strain PPV-W was the oldest. Eurasia is clearly the 'centre of emergence' of PPV and the several PPV-D strain populations found elsewhere only show evidence of gene flow with Europe, so have come from separate introductions from Europe. All ORFs and their individual genes show evidence of strong negative selection, except the positively selected pipo gene of the recently migrant populations. The possible ancient origins of PPV are discussed.
Collapse
Affiliation(s)
- Mohammad Hajizadeh
- Department of Plant Protection, Faculty of Agriculture, University of Kurdistan, Sanandaj, Iran
| | - Adrian J Gibbs
- Emeritus Faculty Australian National University, Canberra, Australia
| | - Fahimeh Amirnia
- Department of Plant Protection, Faculty of Agriculture, University of Kurdistan, Sanandaj, Iran
| | - Miroslav Glasa
- Institute of Virology, Biomedical Research Centre, Slovak Academy of Sciences, Dúbravská cesta 9, 84505 Bratislava, Slovakia
| |
Collapse
|
24
|
Davydov II, Salamin N, Robinson-Rechavi M. Large-Scale Comparative Analysis of Codon Models Accounting for Protein and Nucleotide Selection. Mol Biol Evol 2019; 36:1316-1332. [PMID: 30847475 PMCID: PMC6526913 DOI: 10.1093/molbev/msz048] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
There are numerous sources of variation in the rate of synonymous substitutions inside genes, such as direct selection on the nucleotide sequence, or mutation rate variation. Yet scans for positive selection rely on codon models which incorporate an assumption of effectively neutral synonymous substitution rate, constant between sites of each gene. Here we perform a large-scale comparison of approaches which incorporate codon substitution rate variation and propose our own simple yet effective modification of existing models. We find strong effects of substitution rate variation on positive selection inference. More than 70% of the genes detected by the classical branch-site model are presumably false positives caused by the incorrect assumption of uniform synonymous substitution rate. We propose a new model which is strongly favored by the data while remaining computationally tractable. With the new model we can capture signatures of nucleotide level selection acting on translation initiation and on splicing sites within the coding region. Finally, we show that rate variation is highest in the highly recombining regions, and we propose that recombination and mutation rate variation, such as high CpG mutation rate, are the two main sources of nucleotide rate variation. Although we detect fewer genes under positive selection in Drosophila than without rate variation, the genes which we detect contain a stronger signal of adaptation of dynein, which could be associated with Wolbachia infection. We provide software to perform positive selection analysis using the new model.
Collapse
Affiliation(s)
- Iakov I Davydov
- Department of Computational Biology, Biophore, University of Lausanne, Lausanne, Switzerland.,Department of Ecology and Evolution, Biophore, University of Lausanne, Lausanne, Switzerland.,Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Nicolas Salamin
- Department of Computational Biology, Biophore, University of Lausanne, Lausanne, Switzerland.,Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Marc Robinson-Rechavi
- Department of Ecology and Evolution, Biophore, University of Lausanne, Lausanne, Switzerland.,Swiss Institute of Bioinformatics, Lausanne, Switzerland
| |
Collapse
|
25
|
Guo M, Wang H, Shao Y, Xing R, Zhao X, Zhang W, Li C. Gene identification and antimicrobial activity analysis of a novel lysozyme from razor clam Sinonovacula constricta. FISH & SHELLFISH IMMUNOLOGY 2019; 89:198-206. [PMID: 30946959 DOI: 10.1016/j.fsi.2019.03.077] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/14/2018] [Revised: 03/19/2019] [Accepted: 03/30/2019] [Indexed: 06/09/2023]
Abstract
Lysozymes are important immune effectors present in phylogenetically diverse organisms. They play vital roles in bacterial elimination during early immune responses. In the present study, a second invertebrate-type (i-type) lysozyme gene from razor clam Sinonovacula constricta (denoted as ScLYZ-2) was cloned by RACE and nested PCR methods. The full-length cDNA sequences of ScLYZ-2 were 1558 bp, including a 5' untranslated region (UTR) of 375 bp, an open reading frame of 426 bp, and a 3'-UTR of 757 bp with polyadenylation signal sequence (AATAAA) located upstream of the poly(A) tail. SMART analysis showed that ScLYZ-2 contains a signal peptide in the first 16 amino acid (AA) sequences and a destabilase domain located from 24 to 134 AA sequences. The deduced AA sequences of ScLYZ-2 were highly similar (42%-58%) to other known lysozyme genes of bivalve species. Multiple alignments of AA sequences showed that ScLYZ-2 possesses the classical i-type lysozyme family signature of two motifs ["MDVGSLSCGP(Y/F)QIK" and "CL(E/L/R/H)C(I/M)C"] and two catalytic residues (Glu35 and Asp46). Moreover, phylogenetic analysis showed that ScLYZ-2 is a new member of the i-type lysozyme family. In healthy razor clams, ScLYZ-2 was highly expressed in the hepatopancreas, followed by the gills, water pipes, and abdominal foot. Lysozyme activity and ScLYZ-2 expression levels were significantly upregulated in the hepatopancreas and gills after being infected with V. splendidus, V. harveyi, V. parahaemolyticus and S. aureus and M. luteus. Moreover, the recombinant ScLYZ-2 had strong antimicrobial activities against V. splendidus, V. harveyi, and V. parahaemolyticus. Furthermore, the minimal inhibitory concentration of the recombinant ScLYZ-2 against V. parahaemolyticus was 7.2 μmol/mL. Taken together, our results show that ScLYZ-2 plays an important role in the immune defense of razor clam by eliminating pathogenic microorganisms.
Collapse
Affiliation(s)
- Ming Guo
- School of Marine Sciences, Ningbo University, Ningbo, 315211, PR China
| | - Huihui Wang
- School of Marine Sciences, Ningbo University, Ningbo, 315211, PR China
| | - Yina Shao
- School of Marine Sciences, Ningbo University, Ningbo, 315211, PR China
| | - Ronglian Xing
- College of Life Sciences, Yantai University, Yantai, 264005, PR China
| | - Xuelin Zhao
- School of Marine Sciences, Ningbo University, Ningbo, 315211, PR China
| | - Weiwei Zhang
- School of Marine Sciences, Ningbo University, Ningbo, 315211, PR China
| | - Chenghua Li
- School of Marine Sciences, Ningbo University, Ningbo, 315211, PR China; Laboratory for Marine Fisheries Science and Food Production Processes, Qingdao National Laboratory for Marine Science and Technology, Qingdao, 266071, PR China.
| |
Collapse
|
26
|
Feyertag F, Berninsone PM, Alvarez-Ponce D. N-glycoproteins exhibit a positive expression level-evolutionary rate correlation. J Evol Biol 2019; 32:390-394. [PMID: 30697857 DOI: 10.1111/jeb.13420] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2018] [Revised: 01/23/2019] [Accepted: 01/25/2019] [Indexed: 12/22/2022]
Abstract
The different proteins of any proteome evolve at enormously different rates. One of the primary factors influencing rates of protein evolution is expression level, with highly expressed proteins tending to evolve at slow rates. This phenomenon, known as the expression level-evolutionary rate (E-R) anticorrelation, has been attributed to the abundance-dependent deleterious effects of misfolding or misinteraction. We have recently shown that secreted proteins either lack an E-R anticorrelation or exhibit a significantly reduced E-R anticorrelation. This effect may be due to the strict quality control to which secreted proteins are subject in the endoplasmic reticulum (which is expected to reduce the rate of misfolding and its deleterious effects) or to their extracellular location (expected to reduce the rate of misinteraction and its deleterious effects). Among secreted proteins, N-glycosylated ones are under particularly strong quality control. Here, we investigate how N-linked glycosylation affects the E-R anticorrelation. Strikingly, we observe a positive E-R correlation among N-glycosylated proteins. That is, N-glycoproteins that are highly expressed evolve at faster rates than lowly expressed N-glycoproteins, in contrast to what is observed among intracellular proteins.
Collapse
Affiliation(s)
- Felix Feyertag
- Department of Biology, University of Nevada, Reno, Reno, Nevada
| | | | | |
Collapse
|
27
|
Marek A, Tomala K. The Contribution of Purifying Selection, Linkage, and Mutation Bias to the Negative Correlation between Gene Expression and Polymorphism Density in Yeast Populations. Genome Biol Evol 2018; 10:2986-2996. [PMID: 30321329 PMCID: PMC6250307 DOI: 10.1093/gbe/evy225] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/10/2018] [Indexed: 11/13/2022] Open
Abstract
The negative correlation between the rate of protein evolution and expression level of a gene has been recognized as a universal law of the evolutionary biology (Koonin 2011). In our study, we apply a population-based approach to systematically investigate the relative importance of unequal mutation rate, linkage, and selection in the origin of the expression-polymorphism anticorrelation. We analyzed the DNA sequence of protein coding genes of 24 Saccharomyces cerevisiae and 58 Schizosaccharomyces pombe strains. We found that highly expressed genes had a substantially decreased number of polymorphic sites when compared with genes transcribed less extensively. This expression-dependent reduction was especially strong in the nonsynonymous sites, although it was also present in the synonymous sites and untranslated regions, both up and down of a gene. Most importantly, no such trend was found in introns. We used these observations, as well as analyses of site frequency spectra and data from mutation accumulation experiments, to show that the purifying selection acting on nonsynonymous sites was the main, but not exclusive, factor impeding molecular evolution within the coding sequences of highly expressed genes. Linkage could not fully explain the observed pattern of polymorphism within the untranslated regions and synonymous sites, although the contribution of selection acting directly on synonymous variants was extremely small. Finally, we found that the impact of mutational bias was rather negligible.
Collapse
Affiliation(s)
- Agnieszka Marek
- Institute of Environmental Sciences, Jagiellonian University, Krakow, Poland
| | - Katarzyna Tomala
- Institute of Environmental Sciences, Jagiellonian University, Krakow, Poland
| |
Collapse
|
28
|
Protein evolution speed depends on its stability and abundance and on chaperone concentrations. Proc Natl Acad Sci U S A 2018; 115:9092-9097. [PMID: 30150386 PMCID: PMC6140491 DOI: 10.1073/pnas.1810194115] [Citation(s) in RCA: 41] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
Some biological evolution is slow (millions of years), and some is fast (months to years). The speed at which a protein evolves depends on how stable a protein’s folded structure is, how well it avoids aggregation, and how well-chaperoned it is. What are the mechanisms? We compute fitness landscapes by combining a model of protein-folding equilibria with sequence-change dynamics. We find that adapting to a new environment is fastest for proteins that are least stably folded, because those sit on steep downhill parts of fitness potentials. The modeling shows that cells should adapt to warmer environments faster than to colder ones, explains why increasing a protein’s abundance slows cell evolution, and explains how chaperones accelerate evolution by mitigating this effect. Proteins evolve at different rates. What drives the speed of protein sequence changes? Two main factors are a protein’s folding stability and aggregation propensity. By combining the hydrophobic–polar (HP) model with the Zwanzig–Szabo–Bagchi rate theory, we find that: (i) Adaptation is strongly accelerated by selection pressure, explaining the broad variation from days to thousands of years over which organisms adapt to new environments. (ii) The proteins that adapt fastest are those that are not very stably folded, because their fitness landscapes are steepest. And because heating destabilizes folded proteins, we predict that cells should adapt faster when put into warmer rather than cooler environments. (iii) Increasing protein abundance slows down evolution (the substitution rate of the sequence) because a typical protein is not perfectly fit, so increasing its number of copies reduces the cell’s fitness. (iv) However, chaperones can mitigate this abundance effect and accelerate evolution (also called evolutionary capacitance) by effectively enhancing protein stability. This model explains key observations about protein evolution rates.
Collapse
|
29
|
Yang T, Zhong J, Zhang J, Li C, Yu X, Xiao J, Jia X, Ding N, Ma G, Wang G, Yue L, Liang Q, Sheng Y, Sun Y, Huang H, Chen F. Pan-Genomic Study of Mycobacterium tuberculosis Reflecting the Primary/Secondary Genes, Generality/Individuality, and the Interconversion Through Copy Number Variations. Front Microbiol 2018; 9:1886. [PMID: 30177918 PMCID: PMC6109687 DOI: 10.3389/fmicb.2018.01886] [Citation(s) in RCA: 26] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2018] [Accepted: 07/27/2018] [Indexed: 11/13/2022] Open
Abstract
Tuberculosis (TB) has surpassed HIV as the leading infectious disease killer worldwide since 2014. The main pathogen, Mycobacterium tuberculosis (Mtb), contains ~4,000 genes that account for ~90% of the genome. However, it is still unclear which of these genes are primary/secondary, which are responsible for generality/individuality, and which interconvert during evolution. Here we utilized a pan-genomic analysis of 36 Mtb genomes to address these questions. We identified 3,679 Mtb core (i.e., primary) genes, determining their phenotypic generality (e.g., virulence, slow growth, dormancy). We also observed 1,122 dispensable and 964 strain-specific secondary genes, reflecting partially shared and lineage-/strain-specific individualities. Among which, five L2 lineage-specific genes might be related to the increased virulence of the L2 lineage. Notably, we discovered 28 Mtb “Super Core Genes” (SCGs: more than a copy in at least 90% strains), which might be of increased importance, and reflected the “super phenotype generality.” Most SCGs encode PE/PPE, virulence factors, antigens, and transposases, and have been verified as playing crucial roles in Mtb pathogenicity. Further investigation of the 28 SCGs demonstrated the interconversion among SCGs, single-copy core, dispensable, and strain-specific genes through copy number variations (CNVs) during evolution; different mutations on different copies highlight the delicate adaptive-evolution regulation amongst Mtb lineages. This reflects that the importance of genes varied through CNVs, which might be driven by selective pressure from environment/host-adaptation. In addition, compared with Mycobacterium bovis (Mbo), Mtb possesses 48 specific single core genes that partially reflect the differences between Mtb and Mbo individuality.
Collapse
Affiliation(s)
- Tingting Yang
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China.,University of Chinese Academy of Sciences, Beijing, China
| | - Jun Zhong
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China
| | - Ju Zhang
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China
| | - Cuidan Li
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China.,University of Chinese Academy of Sciences, Beijing, China
| | - Xia Yu
- National Clinical Laboratory on Tuberculosis, Beijing Key Laboratory on Drug-Resistant Tuberculosis Research, Beijing Chest Hospital, Capital Medical University, Beijing Tuberculosis and Thoracic Tumor Institute, Beijing, China
| | - Jingfa Xiao
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China.,University of Chinese Academy of Sciences, Beijing, China.,BIG Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China
| | - Xinmiao Jia
- Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Nan Ding
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China
| | - Guannan Ma
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China
| | - Guirong Wang
- National Clinical Laboratory on Tuberculosis, Beijing Key Laboratory on Drug-Resistant Tuberculosis Research, Beijing Chest Hospital, Capital Medical University, Beijing Tuberculosis and Thoracic Tumor Institute, Beijing, China
| | - Liya Yue
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China
| | - Qian Liang
- National Clinical Laboratory on Tuberculosis, Beijing Key Laboratory on Drug-Resistant Tuberculosis Research, Beijing Chest Hospital, Capital Medical University, Beijing Tuberculosis and Thoracic Tumor Institute, Beijing, China
| | - Yongjie Sheng
- Key Laboratory for Molecular Enzymology and Engineering of Ministry of Education, Jilin University, Changchun, China
| | - Yanhong Sun
- Key Laboratory for Molecular Enzymology and Engineering of Ministry of Education, Jilin University, Changchun, China
| | - Hairong Huang
- National Clinical Laboratory on Tuberculosis, Beijing Key Laboratory on Drug-Resistant Tuberculosis Research, Beijing Chest Hospital, Capital Medical University, Beijing Tuberculosis and Thoracic Tumor Institute, Beijing, China
| | - Fei Chen
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China.,University of Chinese Academy of Sciences, Beijing, China.,Collaborative Innovation Center for Genetics and Development, Beijing, China
| |
Collapse
|
30
|
Roux J, Liu J, Robinson-Rechavi M. Selective Constraints on Coding Sequences of Nervous System Genes Are a Major Determinant of Duplicate Gene Retention in Vertebrates. Mol Biol Evol 2018; 34:2773-2791. [PMID: 28981708 PMCID: PMC5850798 DOI: 10.1093/molbev/msx199] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
The evolutionary history of vertebrates is marked by three ancient whole-genome duplications: two successive rounds in the ancestor of vertebrates, and a third one specific to teleost fishes. Biased loss of most duplicates enriched the genome for specific genes, such as slow evolving genes, but this selective retention process is not well understood. To understand what drives the long-term preservation of duplicate genes, we characterized duplicated genes in terms of their expression patterns. We used a new method of expression enrichment analysis, TopAnat, applied to in situ hybridization data from thousands of genes from zebrafish and mouse. We showed that the presence of expression in the nervous system is a good predictor of a higher rate of retention of duplicate genes after whole-genome duplication. Further analyses suggest that purifying selection against the toxic effects of misfolded or misinteracting proteins, which is particularly strong in nonrenewing neural tissues, likely constrains the evolution of coding sequences of nervous system genes, leading indirectly to the preservation of duplicate genes after whole-genome duplication. Whole-genome duplications thus greatly contributed to the expansion of the toolkit of genes available for the evolution of profound novelties of the nervous system at the base of the vertebrate radiation.
Collapse
Affiliation(s)
- Julien Roux
- Département d'Ecologie et d'Evolution, Université de Lausanne, Lausanne, Switzerland.,Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Jialin Liu
- Département d'Ecologie et d'Evolution, Université de Lausanne, Lausanne, Switzerland.,Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Marc Robinson-Rechavi
- Département d'Ecologie et d'Evolution, Université de Lausanne, Lausanne, Switzerland.,Swiss Institute of Bioinformatics, Lausanne, Switzerland
| |
Collapse
|
31
|
Yang JR. Does mRNA structure contain genetic information for regulating co-translational protein folding? Zool Res 2018; 38:36-43. [PMID: 28271668 PMCID: PMC5368379 DOI: 10.13918/j.issn.2095-8137.2017.004] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023] Open
Abstract
Currently many facets of genetic information are illdefined. In particular, how protein folding is genetically regulated has been a long-standing issue for genetics and protein biology. And a generic mechanistic model with supports of genomic data is still lacking. Recent technological advances have enabled much needed genome-wide experiments. While putting the effect of codon optimality on debate, these studies have supplied mounting evidence suggesting a role of mRNA structure in the regulation of protein folding by modulating translational elongation rate. In conjunctions with previous theories, this mechanistic model of protein folding guided by mRNA structure shall expand our understandings of genetic information and offer new insights into various biomedical puzzles.
Collapse
Affiliation(s)
- Jian-Rong Yang
- Department of Biology, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou 510080, China.
| |
Collapse
|
32
|
Farkas Z, Kalapis D, Bódi Z, Szamecz B, Daraba A, Almási K, Kovács K, Boross G, Pál F, Horváth P, Balassa T, Molnár C, Pettkó-Szandtner A, Klement É, Rutkai E, Szvetnik A, Papp B, Pál C. Hsp70-associated chaperones have a critical role in buffering protein production costs. eLife 2018; 7:29845. [PMID: 29377792 PMCID: PMC5788500 DOI: 10.7554/elife.29845] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2017] [Accepted: 12/23/2017] [Indexed: 02/01/2023] Open
Abstract
Proteins are necessary for cellular growth. Concurrently, however, protein production has high energetic demands associated with transcription and translation. Here, we propose that activity of molecular chaperones shape protein burden, that is the fitness costs associated with expression of unneeded proteins. To test this hypothesis, we performed a genome-wide genetic interaction screen in baker's yeast. Impairment of transcription, translation, and protein folding rendered cells hypersensitive to protein burden. Specifically, deletion of specific regulators of the Hsp70-associated chaperone network increased protein burden. In agreement with expectation, temperature stress, increased mistranslation and a chemical misfolding agent all substantially enhanced protein burden. Finally, unneeded protein perturbed interactions between key components of the Hsp70-Hsp90 network involved in folding of native proteins. We conclude that specific chaperones contribute to protein burden. Our work indicates that by minimizing the damaging impact of gratuitous protein overproduction, chaperones enable tolerance to massive changes in genomic expression.
Collapse
Affiliation(s)
- Zoltán Farkas
- Synthetic and Systems Biology Unit, Institute of Biochemistry, Biological Research Centre of the Hungarian Academy of Sciences, Szeged, Hungary
| | - Dorottya Kalapis
- Synthetic and Systems Biology Unit, Institute of Biochemistry, Biological Research Centre of the Hungarian Academy of Sciences, Szeged, Hungary
| | - Zoltán Bódi
- Synthetic and Systems Biology Unit, Institute of Biochemistry, Biological Research Centre of the Hungarian Academy of Sciences, Szeged, Hungary
| | - Béla Szamecz
- Synthetic and Systems Biology Unit, Institute of Biochemistry, Biological Research Centre of the Hungarian Academy of Sciences, Szeged, Hungary
| | - Andreea Daraba
- Synthetic and Systems Biology Unit, Institute of Biochemistry, Biological Research Centre of the Hungarian Academy of Sciences, Szeged, Hungary
| | - Karola Almási
- Synthetic and Systems Biology Unit, Institute of Biochemistry, Biological Research Centre of the Hungarian Academy of Sciences, Szeged, Hungary
| | - Károly Kovács
- Synthetic and Systems Biology Unit, Institute of Biochemistry, Biological Research Centre of the Hungarian Academy of Sciences, Szeged, Hungary
| | - Gábor Boross
- Synthetic and Systems Biology Unit, Institute of Biochemistry, Biological Research Centre of the Hungarian Academy of Sciences, Szeged, Hungary
| | - Ferenc Pál
- Synthetic and Systems Biology Unit, Institute of Biochemistry, Biological Research Centre of the Hungarian Academy of Sciences, Szeged, Hungary
| | - Péter Horváth
- Synthetic and Systems Biology Unit, Institute of Biochemistry, Biological Research Centre of the Hungarian Academy of Sciences, Szeged, Hungary
| | - Tamás Balassa
- Synthetic and Systems Biology Unit, Institute of Biochemistry, Biological Research Centre of the Hungarian Academy of Sciences, Szeged, Hungary
| | - Csaba Molnár
- Synthetic and Systems Biology Unit, Institute of Biochemistry, Biological Research Centre of the Hungarian Academy of Sciences, Szeged, Hungary
| | - Aladár Pettkó-Szandtner
- Institute of Plant Biology, Biological Research Centre of the Hungarian Academy of Sciences, Szeged, Hungary.,Laboratory of Proteomic Research, Biological Research Centre of the Hungarian Academy of Sciences, Szeged, Hungary
| | - Éva Klement
- Laboratory of Proteomic Research, Biological Research Centre of the Hungarian Academy of Sciences, Szeged, Hungary
| | - Edit Rutkai
- Division for Biotechnology, Bay Zoltán Nonprofit Ltd, Budapest, Hungary
| | - Attila Szvetnik
- Division for Biotechnology, Bay Zoltán Nonprofit Ltd, Budapest, Hungary
| | - Balázs Papp
- Synthetic and Systems Biology Unit, Institute of Biochemistry, Biological Research Centre of the Hungarian Academy of Sciences, Szeged, Hungary
| | - Csaba Pál
- Synthetic and Systems Biology Unit, Institute of Biochemistry, Biological Research Centre of the Hungarian Academy of Sciences, Szeged, Hungary
| |
Collapse
|
33
|
Plata G, Vitkup D. Protein Stability and Avoidance of Toxic Misfolding Do Not Explain the Sequence Constraints of Highly Expressed Proteins. Mol Biol Evol 2017; 35:700-703. [PMID: 29309671 DOI: 10.1093/molbev/msx323] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
The avoidance of cytotoxic effects associated with protein misfolding has been proposed as a dominant constraint on the sequence evolution and molecular clock of highly expressed proteins. Recently, Leuenberger et al. developed an elegant experimental approach to measure protein thermal stability at the proteome scale. The collected data allow us to rigorously test the predictions of the misfolding avoidance hypothesis that highly expressed proteins have evolved to be more stable, and that maintaining thermodynamic stability significantly constrains their evolution. Notably, reanalysis of the Leuenberger et al. data across four different organisms reveals no substantial correlation between protein stability and protein abundance. Therefore, the key predictions of the misfolding toxicity and related hypotheses are not supported by available empirical data. The data also suggest that, regardless of protein expression, protein stability does not substantially affect the protein molecular clock across organisms.
Collapse
Affiliation(s)
- Germán Plata
- Department of Systems Biology, Columbia University, New York, NY
| | - Dennis Vitkup
- Department of Systems Biology, Columbia University, New York, NY.,Department of Biomedical Informatics, Columbia University, New York, NY
| |
Collapse
|
34
|
Feyertag F, Berninsone PM, Alvarez-Ponce D. Secreted Proteins Defy the Expression Level-Evolutionary Rate Anticorrelation. Mol Biol Evol 2017; 34:692-706. [PMID: 28007979 DOI: 10.1093/molbev/msw268] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
The rates of evolution of the proteins of any organism vary across orders of magnitude. A primary factor influencing rates of protein evolution is expression. A strong negative correlation between expression levels and evolutionary rates (the so-called E-R anticorrelation) has been observed in virtually all studied organisms. This effect is currently attributed to the abundance-dependent fitness costs of misfolding and unspecific protein-protein interactions, among other factors. Secreted proteins are folded in the endoplasmic reticulum, a compartment where chaperones, folding catalysts, and stringent quality control mechanisms promote their correct folding and may reduce the fitness costs of misfolding. In addition, confinement of secreted proteins to the extracellular space may reduce misinteractions and their deleterious effects. We hypothesize that each of these factors (the secretory pathway quality control and extracellular location) may reduce the strength of the E-R anticorrelation. Indeed, here we show that among human proteins that are secreted to the extracellular space, rates of evolution do not correlate with protein abundances. This trend is robust to controlling for several potentially confounding factors and is also observed when analyzing protein abundance data for 6 human tissues. In addition, analysis of mRNA abundance data for 32 human tissues shows that the E-R correlation is always less negative, and sometimes nonsignificant, in secreted proteins. Similar observations were made in Caenorhabditis elegans and in Escherichia coli, and to a lesser extent in Drosophila melanogaster, Saccharomyces cerevisiae and Arabidopsis thaliana. Our observations contribute to understand the causes of the E-R anticorrelation.
Collapse
Affiliation(s)
- Felix Feyertag
- Department of Biology, University of Nevada, Reno, Reno, NV
| | | | | |
Collapse
|
35
|
Echave J, Wilke CO. Biophysical Models of Protein Evolution: Understanding the Patterns of Evolutionary Sequence Divergence. Annu Rev Biophys 2017; 46:85-103. [PMID: 28301766 DOI: 10.1146/annurev-biophys-070816-033819] [Citation(s) in RCA: 68] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
For decades, rates of protein evolution have been interpreted in terms of the vague concept of functional importance. Slowly evolving proteins or sites within proteins were assumed to be more functionally important and thus subject to stronger selection pressure. More recently, biophysical models of protein evolution, which combine evolutionary theory with protein biophysics, have completely revolutionized our view of the forces that shape sequence divergence. Slowly evolving proteins have been found to evolve slowly because of selection against toxic misfolding and misinteractions, linking their rate of evolution primarily to their abundance. Similarly, most slowly evolving sites in proteins are not directly involved in function, but mutating these sites has a large impact on protein structure and stability. In this article, we review the studies in the emerging field of biophysical protein evolution that have shaped our current understanding of sequence divergence patterns. We also propose future research directions to develop this nascent field.
Collapse
Affiliation(s)
- Julian Echave
- Escuela de Ciencia y Tecnología, Universidad Nacional de San Martín, 1650 San Martín, Buenos Aires, Argentina; .,Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET), Argentina
| | - Claus O Wilke
- Department of Integrative Biology, The University of Texas at Austin, Texas 78712;
| |
Collapse
|
36
|
Bershtein S, Serohijos AW, Shakhnovich EI. Bridging the physical scales in evolutionary biology: from protein sequence space to fitness of organisms and populations. Curr Opin Struct Biol 2016; 42:31-40. [PMID: 27810574 DOI: 10.1016/j.sbi.2016.10.013] [Citation(s) in RCA: 46] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2016] [Accepted: 10/14/2016] [Indexed: 01/11/2023]
Abstract
Bridging the gap between the molecular properties of proteins and organismal/population fitness is essential for understanding evolutionary processes. This task requires the integration of the several physical scales of biological organization, each defined by a distinct set of mechanisms and constraints, into a single unifying model. The molecular scale is dominated by the constraints imposed by the physico-chemical properties of proteins and their substrates, which give rise to trade-offs and epistatic (non-additive) effects of mutations. At the systems scale, biological networks modulate protein expression and can either buffer or enhance the fitness effects of mutations. The population scale is influenced by the mutational input, selection regimes, and stochastic changes affecting the size and structure of populations, which eventually determine the evolutionary fate of mutations. Here, we summarize the recent advances in theory, computer simulations, and experiments that advance our understanding of the links between various physical scales in biology.
Collapse
Affiliation(s)
- Shimon Bershtein
- Department of Life Sciences, Ben-Gurion University of the Negev, Beer-Sheva 84501, Israel
| | - Adrian Wr Serohijos
- Département de Biochimie, Centre Robert-Cedergren en Bioinformatique & Génomique, Université de Montréal, Montréal, QC H3T 1J4, Canada
| | - Eugene I Shakhnovich
- Department of Chemistry and Chemical Biology, Harvard University, 12 Oxford Street, Cambridge, MA 02138, United States.
| |
Collapse
|
37
|
Chesmore KN, Bartlett J, Cheng C, Williams SM. Complex Patterns of Association between Pleiotropy and Transcription Factor Evolution. Genome Biol Evol 2016; 8:3159-3170. [PMID: 27635052 PMCID: PMC5174740 DOI: 10.1093/gbe/evw228] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
Pleiotropy has been claimed to constrain gene evolution but specific mechanisms and extent of these constraints have been difficult to demonstrate. The expansion of molecular data makes it possible to investigate these pleiotropic effects. Few classes of genes have been characterized as intensely as human transcription factors (TFs). We therefore analyzed the evolutionary rates of full TF proteins, along with their DNA binding domains and protein-protein interacting domains (PID) in light of the degree of pleiotropy, measured by the number of TF-TF interactions, or the number of DNA-binding targets. Data were extracted from the ENCODE Chip-Seq dataset, the String v 9.2 database, and the NHGRI GWAS catalog. Evolutionary rates of proteins and domains were calculated using the PAML CodeML package. Our analysis shows that the numbers of TF-TF interactions and DNA binding targets associated with constrained gene evolution; however, the constraint caused by the number of DNA binding targets was restricted to the DNA binding domains, whereas the number of TF-TF interactions constrained the full protein and did so more strongly. Additionally, we found a positive correlation between the number of protein-PIDs and the evolutionary rates of the protein-PIDs. These findings show that not only does pleiotropy associate with constrained protein evolution but the constraint differs by domain function. Finally, we show that GWAS associated TF genes are more highly pleiotropic : The GWAS data illustrates that mutations in highly pleiotropic genes are more likely to be associated with disease phenotypes.
Collapse
Affiliation(s)
- Kevin N Chesmore
- Department of Genetics, Geisel School of Medicine, Dartmouth College, Hanover, NH
| | - Jacquelaine Bartlett
- Department of Genetics, Geisel School of Medicine, Dartmouth College, Hanover, NH
| | - Chao Cheng
- Department of Genetics, Geisel School of Medicine, Dartmouth College, Hanover, NH
| | - Scott M Williams
- Department of Genetics, Geisel School of Medicine, Dartmouth College, Hanover, NH
| |
Collapse
|
38
|
Abstract
Our genome is protected from the introduction of mutations by high fidelity replication and an extensive network of DNA damage response and repair mechanisms. However, the expression of our genome, via RNA and protein synthesis, allows for more diversity in translating genetic information. In addition, the splicing process has become less stringent over evolutionary time allowing for a substantial increase in the diversity of transcripts generated. The result is a diverse transcriptome and proteome that harbor selective advantages over a more tightly regulated system. Here, we describe mechanisms in place that both safeguard the genome and promote translational diversity, with emphasis on post-transcriptional RNA processing.
Collapse
Affiliation(s)
- Brian Magnuson
- Department of Radiation Oncology, University of Michigan Comprehensive Cancer Center, and Translational Oncology Program, University of Michigan, Ann Arbor, USA; Department of Environmental Health Sciences, School of Public Health, University of Michigan, Ann Arbor, USA
| | - Karan Bedi
- Department of Radiation Oncology, University of Michigan Comprehensive Cancer Center, and Translational Oncology Program, University of Michigan, Ann Arbor, USA
| | - Mats Ljungman
- Department of Radiation Oncology, University of Michigan Comprehensive Cancer Center, and Translational Oncology Program, University of Michigan, Ann Arbor, USA; Department of Environmental Health Sciences, School of Public Health, University of Michigan, Ann Arbor, USA.
| |
Collapse
|
39
|
Mannakee BK, Gutenkunst RN. Selection on Network Dynamics Drives Differential Rates of Protein Domain Evolution. PLoS Genet 2016; 12:e1006132. [PMID: 27380265 PMCID: PMC4933380 DOI: 10.1371/journal.pgen.1006132] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2016] [Accepted: 05/27/2016] [Indexed: 11/19/2022] Open
Abstract
The long-held principle that functionally important proteins evolve slowly has recently been challenged by studies in mice and yeast showing that the severity of a protein knockout only weakly predicts that protein's rate of evolution. However, the relevance of these studies to evolutionary changes within proteins is unknown, because amino acid substitutions, unlike knockouts, often only slightly perturb protein activity. To quantify the phenotypic effect of small biochemical perturbations, we developed an approach to use computational systems biology models to measure the influence of individual reaction rate constants on network dynamics. We show that this dynamical influence is predictive of protein domain evolutionary rate within networks in vertebrates and yeast, even after controlling for expression level and breadth, network topology, and knockout effect. Thus, our results not only demonstrate the importance of protein domain function in determining evolutionary rate, but also the power of systems biology modeling to uncover unanticipated evolutionary forces.
Collapse
Affiliation(s)
- Brian K. Mannakee
- Division of Epidemiology and Biostatistics, Mel and Enid Zuckerman College of Public Health, University of Arizona, Tucson, Arizona, United States of America
| | - Ryan N. Gutenkunst
- Department of Molecular and Cellular Biology, University of Arizona, Tucson, Arizona, United States of America
- * E-mail:
| |
Collapse
|
40
|
Abstract
The rate and mechanism of protein sequence evolution have been central questions in evolutionary biology since the 1960s. Although the rate of protein sequence evolution depends primarily on the level of functional constraint, exactly what determines functional constraint has remained unclear. The increasing availability of genomic data has enabled much needed empirical examinations on the nature of functional constraint. These studies found that the evolutionary rate of a protein is predominantly influenced by its expression level rather than functional importance. A combination of theoretical and empirical analyses has identified multiple mechanisms behind these observations and demonstrated a prominent role in protein evolution of selection against errors in molecular and cellular processes.
Collapse
Affiliation(s)
- Jianzhi Zhang
- Department of Ecology and Evolutionary Biology, University of Michigan, 830 North University Avenue, Ann Arbor, Michigan 48109, USA
| | - Jian-Rong Yang
- Department of Ecology and Evolutionary Biology, University of Michigan, 830 North University Avenue, Ann Arbor, Michigan 48109, USA
| |
Collapse
|
41
|
Sikosek T, Chan HS. Biophysics of protein evolution and evolutionary protein biophysics. J R Soc Interface 2015; 11:20140419. [PMID: 25165599 DOI: 10.1098/rsif.2014.0419] [Citation(s) in RCA: 150] [Impact Index Per Article: 16.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023] Open
Abstract
The study of molecular evolution at the level of protein-coding genes often entails comparing large datasets of sequences to infer their evolutionary relationships. Despite the importance of a protein's structure and conformational dynamics to its function and thus its fitness, common phylogenetic methods embody minimal biophysical knowledge of proteins. To underscore the biophysical constraints on natural selection, we survey effects of protein mutations, highlighting the physical basis for marginal stability of natural globular proteins and how requirement for kinetic stability and avoidance of misfolding and misinteractions might have affected protein evolution. The biophysical underpinnings of these effects have been addressed by models with an explicit coarse-grained spatial representation of the polypeptide chain. Sequence-structure mappings based on such models are powerful conceptual tools that rationalize mutational robustness, evolvability, epistasis, promiscuous function performed by 'hidden' conformational states, resolution of adaptive conflicts and conformational switches in the evolution from one protein fold to another. Recently, protein biophysics has been applied to derive more accurate evolutionary accounts of sequence data. Methods have also been developed to exploit sequence-based evolutionary information to predict biophysical behaviours of proteins. The success of these approaches demonstrates a deep synergy between the fields of protein biophysics and protein evolution.
Collapse
Affiliation(s)
- Tobias Sikosek
- Department of Biochemistry, University of Toronto, Toronto, Ontario, Canada M5S 1A8 Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada M5S 1A8 Department of Physics, University of Toronto, Toronto, Ontario, Canada M5S 1A8
| | - Hue Sun Chan
- Department of Biochemistry, University of Toronto, Toronto, Ontario, Canada M5S 1A8 Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada M5S 1A8 Department of Physics, University of Toronto, Toronto, Ontario, Canada M5S 1A8
| |
Collapse
|
42
|
Çetinbaş M, Shakhnovich EI. Is catalytic activity of chaperones a selectable trait for the emergence of heat shock response? Biophys J 2015; 108:438-48. [PMID: 25606691 DOI: 10.1016/j.bpj.2014.11.3468] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2014] [Revised: 10/28/2014] [Accepted: 11/24/2014] [Indexed: 10/24/2022] Open
Abstract
Although heat shock response is ubiquitous in bacterial cells, the underlying physical chemistry behind heat shock response remains poorly understood. To study the response of cell populations to heat shock we employ a physics-based ab initio model of living cells where protein biophysics (i.e., folding and protein-protein interactions in crowded cellular environments) and important aspects of proteins homeostasis are coupled with realistic population dynamics simulations. By postulating a genotype-phenotype relationship we define a cell division rate in terms of functional concentrations of proteins and protein complexes, whose Boltzmann stabilities of folding and strengths of their functional interactions are exactly evaluated from their sequence information. We compare and contrast evolutionary dynamics for two models of chaperon action. In the active model, foldase chaperones function as nonequilibrium machines to accelerate the rate of protein folding. In the passive model, holdase chaperones form reversible complexes with proteins in their misfolded conformations to maintain their solubility. We find that only cells expressing foldase chaperones are capable of genuine heat shock response to the increase in the amount of unfolded proteins at elevated temperatures. In response to heat shock, cells' limited resources are redistributed differently for active and passive models. For the active model, foldase chaperones are overexpressed at the expense of downregulation of high abundance proteins, whereas for the passive model; cells react to heat shock by downregulating their high abundance proteins, as their low abundance proteins are upregulated.
Collapse
Affiliation(s)
- Murat Çetinbaş
- Department of Chemistry and Chemical Biology, Harvard University, Cambridge, Massachusetts
| | - Eugene I Shakhnovich
- Department of Chemistry and Chemical Biology, Harvard University, Cambridge, Massachusetts.
| |
Collapse
|
43
|
Schüler A, Ghanbarian AT, Hurst LD. Purifying selection on splice-related motifs, not expression level nor RNA folding, explains nearly all constraint on human lincRNAs. Mol Biol Evol 2014; 31:3164-83. [PMID: 25158797 PMCID: PMC4245815 DOI: 10.1093/molbev/msu249] [Citation(s) in RCA: 41] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Open
Abstract
There are two strong and equally important predictors of rates of human protein evolution: The amount the gene is expressed and the proportion of exonic sequence devoted to control splicing, mediated largely by selection on exonic splice enhancer (ESE) motifs. Is the same true for noncoding RNAs, known to be under very weak purifying selection? Prior evidence suggests that selection at splice sites in long intergenic noncoding RNAs (lincRNAs) is important. We now report multiple lines of evidence indicating that the great majority of purifying selection operating on lincRNAs in humans is splice related. Splice-related parameters explain much of the between-gene variation in evolutionary rate in humans. Expression rate is not a relevant predictor, although expression breadth is weakly so. In contrast to protein-coding RNAs, we observe no relationship between evolutionary rate and lincRNA stability. As in protein-coding genes, ESEs are especially abundant near splice junctions and evolve slower than non-ESE sequence equidistant from boundaries. Nearly all constraint in lincRNAs is at exon ends (N.B. the same is not witnessed in Drosophila). Although we cannot definitely answer the question as to why splice-related selection is so important, we find no evidence that splicing might enable the nonsense-mediated decay pathway to capture transcripts incorrectly processed by ribosomes. We find evidence consistent with the notion that splicing modifies the underlying chromatin through recruitment of splice-coupled chromatin modifiers, such as CHD1, which in turn might modulate neighbor gene activity. We conclude that most selection on human lincRNAs is splice mediated and suggest that the possibility of splice-chromatin coupling is worthy of further scrutiny.
Collapse
Affiliation(s)
- Andreas Schüler
- Institute for Evolution and Biodiversity, University of Münster, Münster, Germany
| | - Avazeh T Ghanbarian
- Department of Biology and Biochemistry, University of Bath, Bath, United Kingdom
| | - Laurence D Hurst
- Department of Biology and Biochemistry, University of Bath, Bath, United Kingdom
| |
Collapse
|
44
|
Abstract
Protein metabolism is one of the most costly processes in the cell and is therefore expected to be under the effective control of natural selection. We stimulated yeast strains to overexpress each single gene product to approximately 1% of the total protein content. Consistent with previous reports, we found that excessive expression of proteins containing disordered or membrane-protruding regions resulted in an especially high fitness cost. We estimated these costs to be nearly twice as high as for other proteins. There was a ten-fold difference in cost if, instead of entire proteins, only the disordered or membrane-embedded regions were compared with other segments. Although the cost of processing bulk protein was measurable, it could not be explained by several tested protein features, including those linked to translational efficiency or intensity of physical interactions after maturation. It most likely included a number of individually indiscernible effects arising during protein synthesis, maturation, maintenance, (mal)functioning, and disposal. When scaled to the levels normally achieved by proteins in the cell, the fitness cost of dealing with one amino acid in a standard protein appears to be generally very low. Many single amino acid additions or deletions are likely to be neutral even if the effective population size is as large as that of the budding yeast. This should also apply to substitutions. Selection is much more likely to operate if point mutations affect protein structure by, for example, extending or creating stretches that tend to unfold or interact improperly with membranes.
Collapse
|
45
|
Codon-by-codon modulation of translational speed and accuracy via mRNA folding. PLoS Biol 2014; 12:e1001910. [PMID: 25051069 PMCID: PMC4106722 DOI: 10.1371/journal.pbio.1001910] [Citation(s) in RCA: 85] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2014] [Accepted: 06/12/2014] [Indexed: 11/20/2022] Open
Abstract
Secondary structure in mRNAs modulates the speed of protein synthesis codon-by-codon to improve accuracy at important sites while ensuring high speed elsewhere. Rapid cell growth demands fast protein translational elongation to alleviate ribosome shortage. However, speedy elongation undermines translational accuracy because of a mechanistic tradeoff. Here we provide genomic evidence in budding yeast and mouse embryonic stem cells that the efficiency–accuracy conflict is alleviated by slowing down the elongation at structurally or functionally important residues to ensure their translational accuracies while sacrificing the accuracy for speed at other residues. Our computational analysis in yeast with codon resolution suggests that mRNA secondary structures serve as elongation brakes to control the speed and hence the fidelity of protein translation. The position-specific effect of mRNA folding on translational accuracy is further demonstrated experimentally by swapping synonymous codons in a yeast transgene. Our findings explain why highly expressed genes tend to have strong mRNA folding, slow translational elongation, and conserved protein sequences. The exquisite codon-by-codon translational modulation uncovered here is a testament to the power of natural selection in mitigating efficiency–accuracy conflicts, which are prevalent in biology. Protein synthesis by ribosomal translation is a vital cellular process, but our understanding of its regulation has been poor. Because the number of ribosomes in the cell is limited, rapid growth relies on fast translational elongation. The accuracy of translation must also be maintained, and in an ideal scenario, both speed and accuracy should be maximized to sustain rapid and productive growth. However, existing data suggest a tradeoff between speed and accuracy, making it impossible to simultaneously maximize both. A potential solution is slowing the elongation at functionally or structurally important sites to ensure their translational accuracies, while sacrificing accuracy for speed at other sites. Here, we show that budding yeast and mouse embryonic stem cells indeed use this strategy. We discover that a codon-by-codon adaptive modulation of translational elongation is accomplished by mRNA secondary structures, which serve as brakes to control the elongation speed and hence translational fidelity. Our findings explain why highly expressed genes tend to have strong mRNA folding, slow translational elongation, and conserved protein sequences. The exquisite translational modulation reflects the power of natural selection in mitigating efficiency–accuracy conflicts, and our study offers a general framework for analyzing similar conflicts, which are widespread in biology.
Collapse
|
46
|
Pechmann S, Frydman J. Interplay between chaperones and protein disorder promotes the evolution of protein networks. PLoS Comput Biol 2014; 10:e1003674. [PMID: 24968255 PMCID: PMC4072544 DOI: 10.1371/journal.pcbi.1003674] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2013] [Accepted: 05/03/2014] [Indexed: 11/19/2022] Open
Abstract
Evolution is driven by mutations, which lead to new protein functions but come at a cost to protein stability. Non-conservative substitutions are of interest in this regard because they may most profoundly affect both function and stability. Accordingly, organisms must balance the benefit of accepting advantageous substitutions with the possible cost of deleterious effects on protein folding and stability. We here examine factors that systematically promote non-conservative mutations at the proteome level. Intrinsically disordered regions in proteins play pivotal roles in protein interactions, but many questions regarding their evolution remain unanswered. Similarly, whether and how molecular chaperones, which have been shown to buffer destabilizing mutations in individual proteins, generally provide robustness during proteome evolution remains unclear. To this end, we introduce an evolutionary parameter λ that directly estimates the rate of non-conservative substitutions. Our analysis of λ in Escherichia coli, Saccharomyces cerevisiae, and Homo sapiens sequences reveals how co- and post-translationally acting chaperones differentially promote non-conservative substitutions in their substrates, likely through buffering of their destabilizing effects. We further find that λ serves well to quantify the evolution of intrinsically disordered proteins even though the unstructured, thus generally variable regions in proteins are often flanked by very conserved sequences. Crucially, we show that both intrinsically disordered proteins and highly re-wired proteins in protein interaction networks, which have evolved new interactions and functions, exhibit a higher λ at the expense of enhanced chaperone assistance. Our findings thus highlight an intricate interplay of molecular chaperones and protein disorder in the evolvability of protein networks. Our results illuminate the role of chaperones in enabling protein evolution, and underline the importance of the cellular context and integrated approaches for understanding proteome evolution. We feel that the development of λ may be a valuable addition to the toolbox applied to understand the molecular basis of evolution.
Collapse
Affiliation(s)
- Sebastian Pechmann
- Department of Biology, Stanford University, Stanford, California, United States of America
- * E-mail: (SP); (JF)
| | - Judith Frydman
- Department of Biology, Stanford University, Stanford, California, United States of America
- * E-mail: (SP); (JF)
| |
Collapse
|
47
|
Loss of quaternary structure is associated with rapid sequence divergence in the OSBS family. Proc Natl Acad Sci U S A 2014; 111:8535-40. [PMID: 24872444 DOI: 10.1073/pnas.1318703111] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023] Open
Abstract
The rate of protein evolution is determined by a combination of selective pressure on protein function and biophysical constraints on protein folding and structure. Determining the relative contributions of these properties is an unsolved problem in molecular evolution with broad implications for protein engineering and function prediction. As a case study, we examined the structural divergence of the rapidly evolving o-succinylbenzoate synthase (OSBS) family, which catalyzes a step in menaquinone synthesis in diverse microorganisms and plants. On average, the OSBS family is much more divergent than other protein families from the same set of species, with the most divergent family members sharing <15% sequence identity. Comparing 11 representative structures revealed that loss of quaternary structure and large deletions or insertions are associated with the family's rapid evolution. Neither of these properties has been investigated in previous studies to identify factors that affect the rate of protein evolution. Intriguingly, one subfamily retained a multimeric quaternary structure and has small insertions and deletions compared with related enzymes that catalyze diverse reactions. Many proteins in this subfamily catalyze both OSBS and N-succinylamino acid racemization (NSAR). Retention of ancestral structural characteristics in the NSAR/OSBS subfamily suggests that the rate of protein evolution is not proportional to the capacity to evolve new protein functions. Instead, structural features that are conserved among proteins with diverse functions might contribute to the evolution of new functions.
Collapse
|
48
|
Abstract
Impairment of RNA editing at a handful of coding sites causes severe disorders, prompting the view that coding RNA editing is highly advantageous. Recent genomic studies have expanded the list of human coding RNA editing sites by more than 100 times, raising the question of how common advantageous RNA editing is. Analyzing 1,783 human coding A-to-G editing sites, we show that both the frequency and level of RNA editing decrease as the importance of a site or gene increases; that during evolution, edited As are more likely than unedited As to be replaced with Gs but not with Ts or Cs; and that among nonsynonymously edited As, those that are evolutionarily least conserved exhibit the highest editing levels. These and other observations reveal the overall nonadaptive nature of coding RNA editing, despite the presence of a few sites in which editing is clearly beneficial. We propose that most observed coding RNA editing results from tolerable promiscuous targeting by RNA editing enzymes, the original physiological functions of which remain elusive.
Collapse
|
49
|
Tomala K, Pogoda E, Jakubowska A, Korona R. Fitness costs of minimal sequence alterations causing protein instability and toxicity. Mol Biol Evol 2013; 31:703-7. [PMID: 24361995 DOI: 10.1093/molbev/mst264] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
Destabilization of a protein impairs its metabolic efficiency. It is less clear how often destabilization also results in a gain of toxicity. We derived collections of temperature-sensitive, and thus structurally unstable, mutants of the yeast ADE2 and LYS2 genes by introducing single or very few amino acids substitutions. Overexpression of these mutant proteins led to a common, although unequal, fitness decrease. Interestingly, although the mutant proteins were functionally redundant, higher expression levels were associated with higher fitness. This result suggests that growth was hampered not by the accumulation of damaged chains but by the activities needed to remove them or by the damage caused before they were removed. Our results support the idea that any protein can become toxic when destabilized by a point mutation.
Collapse
Affiliation(s)
- Katarzyna Tomala
- Institute of Environmental Sciences, Jagiellonian University, Krakow, Poland
| | | | | | | |
Collapse
|
50
|
Abstract
Levels of selective constraint vary among proteins. Although strong constraint on a protein is often attributed to its functional importance, evolutionary rate may also be limited if a protein is fragile, such that a large proportion of amino acid replacements reduce its fitness. To determine the relative contributions of essentiality and fragility to selective constraint, we compared relationships of selection against nonsense mutations (snon) and selection against missense mutations (smis) to protein sequence conservation (Ka). As expected, snon is greater than smis; however, the correlation between smis and Ka is nearly three times stronger than the correlation between snon and Ka. Moreover, examination of relationships to gene expression level, tissue specificity, and number of protein-protein interactions shows that smis is more strongly correlated than snon to all three measures of biological function. Thus, our analysis reveals that slowly evolving proteins are under strong selective constraint primarily because they are fragile, and that this association likely exists because allowing a protein to function improperly, rather than removing it from a biological network, can negatively affect the functions of other molecules it interacts with and their downstream products.
Collapse
Affiliation(s)
- Raquel Assis
- Department of Biology, Pennsylvania State University
| | | |
Collapse
|