1
|
Usmanova DR, Plata G, Vitkup D. Functional Optimization in Distinct Tissues and Conditions Constrains the Rate of Protein Evolution. Mol Biol Evol 2024; 41:msae200. [PMID: 39431545 PMCID: PMC11523136 DOI: 10.1093/molbev/msae200] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2024] [Revised: 07/29/2024] [Accepted: 08/05/2024] [Indexed: 10/22/2024] Open
Abstract
Understanding the main determinants of protein evolution is a fundamental challenge in biology. Despite many decades of active research, the molecular and cellular mechanisms underlying the substantial variability of evolutionary rates across cellular proteins are not currently well understood. It also remains unclear how protein molecular function is optimized in the context of multicellular species and why many proteins, such as enzymes, are only moderately efficient on average. Our analysis of genomics and functional datasets reveals in multiple organisms a strong inverse relationship between the optimality of protein molecular function and the rate of protein evolution. Furthermore, we find that highly expressed proteins tend to be substantially more functionally optimized. These results suggest that cellular expression costs lead to more pronounced functional optimization of abundant proteins and that the purifying selection to maintain high levels of functional optimality significantly slows protein evolution. We observe that in multicellular species both the rate of protein evolution and the degree of protein functional efficiency are primarily affected by expression in several distinct cell types and tissues, specifically, in developed neurons with upregulated synaptic processes in animals and in young and fast-growing tissues in plants. Overall, our analysis reveals how various constraints from the molecular, cellular, and species' levels of biological organization jointly affect the rate of protein evolution and the level of protein functional adaptation.
Collapse
Affiliation(s)
- Dinara R Usmanova
- Department of Systems Biology, Columbia University, New York, NY 10032, USA
| | - Germán Plata
- Department of Systems Biology, Columbia University, New York, NY 10032, USA
- BiomEdit, Fishers, IN 46037, USA
| | - Dennis Vitkup
- Department of Systems Biology, Columbia University, New York, NY 10032, USA
- Department of Biomedical Informatics, Columbia University, New York, NY 10032, USA
| |
Collapse
|
2
|
Basu Choudhury G, Datta S. Implication of Molecular Constraints Facilitating the Functional Evolution of Pseudomonas aeruginosa KPR2 into a Versatile α-Keto-Acid Reductase. Biochemistry 2024; 63:1808-1823. [PMID: 38962820 DOI: 10.1021/acs.biochem.4c00087] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/05/2024]
Abstract
Theoretical concepts linking the structure, function, and evolution of a protein, while often intuitive, necessitate validation through investigations in real-world systems. Our study empirically explores the evolutionary implications of multiple gene copies in an organism by shedding light on the structure-function modulations observed in Pseudomonas aeruginosa's second copy of ketopantoate reductase (PaKPR2). We demonstrated with two apo structures that the typical active site cleft of the protein transforms into a two-sided pocket where a molecular gate made up of two residues controls the substrate entry site, resulting in its inactivity toward the natural substrate ketopantoate. Strikingly, this structural modification made the protein active against several important α-keto-acid substrates with varied efficiency. Structural constraints at the binding site for this altered functional trait were analyzed with two binary complexes that show the conserved residue microenvironment faces restricted movements due to domain closure. Finally, its mechanistic highlights gathered from a ternary complex structure help in delineating the molecular perspectives behind its kinetic cooperativity toward these broad range of substrates. Detailed structural characteristics of the protein presented here also identified four key amino acid residues responsible for its versatile α-keto-acid reductase activity, which can be further modified to improve its functional properties through protein engineering.
Collapse
Affiliation(s)
- Gourab Basu Choudhury
- CSIR-Indian Institute of Chemical Biology, Raja S C Mullick Road, Jadavpur, Kolkata 700032, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad 201002, India
| | - Saumen Datta
- CSIR-Indian Institute of Chemical Biology, Raja S C Mullick Road, Jadavpur, Kolkata 700032, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad 201002, India
| |
Collapse
|
3
|
Teyssonniere EM, Shichino Y, Mito M, Friedrich A, Iwasaki S, Schacherer J. Translation variation across genetic backgrounds reveals a post-transcriptional buffering signature in yeast. Nucleic Acids Res 2024; 52:2434-2445. [PMID: 38261993 PMCID: PMC10954453 DOI: 10.1093/nar/gkae030] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2023] [Revised: 12/21/2023] [Accepted: 01/11/2024] [Indexed: 01/25/2024] Open
Abstract
Gene expression is known to vary among individuals, and this variability can impact the phenotypic diversity observed in natural populations. While the transcriptome and proteome have been extensively studied, little is known about the translation process itself. Here, we therefore performed ribosome and transcriptomic profiling on a genetically and ecologically diverse set of natural isolates of the Saccharomyces cerevisiae yeast. Interestingly, we found that the Euclidean distances between each profile and the expression fold changes in each pairwise isolate comparison were higher at the transcriptomic level. This observation clearly indicates that the transcriptional variation observed in the different isolates is buffered through a phenomenon known as post-transcriptional buffering at the translation level. Furthermore, this phenomenon seemed to have a specific signature by preferentially affecting essential genes as well as genes involved in complex-forming proteins, and low transcribed genes. We also explored the translation of the S. cerevisiae pangenome and found that the accessory genes related to introgression events displayed similar transcription and translation levels as the core genome. By contrast, genes acquired through horizontal gene transfer events tended to be less efficiently translated. Together, our results highlight both the extent and signature of the post-transcriptional buffering.
Collapse
Affiliation(s)
| | - Yuichi Shichino
- RNA Systems Biochemistry Laboratory, RIKEN Cluster for Pioneering Research, Wako, Saitama 351-0198, Japan
| | - Mari Mito
- RNA Systems Biochemistry Laboratory, RIKEN Cluster for Pioneering Research, Wako, Saitama 351-0198, Japan
| | - Anne Friedrich
- Université de Strasbourg, CNRS, GMGM UMR, 7156 Strasbourg, France
| | - Shintaro Iwasaki
- RNA Systems Biochemistry Laboratory, RIKEN Cluster for Pioneering Research, Wako, Saitama 351-0198, Japan
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Kashiwa, Chiba 277-8561, Japan
| | - Joseph Schacherer
- Université de Strasbourg, CNRS, GMGM UMR, 7156 Strasbourg, France
- Institut Universitaire de France (IUF), Paris, France
| |
Collapse
|
4
|
Uruén C, Gimeno J, Sanz M, Fraile L, Marín CM, Arenas J. Invasive Streptococcus suis isolated in Spain contain a highly promiscuous and dynamic resistome. Front Cell Infect Microbiol 2024; 13:1329632. [PMID: 38317790 PMCID: PMC10839070 DOI: 10.3389/fcimb.2023.1329632] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2023] [Accepted: 12/26/2023] [Indexed: 02/07/2024] Open
Abstract
Introduction Streptococcus suis is a major pathogen for swine and human. Here we aimed to know the rates of antimicrobial resistance (AMR) in invasive S. suis isolates recovered along Spain between 2016 - 2021 and elucidate their genetic origin. Methods Antibiotic susceptibility testing was performed for 116 isolates of different genetic backgrounds and geographic origins against 18 antibiotics of 9 families. The association between AMR and genotypes and the origin of the isolates were statistically analyzed using Pearson´s chi-square test and the likelihood ratio. The antimicrobial resistant genes were identified by whole genome sequencing analysis and PCR screenings. Results High AMR rates (>80%) were detected for tetracyclines, spectinomycin, lincosamides, and marbofloxacin, medium (20-40%) for sulphonamides/trimethoprim, tiamulin, penicillin G, and enrofloxacin, and low (< 20%) for florfenicol, and four additional β-lactams. The occurrence of multidrug resistance was observed in 90% of isolates. For certain antibiotics (penicillin G, enrofloxacin, marbofloxacin, tilmicosin, and erythromycin), AMR was significantly associated with particular sequence types (STs), geographic regions, age of pigs, and time course. Whole genome sequencing comparisons and PCR screenings identified 23 AMR genes, of which 19 were previously reported in S. suis (aph(3')-IIIa, sat4, aadE, spw, aac(6')-Ie-aph(2'')-Ia, fexA, optrA, erm(B), mef(A/E), mrs(D), mph(C), lnu(B), lsa(E), vga(F), tet(M), tet(O), tet(O/W/32/O), tet(W)), and 4 were novel (aph(2'')-IIIa, apmA, erm(47), tet(T)). These AMR genes explained the AMR to spectinomycin, macrolides, lincosamides, tiamulin, and tetracyclines. Several genes were located on mobile genetic elements which showed a variable organization and composition. As AMR gene homologs were identified in many human and animal pathogens, the resistome of S. suis has a different phylogenetic origin. Moreover, AMR to penicillin G, fluoroquinolones, and trimethoprim related to mutations in genes coding for target enzymes (pbp1a, pbp2b, pbp2x, mraY, gyrA, parC, and dhfr). Bioinformatic analysis estimated traits of recombination on target genes, also indicative of gene transfer events. Conclusions Our work evidences that S. suis is a major contributor to AMR dissemination across veterinary and human pathogens. Therefore, control of AMR in S. suis should be considered from a One Health approach in regions with high pig production to properly tackle the issue of antimicrobial drug resistance.
Collapse
Affiliation(s)
- Cristina Uruén
- Unit of Microbiology and Immunology, Faculty of Veterinary, University of Zaragoza, Zaragoza, Spain
- Institute Agrofood of Aragón-IA2, University of Zaragoza-CITA, Zaragoza, Spain
| | - Jorge Gimeno
- Unit of Microbiology and Immunology, Faculty of Veterinary, University of Zaragoza, Zaragoza, Spain
- Institute Agrofood of Aragón-IA2, University of Zaragoza-CITA, Zaragoza, Spain
| | - Marina Sanz
- Unit of Microbiology and Immunology, Faculty of Veterinary, University of Zaragoza, Zaragoza, Spain
- Institute Agrofood of Aragón-IA2, University of Zaragoza-CITA, Zaragoza, Spain
| | - Lorenzo Fraile
- Department of Animal Science, ETSEA, University of Lleida-Agrotecno, Lleida, Spain
| | - Clara M. Marín
- Institute Agrofood of Aragón-IA2, University of Zaragoza-CITA, Zaragoza, Spain
- Department of Animal Production and Health, CITA, Zaragoza, Spain
| | - Jesús Arenas
- Unit of Microbiology and Immunology, Faculty of Veterinary, University of Zaragoza, Zaragoza, Spain
- Institute Agrofood of Aragón-IA2, University of Zaragoza-CITA, Zaragoza, Spain
| |
Collapse
|
5
|
Roberts M, Josephs EB. Weaker selection on genes with treatment-specific expression consistent with a limit on plasticity evolution in Arabidopsis thaliana. Genetics 2023; 224:iyad074. [PMID: 37094602 PMCID: PMC10484170 DOI: 10.1093/genetics/iyad074] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2023] [Revised: 03/06/2023] [Accepted: 04/07/2023] [Indexed: 04/26/2023] Open
Abstract
Differential gene expression between environments often underlies phenotypic plasticity. However, environment-specific expression patterns are hypothesized to relax selection on genes, and thus limit plasticity evolution. We collated over 27 terabases of RNA-sequencing data on Arabidopsis thaliana from over 300 peer-reviewed studies and 200 treatment conditions to investigate this hypothesis. Consistent with relaxed selection, genes with more treatment-specific expression have higher levels of nucleotide diversity and divergence at nonsynonymous sites but lack stronger signals of positive selection. This result persisted even after controlling for expression level, gene length, GC content, the tissue specificity of expression, and technical variation between studies. Overall, our investigation supports the existence of a hypothesized trade-off between the environment specificity of a gene's expression and the strength of selection on said gene in A. thaliana. Future studies should leverage multiple genome-scale datasets to tease apart the contributions of many variables in limiting plasticity evolution.
Collapse
Affiliation(s)
- Miles Roberts
- Genetics and Genome Sciences Program, Michigan State University, East Lansing, MI 48824, USA
| | - Emily B Josephs
- Department of Plant Biology, Michigan State University, East Lansing, MI 48824, USA
- Ecology, Evolution, and Behavior Program, Michigan State University, East Lansing, MI 48824, USA
| |
Collapse
|
6
|
Biesiadecka MK, Sliwa P, Tomala K, Korona R. An Overexpression Experiment Does Not Support the Hypothesis That Avoidance of Toxicity Determines the Rate of Protein Evolution. Genome Biol Evol 2021; 12:589-596. [PMID: 32259256 PMCID: PMC7250497 DOI: 10.1093/gbe/evaa067] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/01/2020] [Indexed: 12/22/2022] Open
Abstract
The misfolding avoidance hypothesis postulates that sequence mutations render proteins cytotoxic and therefore the higher the gene expression, the stronger the operation of selection against substitutions. This translates into prediction that relative toxicity of extant proteins is higher for those evolving faster. In the present experiment, we selected pairs of yeast genes which were paralogous but evolving at different rates. We expressed them artificially to high levels. We expected that toxicity would be higher for ones bearing more mutations, especially that overcrowding should rather exacerbate than reverse the already existing differences in misfolding rates. We did find that the applied mode of overexpression caused a considerable decrease in fitness and that the decrease was proportional to the amount of excessive protein. However, it was not higher for proteins which are normally expressed at lower levels (and have less conserved sequence). This result was obtained consistently, regardless whether the rate of growth or ability to compete in common cultures was used as a proxy for fitness. In additional experiments, we applied factors that reduce accuracy of translation or enhance structural instability of proteins. It did not change a consistent pattern of independence between the fitness cost caused by overexpression of a protein and the rate of its sequence evolution.
Collapse
Affiliation(s)
| | - Piotr Sliwa
- Department of Genetics, Faculty of Biotechnology, University of Rzeszów, Poland
| | - Katarzyna Tomala
- Institute of Environmental Sciences, Faculty of Biology, Jagiellonian University, Cracow, Poland
| | - Ryszard Korona
- Institute of Environmental Sciences, Faculty of Biology, Jagiellonian University, Cracow, Poland
| |
Collapse
|
7
|
Wei C, Chen YM, Chen Y, Qian W. The Missing Expression Level-Evolutionary Rate Anticorrelation in Viruses Does Not Support Protein Function as a Main Constraint on Sequence Evolution. Genome Biol Evol 2021; 13:evab049. [PMID: 33713114 PMCID: PMC7989579 DOI: 10.1093/gbe/evab049] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/06/2021] [Indexed: 12/13/2022] Open
Abstract
One of the central goals in molecular evolutionary biology is to determine the sources of variation in the rate of sequence evolution among proteins. Gene expression level is widely accepted as the primary determinant of protein evolutionary rate, because it scales with the extent of selective constraints imposed on a protein, leading to the well-known negative correlation between expression level and protein evolutionary rate (the E-R anticorrelation). Selective constraints have been hypothesized to entail the maintenance of protein function, the avoidance of cytotoxicity caused by protein misfolding or nonspecific protein-protein interactions, or both. However, empirical tests evaluating the relative importance of these hypotheses remain scarce, likely due to the nontrivial difficulties in distinguishing the effect of a deleterious mutation on a protein's function versus its cytotoxicity. We realized that examining the sequence evolution of viral proteins could overcome this hurdle. It is because purifying selection against mutations in a viral protein that result in cytotoxicity per se is likely relaxed, whereas purifying selection against mutations that impair viral protein function persists. Multiple analyses of SARS-CoV-2 and nine other virus species revealed a complete absence of any E-R anticorrelation. As a control, the E-R anticorrelation does exist in human endogenous retroviruses where purifying selection against cytotoxicity is present. Taken together, these observations do not support the maintenance of protein function as the main constraint on protein sequence evolution in cellular organisms.
Collapse
Affiliation(s)
- Changshuo Wei
- State Key Laboratory of Plant Genomics, Institute of Genetics and Developmental Biology, Innovation Academy for Seed Design, Chinese Academy of Sciences, Beijing, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Yan-Ming Chen
- State Key Laboratory of Plant Genomics, Institute of Genetics and Developmental Biology, Innovation Academy for Seed Design, Chinese Academy of Sciences, Beijing, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Ying Chen
- State Key Laboratory of Plant Genomics, Institute of Genetics and Developmental Biology, Innovation Academy for Seed Design, Chinese Academy of Sciences, Beijing, China
| | - Wenfeng Qian
- State Key Laboratory of Plant Genomics, Institute of Genetics and Developmental Biology, Innovation Academy for Seed Design, Chinese Academy of Sciences, Beijing, China
- University of Chinese Academy of Sciences, Beijing, China
| |
Collapse
|
8
|
Lian S, Zhou Y, Liu Z, Gong A, Cheng L. The differential expression patterns of paralogs in response to stresses indicate expression and sequence divergences. BMC PLANT BIOLOGY 2020; 20:277. [PMID: 32546126 PMCID: PMC7298774 DOI: 10.1186/s12870-020-02460-x] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/27/2019] [Accepted: 05/24/2020] [Indexed: 05/22/2023]
Abstract
BACKGROUND Theoretically, paralogous genes generated through whole genome duplications should share identical expression levels due to their identical sequences and chromatin environments. However, functional divergences and expression differences have arisen due to selective pressures throughout evolution. A comprehensive investigation of the expression patterns of paralogous gene pairs in response to various stresses and a study of correlations between the expression levels and sequence divergences of the paralogs are needed. RESULTS In this study, we analyzed the expression patterns of paralogous genes under different types of stress and investigated the correlations between the expression levels and sequence divergences of the paralogs. We analyzed the differential expression patterns of the paralogs under four different types of stress (drought, cold, infection, and herbivory) and classified them into three main types according to their expression patterns. We then further analyzed the differential expression patterns under various degrees of stress and constructed corresponding co-expression networks of differentially expressed paralogs and transcription factors. Finally, we investigated the correlations between the expression levels and sequence divergences of the paralogs and identified positive correlations between expression level and sequence divergence. With regard to sequence divergence, we identified correlations between selective pressures and phylogenetic relationships. CONCLUSIONS These results shed light on differential expression patterns of paralogs in response to environmental stresses and are helpful for understanding the relationships between expression levels and sequences divergences.
Collapse
Affiliation(s)
- Shuaibin Lian
- College of Physics and Electronic Engineering, Xinyang Normal University, Xinyang, China
| | - Yongjie Zhou
- College of Physics and Electronic Engineering, Xinyang Normal University, Xinyang, China
| | - Zixiao Liu
- College of Physics and Electronic Engineering, Xinyang Normal University, Xinyang, China
| | - Andong Gong
- College of Life Sciences, Xinyang Normal University, Xinyang, China
| | - Lin Cheng
- College of Life Sciences, Xinyang Normal University, Xinyang, China
| |
Collapse
|
9
|
Alvarez-Ponce D, Aguilar-Rodríguez J, Fares MA. Molecular Chaperones Accelerate the Evolution of Their Protein Clients in Yeast. Genome Biol Evol 2020; 11:2360-2375. [PMID: 31297528 PMCID: PMC6735891 DOI: 10.1093/gbe/evz147] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/05/2019] [Indexed: 12/23/2022] Open
Abstract
Protein stability is a major constraint on protein evolution. Molecular chaperones, also known as heat-shock proteins, can relax this constraint and promote protein evolution by diminishing the deleterious effect of mutations on protein stability and folding. This effect, however, has only been stablished for a few chaperones. Here, we use a comprehensive chaperone–protein interaction network to study the effect of all yeast chaperones on the evolution of their protein substrates, that is, their clients. In particular, we analyze how yeast chaperones affect the evolutionary rates of their clients at two very different evolutionary time scales. We first study the effect of chaperone-mediated folding on protein evolution over the evolutionary divergence of Saccharomyces cerevisiae and S. paradoxus. We then test whether yeast chaperones have left a similar signature on the patterns of standing genetic variation found in modern wild and domesticated strains of S. cerevisiae. We find that genes encoding chaperone clients have diverged faster than genes encoding non-client proteins when controlling for their number of protein–protein interactions. We also find that genes encoding client proteins have accumulated more intraspecific genetic diversity than those encoding non-client proteins. In a number of multivariate analyses, controlling by other well-known factors that affect protein evolution, we find that chaperone dependence explains the largest fraction of the observed variance in the rate of evolution at both evolutionary time scales. Chaperones affecting rates of protein evolution mostly belong to two major chaperone families: Hsp70s and Hsp90s. Our analyses show that protein chaperones, by virtue of their ability to buffer destabilizing mutations and their role in modulating protein genotype–phenotype maps, have a considerable accelerating effect on protein evolution.
Collapse
Affiliation(s)
- David Alvarez-Ponce
- Biology Department, University of Nevada, Reno.,Instituto de Biología Molecular y Celular de Plantas, CSIC-UPV, Valencia, Spain
| | - José Aguilar-Rodríguez
- Department of Biology, Stanford University, CA.,Department of Chemical and Systems Biology, Stanford University School of Medicine, CA
| | - Mario A Fares
- Instituto de Biología Molecular y Celular de Plantas, CSIC-UPV, Valencia, Spain.,Smurfit Institute of Genetics, University of Dublin, Trinity College Dublin, Ireland
| |
Collapse
|
10
|
Liu K, Hao X, Wang Q, Hou J, Lai X, Dong Z, Shao C. Genome-wide identification and characterization of heat shock protein family 70 provides insight into its divergent functions on immune response and development of Paralichthys olivaceus. PeerJ 2019; 7:e7781. [PMID: 31737440 PMCID: PMC6855204 DOI: 10.7717/peerj.7781] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2019] [Accepted: 08/28/2019] [Indexed: 01/16/2023] Open
Abstract
Flatfish undergo extreme morphological development and settle to a benthic in the adult stage, and are likely to be more susceptible to environmental stress. Heat shock proteins 70 (hsp70) are involved in embryonic development and stress response in metazoan animals. However, the evolutionary history and functions of hsp70 in flatfish are poorly understood. Here, we identified 15 hsp70 genes in the genome of Japanese flounder (Paralichthys olivaceus), a flatfish endemic to northwestern Pacific Ocean. Gene structure and motifs of the Japanese flounder hsp70 were conserved, and there were few structure variants compared to other fish species. We constructed a maximum likelihood tree to understand the evolutionary relationship of the hsp70 genes among surveyed fish. Selection pressure analysis suggested that four genes, hspa4l, hspa9, hspa13, and hyou1, showed signs of positive selection. We then extracted transcriptome data on the Japanese flounder with Edwardsiella tarda to induce stress, and found that hspa9, hspa12b, hspa4l, hspa13, and hyou1 were highly expressed, likely to protect cells from stress. Interestingly, expression patterns of hsp70 genes were divergent in different developmental stages of the Japanese flounder. We found that at least one hsp70 gene was always highly expressed at various stages of embryonic development of the Japanese flounder, thereby indicating that hsp70 genes were constitutively expressed in the Japanese flounder. Our findings provide basic and useful resources to better understand hsp70 genes in flatfish.
Collapse
Affiliation(s)
- Kaiqiang Liu
- Key Laboratory for Sustainable Utilization of Marine Fisheries Resource, Ministry of Agriculture, Yellow Sea Fisheries Research Institute, Chinese Academy of Fishery Sciences, QingDao, China
- Laboratory for Marine Fisheries Science and Food Production Processes, Qingdao National Laboratory for Marine Science and Technology, QingDao, China
- Jiangsu Key Laboratory of Marine Bioresources and Environment, Jiangsu Key Laboratory of Marine Biotechnology, Huaihai Institute of Technology, Lianyungang, China
| | - Xiancai Hao
- Key Laboratory for Sustainable Utilization of Marine Fisheries Resource, Ministry of Agriculture, Yellow Sea Fisheries Research Institute, Chinese Academy of Fishery Sciences, QingDao, China
- Laboratory for Marine Fisheries Science and Food Production Processes, Qingdao National Laboratory for Marine Science and Technology, QingDao, China
| | - Qian Wang
- Key Laboratory for Sustainable Utilization of Marine Fisheries Resource, Ministry of Agriculture, Yellow Sea Fisheries Research Institute, Chinese Academy of Fishery Sciences, QingDao, China
- Laboratory for Marine Fisheries Science and Food Production Processes, Qingdao National Laboratory for Marine Science and Technology, QingDao, China
| | - Jilun Hou
- Beidaihe Central Experiment Station, Chinese Academy of Fishery Sciences, Beidaihe, China
| | - Xiaofang Lai
- Jiangsu Key Laboratory of Marine Bioresources and Environment, Jiangsu Key Laboratory of Marine Biotechnology, Huaihai Institute of Technology, Lianyungang, China
| | - Zhiguo Dong
- Jiangsu Key Laboratory of Marine Bioresources and Environment, Jiangsu Key Laboratory of Marine Biotechnology, Huaihai Institute of Technology, Lianyungang, China
| | - Changwei Shao
- Key Laboratory for Sustainable Utilization of Marine Fisheries Resource, Ministry of Agriculture, Yellow Sea Fisheries Research Institute, Chinese Academy of Fishery Sciences, QingDao, China
- Laboratory for Marine Fisheries Science and Food Production Processes, Qingdao National Laboratory for Marine Science and Technology, QingDao, China
| |
Collapse
|
11
|
Feyertag F, Berninsone PM, Alvarez-Ponce D. N-glycoproteins exhibit a positive expression level-evolutionary rate correlation. J Evol Biol 2019; 32:390-394. [PMID: 30697857 DOI: 10.1111/jeb.13420] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2018] [Revised: 01/23/2019] [Accepted: 01/25/2019] [Indexed: 12/22/2022]
Abstract
The different proteins of any proteome evolve at enormously different rates. One of the primary factors influencing rates of protein evolution is expression level, with highly expressed proteins tending to evolve at slow rates. This phenomenon, known as the expression level-evolutionary rate (E-R) anticorrelation, has been attributed to the abundance-dependent deleterious effects of misfolding or misinteraction. We have recently shown that secreted proteins either lack an E-R anticorrelation or exhibit a significantly reduced E-R anticorrelation. This effect may be due to the strict quality control to which secreted proteins are subject in the endoplasmic reticulum (which is expected to reduce the rate of misfolding and its deleterious effects) or to their extracellular location (expected to reduce the rate of misinteraction and its deleterious effects). Among secreted proteins, N-glycosylated ones are under particularly strong quality control. Here, we investigate how N-linked glycosylation affects the E-R anticorrelation. Strikingly, we observe a positive E-R correlation among N-glycosylated proteins. That is, N-glycoproteins that are highly expressed evolve at faster rates than lowly expressed N-glycoproteins, in contrast to what is observed among intracellular proteins.
Collapse
Affiliation(s)
- Felix Feyertag
- Department of Biology, University of Nevada, Reno, Reno, Nevada
| | | | | |
Collapse
|
12
|
Furman BLS, Dang UJ, Evans BJ, Golding GB. Divergent subgenome evolution after allopolyploidization in African clawed frogs (Xenopus). J Evol Biol 2018; 31:1945-1958. [DOI: 10.1111/jeb.13391] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2018] [Revised: 09/26/2018] [Accepted: 10/06/2018] [Indexed: 12/22/2022]
Affiliation(s)
| | - Utkarsh J. Dang
- Department of Health Outcomes and Administrative Sciences; School of Pharmacy and Pharmaceutical Sciences; Binghamton University; State University of New York; Binghamton NY USA
| | - Ben J. Evans
- Department of Biology; McMaster University; Hamilton ON Canada
| | | |
Collapse
|
13
|
Cole CT, Ingvarsson PK. Pathway position constrains the evolution of an ecologically important pathway in aspens (Populus tremula L.). Mol Ecol 2018; 27:3317-3330. [PMID: 29972878 DOI: 10.1111/mec.14785] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2016] [Revised: 01/30/2018] [Accepted: 02/20/2018] [Indexed: 12/22/2022]
Abstract
Many ecological interactions of aspens and their relatives (Populus spp.) are affected by products of the phenylpropanoid pathway synthesizing condensed tannins (CTs), whose production involves trade-offs with other ecologically important compounds and with growth. Genes of this pathway are candidates for investigating the role of selection on ecologically important, polygenic traits. We analysed sequences from 25 genes representing 10 steps of the CT synthesis pathway, which produces CTs used in defence and lignins used for growth, in 12 individuals of European aspen (Populus tremula). We compared these to homologs from P. trichocarpa, to a control set of 77 P. tremula genes, to genome-wide resequencing data and to RNA-seq expression levels, in order to identify signatures of selection distinct from those of demography. In Populus, pathway position exerts a strong influence on the evolution of these genes. Nonsynonymous diversity, divergence and allele frequency shifts (Tajima's D) were much lower than for synonymous measures. Expression levels were higher, and the direction of selection more negative, for upstream genes than for those downstream. Selective constraints act with increasing intensity on upstream genes, despite the presence of multiple paralogs in most gene families. Pleiotropy, expression level, flux control and codon bias appear to interact in determining levels and patterns of variation in genes of this pathway, whose products mediate a wide array of ecological interactions for this widely distributed species.
Collapse
Affiliation(s)
- Christopher T Cole
- Division of Science and Mathematics, University of Minnesota, Morris, Morris, Minnesota
| | - Pär K Ingvarsson
- Department of Ecology and Environmental Science, Umeå University, Umeå, Sweden
| |
Collapse
|
14
|
Alvarez-Ponce D, Feyertag F, Chakraborty S. Position Matters: Network Centrality Considerably Impacts Rates of Protein Evolution in the Human Protein-Protein Interaction Network. Genome Biol Evol 2018; 9:1742-1756. [PMID: 28854629 PMCID: PMC5570066 DOI: 10.1093/gbe/evx117] [Citation(s) in RCA: 30] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/01/2017] [Indexed: 02/06/2023] Open
Abstract
The proteins of any organism evolve at disparate rates. A long list of factors affecting rates of protein evolution have been identified. However, the relative importance of each factor in determining rates of protein evolution remains unresolved. The prevailing view is that evolutionary rates are dominantly determined by gene expression, and that other factors such as network centrality have only a marginal effect, if any. However, this view is largely based on analyses in yeasts, and accurately measuring the importance of the determinants of rates of protein evolution is complicated by the fact that the different factors are often correlated with each other, and by the relatively poor quality of available functional genomics data sets. Here, we use correlation, partial correlation and principal component regression analyses to measure the contributions of several factors to the variability of the rates of evolution of human proteins. For this purpose, we analyzed the entire human protein–protein interaction data set and the human signal transduction network—a network data set of exceptionally high quality, obtained by manual curation, which is expected to be virtually free from false positives. In contrast with the prevailing view, we observe that network centrality (measured as the number of physical and nonphysical interactions, betweenness, and closeness) has a considerable impact on rates of protein evolution. Surprisingly, the impact of centrality on rates of protein evolution seems to be comparable, or even superior according to some analyses, to that of gene expression. Our observations seem to be independent of potentially confounding factors and from the limitations (biases and errors) of interactomic data sets.
Collapse
|
15
|
Feyertag F, Alvarez-Ponce D. Disulfide Bonds Enable Accelerated Protein Evolution. Mol Biol Evol 2018; 34:1833-1837. [PMID: 28431018 DOI: 10.1093/molbev/msx135] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open
Abstract
The different proteins of any proteome evolve at enormously different rates. What factors contribute to this variability, and to what extent, is still a largely open question. We hypothesized that disulfide bonds, by increasing protein stability, should make proteins' structures relatively independent of their amino acid sequences, thus acting as buffers of deleterious mutations and enabling accelerated sequence evolution. In agreement with this hypothesis, we observed that membrane proteins with disulfide bonds evolved 88% faster than those without disulfide bonds, and that extracellular proteins with disulfide bonds evolved 49% faster than those without disulfide bonds. In addition, genes encoding proteins with disulfide bonds exhibit an increased likelihood of showing signatures of positive selection. Multivariate analyses indicate that the trend is independent of a number of potentially confounding factors. The effect, however, is not observed among the longest proteins, which can become stabilized by mechanisms other than disulfide bonds.
Collapse
Affiliation(s)
- Felix Feyertag
- Department of Biology, University of Nevada-Reno, Reno, NV
| | | |
Collapse
|
16
|
Feyertag F, Berninsone PM, Alvarez-Ponce D. Secreted Proteins Defy the Expression Level-Evolutionary Rate Anticorrelation. Mol Biol Evol 2017; 34:692-706. [PMID: 28007979 DOI: 10.1093/molbev/msw268] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
The rates of evolution of the proteins of any organism vary across orders of magnitude. A primary factor influencing rates of protein evolution is expression. A strong negative correlation between expression levels and evolutionary rates (the so-called E-R anticorrelation) has been observed in virtually all studied organisms. This effect is currently attributed to the abundance-dependent fitness costs of misfolding and unspecific protein-protein interactions, among other factors. Secreted proteins are folded in the endoplasmic reticulum, a compartment where chaperones, folding catalysts, and stringent quality control mechanisms promote their correct folding and may reduce the fitness costs of misfolding. In addition, confinement of secreted proteins to the extracellular space may reduce misinteractions and their deleterious effects. We hypothesize that each of these factors (the secretory pathway quality control and extracellular location) may reduce the strength of the E-R anticorrelation. Indeed, here we show that among human proteins that are secreted to the extracellular space, rates of evolution do not correlate with protein abundances. This trend is robust to controlling for several potentially confounding factors and is also observed when analyzing protein abundance data for 6 human tissues. In addition, analysis of mRNA abundance data for 32 human tissues shows that the E-R correlation is always less negative, and sometimes nonsignificant, in secreted proteins. Similar observations were made in Caenorhabditis elegans and in Escherichia coli, and to a lesser extent in Drosophila melanogaster, Saccharomyces cerevisiae and Arabidopsis thaliana. Our observations contribute to understand the causes of the E-R anticorrelation.
Collapse
Affiliation(s)
- Felix Feyertag
- Department of Biology, University of Nevada, Reno, Reno, NV
| | | | | |
Collapse
|
17
|
Genomic analysis reveals major determinants of cis-regulatory variation in Capsella grandiflora. Proc Natl Acad Sci U S A 2017; 114:1087-1092. [PMID: 28096395 DOI: 10.1073/pnas.1612561114] [Citation(s) in RCA: 47] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
Understanding the causes of cis-regulatory variation is a long-standing aim in evolutionary biology. Although cis-regulatory variation has long been considered important for adaptation, we still have a limited understanding of the selective importance and genomic determinants of standing cis-regulatory variation. To address these questions, we studied the prevalence, genomic determinants, and selective forces shaping cis-regulatory variation in the outcrossing plant Capsella grandiflora We first identified a set of 1,010 genes with common cis-regulatory variation using analyses of allele-specific expression (ASE). Population genomic analyses of whole-genome sequences from 32 individuals showed that genes with common cis-regulatory variation (i) are under weaker purifying selection and (ii) undergo less frequent positive selection than other genes. We further identified genomic determinants of cis-regulatory variation. Gene body methylation (gbM) was a major factor constraining cis-regulatory variation, whereas presence of nearby transposable elements (TEs) and tissue specificity of expression increased the odds of ASE. Our results suggest that most common cis-regulatory variation in C. grandiflora is under weak purifying selection, and that gene-specific functional constraints are more important for the maintenance of cis-regulatory variation than genome-scale variation in the intensity of selection. Our results agree with previous findings that suggest TE silencing affects nearby gene expression, and provide evidence for a link between gbM and cis-regulatory constraint, possibly reflecting greater dosage sensitivity of body-methylated genes. Given the extensive conservation of gbM in flowering plants, this suggests that gbM could be an important predictor of cis-regulatory variation in a wide range of plant species.
Collapse
|
18
|
Alvarez-Ponce D, Sabater-Muñoz B, Toft C, Ruiz-González MX, Fares MA. Essentiality Is a Strong Determinant of Protein Rates of Evolution during Mutation Accumulation Experiments in Escherichia coli. Genome Biol Evol 2016; 8:2914-2927. [PMID: 27566759 PMCID: PMC5630975 DOI: 10.1093/gbe/evw205] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
The Neutral Theory of Molecular Evolution is considered the most powerful theory to understand the evolutionary behavior of proteins. One of the main predictions of this theory is that essential proteins should evolve slower than dispensable ones owing to increased selective constraints. Comparison of genomes of different species, however, has revealed only small differences between the rates of evolution of essential and nonessential proteins. In some analyses, these differences vanish once confounding factors are controlled for, whereas in other cases essentiality seems to have an independent, albeit small, effect. It has been argued that comparing relatively distant genomes may entail a number of limitations. For instance, many of the genes that are dispensable in controlled lab conditions may be essential in some of the conditions faced in nature. Moreover, essentiality can change during evolution, and rates of protein evolution are simultaneously shaped by a variety of factors, whose individual effects are difficult to isolate. Here, we conducted two parallel mutation accumulation experiments in Escherichia coli, during 5,500–5,750 generations, and compared the genomes at different points of the experiments. Our approach (a short-term experiment, under highly controlled conditions) enabled us to overcome many of the limitations of previous studies. We observed that essential proteins evolved substantially slower than nonessential ones during our experiments. Strikingly, rates of protein evolution were only moderately affected by expression level and protein length.
Collapse
Affiliation(s)
| | - Beatriz Sabater-Muñoz
- Instituto de Biología Molecular y Celular de Plantas (CSIC-UPV), Valencia, Spain Department of Genetics, Smurfit Institute of Genetics, University of Dublin, Trinity College Dublin, Dublin, Ireland
| | - Christina Toft
- Department of Genetics, University of Valencia, Valencia, Spain Departamento de Biotecnología, Instituto de Agroquímica y Tecnología de los Alimentos (CSIC), Valencia, Spain
| | - Mario X Ruiz-González
- Instituto de Biología Molecular y Celular de Plantas (CSIC-UPV), Valencia, Spain Current Address: Secretaría de Educación Superior, Ciencia, Tecnología e Innovación, Proyecto Prometeo; Departamento de Ciencias Biológicas, Universidad Tócnica Particular de Loja, Loja, Ecuador
| | - Mario A Fares
- Instituto de Biología Molecular y Celular de Plantas (CSIC-UPV), Valencia, Spain Department of Genetics, Smurfit Institute of Genetics, University of Dublin, Trinity College Dublin, Dublin, Ireland
| |
Collapse
|
19
|
Steige KA, Slotte T. Genomic legacies of the progenitors and the evolutionary consequences of allopolyploidy. CURRENT OPINION IN PLANT BIOLOGY 2016; 30:88-93. [PMID: 26943938 DOI: 10.1016/j.pbi.2016.02.006] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/23/2015] [Revised: 02/13/2016] [Accepted: 02/15/2016] [Indexed: 06/05/2023]
Abstract
The formation of an allopolyploid species involves the merger of genomes with separate evolutionary histories and thereby different genomic legacies. Contrary to expectations from theory, genes from one are often lost preferentially in allopolyploids - there is biased fractionation. Here, we provide an overview of two ways in which the genomic legacies of the progenitors may impact the fate of duplicated genes in allopolyploids. Specifically, we discuss the role of homeolog expression biases in setting the stage for biased fractionation, and the evidence for transposable element silencing as a possible mechanism for homeolog expression biases. Finally, we highlight how differences between the progenitors with respect to accumulation of deleterious variation may affect trajectories of duplicate gene evolution in allopolyploids.
Collapse
Affiliation(s)
- Kim A Steige
- Department of Ecology and Genetics, Uppsala University, Sweden; Department of Ecology, Environment and Plant Sciences, Science for Life Laboratory, Stockholm University, Sweden
| | - Tanja Slotte
- Department of Ecology, Environment and Plant Sciences, Science for Life Laboratory, Stockholm University, Sweden.
| |
Collapse
|
20
|
Newton MS, Arcus VL, Patrick WM. Rapid bursts and slow declines: on the possible evolutionary trajectories of enzymes. J R Soc Interface 2016; 12:rsif.2015.0036. [PMID: 25926697 DOI: 10.1098/rsif.2015.0036] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
The evolution of enzymes is often viewed as following a smooth and steady trajectory, from barely functional primordial catalysts to the highly active and specific enzymes that we observe today. In this review, we summarize experimental data that suggest a different reality. Modern examples, such as the emergence of enzymes that hydrolyse human-made pesticides, demonstrate that evolution can be extraordinarily rapid. Experiments to infer and resurrect ancient sequences suggest that some of the first organisms present on the Earth are likely to have possessed highly active enzymes. Reconciling these observations, we argue that rapid bursts of strong selection for increased catalytic efficiency are interspersed with much longer periods in which the catalytic power of an enzyme erodes, through neutral drift and selection for other properties such as cellular energy efficiency or regulation. Thus, many enzymes may have already passed their catalytic peaks.
Collapse
Affiliation(s)
- Matilda S Newton
- Department of Biochemistry, University of Otago, Dunedin, New Zealand
| | - Vickery L Arcus
- School of Biology, University of Waikato, Hamilton, New Zealand
| | - Wayne M Patrick
- Department of Biochemistry, University of Otago, Dunedin, New Zealand
| |
Collapse
|
21
|
Hodgins KA, Yeaman S, Nurkowski KA, Rieseberg LH, Aitken SN. Expression Divergence Is Correlated with Sequence Evolution but Not Positive Selection in Conifers. Mol Biol Evol 2016; 33:1502-16. [PMID: 26873578 DOI: 10.1093/molbev/msw032] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
The evolutionary and genomic determinants of sequence evolution in conifers are poorly understood, and previous studies have found only limited evidence for positive selection. Using RNAseq data, we compared gene expression profiles to patterns of divergence and polymorphism in 44 seedlings of lodgepole pine (Pinus contorta) and 39 seedlings of interior spruce (Picea glauca × engelmannii) to elucidate the evolutionary forces that shape their genomes and their plastic responses to abiotic stress. We found that rapidly diverging genes tend to have greater expression divergence, lower expression levels, reduced levels of synonymous site diversity, and longer proteins than slowly diverging genes. Similar patterns were identified for the untranslated regions, but with some exceptions. We found evidence that genes with low expression levels had a larger fraction of nearly neutral sites, suggesting a primary role for negative selection in determining the association between evolutionary rate and expression level. There was limited evidence for differences in the rate of positive selection among genes with divergent versus conserved expression profiles and some evidence supporting relaxed selection in genes diverging in expression between the species. Finally, we identified a small number of genes that showed evidence of site-specific positive selection using divergence data alone. However, estimates of the proportion of sites fixed by positive selection (α) were in the range of other plant species with large effective population sizes suggesting relatively high rates of adaptive divergence among conifers.
Collapse
Affiliation(s)
- Kathryn A Hodgins
- School of Biological Sciences, Monash University, Melbourne, VIC, Australia
| | - Sam Yeaman
- Department of Botany, University of British Columbia, Vancouver, BC, Canada Department of Forest and Conservation Sciences, University of British Columbia, Vancouver, BC, Canada Department of Biological Sciences, University of Calgary, Calgary, AB, Canada
| | | | - Loren H Rieseberg
- Department of Botany, University of British Columbia, Vancouver, BC, Canada
| | - Sally N Aitken
- Department of Forest and Conservation Sciences, University of British Columbia, Vancouver, BC, Canada
| |
Collapse
|
22
|
Mukherjee D, Mukherjee A, Ghosh TC. Evolutionary Rate Heterogeneity of Primary and Secondary Metabolic Pathway Genes in Arabidopsis thaliana. Genome Biol Evol 2015; 8:17-28. [PMID: 26556590 PMCID: PMC4758233 DOI: 10.1093/gbe/evv217] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
Primary metabolism is essential to plants for growth and development, and secondary metabolism helps plants to interact with the environment. Many plant metabolites are industrially important. These metabolites are produced by plants through complex metabolic pathways. Lack of knowledge about these pathways is hindering the successful breeding practices for these metabolites. For a better knowledge of the metabolism in plants as a whole, evolutionary rate variation of primary and secondary metabolic pathway genes is a prerequisite. In this study, evolutionary rate variation of primary and secondary metabolic pathway genes has been analyzed in the model plant Arabidopsis thaliana. Primary metabolic pathway genes were found to be more conserved than secondary metabolic pathway genes. Several factors such as gene structure, expression level, tissue specificity, multifunctionality, and domain number are the key factors behind this evolutionary rate variation. This study will help to better understand the evolutionary dynamics of plant metabolism.
Collapse
Affiliation(s)
- Dola Mukherjee
- Bioinformatics Centre, Bose Institute, Kolkata, West Bengal, India
| | - Ashutosh Mukherjee
- Department of Botany, Vivekananda College, Thakurpukur, Kolkata, West Bengal, India
| | | |
Collapse
|
23
|
Kryuchkova-Mostacci N, Robinson-Rechavi M. Tissue-Specific Evolution of Protein Coding Genes in Human and Mouse. PLoS One 2015; 10:e0131673. [PMID: 26121354 PMCID: PMC4488272 DOI: 10.1371/journal.pone.0131673] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2015] [Accepted: 06/04/2015] [Indexed: 12/23/2022] Open
Abstract
Protein-coding genes evolve at different rates, and the influence of different parameters, from gene size to expression level, has been extensively studied. While in yeast gene expression level is the major causal factor of gene evolutionary rate, the situation is more complex in animals. Here we investigate these relations further, especially taking in account gene expression in different organs as well as indirect correlations between parameters. We used RNA-seq data from two large datasets, covering 22 mouse tissues and 27 human tissues. Over all tissues, evolutionary rate only correlates weakly with levels and breadth of expression. The strongest explanatory factors of purifying selection are GC content, expression in many developmental stages, and expression in brain tissues. While the main component of evolutionary rate is purifying selection, we also find tissue-specific patterns for sites under neutral evolution and for positive selection. We observe fast evolution of genes expressed in testis, but also in other tissues, notably liver, which are explained by weak purifying selection rather than by positive selection.
Collapse
Affiliation(s)
- Nadezda Kryuchkova-Mostacci
- Department of Ecology and Evolution, University of Lausanne, Lausanne, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Marc Robinson-Rechavi
- Department of Ecology and Evolution, University of Lausanne, Lausanne, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| |
Collapse
|
24
|
Shin SH, Choi SS. Lengths of coding and noncoding regions of a gene correlate with gene essentiality and rates of evolution. Genes Genomics 2015. [DOI: 10.1007/s13258-015-0265-6] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
|
25
|
Abstract
Protein metabolism is one of the most costly processes in the cell and is therefore expected to be under the effective control of natural selection. We stimulated yeast strains to overexpress each single gene product to approximately 1% of the total protein content. Consistent with previous reports, we found that excessive expression of proteins containing disordered or membrane-protruding regions resulted in an especially high fitness cost. We estimated these costs to be nearly twice as high as for other proteins. There was a ten-fold difference in cost if, instead of entire proteins, only the disordered or membrane-embedded regions were compared with other segments. Although the cost of processing bulk protein was measurable, it could not be explained by several tested protein features, including those linked to translational efficiency or intensity of physical interactions after maturation. It most likely included a number of individually indiscernible effects arising during protein synthesis, maturation, maintenance, (mal)functioning, and disposal. When scaled to the levels normally achieved by proteins in the cell, the fitness cost of dealing with one amino acid in a standard protein appears to be generally very low. Many single amino acid additions or deletions are likely to be neutral even if the effective population size is as large as that of the budding yeast. This should also apply to substitutions. Selection is much more likely to operate if point mutations affect protein structure by, for example, extending or creating stretches that tend to unfold or interact improperly with membranes.
Collapse
|
26
|
Nuñez PA, Romero H, Farber MD, Rocha EPC. Natural selection for operons depends on genome size. Genome Biol Evol 2014; 5:2242-54. [PMID: 24201372 PMCID: PMC3845653 DOI: 10.1093/gbe/evt174] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
In prokaryotes, genome size is associated with metabolic versatility, regulatory complexity, effective population size, and horizontal transfer rates. We therefore analyzed the covariation of genome size and operon conservation to assess the evolutionary models of operon formation and maintenance. In agreement with previous results, intraoperonic pairs of essential and of highly expressed genes are more conserved. Interestingly, intraoperonic pairs of genes are also more conserved when they encode proteins at similar cell concentrations, suggesting a role of cotranscription in diminishing the cost of waste and shortfall in gene expression. Larger genomes have fewer and smaller operons that are also less conserved. Importantly, lower conservation in larger genomes was observed for all classes of operons in terms of gene expression, essentiality, and balanced protein concentration. We reached very similar conclusions in independent analyses of three major bacterial clades (α- and β-Proteobacteria and Firmicutes). Operon conservation is inversely correlated to the abundance of transcription factors in the genome when controlled for genome size. This suggests a negative association between the complexity of genetic networks and operon conservation. These results show that genome size and/or its proxies are key determinants of the intensity of natural selection for operon organization. Our data fit better the evolutionary models based on the advantage of coregulation than those based on genetic linkage or stochastic gene expression. We suggest that larger genomes with highly complex genetic networks and many transcription factors endure weaker selection for operons than smaller genomes with fewer alternative tools for genetic regulation.
Collapse
Affiliation(s)
- Pablo A Nuñez
- Instituto de Biotecnología, Instituto Nacional de Tecnología Agropecuaria (CICVyA-INTA), Buenos Aires, Argentina
| | | | | | | |
Collapse
|
27
|
Bertels F, Silander OK, Pachkov M, Rainey PB, van Nimwegen E. Automated reconstruction of whole-genome phylogenies from short-sequence reads. Mol Biol Evol 2014; 31:1077-88. [PMID: 24600054 PMCID: PMC3995342 DOI: 10.1093/molbev/msu088] [Citation(s) in RCA: 324] [Impact Index Per Article: 32.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open
Abstract
Studies of microbial evolutionary dynamics are being transformed by the availability of affordable high-throughput sequencing technologies, which allow whole-genome sequencing of hundreds of related taxa in a single study. Reconstructing a phylogenetic tree of these taxa is generally a crucial step in any evolutionary analysis. Instead of constructing genome assemblies for all taxa, annotating these assemblies, and aligning orthologous genes, many recent studies 1) directly map raw sequencing reads to a single reference sequence, 2) extract single nucleotide polymorphisms (SNPs), and 3) infer the phylogenetic tree using maximum likelihood methods from the aligned SNP positions. However, here we show that, when using such methods to reconstruct phylogenies from sets of simulated sequences, both the exclusion of nonpolymorphic positions and the alignment to a single reference genome, introduce systematic biases and errors in phylogeny reconstruction. To address these problems, we developed a new method that combines alignments from mappings to multiple reference sequences and show that this successfully removes biases from the reconstructed phylogenies. We implemented this method as a web server named REALPHY (Reference sequence Alignment-based Phylogeny builder), which fully automates phylogenetic reconstruction from raw sequencing reads.
Collapse
Affiliation(s)
- Frederic Bertels
- Biozentrum, University of Basel and Swiss Institute of Bioinformatics, Basel, Switzerland
| | | | | | | | | |
Collapse
|
28
|
Paape T, Bataillon T, Zhou P, J Y Kono T, Briskine R, Young ND, Tiffin P. Selection, genome-wide fitness effects and evolutionary rates in the model legume Medicago truncatula. Mol Ecol 2013; 22:3525-38. [PMID: 23773281 DOI: 10.1111/mec.12329] [Citation(s) in RCA: 49] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2012] [Revised: 02/22/2013] [Accepted: 03/12/2013] [Indexed: 12/15/2022]
Abstract
Sequence data for >20 000 annotated genes from 56 accessions of Medicago truncatula were used to identify potential targets of positive selection, the determinants of evolutionary rate variation and the relative importance of positive and purifying selection in shaping nucleotide diversity. Based upon patterns of intraspecific diversity and interspecific divergence, c. 50-75% of nonsynonymous polymorphisms are subject to strong purifying selection and 1% of the sampled genes harbour a signature of positive selection. Combining polymorphism with expression data, we estimated the distribution of fitness effects and found that the proportion of deleterious mutations is significantly greater for expressed genes than for genes with undetected transcripts (nonexpressed) in a previous RNA-seq experiment and greater for broadly expressed genes than those expressed in only a single tissue. Expression level is the strongest correlate of evolutionary rates at nonsynonymous sites, and despite multiple genomic features being significantly correlated with evolutionary rates, they explain less than 20% of the variation in nonsynonymous rates (dN) and <15% of the variation in either synonymous rates (dS) or dN:dS. Among putative targets of selection were genes involved in defence against pathogens and herbivores, genes with roles in mediating the relationship with rhizobial symbionts and one-third of annotated histone-lysine methyltransferases. Adaptive evolution of the methyltransferases suggests that positive selection in gene expression may have occurred through evolution of enzymes involved in epigenetic modification.
Collapse
Affiliation(s)
- Timothy Paape
- Institute of Evolutionary Biology and Environmental Studies, University of Zurich, Zurich, 8057, Switzerland
| | | | | | | | | | | | | |
Collapse
|
29
|
Alvarez-Ponce D, Fares MA. Evolutionary rate and duplicability in the Arabidopsis thaliana protein-protein interaction network. Genome Biol Evol 2013; 4:1263-74. [PMID: 23160177 PMCID: PMC3542556 DOI: 10.1093/gbe/evs101] [Citation(s) in RCA: 48] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Genes show a bewildering variation in their patterns of molecular evolution, as a result of the action of different levels and types of selective forces. The factors underlying this variation are, however, still poorly understood. In the last decade, the position of proteins in the protein-protein interaction network has been put forward as a determinant factor of the evolutionary rate and duplicability of their encoding genes. This conclusion, however, has been based on the analysis of the limited number of microbes and animals for which interactome-level data are available (essentially, Escherichia coli, yeast, worm, fly, and humans). Here, we study, for the first time, the relationship between the position of proteins in the high-density interactome of a plant (Arabidopsis thaliana) and the patterns of molecular evolution of their encoding genes. We found that genes whose encoded products act at the center of the network are more evolutionarily constrained than those acting at the network periphery. This trend remains significant when potential confounding factors (gene expression level and breadth, duplicability, function, and length of the encoded products) are controlled for. Even though the correlation between centrality measures and rates of evolution is generally weak, for some functional categories, it is comparable in strength to (or even stronger than) the correlation between evolutionary rates and expression levels or breadths. In addition, genes encoding interacting proteins in the network evolve at relatively similar rates. Finally, Arabidopsis proteins encoded by duplicated genes are more highly connected than those encoded by singleton genes. This observation is in agreement with the patterns observed in humans, but in contrast with those observed in E. coli, yeast, worm, and fly (whose duplicated genes tend to act at the periphery of the network), implying that the relationship between duplicability and centrality inverted at least twice during eukaryote evolution. Taken together, these results indicate that the structure of the A. thaliana network constrains the evolution of its components at multiple levels.
Collapse
Affiliation(s)
- David Alvarez-Ponce
- Department of Abiotic Stress, Integrative and Systems Biology Laboratory, Instituto de Biología Molecular y Celular de Plantas, Consejo Superior de Investigaciones Científicias (CSIC-UPV), Valencia, Spain.
| | | |
Collapse
|
30
|
Niu SH, Li ZX, Yuan HW, Chen XY, Li Y, Li W. Transcriptome characterisation of Pinus tabuliformis and evolution of genes in the Pinus phylogeny. BMC Genomics 2013; 14:263. [PMID: 23597112 PMCID: PMC3640921 DOI: 10.1186/1471-2164-14-263] [Citation(s) in RCA: 50] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2012] [Accepted: 04/15/2013] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The Chinese pine (Pinus tabuliformis) is an indigenous conifer species in northern China but is relatively underdeveloped as a genomic resource; thus, limiting gene discovery and breeding. Large-scale transcriptome data were obtained using a next-generation sequencing platform to compensate for the lack of P. tabuliformis genomic information. RESULTS The increasing amount of transcriptome data on Pinus provides an excellent resource for multi-gene phylogenetic analysis and studies on how conserved genes and functions are maintained in the face of species divergence. The first P. tabuliformis transcriptome from a normalised cDNA library of multiple tissues and individuals was sequenced in a full 454 GS-FLX run, producing 911,302 sequencing reads. The high quality overlapping expressed sequence tags (ESTs) were assembled into 46,584 putative transcripts, and more than 700 SSRs and 92,000 SNPs/InDels were characterised. Comparative analysis of the transcriptome of six conifer species yielded 191 orthologues, from which we inferred a phylogenetic tree, evolutionary patterns and calculated rates of gene diversion. We also identified 938 fast evolving sequences that may be useful for identifying genes that perhaps evolved in response to positive selection and might be responsible for speciation in the Pinus lineage. CONCLUSIONS A large collection of high-quality ESTs was obtained, de novo assembled and characterised, which represents a dramatic expansion of the current transcript catalogues of P. tabuliformis and which will gradually be applied in breeding programs of P. tabuliformis. Furthermore, these data will facilitate future studies of the comparative genomics of P. tabuliformis and other related species.
Collapse
Affiliation(s)
- Shi-Hui Niu
- National Engineering Laboratory for Forest Tree Breeding, College of Biological Science and Technology, Beijing Forestry University, Beijing 100083, People's Republic of China
| | | | | | | | | | | |
Collapse
|
31
|
Choi SS, Hannenhalli S. Three independent determinants of protein evolutionary rate. J Mol Evol 2013; 76:98-111. [PMID: 23400388 DOI: 10.1007/s00239-013-9543-6] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2012] [Accepted: 01/16/2013] [Indexed: 12/15/2022]
Abstract
One of the most widely accepted ideas related to the evolutionary rates of proteins is that functionally important residues or regions evolve slower than other regions, a reasonable outcome of which should be a slower evolutionary rate of the proteins with a higher density of functionally important sites. Oddly, the role of functional importance, mainly measured by essentiality, in determining evolutionary rate has been challenged in recent studies. Several variables other than protein essentiality, such as expression level, gene compactness, protein-protein interactions, etc., have been suggested to affect protein evolutionary rate. In the present review, we try to refine the concept of functional importance of a gene, and consider three factors-functional importance, expression level, and gene compactness, as independent determinants of evolutionary rate of a protein, based not only on their known correlation with evolutionary rate but also on a reasonable mechanistic model. We suggest a framework based on these mechanistic models to correctly interpret the correlations between evolutionary rates and the various variables as well as the interrelationships among the variables.
Collapse
Affiliation(s)
- Sun Shim Choi
- Department of Medical Biotechnology, College of Biomedical Science, and Institute of Bioscience & Biotechnology, Kangwon National University, Chuncheon, South Korea.
| | | |
Collapse
|
32
|
Shabalina SA, Spiridonov NA, Kashina A. Sounds of silence: synonymous nucleotides as a key to biological regulation and complexity. Nucleic Acids Res 2013; 41:2073-94. [PMID: 23293005 PMCID: PMC3575835 DOI: 10.1093/nar/gks1205] [Citation(s) in RCA: 187] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023] Open
Abstract
Messenger RNA is a key component of an intricate regulatory network of its own. It accommodates numerous nucleotide signals that overlap protein coding sequences and are responsible for multiple levels of regulation and generation of biological complexity. A wealth of structural and regulatory information, which mRNA carries in addition to the encoded amino acid sequence, raises the question of how these signals and overlapping codes are delineated along non-synonymous and synonymous positions in protein coding regions, especially in eukaryotes. Silent or synonymous codon positions, which do not determine amino acid sequences of the encoded proteins, define mRNA secondary structure and stability and affect the rate of translation, folding and post-translational modifications of nascent polypeptides. The RNA level selection is acting on synonymous sites in both prokaryotes and eukaryotes and is more common than previously thought. Selection pressure on the coding gene regions follows three-nucleotide periodic pattern of nucleotide base-pairing in mRNA, which is imposed by the genetic code. Synonymous positions of the coding regions have a higher level of hybridization potential relative to non-synonymous positions, and are multifunctional in their regulatory and structural roles. Recent experimental evidence and analysis of mRNA structure and interspecies conservation suggest that there is an evolutionary tradeoff between selective pressure acting at the RNA and protein levels. Here we provide a comprehensive overview of the studies that define the role of silent positions in regulating RNA structure and processing that exert downstream effects on proteins and their functions.
Collapse
Affiliation(s)
- Svetlana A Shabalina
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20984, USA.
| | | | | |
Collapse
|
33
|
Levy ED, Michnick SW, Landry CR. Protein abundance is key to distinguish promiscuous from functional phosphorylation based on evolutionary information. Philos Trans R Soc Lond B Biol Sci 2012; 367:2594-606. [PMID: 22889910 DOI: 10.1098/rstb.2012.0078] [Citation(s) in RCA: 79] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023] Open
Abstract
In eukaryotic cells, protein phosphorylation is an important and widespread mechanism used to regulate protein function. Yet, of the thousands of phosphosites identified to date, only a few hundred at best have a characterized function. It was recently shown that these functional sites are significantly more conserved than phosphosites of unknown function, stressing the importance of considering evolutionary conservation in assessing the global functional landscape of phosphosites. This leads us to review studies that examined the impact of phosphorylation on evolutionary conservation. While all these studies have shown that conservation is greater among phosphorylated sites compared with non-phosphorylated ones, the magnitude of this difference varies greatly. Further, not all studies have considered key factors that may influence the rate of phosphosite evolution. Such key factors are their localization in ordered or disordered regions, their stoichiometry or the abundance of their corresponding protein. Here we take into account all of these factors simultaneously, which reveals remarkable evolutionary patterns. First, while it is well established that protein conservation increases with abundance, we show that phosphosites partly follow an opposite trend. More precisely, Saccharomyces cerevisiae phosphosites present among abundant proteins are 1.5 times more likely to diverge in the closely related species Saccharomyces bayanus when compared with phosphosites present in the 5 per cent least abundant proteins. Second, we show that conservation is coupled to stoichiometry, whereby sites frequently phosphorylated are more conserved than those rarely phosphorylated. Finally, we provide a model of functional and noisy or 'accidental' phosphorylation that explains these observations.
Collapse
Affiliation(s)
- Emmanuel D Levy
- Département de Biochimie, Université de Montréal, Montréal, Québec, Canada.
| | | | | |
Collapse
|
34
|
Alvarez-Ponce D. The relationship between the hierarchical position of proteins in the human signal transduction network and their rate of evolution. BMC Evol Biol 2012; 12:192. [PMID: 23020283 PMCID: PMC3527147 DOI: 10.1186/1471-2148-12-192] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2012] [Accepted: 09/14/2012] [Indexed: 11/23/2022] Open
Abstract
Background Proteins evolve at disparate rates, as a result of the action of different types and strengths of evolutionary forces. An open question in evolutionary biology is what factors are responsible for this variability. In general, proteins whose function has a great impact on organisms’ fitness are expected to evolve under stronger selective pressures. In biosynthetic pathways, upstream genes usually evolve under higher levels of selective constraint than those acting at the downstream part, as a result of their higher hierarchical position. Similar observations have been made in transcriptional regulatory networks, whose upstream elements appear to be more essential and subject to selection. Less well understood is, however, how selective pressures distribute along signal transduction pathways. Results Here, I combine comparative genomics and directed protein interaction data to study the distribution of evolutionary forces across the human signal transduction network. Surprisingly, no evidence was found for higher levels of selective constraint at the upstream network genes (those occupying more hierarchical positions). On the contrary, purifying selection was found to act more strongly on genes acting at the downstream part of the network, which seems to be due to downstream genes being more highly and broadly expressed, performing certain functions and, in particular, encoding proteins that are more highly connected in the protein–protein interaction network. When the effect of these confounding factors is discounted, upstream and downstream genes evolve at similar rates. The trends found in the overall signaling network are exemplified by analysis of the distribution of purifying selection along the mammalian Ras signaling pathway, showing that upstream and downstream genes evolve at similar rates. Conclusions These results indicate that the upstream/downstream position of proteins in the signal transduction network has, in general, no direct effect on their rates of evolution, suggesting that upstream and downstream genes are similarly important for the function of the network. This implies that natural selection differently distributes across signal transduction networks and across biosynthetic and transcriptional regulatory networks, which might reflect fundamental differences in their function and organization.
Collapse
Affiliation(s)
- David Alvarez-Ponce
- Department of Biology, National University of Ireland Maynooth, Maynooth, County Kildare, Ireland.
| |
Collapse
|
35
|
Buschiazzo E, Ritland C, Bohlmann J, Ritland K. Slow but not low: genomic comparisons reveal slower evolutionary rate and higher dN/dS in conifers compared to angiosperms. BMC Evol Biol 2012; 12:8. [PMID: 22264329 PMCID: PMC3328258 DOI: 10.1186/1471-2148-12-8] [Citation(s) in RCA: 101] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2011] [Accepted: 01/20/2012] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Comparative genomics can inform us about the processes of mutation and selection across diverse taxa. Among seed plants, gymnosperms have been lacking in genomic comparisons. Recent EST and full-length cDNA collections for two conifers, Sitka spruce (Picea sitchensis) and loblolly pine (Pinus taeda), together with full genome sequences for two angiosperms, Arabidopsis thaliana and poplar (Populus trichocarpa), offer an opportunity to infer the evolutionary processes underlying thousands of orthologous protein-coding genes in gymnosperms compared with an angiosperm orthologue set. RESULTS Based upon pairwise comparisons of 3,723 spruce and pine orthologues, we found an average synonymous genetic distance (dS) of 0.191, and an average dN/dS ratio of 0.314. Using a fossil-established divergence time of 140 million years between spruce and pine, we extrapolated a nucleotide substitution rate of 0.68 × 10(-9) synonymous substitutions per site per year. When compared to angiosperms, this indicates a dramatically slower rate of nucleotide substitution rates in conifers: on average 15-fold. Coincidentally, we found a three-fold higher dN/dS for the spruce-pine lineage compared to the poplar-Arabidopsis lineage. This joint occurrence of a slower evolutionary rate in conifers with higher dN/dS, and possibly positive selection, showcases the uniqueness of conifer genome evolution. CONCLUSIONS Our results are in line with documented reduced nucleotide diversity, conservative genome evolution and low rates of diversification in conifers on the one hand and numerous examples of local adaptation in conifers on the other hand. We propose that reduced levels of nucleotide mutation in large and long-lived conifer trees, coupled with large effective population size, were the main factors leading to slow substitution rates but retention of beneficial mutations.
Collapse
Affiliation(s)
- Emmanuel Buschiazzo
- Department of Forest Sciences, University of British Columbia, 2424 Main Mall, Vancouver, BC V6T 1Z4, Canada.
| | | | | | | |
Collapse
|
36
|
Gaut B, Yang L, Takuno S, Eguiarte LE. The Patterns and Causes of Variation in Plant Nucleotide Substitution Rates. ANNUAL REVIEW OF ECOLOGY EVOLUTION AND SYSTEMATICS 2011. [DOI: 10.1146/annurev-ecolsys-102710-145119] [Citation(s) in RCA: 114] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Affiliation(s)
- Brandon Gaut
- Department of Ecology and Evolutionary Biology, University of California, Irvine, California 92697; , ,
| | - Liang Yang
- Department of Ecology and Evolutionary Biology, University of California, Irvine, California 92697; , ,
| | - Shohei Takuno
- Department of Ecology and Evolutionary Biology, University of California, Irvine, California 92697; , ,
| | - Luis E. Eguiarte
- Instituto de Ecología, Universidad Nacional Autónoma de México, CP 04510 Mexico City, Mexico;
| |
Collapse
|
37
|
Investment in rapid growth shapes the evolutionary rates of essential proteins. Proc Natl Acad Sci U S A 2011; 108:20030-5. [PMID: 22135464 DOI: 10.1073/pnas.1110972108] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Proteins evolve at very different rates and, most notably, at rates inversely proportional to the level at which they are produced. The relative frequency of highly expressed proteins in the proteome, and thus their impact on the cell budget, increases steeply with growth rate. The maximal growth rate is a key life-history trait reflecting trade-offs between rapid growth and other fitness components. We show that the maximal growth rate is weakly affected by genetic drift. The negative correlation between protein expression levels and evolutionary rate and the positive correlation between expression levels of highly expressed proteins and growth rates, suggest that investment in growth affects the evolutionary rate of proteins, especially the highly expressed ones. Accordingly, analysis of 61 families of orthologs in 74 proteobacteria shows that differences in evolutionary rates between lowly and highly expressed proteins depend on maximal growth rates. Analyses of complexes with key roles in bacterial growth and strikingly different expression levels, the ribosome and the replisome, confirm these patterns and suggest that the growth-related sequence conservation is associated with protein synthesis. Maximal growth rates also shape protein evolution in the other bacterial clades. Long-branch attractions associated with this effect might explain why clades with persistent history of slow growth are attracted to the root when the tree of prokaryotes is inferred using highly, but not lowly, expressed proteins. These results indicate that reconstruction of deep phylogenies can be strongly affected by maximal growth rates, and highlight the importance of life-history traits and their physiological consequences for protein evolution.
Collapse
|
38
|
Slotte T, Bataillon T, Hansen TT, St Onge K, Wright SI, Schierup MH. Genomic determinants of protein evolution and polymorphism in Arabidopsis. Genome Biol Evol 2011; 3:1210-9. [PMID: 21926095 PMCID: PMC3296466 DOI: 10.1093/gbe/evr094] [Citation(s) in RCA: 84] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open
Abstract
Recent results from Drosophila suggest that positive selection has a substantial impact on genomic patterns of polymorphism and divergence. However, species with smaller population sizes and/or stronger population structure may not be expected to exhibit Drosophila-like patterns of sequence variation. We test this prediction and identify determinants of levels of polymorphism and rates of protein evolution using genomic data from Arabidopsis thaliana and the recently sequenced Arabidopsis lyrata genome. We find that, in contrast to Drosophila, there is no negative relationship between nonsynonymous divergence and silent polymorphism at any spatial scale examined. Instead, synonymous divergence is a major predictor of silent polymorphism, which suggests variation in mutation rate as the main determinant of silent variation. Variation in rates of protein divergence is mainly correlated with gene expression level and breadth, consistent with results for a broad range of taxa, and map-based estimates of recombination rate are only weakly correlated with nonsynonymous divergence. Variation in mutation rates and the strength of purifying selection seem to be major drivers of patterns of polymorphism and divergence in Arabidopsis. Nevertheless, a model allowing for varying negative and positive selection by functional gene category explains the data better than a homogeneous model, implying the action of positive selection on a subset of genes. Genes involved in disease resistance and abiotic stress display high proportions of adaptive substitution. Our results are important for a general understanding of the determinants of rates of protein evolution and the impact of selection on patterns of polymorphism and divergence.
Collapse
Affiliation(s)
- Tanja Slotte
- Department of Evolutionary Biology, Evolutionary Biology Centre, Uppsala University, Sweden.
| | | | | | | | | | | |
Collapse
|
39
|
Jovelin R, Phillips PC. Expression level drives the pattern of selective constraints along the insulin/Tor signal transduction pathway in Caenorhabditis. Genome Biol Evol 2011; 3:715-22. [PMID: 21849326 PMCID: PMC3157841 DOI: 10.1093/gbe/evr071] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
Genes do not act in isolation but perform their biological functions within genetic pathways that are connected in larger networks. Investigation of nucleotide variation within genetic pathways and networks has shown that topology can affect the rate of protein evolution; however, it remains unclear whether a same pattern of nucleotide variation is expected within functionally similar networks and whether it may be due to similar or different biological mechanisms. We address these questions by investigating nucleotide variation in the context of the structure of the insulin/Tor-signaling pathway in Caenorhabditis, which is well characterized and is functionally conserved across phylogeny. In Drosophila and vertebrates, the rate of protein evolution is negatively correlated with the position of a gene within the insulin/Tor pathway. Similarly, we find that in Caenorhabditis, the rate of amino acid replacement is lower for downstream genes. However, in Caenorhabditis, the rate of synonymous substitution is also strongly affected by the position of a gene in the pathway, and we show that the distribution of selective pressure along the pathway is driven by differential expression level. A full understanding of the effect of pathway structure on selective constraints is therefore likely to require inclusion of specific biological function into more general network models.
Collapse
Affiliation(s)
- Richard Jovelin
- Department of Biology, Center for Ecology and Evolutionary Biology, University of Oregon, USA.
| | | |
Collapse
|
40
|
Theis FJ, Latif N, Wong P, Frishman D. Complex principal component and correlation structure of 16 yeast genomic variables. Mol Biol Evol 2011; 28:2501-12. [PMID: 21444651 DOI: 10.1093/molbev/msr077] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
A quickly growing number of characteristics reflecting various aspects of gene function and evolution can be either measured experimentally or computed from DNA and protein sequences. The study of pairwise correlations between such quantitative genomic variables as well as collective analysis of their interrelations by multidimensional methods have delivered crucial insights into the processes of molecular evolution. Here, we present a principal component analysis (PCA) of 16 genomic variables from Saccharomyces cerevisiae, the largest data set analyzed so far. Because many missing values and potential outliers hinder the direct calculation of principal components, we introduce the application of Bayesian PCA. We confirm some of the previously established correlations, such as evolutionary rate versus protein expression, and reveal new correlations such as those between translational efficiency, phosphorylation density, and protein age. Although the first principal component primarily contrasts genomic change and protein expression, the second component separates variables related to gene existence and expressed protein functions. Enrichment analysis on genes affecting variable correlations unveils classes of influential genes. For example, although ribosomal and nuclear transport genes make important contributions to the correlation between protein isoelectric point and molecular weight, protein synthesis and amino acid metabolism genes help cause the lack of significant correlation between propensity for gene loss and protein age. We present the novel Quagmire database (Quantitative Genomics Resource) which allows exploring relationships between more genomic variables in three model organisms-Escherichia coli, S. cerevisiae, and Homo sapiens (http://webclu.bio.wzw.tum.de:18080/quagmire).
Collapse
Affiliation(s)
- Fabian J Theis
- Helmholtz Center Munich-German Research Center for Environmental Health, Institute of Bioinformatics and Systems Biology, Ingolstädter Landstraße 1, Neuherberg, Germany
| | | | | | | |
Collapse
|
41
|
Yang L, Gaut BS. Factors that contribute to variation in evolutionary rate among Arabidopsis genes. Mol Biol Evol 2011; 28:2359-69. [PMID: 21389272 DOI: 10.1093/molbev/msr058] [Citation(s) in RCA: 136] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Abstract
Surprisingly, few studies have described evolutionary rate variation among plant nuclear genes, with little investigation of the causes of rate variation. Here, we describe evolutionary rates for 11,492 ortholog pairs between Arabidopsis thaliana and A. lyrata and investigate possible contributors to rate variation among these genes. Rates of evolution at synonymous sites vary along chromosomes, suggesting that mutation rates vary on genomic scales, perhaps as a function of recombination rate. Rates of evolution at nonsynonymous sites correlate most strongly with expression patterns, but they also vary as to whether a gene is duplicated and retained after a whole-genome duplication (WGD) event. WGD genes evolve more slowly, on average, than nonduplicated genes and non-WGD duplicates. We hypothesize that levels and patterns of expression are not only the major determinants that explain nonsynonymous rate variation among genes but also a critical determinant of gene retention after duplication.
Collapse
Affiliation(s)
- Liang Yang
- Department of Ecology and Evolutionary Biology, University of California Irvine, Irvine, USA
| | | |
Collapse
|
42
|
Shortridge MD, Triplet T, Revesz P, Griep MA, Powers R. Bacterial protein structures reveal phylum dependent divergence. Comput Biol Chem 2011; 35:24-33. [PMID: 21315656 DOI: 10.1016/j.compbiolchem.2010.12.004] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2010] [Revised: 12/28/2010] [Accepted: 12/29/2010] [Indexed: 01/26/2023]
Abstract
Protein sequence space is vast compared to protein fold space. This raises important questions about how structures adapt to evolutionary changes in protein sequences. A growing trend is to regard protein fold space as a continuum rather than a series of discrete structures. From this perspective, homologous protein structures within the same functional classification should reveal a constant rate of structural drift relative to sequence changes. The clusters of orthologous groups (COG) classification system was used to annotate homologous bacterial protein structures in the Protein Data Bank (PDB). The structures and sequences of proteins within each COG were compared against each other to establish their relatedness. As expected, the analysis demonstrates a sharp structural divergence between the bacterial phyla Firmicutes and Proteobacteria. Additionally, each COG had a distinct sequence/structure relationship, indicating that different evolutionary pressures affect the degree of structural divergence. However, our analysis also shows the relative drift rate between sequence identity and structure divergence remains constant.
Collapse
Affiliation(s)
- Matthew D Shortridge
- Department of Chemistry, University of Nebraska-Lincoln, 68588-0304, United States
| | | | | | | | | |
Collapse
|
43
|
Pang K, Cheng C, Xuan Z, Sheng H, Ma X. Understanding protein evolutionary rate by integrating gene co-expression with protein interactions. BMC SYSTEMS BIOLOGY 2010; 4:179. [PMID: 21190591 PMCID: PMC3022652 DOI: 10.1186/1752-0509-4-179] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/26/2010] [Accepted: 12/30/2010] [Indexed: 12/23/2022]
Abstract
Background Among the many factors determining protein evolutionary rate, protein-protein interaction degree (PPID) has been intensively investigated in recent years, but its precise effect on protein evolutionary rate is still heavily debated. Results We first confirmed that the correlation between protein evolutionary rate and PPID varies considerably across different protein interaction datasets. Specifically, because of the maximal inconsistency between yeast two-hybrid and other datasets, we reasoned that the difference in experimental methods contributes to our inability to clearly define how PPID affects protein evolutionary rate. To address this, we integrated protein interaction and gene co-expression data to derive a co-expressed protein-protein interaction degree (ePPID) measure, which reflects the number of partners with which a protein can permanently interact. Thus, irrespective of the experimental method employed, we found that (1) ePPID is a better predictor of protein evolutionary rate than PPID, (2) ePPID is a more robust predictor of protein evolutionary rate than PPID, and (3) the contribution of ePPID to protein evolutionary rate is statistically independent of expression level. Analysis of hub proteins in the Structural Interaction Network further supported ePPID as a better predictor of protein evolutionary rate than the number of distinct binding interfaces and clarified the slower evolution of co-expressed multi-interface hub proteins over that of other hub proteins. Conclusions Our study firmly established ePPID as a robust predictor of protein evolutionary rate, irrespective of experimental method, and underscored the importance of permanent interactions in shaping the evolutionary outcome.
Collapse
Affiliation(s)
- Kaifang Pang
- Department of Computer Science and Engineering, Shanghai Jiao Tong University, China
| | | | | | | | | |
Collapse
|
44
|
Montanucci L, Laayouni H, Dall'Olio GM, Bertranpetit J. Molecular evolution and network-level analysis of the N-glycosylation metabolic pathway across primates. Mol Biol Evol 2010; 28:813-23. [PMID: 20924085 DOI: 10.1093/molbev/msq259] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
N-glycosylation is one of the most important forms of protein modification, serving key biological functions in multicellular organisms. N-glycans at the cell surface mediate the interaction between cells and the surrounding matrix and may act as pathogen receptors, making the genes responsible for their synthesis good candidates to show signatures of adaptation to different pathogen environments. Here, we study the forces that shaped the evolution of the genes involved in the synthesis of the N-glycans during the divergence of primates within the framework of their functional network. We have found that, despite their function of producing glycan repertoires capable of evading rapidly evolving pathogens, genes involved in the synthesis of the glycans are highly conserved, and no signals of positive selection have been detected within the time of divergence of primates. This suggests strong functional constraints as the main force driving their evolution. We studied the strength of the purifying selection acting on the genes in relation to the network structure considering the position of each gene along the pathway, its connectivity, and the rates of evolution in neighboring genes. We found a strong and highly significant negative correlation between the strength of purifying selection and the connectivity of each gene, indicating that genes encoding for highly connected enzymes evolve slower and thus are subject to stronger selective constraints. This result confirms that network topology does shape the evolution of the genes and that the connectivity within metabolic pathways and networks plays a major role in constraining evolutionary rates.
Collapse
Affiliation(s)
- Ludovica Montanucci
- Department of Experimental and Health Sciences, Institute of Evolutionary Biology, Universitat Pompeu Fabra-Consejo Superior de Investigaciones Cientificas, Barcelona, Catalonia, Spain
| | | | | | | |
Collapse
|
45
|
Plata G, Gottesman ME, Vitkup D. The rate of the molecular clock and the cost of gratuitous protein synthesis. Genome Biol 2010; 11:R98. [PMID: 20920270 PMCID: PMC2965390 DOI: 10.1186/gb-2010-11-9-r98] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2010] [Revised: 09/03/2010] [Accepted: 09/29/2010] [Indexed: 01/05/2023] Open
Abstract
Background The nature of the protein molecular clock, the protein-specific rate of amino acid substitutions, is among the central questions of molecular evolution. Protein expression level is the dominant determinant of the clock rate in a number of organisms. It has been suggested that highly expressed proteins evolve slowly in all species mainly to maintain robustness to translation errors that generate toxic misfolded proteins. Here we investigate this hypothesis experimentally by comparing the growth rate of Escherichia coli expressing wild type and misfolding-prone variants of the LacZ protein. Results We show that the cost of toxic protein misfolding is small compared to other costs associated with protein synthesis. Complementary computational analyses demonstrate that there is also a relatively weaker, but statistically significant, selection for increasing solubility and polarity in highly expressed E. coli proteins. Conclusions Although we cannot rule out the possibility that selection against misfolding toxicity significantly affects the protein clock in species other than E. coli, our results suggest that it is unlikely to be the dominant and universal factor determining the clock rate in all organisms. We find that in this bacterium other costs associated with protein synthesis are likely to play an important role. Interestingly, our experiments also suggest significant costs associated with volume effects, such as jamming of the cellular environment with unnecessary proteins.
Collapse
Affiliation(s)
- Germán Plata
- Center for Computational Biology and Bioinformatics, Columbia University, 1130 St Nicholas Ave, New York City, NY 10032, USA.
| | | | | |
Collapse
|
46
|
The relationship among gene expression, the evolution of gene dosage, and the rate of protein evolution. PLoS Genet 2010; 6:e1000944. [PMID: 20485561 PMCID: PMC2869310 DOI: 10.1371/journal.pgen.1000944] [Citation(s) in RCA: 143] [Impact Index Per Article: 10.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2009] [Accepted: 04/09/2010] [Indexed: 01/08/2023] Open
Abstract
The understanding of selective constraints affecting genes is a major issue in biology. It is well established that gene expression level is a major determinant of the rate of protein evolution, but the reasons for this relationship remain highly debated. Here we demonstrate that gene expression is also a major determinant of the evolution of gene dosage: the rate of gene losses after whole genome duplications in the Paramecium lineage is negatively correlated to the level of gene expression, and this relationship is not a byproduct of other factors known to affect the fate of gene duplicates. This indicates that changes in gene dosage are generally more deleterious for highly expressed genes. This rule also holds for other taxa: in yeast, we find a clear relationship between gene expression level and the fitness impact of reduction in gene dosage. To explain these observations, we propose a model based on the fact that the optimal expression level of a gene corresponds to a trade-off between the benefit and cost of its expression. This COSTEX model predicts that selective pressure against mutations changing gene expression level or affecting the encoded protein should on average be stronger in highly expressed genes and hence that both the frequency of gene loss and the rate of protein evolution should correlate negatively with gene expression. Thus, the COSTEX model provides a simple and common explanation for the general relationship observed between the level of gene expression and the different facets of gene evolution. The analysis of gene evolution is a powerful approach to recognize the genetic features that contribute to the fitness of organisms. It was shown previously that selective constraints on protein sequences increase with expression level. This observation was surprising because there is a priori no reason why lowly expressed genes should be less important than highly expressed genes for the proper function of an organism. Here we show that selective pressure on the evolution of gene dosage, which is another important aspect of gene evolution, is also directly dependent on gene expression level. To explain these observations, we propose a model based on the fact that gene expression is a costly process (notably protein synthesis), so that there is an optimal expression level for each gene corresponding to a trade-off between the benefit and the cost of its expression. This model predicts that selective pressure on gene expression level or on the encoded protein should on average be stronger in highly expressed genes, providing a simple and common explanation for the general relationship observed between gene expression and the different facets of gene evolution.
Collapse
|
47
|
Eory L, Halligan DL, Keightley PD. Distributions of selectively constrained sites and deleterious mutation rates in the hominid and murid genomes. Mol Biol Evol 2010; 27:177-92. [PMID: 19759235 DOI: 10.1093/molbev/msp219] [Citation(s) in RCA: 81] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
Protein-coding sequences make up only about 1% of the mammalian genome. Much of the remaining 99% has been long assumed to be junk DNA, with little or no functional significance. Here, we show that in hominids, a group with historically low effective population sizes, all classes of noncoding DNA evolve more slowly than ancestral transposable elements and so appear to be subject to significant evolutionary constraints. Under the nearly neutral theory, we expected to see lower levels of selective constraints on most sequence types in hominids than murids, a group that is thought to have a higher effective population size. We found that this is the case for many sequence types examined, the most extreme example being 5'UTRs, for which constraint in hominids is only about one-third that of murids. Surprisingly, however, we observed higher constraints for some sequence types in hominids, notably 4-fold sites, where constraint is more than twice as high as in murids. This implies that more than about one-fifth of mutations at 4-fold sites are effectively selected against in hominids. The higher constraint at 4-fold sites in hominids suggests a more complex protein-coding gene structure than murids and indicates that methods for detecting selection on protein-coding sequences (e.g., using the d(N)/d(S) ratio), with 4-fold sites as a neutral standard, may lead to biased estimates, particularly in hominids. Our constraint estimates imply that 5.4% of nucleotide sites in the human genome are subject to effective negative selection and that there are three times as many constrained sites within noncoding sequences as within protein-coding sequences. Including coding and noncoding sites, we estimate that the genomic deleterious mutation rate U = 4.2. The mutational load predicted under a multiplicative model is therefore about 99% in hominids.
Collapse
Affiliation(s)
- Lél Eory
- Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, Edinburgh, United Kingdom.
| | | | | |
Collapse
|
48
|
Singh ND, Larracuente AM, Sackton TB, Clark AG. Comparative Genomics on the Drosophila Phylogenetic Tree. ANNUAL REVIEW OF ECOLOGY EVOLUTION AND SYSTEMATICS 2009. [DOI: 10.1146/annurev.ecolsys.110308.120214] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
With the sequencing of 12 complete euchromatic Drosophila genomes, the genus Drosophila is a leading model for comparative genomics. In this review, we discuss the novel insights into evolutionary processes afforded by the newly available genomic sequences when placed in the context of the phylogeny. We focus on three levels: insights into whole-genome content, such as changes in genome size and content across the phylogeny; insights into large-scale patterns of divergence and conservation, such as selective constraints on genes and chromosome-level evolution of sex chromosomes; and insights into finer-scale processes in individual lineages and genes, such as lineage-specific evolution in response to ecological context. As the field of comparative genomics is still young, we also discuss current challenges, such as the development of more sophisticated evolutionary models to capture nonequilibrium processes and the improvement of assembly and alignment algorithms to better capture uncertainty in the data.
Collapse
Affiliation(s)
- Nadia D. Singh
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York 14853
| | - Amanda M. Larracuente
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York 14853
| | - Timothy B. Sackton
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, Massachusetts 02138
| | - Andrew G. Clark
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York 14853
| |
Collapse
|
49
|
Yang YH, Zhang FM, Ge S. Evolutionary rate patterns of the Gibberellin pathway genes. BMC Evol Biol 2009; 9:206. [PMID: 19689796 PMCID: PMC2794029 DOI: 10.1186/1471-2148-9-206] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2008] [Accepted: 08/18/2009] [Indexed: 01/22/2023] Open
Abstract
BACKGROUND Analysis of molecular evolutionary patterns of different genes within metabolic pathways allows us to determine whether these genes are subject to equivalent evolutionary forces and how natural selection shapes the evolution of proteins in an interacting system. Although previous studies found that upstream genes in the pathway evolved more slowly than downstream genes, the correlation between evolutionary rate and position of the genes in metabolic pathways as well as its implications in molecular evolution are still less understood. RESULTS We sequenced and characterized 7 core structural genes of the gibberellin biosynthetic pathway from 8 representative species of the rice tribe (Oryzeae) to address alternative hypotheses regarding evolutionary rates and patterns of metabolic pathway genes. We have detected significant rate heterogeneity among 7 GA pathway genes for both synonymous and nonsynonymous sites. Such rate variation is mostly likely attributed to differences of selection intensity rather than differential mutation pressures on the genes. Unlike previous argument that downstream genes in metabolic pathways would evolve more slowly than upstream genes, the downstream genes in the GA pathway did not exhibited the elevated substitution rate and instead, the genes that encode either the enzyme at the branch point (GA20ox) or enzymes catalyzing multiple steps (KO, KAO and GA3ox) in the pathway had the lowest evolutionary rates due to strong purifying selection. Our branch and codon models failed to detect signature of positive selection for any lineage and codon of the GA pathway genes. CONCLUSION This study suggests that significant heterogeneity of evolutionary rate of the GA pathway genes is mainly ascribed to differential constraint relaxation rather than the positive selection and supports the pathway flux theory that predicts that natural selection primarily targets enzymes that have the greatest control on fluxes.
Collapse
Affiliation(s)
- Yan-hua Yang
- State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, Chinese Academy of Sciences, Beijing 100093, PR China
- Graduate University, Chinese Academy of Sciences, Beijing 100039, PR China
| | - Fu-min Zhang
- State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, Chinese Academy of Sciences, Beijing 100093, PR China
| | - Song Ge
- State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, Chinese Academy of Sciences, Beijing 100093, PR China
- Graduate University, Chinese Academy of Sciences, Beijing 100039, PR China
| |
Collapse
|
50
|
Franzosa EA, Xia Y. Structural determinants of protein evolution are context-sensitive at the residue level. Mol Biol Evol 2009; 26:2387-95. [PMID: 19597162 DOI: 10.1093/molbev/msp146] [Citation(s) in RCA: 154] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
Structural properties of a protein residue's microenvironment have long been implicated as agents of selective constraint. Although these properties are inherently quantitative, structure-based studies of protein evolution tend to rely upon coarse distinctions between "surface" and "buried" residues and between "interfacial" and "noninterfacial" residues. Using homology-mapped yeast protein structures, we explore the relationships between residue evolution and continuous structural properties of the residue microenvironment, including solvent accessibility, density and distribution of residue-residue contacts, and burial depth. We confirm the role of solvent exposure as a major structural determinant of residue evolution and also identify a weak secondary effect arising from packing density. The relationship between solvent exposure and evolutionary rate (d(N)/d(S)) is found to be strong, positive, and linear. This reinforces the notion that residue burial is a continuous property with quantitative fitness implications. Next, we demonstrate systematic variation in residue-level structure-evolution relationships resulting from changes in global physical and biological contexts. We find that increasing protein-core size yields a more rapid relaxation of selective constraint as solvent exposure increases, although solvent-excluded residues remain similarly constrained. Finally, we analyze the selective constraint in protein-protein interfaces, revealing two fundamentally different yet separable components: continuous structural constraint that scales with total residue burial and a more surprising fixed functional constraint that accompanies any degree of interface involvement. These discoveries serve to elucidate and unite structure-evolution relationships at the residue and whole-protein levels.
Collapse
|