1
|
Zhao Y, Zhang Y, Feng J, He Z, Li T. Codon Usage Bias: A Potential Factor Affecting VGLUT Developmental Expression and Protein Evolution. Mol Neurobiol 2025; 62:3508-3522. [PMID: 39305444 DOI: 10.1007/s12035-024-04426-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2023] [Accepted: 08/05/2024] [Indexed: 02/04/2025]
Abstract
More and more attention has been paid to the role of synonymous substitution in evolution, in which codon usage preference can affect gene expression distribution and protein structure and function. Vesicular glutamate transporter (VGLUT) consists of three isoforms, among which VGLUT3 is significantly different from other VGLUTs in functional importance, expression level, and distribution range, whose reason is still unclear. This study sought to analyze the role of codon preference in VGLUT differentiation. To conduct an evolutionary analysis of the three VGLUTs, this paper uses bioinformatics research methods to analyze the coding sequences of the three VGLUTs in different species and compare the codon usage patterns. Furthermore, the differences among the three VGLUTs were analyzed by combining functional importance, expression level, distribution range, gene structure, protein relationship network, expression at specific developmental stages, and phylogenetic tree, and the influence of codon usage pattern was explored. The results showed that the VGLUT with greater codon preference had less functional importance, lower expression levels, more peripheral distribution away from the CNS, smaller exon density of gene, less conserved and farther away from the CDS region miRNA regulatory sites, simpler and less tight protein interaction networks, delayed developmental expression, and more distant evolutionary relationships. Codon usage preference is a potential factor affecting VGLUT developmental expression and protein evolution.
Collapse
Affiliation(s)
- Yiran Zhao
- College of Life Sciences, Yunlong District, Xuzhou Medical University, No. 209, Tongshan Road, Xuzhou City, Jiangsu, 221000, China
| | - Yu Zhang
- College of Life Sciences, Yunlong District, Xuzhou Medical University, No. 209, Tongshan Road, Xuzhou City, Jiangsu, 221000, China
| | - Jiaxing Feng
- College of Life Sciences, Yunlong District, Xuzhou Medical University, No. 209, Tongshan Road, Xuzhou City, Jiangsu, 221000, China
| | - Zixian He
- College of Life Sciences, Yunlong District, Xuzhou Medical University, No. 209, Tongshan Road, Xuzhou City, Jiangsu, 221000, China
| | - Ting Li
- College of Life Sciences, Yunlong District, Xuzhou Medical University, No. 209, Tongshan Road, Xuzhou City, Jiangsu, 221000, China.
| |
Collapse
|
2
|
Kurmi A, Sen P, Dash M, Ray SK, Satapathy SS. Differentially used codons among essential genes in bacteria identified by machine learning-based analysis. Mol Genet Genomics 2024; 299:72. [PMID: 39060647 DOI: 10.1007/s00438-024-02163-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2024] [Accepted: 07/10/2024] [Indexed: 07/28/2024]
Abstract
Codon usage bias (CUB), the uneven usage of synonymous codons encoding the same amino acid, differs among genes within and across bacteria genomes. CUB is known to be influenced by gene expression and accordingly, CUB differs between the high-expression and low-expression genes in several bacteria. In this article, we have extended codon usage study considering gene essentiality as a feature. Using machine learning (ML) based approaches, we have analysed Relative Synonymous Codon Usage (RSCU) values between essential and non-essential genes in Escherichia coli and thirty-four other bacterial genomes whose gene essentiality features were available in public databases. We observed significant differences in codon usage patterns between essential and non-essential genes for majority of the bacterial genomes and accordingly, ML based classifiers achieved high area under curve (AUC) scores, with a minimum score of 70.0 across twenty-eight organisms. Further, importance of the codons towards classifying genes found to differ among the codons in each genome. Arg codon CGT and Gly codon GGT were observed to be the most preferred codons among essential genes in Escherichia coli. Interestingly, some of the codons like CGT, ATA, GGT and GGG observed to be contributing consistently towards classifying essential genes across thirty-five bacteria genomes studied. In other hand, codons TGY and CAY encoding amino acids Cys and His respectively were among the least contributing codons towards classification among all these bacteria. This study demonstrates the gene essentiality based differences in synonymous codon usage in bacteria genomes and presents a common codon usage pattern across bacteria.
Collapse
Affiliation(s)
- Annushree Kurmi
- Department of Computer Science and Engineering, Tezpur University, Napaam, Assam, 784028, India
- Department of Computer Science and Engineering, The Assam Kaziranga University, Jorhat, Assam, 785006, India
| | - Piyali Sen
- Department of Computer Science and Engineering, Tezpur University, Napaam, Assam, 784028, India
| | - Madhusmita Dash
- Department of Electronics and Communication Engineering, NIT, Jote, Arunachal Pradesh, 791113, India
| | - Suvendra Kumar Ray
- Department of Molecular Biology and Biotechnology, Tezpur University, Napaam, Assam, 784028, India
| | | |
Collapse
|
3
|
Akeju OJ, Cope AL. Re-examining Correlations Between Synonymous Codon Usage and Protein Bond Angles in Escherichia coli. Genome Biol Evol 2024; 16:evae080. [PMID: 38619010 PMCID: PMC11077309 DOI: 10.1093/gbe/evae080] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2023] [Revised: 04/05/2024] [Accepted: 04/10/2024] [Indexed: 04/16/2024] Open
Abstract
Rosenberg AA, Marx A, Bronstein AM (Codon-specific Ramachandran plots show amino acid backbone conformation depends on identity of the translated codon. Nat Commun. 2022:13:2815) recently found a surprising correlation between synonymous codon usage and the dihedral bond angles of the resulting amino acid. However, their analysis did not account for the strongest known correlate of codon usage: gene expression. We re-examined the relationship between bond angles and codon usage by applying the approach of Rosenberg et al. to simulated protein-coding sequences that (i) have random codon usage, (ii) codon usage determined by mutation biases, and (iii) maintain the general relationship between codon usage and gene expression via the assumption of selection-mutation-drift equilibrium. We observed correlations between dihedral bond angle and codon usage when codon usage is entirely random, indicating possible conflation of noise with differences in bond angle distributions between synonymous codons. More relevant to the general analysis of codon usage patterns, we found surprisingly good agreement between the analysis of the real sequences and the analysis of sequences simulated assuming selection-mutation-drift equilibrium, with 91% of significant synonymous codon pairs detected in the former were also detected in the latter. We believe the correlation between codon usage and dihedral bond angles resulted from the variation in codon usage across genes due to the interplay between mutation bias, natural selection for translation efficiency, and gene expression, further underscoring these factors must be controlled for when looking for novel patterns related to codon usage.
Collapse
Affiliation(s)
| | - Alexander L Cope
- Department of Genetics, Rutgers University, Piscataway, New Jersey, USA
- Human Genetics Institute of New Jersey, Rutgers University, Piscataway, New Jersey, USA
- Robert Wood Johnson Medical School, Rutgers University, Piscataway, New Jersey, USA
| |
Collapse
|
4
|
Safadi A, Lovell SC, Doig AJ. Essentiality, protein-protein interactions and evolutionary properties are key predictors for identifying cancer-associated genes using machine learning. Sci Rep 2024; 14:9199. [PMID: 38649399 PMCID: PMC11035574 DOI: 10.1038/s41598-023-44118-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2023] [Accepted: 10/04/2023] [Indexed: 04/25/2024] Open
Abstract
The distinctive nature of cancer as a disease prompts an exploration of the special characteristics the genes implicated in cancer exhibit. The identification of cancer-associated genes and their characteristics is crucial to further our understanding of this disease and enhanced likelihood of therapeutic drug targets success. However, the rate at which cancer genes are being identified experimentally is slow. Applying predictive analysis techniques, through the building of accurate machine learning models, is potentially a useful approach in enhancing the identification rate of these genes and their characteristics. Here, we investigated gene essentiality scores and found that they tend to be higher for cancer-associated genes compared to other protein-coding human genes. We built a dataset of extended gene properties linked to essentiality and used it to train a machine-learning model; this model reached 89% accuracy and > 0.85 for the Area Under Curve (AUC). The model showed that essentiality, evolutionary-related properties, and properties arising from protein-protein interaction networks are particularly effective in predicting cancer-associated genes. We were able to use the model to identify potential candidate genes that have not been previously linked to cancer. Prioritising genes that score highly by our methods could aid scientists in their cancer genes research.
Collapse
Affiliation(s)
- Amro Safadi
- Division of Evolution and Genomic Sciences, School of Biological Sciences, Faculty of Biology, Medicine and Health, The University of Manchester, Manchester, M13 9PT, UK
| | - Simon C Lovell
- Division of Evolution and Genomic Sciences, School of Biological Sciences, Faculty of Biology, Medicine and Health, The University of Manchester, Manchester, M13 9PT, UK
| | - Andrew J Doig
- Division of Neuroscience, School of Biological Sciences, Faculty of Biology, Medicine and Health, The University of Manchester, Manchester, M13 9BL, UK.
| |
Collapse
|
5
|
Balogun EJ, Ness RW. The Effects of De Novo Mutation on Gene Expression and the Consequences for Fitness in Chlamydomonas reinhardtii. Mol Biol Evol 2024; 41:msae035. [PMID: 38366781 PMCID: PMC10910851 DOI: 10.1093/molbev/msae035] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2023] [Revised: 02/01/2024] [Accepted: 02/13/2024] [Indexed: 02/18/2024] Open
Abstract
Mutation is the ultimate source of genetic variation, the bedrock of evolution. Yet, predicting the consequences of new mutations remains a challenge in biology. Gene expression provides a potential link between a genotype and its phenotype. But the variation in gene expression created by de novo mutation and the fitness consequences of mutational changes to expression remain relatively unexplored. Here, we investigate the effects of >2,600 de novo mutations on gene expression across the transcriptome of 28 mutation accumulation lines derived from 2 independent wild-type genotypes of the green algae Chlamydomonas reinhardtii. We observed that the amount of genetic variance in gene expression created by mutation (Vm) was similar to the variance that mutation generates in typical polygenic phenotypic traits and approximately 15-fold the variance seen in the limited species where Vm in gene expression has been estimated. Despite the clear effect of mutation on expression, we did not observe a simple additive effect of mutation on expression change, with no linear correlation between the total expression change and mutation count of individual MA lines. We therefore inferred the distribution of expression effects of new mutations to connect the number of mutations to the number of differentially expressed genes (DEGs). Our inferred DEE is highly L-shaped with 95% of mutations causing 0-1 DEG while the remaining 5% are spread over a long tail of large effect mutations that cause multiple genes to change expression. The distribution is consistent with many cis-acting mutation targets that affect the expression of only 1 gene and a large target of trans-acting targets that have the potential to affect tens or hundreds of genes. Further evidence for cis-acting mutations can be seen in the overabundance of mutations in or near differentially expressed genes. Supporting evidence for trans-acting mutations comes from a 15:1 ratio of DEGs to mutations and the clusters of DEGs in the co-expression network, indicative of shared regulatory architecture. Lastly, we show that there is a negative correlation with the extent of expression divergence from the ancestor and fitness, providing direct evidence of the deleterious effects of perturbing gene expression.
Collapse
Affiliation(s)
- Eniolaye J Balogun
- Department of Biology, William G. Davis Building, University of Toronto, Mississauga L5L-1C6, Canada
- Department of Ecology and Evolutionary Biology, University of Toronto, Toronto M5S-3B2, Canada
| | - Rob W Ness
- Department of Biology, William G. Davis Building, University of Toronto, Mississauga L5L-1C6, Canada
| |
Collapse
|
6
|
Li W, Li R, Tang X, Cheng J, Zhan L, Shang Z, Wu J. Genomics evolution of Jingmen viruses associated with ticks and vertebrates. Genomics 2023; 115:110734. [PMID: 37890641 DOI: 10.1016/j.ygeno.2023.110734] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2023] [Revised: 10/08/2023] [Accepted: 10/24/2023] [Indexed: 10/29/2023]
Abstract
Jingmen virus (JMV) associated with ticks and vertebrates have been found to be related to human disease. We obtained the genome of a Jingmen tick virus (JMTV) strain from Rhipicephalus microplus in Guizhou province and compared the genomes of seven JMV species associated with ticks and vertebrates to understand the evolutionary relationships. The topology of the phylogenetic tree of segment 1 and segment 3 is similar, and segment 2 and segment 4 formed two different topologies, with the main differences being between Alongshan virus (ALSV), Takachi virus, Yanggou tick virus and Pteropus lylei jingmen virus (PLJV), and the possibility of genetic reassortment among these viruses. Moreover, we detected recombination within JMTV and between PLJV and ALSV. The genetic reassortment and recombination that occurs during cross-species transmission of these JMV associated with ticks and vertebrates not only complicates their evolutionary relationships, but also raises the risk of these viruses to humans.
Collapse
Affiliation(s)
- Weiyi Li
- School of Public Health, the key Laboratory of Environmental Pollution Monitoring and Disease Control, Ministry of Education, Guizhou Medical University, Guiyang 561113, China; Key Laboratory of Modern Pathogen Biology and Characteristics, Basic Medical College, Guizhou Medical University, Guiyang, Guizhou 550025, China
| | - Rongting Li
- School of Public Health, the key Laboratory of Environmental Pollution Monitoring and Disease Control, Ministry of Education, Guizhou Medical University, Guiyang 561113, China; Key Laboratory of Modern Pathogen Biology and Characteristics, Basic Medical College, Guizhou Medical University, Guiyang, Guizhou 550025, China
| | - Xiaomin Tang
- Key Laboratory of Modern Pathogen Biology and Characteristics, Basic Medical College, Guizhou Medical University, Guiyang, Guizhou 550025, China; Department of Human Parasitology, Basic Medical College, Guizhou Medical University, Guiyang, Guizhou 550025, China
| | - Jinzhi Cheng
- Key Laboratory of Modern Pathogen Biology and Characteristics, Basic Medical College, Guizhou Medical University, Guiyang, Guizhou 550025, China; Department of Human Parasitology, Basic Medical College, Guizhou Medical University, Guiyang, Guizhou 550025, China
| | - Lin Zhan
- School of Public Health, the key Laboratory of Environmental Pollution Monitoring and Disease Control, Ministry of Education, Guizhou Medical University, Guiyang 561113, China; Central Laboratory, Guizhou Provincial People's Hospital, Guiyang, Guizhou 550002, China
| | - Zhengling Shang
- Department of Immunology, Basic Medical College, Guizhou Medical University, Guiyang, Guizhou 550025, China
| | - Jiahong Wu
- Key Laboratory of Modern Pathogen Biology and Characteristics, Basic Medical College, Guizhou Medical University, Guiyang, Guizhou 550025, China; Department of Human Parasitology, Basic Medical College, Guizhou Medical University, Guiyang, Guizhou 550025, China.
| |
Collapse
|
7
|
Vigué L, Tenaillon O. Predicting the effect of mutations to investigate recent events of selection across 60,472 Escherichia coli strains. Proc Natl Acad Sci U S A 2023; 120:e2304177120. [PMID: 37487088 PMCID: PMC10401003 DOI: 10.1073/pnas.2304177120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2023] [Accepted: 05/25/2023] [Indexed: 07/26/2023] Open
Abstract
Microbial genomics studies focusing on the dynamics of selection have often used a small number of distant genomes. As a result, they could only analyze mutations that had become fixed during the divergence between species. However, thousands of genomes of some species are now available in public databases, thanks to high-throughput sequencing. These data provide a more complete picture of the polymorphisms segregating within a species, offering a unique insight into the processes that shape the recent evolution of a species. In this study, we present GLASS (Gene-Level Amino-acid Score Shift), a selection test that is based on the predicted effects of amino acid changes. By comparing the distribution of effects of mutations observed in a gene to the expectation in the absence of selection, GLASS can quantify the intensity of selection. We applied GLASS to a dataset of 60,472 Escherichia coli strains and used this to reexamine the longstanding debate about the role of essentiality versus expression level in the rate of protein evolution. We found that selection has contrasting short-term and long-term dynamics, with essential genes being subject to strong purifying selection in the short term, while expression level determines the rate of gene evolution in the long term. GLASS also found an overrepresentation of inactivating mutations in specific transcription factors, such as efflux pump repressors, which is consistent with selection for antibiotic resistance. These gene-inactivating polymorphisms do not reach fixation, suggesting another contrast between short-term fitness gains and long-term counterselection.
Collapse
Affiliation(s)
- Lucile Vigué
- Université Paris Cité and Université Sorbonne Paris Nord, Inserm, Infection, Antimicrobials, Modelling, Evolution, F-75018Paris, France
| | - Olivier Tenaillon
- Université Paris Cité and Université Sorbonne Paris Nord, Inserm, Infection, Antimicrobials, Modelling, Evolution, F-75018Paris, France
| |
Collapse
|
8
|
Bédard C, Cisneros AF, Jordan D, Landry CR. Correlation between protein abundance and sequence conservation: what do recent experiments say? Curr Opin Genet Dev 2022; 77:101984. [PMID: 36162152 DOI: 10.1016/j.gde.2022.101984] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2022] [Revised: 08/23/2022] [Accepted: 08/26/2022] [Indexed: 01/27/2023]
Abstract
Cells evolve in a space of parameter values set by physical and chemical forces. These constraints create associations among cellular properties. A particularly strong association is the negative correlation between the rate of evolution of proteins and their abundance in the cell. Highly expressed proteins evolve slower than lowly expressed ones. Multiple hypotheses have been put forward to explain this relationship, including, for instance, the requirement for higher mRNA stability, misfolding avoidance, and misinteraction avoidance for highly expressed proteins. Here, we review some of these hypotheses, their predictions, and how they are supported to finally discuss recent experiments that have been performed to test these predictions.
Collapse
Affiliation(s)
- Camille Bédard
- Département de Biologie, Faculté des Sciences et de Génie, Université Laval, G1V 0A6, Canada; Institut de Biologie Intégrative et des Systèmes, Université Laval, G1V 0A6, Canada; PROTEO, Le regroupement québécois de recherche sur la fonction, l'ingénierie et les applications des protéines, Université Laval, G1V 0A6, Canada; Centre de Recherche sur les Données Massives, Université Laval, G1V 0A6, Canada. https://twitter.com/@CamilleBed17
| | - Angel F Cisneros
- Institut de Biologie Intégrative et des Systèmes, Université Laval, G1V 0A6, Canada; PROTEO, Le regroupement québécois de recherche sur la fonction, l'ingénierie et les applications des protéines, Université Laval, G1V 0A6, Canada; Centre de Recherche sur les Données Massives, Université Laval, G1V 0A6, Canada; Département de Biochimie, de Microbiologie et de Bio-informatique, Faculté des Sciences et de Génie, Université Laval, G1V 0A6, Canada. https://twitter.com/@AngelFCC119
| | - David Jordan
- Institut de Biologie Intégrative et des Systèmes, Université Laval, G1V 0A6, Canada; PROTEO, Le regroupement québécois de recherche sur la fonction, l'ingénierie et les applications des protéines, Université Laval, G1V 0A6, Canada; Centre de Recherche sur les Données Massives, Université Laval, G1V 0A6, Canada; Département de Biochimie, de Microbiologie et de Bio-informatique, Faculté des Sciences et de Génie, Université Laval, G1V 0A6, Canada. https://twitter.com/@DavidJordan1997
| | - Christian R Landry
- Département de Biologie, Faculté des Sciences et de Génie, Université Laval, G1V 0A6, Canada; Institut de Biologie Intégrative et des Systèmes, Université Laval, G1V 0A6, Canada; PROTEO, Le regroupement québécois de recherche sur la fonction, l'ingénierie et les applications des protéines, Université Laval, G1V 0A6, Canada; Centre de Recherche sur les Données Massives, Université Laval, G1V 0A6, Canada; Département de Biochimie, de Microbiologie et de Bio-informatique, Faculté des Sciences et de Génie, Université Laval, G1V 0A6, Canada.
| |
Collapse
|
9
|
Ghadie MA, Xia Y. Are transient protein-protein interactions more dispensable? PLoS Comput Biol 2022; 18:e1010013. [PMID: 35404956 PMCID: PMC9000134 DOI: 10.1371/journal.pcbi.1010013] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2021] [Accepted: 03/11/2022] [Indexed: 12/12/2022] Open
Abstract
Protein-protein interactions (PPIs) are key drivers of cell function and evolution. While it is widely assumed that most permanent PPIs are important for cellular function, it remains unclear whether transient PPIs are equally important. Here, we estimate and compare dispensable content among transient PPIs and permanent PPIs in human. Starting with a human reference interactome mapped by experiments, we construct a human structural interactome by building three-dimensional structural models for PPIs, and then distinguish transient PPIs from permanent PPIs using several structural and biophysical properties. We map common mutations from healthy individuals and disease-causing mutations onto the structural interactome, and perform structure-based calculations of the probabilities for common mutations (assumed to be neutral) and disease mutations (assumed to be mildly deleterious) to disrupt transient PPIs and permanent PPIs. Using Bayes' theorem we estimate that a similarly small fraction (<~20%) of both transient and permanent PPIs are completely dispensable, i.e., effectively neutral upon disruption. Hence, transient and permanent interactions are subject to similarly strong selective constraints in the human interactome.
Collapse
Affiliation(s)
| | - Yu Xia
- Department of Bioengineering, McGill University, Montreal, Canada
| |
Collapse
|
10
|
Abstract
Because gene expression is important for evolutionary adaptation, its misregulation is an important cause of maladaptation. A misregulated gene can be incorrectly silent ("off") when a transcription factor (TF) that is required for its activation does not binds its regulatory region. Conversely, a misregulated gene can be incorrectly active ("on") when a TF not normally involved in its activation binds its regulatory region, a phenomenon also known as regulatory crosstalk. DNA mutations that destroy or create TF binding sites on DNA are an important source of misregulation and crosstalk. Although misregulation reduces fitness in an environment to which an organism is well-adapted, it may become adaptive in a new environment. Here, I derive simple yet general mathematical expressions that delimit the conditions under which misregulation can be adaptive. These expressions depend on the strength of selection against misregulation, on the fraction of DNA sequence space filled with TF binding sites, and on the fraction of genes that must be expressed for optimal adaptation. I then use empirical data from RNA sequencing, protein-binding microarrays, and genome evolution, together with population genetic simulations to ask when these conditions are likely to be met. I show that they can be met under realistic circumstances, but these circumstances may vary among organisms and environments. My analysis provides a framework in which improved theory and data collection can help us demonstrate the role of misregulation in adaptation. It also shows that misregulation, like DNA mutation, is one of life's many imperfections that can help propel Darwinian evolution.
Collapse
Affiliation(s)
- Andreas Wagner
- Department of Evolutionary Biology and Environmental Studies, University of Zurich, Zurich, CH-8057, Switzerland.,The Santa Fe Institute, Santa Fe, NM 87501, USA.,Swiss Institute of Bioinformatics, Lausanne, Switzerland
| |
Collapse
|
11
|
Biesiadecka MK, Sliwa P, Tomala K, Korona R. An Overexpression Experiment Does Not Support the Hypothesis That Avoidance of Toxicity Determines the Rate of Protein Evolution. Genome Biol Evol 2021; 12:589-596. [PMID: 32259256 PMCID: PMC7250497 DOI: 10.1093/gbe/evaa067] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/01/2020] [Indexed: 12/22/2022] Open
Abstract
The misfolding avoidance hypothesis postulates that sequence mutations render proteins cytotoxic and therefore the higher the gene expression, the stronger the operation of selection against substitutions. This translates into prediction that relative toxicity of extant proteins is higher for those evolving faster. In the present experiment, we selected pairs of yeast genes which were paralogous but evolving at different rates. We expressed them artificially to high levels. We expected that toxicity would be higher for ones bearing more mutations, especially that overcrowding should rather exacerbate than reverse the already existing differences in misfolding rates. We did find that the applied mode of overexpression caused a considerable decrease in fitness and that the decrease was proportional to the amount of excessive protein. However, it was not higher for proteins which are normally expressed at lower levels (and have less conserved sequence). This result was obtained consistently, regardless whether the rate of growth or ability to compete in common cultures was used as a proxy for fitness. In additional experiments, we applied factors that reduce accuracy of translation or enhance structural instability of proteins. It did not change a consistent pattern of independence between the fitness cost caused by overexpression of a protein and the rate of its sequence evolution.
Collapse
Affiliation(s)
| | - Piotr Sliwa
- Department of Genetics, Faculty of Biotechnology, University of Rzeszów, Poland
| | - Katarzyna Tomala
- Institute of Environmental Sciences, Faculty of Biology, Jagiellonian University, Cracow, Poland
| | - Ryszard Korona
- Institute of Environmental Sciences, Faculty of Biology, Jagiellonian University, Cracow, Poland
| |
Collapse
|
12
|
Razban RM, Dasmeh P, Serohijos AWR, Shakhnovich EI. Avoidance of protein unfolding constrains protein stability in long-term evolution. Biophys J 2021; 120:2413-2424. [PMID: 33932438 PMCID: PMC8390877 DOI: 10.1016/j.bpj.2021.03.042] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2020] [Revised: 02/24/2021] [Accepted: 03/17/2021] [Indexed: 11/28/2022] Open
Abstract
Every amino acid residue can influence a protein's overall stability, making stability highly susceptible to change throughout evolution. We consider the distribution of protein stabilities evolutionarily permittable under two previously reported protein fitness functions: flux dynamics and misfolding avoidance. We develop an evolutionary dynamics theory and find that it agrees better with an extensive protein stability data set for dihydrofolate reductase orthologs under the misfolding avoidance fitness function rather than the flux dynamics fitness function. Further investigation with ribonuclease H data demonstrates that not any misfolded state is avoided; rather, it is only the unfolded state. At the end, we discuss how our work pertains to the universal protein abundance-evolutionary rate correlation seen across organisms' proteomes. We derive a closed-form expression relating protein abundance to evolutionary rate that captures Escherichia coli, Saccharomyces cerevisiae, and Homo sapiens experimental trends without fitted parameters.
Collapse
Affiliation(s)
- Rostam M Razban
- Department of Chemistry and Chemical Biology, Harvard University, Cambridge, Massachusetts
| | - Pouria Dasmeh
- Department of Chemistry and Chemical Biology, Harvard University, Cambridge, Massachusetts; Departement de Biochimie, Université de Montréal, Montreal, Quebec, Canada
| | | | - Eugene I Shakhnovich
- Department of Chemistry and Chemical Biology, Harvard University, Cambridge, Massachusetts.
| |
Collapse
|
13
|
Specificity of RNA Folding and Its Association with Evolutionarily Adaptive mRNA Secondary Structures. GENOMICS PROTEOMICS & BIOINFORMATICS 2021; 19:882-900. [PMID: 33607297 PMCID: PMC9403030 DOI: 10.1016/j.gpb.2019.11.013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/17/2018] [Revised: 08/03/2019] [Accepted: 11/08/2019] [Indexed: 11/23/2022]
Abstract
The secondary structure is a fundamental feature of both noncoding and messenger RNAs. However, our understanding of the secondary structure of mRNA, especially that of the coding regions, remains elusive, likely due to translation and the lack of RNA-binding proteins that sustain the consensus structure, such as those that bind to noncoding RNA. Indeed, mRNA has recently been found to adopt diverse alternative structures, the overall functional significance of which remains untested. We hereby approached this problem by estimating the folding specificity, i.e., the probability that a fragment of RNA folds back to the same partner once refolded. We showed that the folding specificity of mRNA is lower than that of noncoding RNA and exhibits moderate evolutionary conservation. Notably, we found that specific rather than alternative folding is likely evolutionarily adaptive since specific folding is frequently associated with functionally important genes or sites within a gene. Additional analysis in combination with ribosome density suggests the ability to modulate ribosome movement as one potential functional advantage provided by specific folding. Our findings revealed a novel facet of the RNA structurome with important functional and evolutionary implications and indicated a potential method for distinguishing the mRNA secondary structures maintained by natural selection from molecular noise.
Collapse
|
14
|
Usmanova DR, Plata G, Vitkup D. The Relationship between the Misfolding Avoidance Hypothesis and Protein Evolutionary Rates in the Light of Empirical Evidence. Genome Biol Evol 2021; 13:6081017. [PMID: 33432359 PMCID: PMC7874998 DOI: 10.1093/gbe/evab006] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/07/2021] [Indexed: 12/14/2022] Open
Abstract
For more than a decade, the misfolding avoidance hypothesis (MAH) and related theories have dominated evolutionary discussions aimed at explaining the variance of the molecular clock across cellular proteins. In this study, we use various experimental data to further investigate the consistency of the MAH predictions with empirical evidence. We also critically discuss experimental results that motivated the MAH development and that are often viewed as evidence of its major contribution to the variability of protein evolutionary rates. We demonstrate, in Escherichia coli and Homo sapiens, the lack of a substantial negative correlation between protein evolutionary rates and Gibbs free energies of unfolding, a direct measure of protein stability. We then analyze multiple new genome-scale data sets characterizing protein aggregation and interaction propensities, the properties that are likely optimized in evolution to alleviate deleterious effects associated with toxic protein misfolding and misinteractions. Our results demonstrate that the propensity of proteins to aggregate, the fraction of charged amino acids, and protein stickiness do correlate with protein abundances. Nevertheless, across multiple organisms and various data sets we do not observe substantial correlations between proteins’ aggregation- and stability-related properties and evolutionary rates. Therefore, diverse empirical data support the conclusion that the MAH and similar hypotheses do not play a major role in mediating a strong negative correlation between protein expression and the molecular clock, and thus in explaining the variability of evolutionary rates across cellular proteins.
Collapse
Affiliation(s)
- Dinara R Usmanova
- Department of Systems Biology, Columbia University, New York, NY, USA
| | - Germán Plata
- Department of Systems Biology, Columbia University, New York, NY, USA.,Elanco Animal Health, Greenfield, IN, USA
| | - Dennis Vitkup
- Department of Systems Biology, Columbia University, New York, NY, USA.,Department of Biomedical Informatics, Columbia University, New York, NY, USA
| |
Collapse
|
15
|
Abstract
Background:
Essential proteins play important roles in the survival or reproduction of
an organism and support the stability of the system. Essential proteins are the minimum set of
proteins absolutely required to maintain a living cell. The identification of essential proteins is a
very important topic not only for a better comprehension of the minimal requirements for cellular
life, but also for a more efficient discovery of the human disease genes and drug targets.
Traditionally, as the experimental identification of essential proteins is complex, it usually requires
great time and expense. With the cumulation of high-throughput experimental data, many
computational methods that make useful complements to experimental methods have been
proposed to identify essential proteins. In addition, the ability to rapidly and precisely identify
essential proteins is of great significance for discovering disease genes and drug design, and has
great potential for applications in basic and synthetic biology research.
Objective:
The aim of this paper is to provide a review on the identification of essential proteins
and genes focusing on the current developments of different types of computational methods, point
out some progress and limitations of existing methods, and the challenges and directions for
further research are discussed.
Collapse
Affiliation(s)
- Ming Fang
- School of Computer Science, Shaanxi Normal University, Xi'an 710119, China
| | - Xiujuan Lei
- School of Computer Science, Shaanxi Normal University, Xi'an 710119, China
| | - Ling Guo
- College of Life Sciences, Shaanxi Normal University, Xi'an 710119, China
| |
Collapse
|
16
|
Systematic analysis reveals the prevalence and principles of bypassable gene essentiality. Nat Commun 2019; 10:1002. [PMID: 30824696 PMCID: PMC6397241 DOI: 10.1038/s41467-019-08928-1] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2018] [Accepted: 02/07/2019] [Indexed: 12/12/2022] Open
Abstract
Gene essentiality is a variable phenotypic trait, but to what extent and how essential genes can become dispensable for viability remain unclear. Here, we investigate 'bypass of essentiality (BOE)' - an underexplored type of digenic genetic interaction that renders essential genes dispensable. Through analyzing essential genes on one of the six chromosome arms of the fission yeast Schizosaccharomyces pombe, we find that, remarkably, as many as 27% of them can be converted to non-essential genes by BOE interactions. Using this dataset we identify three principles of essentiality bypass: bypassable essential genes tend to have lower importance, tend to exhibit differential essentiality between species, and tend to act with other bypassable genes. In addition, we delineate mechanisms underlying bypassable essentiality, including the previously unappreciated mechanism of dormant redundancy between paralogs. The new insights gained on bypassable essentiality deepen our understanding of genotype-phenotype relationships and will facilitate drug development related to essential genes.
Collapse
|
17
|
Alvarez-Ponce D, Feyertag F, Chakraborty S. Position Matters: Network Centrality Considerably Impacts Rates of Protein Evolution in the Human Protein-Protein Interaction Network. Genome Biol Evol 2018; 9:1742-1756. [PMID: 28854629 PMCID: PMC5570066 DOI: 10.1093/gbe/evx117] [Citation(s) in RCA: 30] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/01/2017] [Indexed: 02/06/2023] Open
Abstract
The proteins of any organism evolve at disparate rates. A long list of factors affecting rates of protein evolution have been identified. However, the relative importance of each factor in determining rates of protein evolution remains unresolved. The prevailing view is that evolutionary rates are dominantly determined by gene expression, and that other factors such as network centrality have only a marginal effect, if any. However, this view is largely based on analyses in yeasts, and accurately measuring the importance of the determinants of rates of protein evolution is complicated by the fact that the different factors are often correlated with each other, and by the relatively poor quality of available functional genomics data sets. Here, we use correlation, partial correlation and principal component regression analyses to measure the contributions of several factors to the variability of the rates of evolution of human proteins. For this purpose, we analyzed the entire human protein–protein interaction data set and the human signal transduction network—a network data set of exceptionally high quality, obtained by manual curation, which is expected to be virtually free from false positives. In contrast with the prevailing view, we observe that network centrality (measured as the number of physical and nonphysical interactions, betweenness, and closeness) has a considerable impact on rates of protein evolution. Surprisingly, the impact of centrality on rates of protein evolution seems to be comparable, or even superior according to some analyses, to that of gene expression. Our observations seem to be independent of potentially confounding factors and from the limitations (biases and errors) of interactomic data sets.
Collapse
|
18
|
Li M, Li W, Wu FX, Pan Y, Wang J. Identifying essential proteins based on sub-network partition and prioritization by integrating subcellular localization information. J Theor Biol 2018; 447:65-73. [PMID: 29571709 DOI: 10.1016/j.jtbi.2018.03.029] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2017] [Revised: 03/19/2018] [Accepted: 03/20/2018] [Indexed: 01/07/2023]
Abstract
Essential proteins are important participants in various life activities and play a vital role in the survival and reproduction of living organisms. Identification of essential proteins from protein-protein interaction (PPI) networks has great significance to facilitate the study of human complex diseases, the design of drugs and the development of bioinformatics and computational science. Studies have shown that highly connected proteins in a PPI network tend to be essential. A series of computational methods have been proposed to identify essential proteins by analyzing topological structures of PPI networks. However, the high noise in the PPI data can degrade the accuracy of essential protein prediction. Moreover, proteins must be located in the appropriate subcellular localization to perform their functions, and only when the proteins are located in the same subcellular localization, it is possible that they can interact with each other. In this paper, we propose a new network-based essential protein discovery method based on sub-network partition and prioritization by integrating subcellular localization information, named SPP. The proposed method SPP was tested on two different yeast PPI networks obtained from DIP database and BioGRID database. The experimental results show that SPP can effectively reduce the effect of false positives in PPI networks and predict essential proteins more accurately compared with other existing computational methods DC, BC, CC, SC, EC, IC, NC.
Collapse
Affiliation(s)
- Min Li
- School of Information Science and Engineering, Central South University, Changsha 410083, China.
| | - Wenkai Li
- School of Information Science and Engineering, Central South University, Changsha 410083, China.
| | - Fang-Xiang Wu
- Division of Biomedical Engineering and Department of Mechanical Engineering, University of Saskatchewan, Saskatoon, SK S7N 5A9, Canada.
| | - Yi Pan
- Department of Computer Science, Georgia State University, Atlanta, GA 30302-4110, USA.
| | - Jianxin Wang
- School of Information Science and Engineering, Central South University, Changsha 410083, China.
| |
Collapse
|
19
|
Kabir M, Barradas A, Tzotzos GT, Hentges KE, Doig AJ. Properties of genes essential for mouse development. PLoS One 2017; 12:e0178273. [PMID: 28562614 PMCID: PMC5451031 DOI: 10.1371/journal.pone.0178273] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2017] [Accepted: 05/10/2017] [Indexed: 12/20/2022] Open
Abstract
Essential genes are those that are critical for life. In the specific case of the mouse, they are the set of genes whose deletion means that a mouse is unable to survive after birth. As such, they are the key minimal set of genes needed for all the steps of development to produce an organism capable of life ex utero. We explored a wide range of sequence and functional features to characterise essential (lethal) and non-essential (viable) genes in mice. Experimental data curated manually identified 1301 essential genes and 3451 viable genes. Very many sequence features show highly significant differences between essential and viable mouse genes. Essential genes generally encode complex proteins, with multiple domains and many introns. These genes tend to be: long, highly expressed, old and evolutionarily conserved. These genes tend to encode ligases, transferases, phosphorylated proteins, intracellular proteins, nuclear proteins, and hubs in protein-protein interaction networks. They are involved with regulating protein-protein interactions, gene expression and metabolic processes, cell morphogenesis, cell division, cell proliferation, DNA replication, cell differentiation, DNA repair and transcription, cell differentiation and embryonic development. Viable genes tend to encode: membrane proteins or secreted proteins, and are associated with functions such as cellular communication, apoptosis, behaviour and immune response, as well as housekeeping and tissue specific functions. Viable genes are linked to transport, ion channels, signal transduction, calcium binding and lipid binding, consistent with their location in membranes and involvement with cell-cell communication. From the analysis of the composite features of essential and viable genes, we conclude that essential genes tend to be required for intracellular functions, and viable genes tend to be involved with extracellular functions and cell-cell communication. Knowledge of the features that are over-represented in essential genes allows for a deeper understanding of the functions and processes implemented during mammalian development.
Collapse
Affiliation(s)
- Mitra Kabir
- Faculty of Biology, Medicine, and Health, University of Manchester, Manchester, United Kingdom
- Manchester Institute of Biotechnology and Department of Chemistry, Faculty of Science and Engineering, The University of Manchester, Manchester, United Kingdom
| | - Ana Barradas
- Faculty of Biology, Medicine, and Health, University of Manchester, Manchester, United Kingdom
| | - George T. Tzotzos
- Department of Agriculture, Food and Environmental Sciences, Marche Polytechnic University, Ancona, Italy
| | - Kathryn E. Hentges
- Faculty of Biology, Medicine, and Health, University of Manchester, Manchester, United Kingdom
| | - Andrew J. Doig
- Manchester Institute of Biotechnology and Department of Chemistry, Faculty of Science and Engineering, The University of Manchester, Manchester, United Kingdom
| |
Collapse
|
20
|
Morandin C, Mikheyev AS, Pedersen JS, Helanterä H. Evolutionary constraints shape caste-specific gene expression across 15 ant species. Evolution 2017; 71:1273-1284. [PMID: 28262920 DOI: 10.1111/evo.13220] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2016] [Accepted: 02/16/2017] [Indexed: 12/22/2022]
Abstract
Development of polymorphic phenotypes from similar genomes requires gene expression differences. However, little is known about how morph-specific gene expression patterns vary on a broad phylogenetic scale. We hypothesize that evolution of morph-specific gene expression, and consequently morph-specific phenotypic evolution, may be constrained by gene essentiality and the amount of pleiotropic constraints. Here, we use comparative transcriptomics of queen and worker morphs, that is, castes, from 15 ant species to understand the constraints of morph-biased gene expression. In particular, we investigate how measures of evolutionary constraints at the sequence level (expression level, connectivity, and number of gene ontology [GO] terms) correlate with morph-biased expression. Our results show that genes indeed vary in their potential to become morph-biased. The existence of genes that are constrained in becoming caste-biased potentially limits the evolutionary decoupling of the caste phenotypes, that is, it might result in "caste load" occasioning from antagonistic fitness variation, similarly to sexually antagonistic fitness variation between males and females. On the other hand, we suggest that genes under low constraints are released from antagonistic variation and thus more likely to be co-opted for morph specific use. Overall, our results suggest that the factors that affect sequence evolutionary rates and evolution of plastic expression may largely overlap.
Collapse
Affiliation(s)
- Claire Morandin
- Centre of Excellence in Biological Interactions, Department of Biosciences, University of Helsinki, Helsinki, Finland.,Tvärminne Zoological Station, University of Helsinki, J.A. Palménin tie 260, FI-10900, Hanko, Finland
| | - Alexander S Mikheyev
- Okinawa Institute of Science and Technology, 1919-1 Tancha, Onna-son, Kunigami-gun, Okinawa, 904-0412, Japan.,Research School of Biology, Australian National University, Canberra, ACT, 0200, Australia
| | - Jes Søe Pedersen
- Centre for Social Evolution, University of Copenhagen, Universitetsparken 15, 2100, Copenhagen, Denmark
| | - Heikki Helanterä
- Centre of Excellence in Biological Interactions, Department of Biosciences, University of Helsinki, Helsinki, Finland.,Tvärminne Zoological Station, University of Helsinki, J.A. Palménin tie 260, FI-10900, Hanko, Finland
| |
Collapse
|
21
|
Li M, Lu Y, Niu Z, Wu FX. United Complex Centrality for Identification of Essential Proteins from PPI Networks. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2017; 14:370-380. [PMID: 28368815 DOI: 10.1109/tcbb.2015.2394487] [Citation(s) in RCA: 50] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
Essential proteins are indispensable for the survival or reproduction of an organism. Identification of essential proteins is not only necessary for the understanding of the minimal requirements for cellular life, but also important for the disease study and drug design. With the development of high-throughput techniques, a large number of protein-protein interaction data are available, which promotes the studies of essential proteins from the network level. Up to now, though a series of computational methods have been proposed, the prediction precision still needs to be improved. In this paper, we propose a new method, United complex Centrality (UC), to identify essential proteins by integrating the protein complexes with the topological features of protein-protein interaction (PPI) networks. By analyzing the relationship between the essential proteins and the known protein complexes of S. cerevisiae and human, we find that the proteins in complexes are more likely to be essential compared with the proteins not included in any complexes and the proteins appeared in multiple complexes are more inclined to be essential compared to those only appeared in a single complex. Considering that some protein complexes generated by computational methods are inaccurate, we also provide a modified version of UC with parameter alpha, named UC-P. The experimental results show that protein complex information can help identify the essential proteins more accurate both for the PPI network of S. cerevisiae and that of human. The proposed method UC performs obviously better than the eight previously proposed methods (DC, IC, EC, SC, BC, CC, NC, and LAC) for identifying essential proteins.
Collapse
|
22
|
Alvarez-Ponce D, Sabater-Muñoz B, Toft C, Ruiz-González MX, Fares MA. Essentiality Is a Strong Determinant of Protein Rates of Evolution during Mutation Accumulation Experiments in Escherichia coli. Genome Biol Evol 2016; 8:2914-2927. [PMID: 27566759 PMCID: PMC5630975 DOI: 10.1093/gbe/evw205] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
The Neutral Theory of Molecular Evolution is considered the most powerful theory to understand the evolutionary behavior of proteins. One of the main predictions of this theory is that essential proteins should evolve slower than dispensable ones owing to increased selective constraints. Comparison of genomes of different species, however, has revealed only small differences between the rates of evolution of essential and nonessential proteins. In some analyses, these differences vanish once confounding factors are controlled for, whereas in other cases essentiality seems to have an independent, albeit small, effect. It has been argued that comparing relatively distant genomes may entail a number of limitations. For instance, many of the genes that are dispensable in controlled lab conditions may be essential in some of the conditions faced in nature. Moreover, essentiality can change during evolution, and rates of protein evolution are simultaneously shaped by a variety of factors, whose individual effects are difficult to isolate. Here, we conducted two parallel mutation accumulation experiments in Escherichia coli, during 5,500–5,750 generations, and compared the genomes at different points of the experiments. Our approach (a short-term experiment, under highly controlled conditions) enabled us to overcome many of the limitations of previous studies. We observed that essential proteins evolved substantially slower than nonessential ones during our experiments. Strikingly, rates of protein evolution were only moderately affected by expression level and protein length.
Collapse
Affiliation(s)
| | - Beatriz Sabater-Muñoz
- Instituto de Biología Molecular y Celular de Plantas (CSIC-UPV), Valencia, Spain Department of Genetics, Smurfit Institute of Genetics, University of Dublin, Trinity College Dublin, Dublin, Ireland
| | - Christina Toft
- Department of Genetics, University of Valencia, Valencia, Spain Departamento de Biotecnología, Instituto de Agroquímica y Tecnología de los Alimentos (CSIC), Valencia, Spain
| | - Mario X Ruiz-González
- Instituto de Biología Molecular y Celular de Plantas (CSIC-UPV), Valencia, Spain Current Address: Secretaría de Educación Superior, Ciencia, Tecnología e Innovación, Proyecto Prometeo; Departamento de Ciencias Biológicas, Universidad Tócnica Particular de Loja, Loja, Ecuador
| | - Mario A Fares
- Instituto de Biología Molecular y Celular de Plantas (CSIC-UPV), Valencia, Spain Department of Genetics, Smurfit Institute of Genetics, University of Dublin, Trinity College Dublin, Dublin, Ireland
| |
Collapse
|
23
|
Qin C, Sun Y, Dong Y. A New Method for Identifying Essential Proteins Based on Network Topology Properties and Protein Complexes. PLoS One 2016; 11:e0161042. [PMID: 27529423 PMCID: PMC4987049 DOI: 10.1371/journal.pone.0161042] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2016] [Accepted: 07/28/2016] [Indexed: 11/18/2022] Open
Abstract
Essential proteins are indispensable to the viability and reproduction of an organism. The identification of essential proteins is necessary not only for understanding the molecular mechanisms of cellular life but also for disease diagnosis, medical treatments and drug design. Many computational methods have been proposed for discovering essential proteins, but the precision of the prediction of essential proteins remains to be improved. In this paper, we propose a new method, LBCC, which is based on the combination of local density, betweenness centrality (BC) and in-degree centrality of complex (IDC). First, we introduce the common centrality measures; second, we propose the densities Den1(v) and Den2(v) of a node v to describe its local properties in the network; and finally, the combined strategy of Den1, Den2, BC and IDC is developed to improve the prediction precision. The experimental results demonstrate that LBCC outperforms traditional topological measures for predicting essential proteins, including degree centrality (DC), BC, subgraph centrality (SC), eigenvector centrality (EC), network centrality (NC), and the local average connectivity-based method (LAC). LBCC also improves the prediction precision by approximately 10 percent on the YMIPS and YMBD datasets compared to the most recently developed method, LIDC.
Collapse
Affiliation(s)
- Chao Qin
- Beijing Key Lab of Traffic Data Analysis and Mining, Beijing Jiaotong University, Beijing, China
| | - Yongqi Sun
- Beijing Key Lab of Traffic Data Analysis and Mining, Beijing Jiaotong University, Beijing, China
- * E-mail:
| | - Yadong Dong
- Beijing Key Lab of Traffic Data Analysis and Mining, Beijing Jiaotong University, Beijing, China
| |
Collapse
|
24
|
Mannakee BK, Gutenkunst RN. Selection on Network Dynamics Drives Differential Rates of Protein Domain Evolution. PLoS Genet 2016; 12:e1006132. [PMID: 27380265 PMCID: PMC4933380 DOI: 10.1371/journal.pgen.1006132] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2016] [Accepted: 05/27/2016] [Indexed: 11/19/2022] Open
Abstract
The long-held principle that functionally important proteins evolve slowly has recently been challenged by studies in mice and yeast showing that the severity of a protein knockout only weakly predicts that protein's rate of evolution. However, the relevance of these studies to evolutionary changes within proteins is unknown, because amino acid substitutions, unlike knockouts, often only slightly perturb protein activity. To quantify the phenotypic effect of small biochemical perturbations, we developed an approach to use computational systems biology models to measure the influence of individual reaction rate constants on network dynamics. We show that this dynamical influence is predictive of protein domain evolutionary rate within networks in vertebrates and yeast, even after controlling for expression level and breadth, network topology, and knockout effect. Thus, our results not only demonstrate the importance of protein domain function in determining evolutionary rate, but also the power of systems biology modeling to uncover unanticipated evolutionary forces.
Collapse
Affiliation(s)
- Brian K. Mannakee
- Division of Epidemiology and Biostatistics, Mel and Enid Zuckerman College of Public Health, University of Arizona, Tucson, Arizona, United States of America
| | - Ryan N. Gutenkunst
- Department of Molecular and Cellular Biology, University of Arizona, Tucson, Arizona, United States of America
- * E-mail:
| |
Collapse
|
25
|
Sensitivity and correlation of hypervariable regions in 16S rRNA genes in phylogenetic analysis. BMC Bioinformatics 2016; 17:135. [PMID: 27000765 PMCID: PMC4802574 DOI: 10.1186/s12859-016-0992-y] [Citation(s) in RCA: 298] [Impact Index Per Article: 33.1] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2015] [Accepted: 03/17/2016] [Indexed: 11/10/2022] Open
Abstract
Background Prokaryotic 16S ribosomal RNA (rRNA) sequences are widely used in environmental microbiology and molecular evolution as reliable markers for the taxonomic classification and phylogenetic analysis of microbes. Restricted by current sequencing techniques, the massive sequencing of 16S rRNA gene amplicons encompassing the full length of genes is not yet feasible. Thus, the selection of the most efficient hypervariable regions for phylogenetic analysis and taxonomic classification is still debated. In the present study, several bioinformatics tools were integrated to build an in silico pipeline to evaluate the phylogenetic sensitivity of the hypervariable regions compared with the corresponding full-length sequences. Results The correlation of seven sub-regions was inferred from the geodesic distance, a parameter that is applied to quantitatively compare the topology of different phylogenetic trees constructed using the sequences from different sub-regions. The relationship between different sub-regions based on the geodesic distance indicated that V4-V6 were the most reliable regions for representing the full-length 16S rRNA sequences in the phylogenetic analysis of most bacterial phyla, while V2 and V8 were the least reliable regions. Conclusions Our results suggest that V4-V6 might be optimal sub-regions for the design of universal primers with superior phylogenetic resolution for bacterial phyla. A potential relationship between function and the evolution of 16S rRNA is also discussed. Electronic supplementary material The online version of this article (doi:10.1186/s12859-016-0992-y) contains supplementary material, which is available to authorized users.
Collapse
|
26
|
Selection maintaining protein stability at equilibrium. J Theor Biol 2016; 391:21-34. [DOI: 10.1016/j.jtbi.2015.12.001] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2015] [Revised: 11/29/2015] [Accepted: 12/01/2015] [Indexed: 11/24/2022]
|
27
|
Gu X, Tang W. Model parameters of molecular evolution explain genomic correlations. Brief Bioinform 2015; 18:37-42. [PMID: 26628558 DOI: 10.1093/bib/bbv098] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2015] [Revised: 10/01/2015] [Indexed: 11/13/2022] Open
Abstract
One long-standing research focus in evolutionary genomics is trying to resolve how biological variables (expression, essentiality, protein-protein interaction, structural stability, etc.) determine the rate of protein evolution. While these studies have considerably deepened our understanding of molecular evolution, many issues remain unsolved. In this opinion article, after having a brief survey of literatures, we establish relationships between model parameters of molecular evolution and genomic variables, based on which, most-observed genomic correlations and confounds can be explained by model parameter combinations under different conditions, which include the strength of stabilizing selection, mutational variance, expression sufficiency, gene pleiotropy, as well as the effective population size. We suggest that the problem to discern biological variable(s) that may determine the rate of protein evolution can be tackled at two levels. The first level, as discussed here, is to demonstrate how the model of molecular evolution can predict potential genomic correlations under various conditions. And the second level is to estimate genome-wide variations of model parameters (or combinations) that help to identify canonical biological variables that may underlie the rate variation among genes that ranges up to at least three magnitudes.
Collapse
|
28
|
Abstract
The rate and mechanism of protein sequence evolution have been central questions in evolutionary biology since the 1960s. Although the rate of protein sequence evolution depends primarily on the level of functional constraint, exactly what determines functional constraint has remained unclear. The increasing availability of genomic data has enabled much needed empirical examinations on the nature of functional constraint. These studies found that the evolutionary rate of a protein is predominantly influenced by its expression level rather than functional importance. A combination of theoretical and empirical analyses has identified multiple mechanisms behind these observations and demonstrated a prominent role in protein evolution of selection against errors in molecular and cellular processes.
Collapse
Affiliation(s)
- Jianzhi Zhang
- Department of Ecology and Evolutionary Biology, University of Michigan, 830 North University Avenue, Ann Arbor, Michigan 48109, USA
| | - Jian-Rong Yang
- Department of Ecology and Evolutionary Biology, University of Michigan, 830 North University Avenue, Ann Arbor, Michigan 48109, USA
| |
Collapse
|
29
|
Ish-Am O, Kristensen DM, Ruppin E. Evolutionary Conservation of Bacterial Essential Metabolic Genes across All Bacterial Culture Media. PLoS One 2015; 10:e0123785. [PMID: 25894004 PMCID: PMC4403854 DOI: 10.1371/journal.pone.0123785] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2014] [Accepted: 03/08/2015] [Indexed: 11/22/2022] Open
Abstract
One of the basic postulates of molecular evolution is that functionally important genes should evolve slower than genes of lesser significance. Essential genes, whose knockout leads to a lethal phenotype are considered of high functional importance, yet whether they are truly more conserved than nonessential genes has been the topic of much debate, fuelled by a host of contradictory findings. Here we conduct the first large-scale study utilizing genome-scale metabolic modeling and spanning many bacterial species, which aims to answer this question. Using the novel Media Variation Analysis, we examine the range of conservation of essential vs. nonessential metabolic genes in a given species across all possible media. We are thus able to obtain for the first time, exact upper and lower bounds on the levels of differential conservation of essential genes for each of the species studied. The results show that bacteria do exhibit an overall tendency for differential conservation of their essential genes vs. their non-essential ones, yet this tendency is highly variable across species. We show that the model bacterium E. coli K12 may or may not exhibit differential conservation of essential genes depending on its growth medium, shedding light on previous experimental studies showing opposite trends.
Collapse
Affiliation(s)
- Oren Ish-Am
- The Blavatnik School of Computer Science, Tel Aviv University, Tel Aviv, Israel
| | - David M. Kristensen
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Eytan Ruppin
- The Blavatnik School of Computer Science, Tel Aviv University, Tel Aviv, Israel
- The Sackler School of Medicine, Tel Aviv University, Tel Aviv, Israel
- Dept. of Computer Science and the Center for Bioinformatics & Computational Biology, the University of Maryland, Maryland, United States of America
| |
Collapse
|
30
|
Rogozin IB, Managadze D, Shabalina SA, Koonin EV. Gene family level comparative analysis of gene expression in mammals validates the ortholog conjecture. Genome Biol Evol 2015; 6:754-62. [PMID: 24610837 PMCID: PMC4007545 DOI: 10.1093/gbe/evu051] [Citation(s) in RCA: 32] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023] Open
Abstract
The ortholog conjecture (OC), which is central to functional annotation of genomes, posits that orthologous genes are functionally more similar than paralogous genes at the same level of sequence divergence. However, a recent study challenged the OC by reporting a greater functional similarity, in terms of Gene Ontology (GO) annotations and expression profiles, among within-species paralogs compared with orthologs. These findings were taken to indicate that functional similarity of homologous genes is primarily determined by the cellular context of the genes, rather than evolutionary history. However, several subsequent studies suggest that GO annotations and microarray data could artificially inflate functional similarity between paralogs from the same organism. We sought to test the OC using approaches distinct from those used in previous studies. Analysis of a large RNAseq data set from multiple human and mouse tissues shows that expression similarity (correlations coefficients, rank's, or Z-scores) between orthologs is substantially greater than that for between-species paralogs with the same sequence divergence, in agreement with the OC and the results of recent detailed analyses. These findings are further corroborated by a fine-grain analysis in which expression profiles of orthologs and paralogs were compared separately for individual gene families. Expression profiles of within-species paralogs are more strongly correlated than profiles of orthologs but it is shown that this is caused by high background noise, that is, correlation between profiles of unrelated genes in the same organism. Z-scores and rank scores show a nonmonotonic dependence of expression profile similarity on sequence divergence. This complexity of gene expression evolution after duplication might be at least partially caused by selection for protein dosage rebalancing following gene duplication.
Collapse
Affiliation(s)
- Igor B Rogozin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland
| | | | | | | |
Collapse
|
31
|
Abstract
Years of meticulous curation of scientific literature and increasingly reliable computational predictions have resulted in creation of vast databases of protein interaction data. Over the years, these repositories have become a basic framework in which experiments are analyzed and new directions of research are explored. Here we present an overview of the most widely used protein-protein interaction databases and the methods they employ to gather, combine, and predict interactions. We also point out the trade-off between comprehensiveness and accuracy and the main pitfall scientists have to be aware before adopting protein interaction databases in any single-gene or genome-wide analysis.
Collapse
|
32
|
Mason EA, Mar JC, Laslett AL, Pera MF, Quackenbush J, Wolvetang E, Wells CA. Gene expression variability as a unifying element of the pluripotency network. Stem Cell Reports 2014; 3:365-77. [PMID: 25254348 PMCID: PMC4175554 DOI: 10.1016/j.stemcr.2014.06.008] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2014] [Revised: 06/18/2014] [Accepted: 06/20/2014] [Indexed: 12/16/2022] Open
Abstract
Heterogeneity is a hallmark of stem cell populations, in part due to the molecular differences between cells undergoing self-renewal and those poised to differentiate. We examined phenotypic and molecular heterogeneity in pluripotent stem cell populations, using public gene expression data sets. A high degree of concordance was observed between global gene expression variability and the reported heterogeneity of different human pluripotent lines. Network analysis demonstrated that low-variability genes were the most highly connected, suggesting that these are the most stable elements of the gene regulatory network and are under the highest regulatory constraints. Known drivers of pluripotency were among these, with lowest expression variability of POU5F1 in cells with the highest capacity for self-renewal. Variability of gene expression provides a reliable measure of phenotypic and molecular heterogeneity and predicts those genes with the highest degree of regulatory constraint within the pluripotency network. Gene expression variability is highly concordant with population heterogeneity Genes within the pluripotency network have distinct variability profiles Expression variability is a network property important for pluripotency
Collapse
Affiliation(s)
- Elizabeth A Mason
- Australian Institute for Bioengineering and Nanotechnology, The University of Queensland, St. Lucia, Brisbane QSLD 4072, Australia
| | - Jessica C Mar
- The Albert Einstein College of Medicine, Bronx, NY 10461, USA
| | - Andrew L Laslett
- Materials Science and Engineering, CSIRO, Clayton VIC 3168, Australia
| | - Martin F Pera
- The University of Melbourne, Florey Neuroscience and Mental Health Institute, and Walter and Eliza Hall Institute of Medical Research, Parkville VIC 3010, Australia
| | - John Quackenbush
- Dana-Farber Cancer Institute, Harvard University, Boston, MA 02215 USA
| | - Ernst Wolvetang
- Australian Institute for Bioengineering and Nanotechnology, The University of Queensland, St. Lucia, Brisbane QSLD 4072, Australia
| | - Christine A Wells
- Australian Institute for Bioengineering and Nanotechnology, The University of Queensland, St. Lucia, Brisbane QSLD 4072, Australia; Institute of Infection, Immunity and Inflammation, College of Medical, Veterinary and Life Sciences, University of Glasgow, Glasgow G12 8TA, UK.
| |
Collapse
|
33
|
Haldane A, Manhart M, Morozov AV. Biophysical fitness landscapes for transcription factor binding sites. PLoS Comput Biol 2014; 10:e1003683. [PMID: 25010228 PMCID: PMC4091707 DOI: 10.1371/journal.pcbi.1003683] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2013] [Accepted: 05/11/2014] [Indexed: 11/18/2022] Open
Abstract
Phenotypic states and evolutionary trajectories available to cell populations are ultimately dictated by complex interactions among DNA, RNA, proteins, and other molecular species. Here we study how evolution of gene regulation in a single-cell eukaryote S. cerevisiae is affected by interactions between transcription factors (TFs) and their cognate DNA sites. Our study is informed by a comprehensive collection of genomic binding sites and high-throughput in vitro measurements of TF-DNA binding interactions. Using an evolutionary model for monomorphic populations evolving on a fitness landscape, we infer fitness as a function of TF-DNA binding to show that the shape of the inferred fitness functions is in broad agreement with a simple functional form inspired by a thermodynamic model of two-state TF-DNA binding. However, the effective parameters of the model are not always consistent with physical values, indicating selection pressures beyond the biophysical constraints imposed by TF-DNA interactions. We find little statistical support for the fitness landscape in which each position in the binding site evolves independently, indicating that epistasis is common in the evolution of gene regulation. Finally, by correlating TF-DNA binding energies with biological properties of the sites or the genes they regulate, we are able to rule out several scenarios of site-specific selection, under which binding sites of the same TF would experience different selection pressures depending on their position in the genome. These findings support the existence of universal fitness landscapes which shape evolution of all sites for a given TF, and whose properties are determined in part by the physics of protein-DNA interactions. Specialized proteins called transcription factors turn genes on and off by binding to short stretches of DNA in their regulatory regions. Precise gene regulation is essential for cellular survival and proliferation, and its evolution and maintenance under mutational pressure are central issues in biology. Here we discuss how evolution of gene regulation is shaped by the need to maintain favorable binding energies between transcription factors and their genomic binding sites. We show that, surprisingly, transcription factor binding is not affected by many biological properties, such as the essentiality of the gene it regulates. Rather, all sites for a given factor appear to evolve under a universal set of constraints, which can be rationalized in terms of a simple model inspired by transcription factor – DNA binding thermodynamics.
Collapse
Affiliation(s)
- Allan Haldane
- Department of Physics and Astronomy, Rutgers University, Piscataway, New Jersey, United States of America
| | - Michael Manhart
- Department of Physics and Astronomy, Rutgers University, Piscataway, New Jersey, United States of America
| | - Alexandre V. Morozov
- Department of Physics and Astronomy, Rutgers University, Piscataway, New Jersey, United States of America
- BioMaPS Institute for Quantitative Biology, Rutgers University, Piscataway, New Jersey, United States of America
- * E-mail:
| |
Collapse
|
34
|
Wei W, Ye YN, Luo S, Deng YY, Lin D, Guo FB. IFIM: a database of integrated fitness information for microbial genes. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2014; 2014:bau052. [PMID: 24923821 PMCID: PMC4207227 DOI: 10.1093/database/bau052] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
Knowledge of an organism’s fitness for survival is important for a complete understanding of microbial genetics and effective drug design. Current essential gene databases provide only binary essentiality data from genome-wide experiments. We therefore developed a new database that Integrates quantitative Fitness Information for Microbial genes (IFIM). The IFIM database currently contains data from 16 experiments and 2186 theoretical predictions. The highly significant correlation between the experiment-derived fitness data and our computational simulations demonstrated that the computer-generated predictions were often as reliable as the experimental data. The data in IFIM can be accessed easily, and the interface allows users to browse through the gene fitness information that it contains. IFIM is the first resource that allows easy access to fitness data of microbial genes. We believe this database will contribute to a better understanding of microbial genetics and will be useful in designing drugs to resist microbial pathogens, especially when experimental data are unavailable. Database URL:http://cefg.uestc.edu.cn/ifim/ or http://cefg.cn/ifim/
Collapse
Affiliation(s)
- Wen Wei
- Center of Bioinformatics and Key Laboratory for NeuroInformation of the Ministry of Education, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Yuan-Nong Ye
- Center of Bioinformatics and Key Laboratory for NeuroInformation of the Ministry of Education, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Sen Luo
- Center of Bioinformatics and Key Laboratory for NeuroInformation of the Ministry of Education, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Yan-Yan Deng
- Center of Bioinformatics and Key Laboratory for NeuroInformation of the Ministry of Education, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Dan Lin
- Center of Bioinformatics and Key Laboratory for NeuroInformation of the Ministry of Education, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Feng-Biao Guo
- Center of Bioinformatics and Key Laboratory for NeuroInformation of the Ministry of Education, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, China
| |
Collapse
|
35
|
Iriarte A, Baraibar JD, Diana L, Castro-Sowinski S, Romero H, Musto H. Trends in amino acid usage across the class Mollicutes. J Biomol Struct Dyn 2014; 32:65-74. [DOI: 10.1080/07391102.2012.748636] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|
36
|
Wei W, Zhang T, Lin D, Yang ZJ, Guo FB. Transcriptional abundance is not the single force driving the evolution of bacterial proteins. BMC Evol Biol 2013; 13:162. [PMID: 23914835 PMCID: PMC3734234 DOI: 10.1186/1471-2148-13-162] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2013] [Accepted: 08/01/2013] [Indexed: 11/20/2022] Open
Abstract
Background Despite rapid progress in understanding the mechanisms that shape the evolution of proteins, the relative importance of various factors remain to be elucidated. In this study, we have assessed the effects of 16 different biological features on the evolutionary rates (ERs) of protein-coding sequences in bacterial genomes. Results Our analysis of 18 bacterial species revealed new correlations between ERs and constraining factors. Previous studies have suggested that transcriptional abundance overwhelmingly constrains the evolution of yeast protein sequences. This transcriptional abundance leads to selection against misfolding or misinteractions. In this study we found that there was no single factor in determining the evolution of bacterial proteins. Not only transcriptional abundance (codon adaptation index and expression level), but also protein-protein associations (PPAs), essentiality (ESS), subcellular localization of cytoplasmic membrane (SLM), transmembrane helices (TMH) and hydropathicity score (HS) independently and significantly affected the ERs of bacterial proteins. In some species, PPA and ESS demonstrate higher correlations with ER than transcriptional abundance. Conclusions Different forces drive the evolution of protein sequences in yeast and bacteria. In bacteria, the constraints are involved in avoiding a build-up of toxic molecules caused by misfolding/misinteraction (transcriptional abundance), while retaining important functions (ESS, PPA) and maintaining the cell membrane (SLM, TMH and HS). Each of these independently contributes to the variation in protein evolution.
Collapse
Affiliation(s)
- Wen Wei
- Center of Bioinformatics and Key Laboratory for NeuroInformation of the Ministry of Education, School of Life Science and Technology, University of Electronic Science and Technology of China, 610054 Chengdu, China
| | | | | | | | | |
Collapse
|
37
|
Fatakia SN, Costanzi S, Chow CC. Molecular evolution of the transmembrane domains of G protein-coupled receptors. PLoS One 2011; 6:e27813. [PMID: 22132149 PMCID: PMC3221663 DOI: 10.1371/journal.pone.0027813] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2010] [Accepted: 10/25/2011] [Indexed: 11/19/2022] Open
Abstract
G protein-coupled receptors (GPCRs) are a superfamily of integral membrane proteins vital for signaling and are important targets for pharmaceutical intervention in humans. Previously, we identified a group of ten amino acid positions (called key positions), within the seven transmembrane domain (7TM) interhelical region, which had high mutual information with each other and many other positions in the 7TM. Here, we estimated the evolutionary selection pressure at those key positions. We found that the key positions of receptors for small molecule natural ligands were under strong negative selection. Receptors naturally activated by lipids had weaker negative selection in general when compared to small molecule-activated receptors. Selection pressure varied widely in peptide-activated receptors. We used this observation to predict that a subgroup of orphan GPCRs not under strong selection may not possess a natural small-molecule ligand. In the subgroup of MRGX1-type GPCRs, we identified a key position, along with two non-key positions, under statistically significant positive selection.
Collapse
Affiliation(s)
- Sarosh N. Fatakia
- Laboratory of Biological Modeling, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Stefano Costanzi
- Laboratory of Biological Modeling, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Carson C. Chow
- Laboratory of Biological Modeling, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, Maryland, United States of America
| |
Collapse
|
38
|
Hudson CM, Puckett EE, Bekaert M, Pires JC, Conant GC. Selection for higher gene copy number after different types of plant gene duplications. Genome Biol Evol 2011; 3:1369-80. [PMID: 22056313 PMCID: PMC3240960 DOI: 10.1093/gbe/evr115] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/31/2011] [Indexed: 01/03/2023] Open
Abstract
The evolutionary origins of the multitude of duplicate genes in the plant genomes are still incompletely understood. To gain an appreciation of the potential selective forces acting on these duplicates, we phylogenetically inferred the set of metabolic gene families from 10 flowering plant (angiosperm) genomes. We then compared the metabolic fluxes for these families, predicted using the Arabidopsis thaliana and Sorghum bicolor metabolic networks, with the families' duplication propensities. For duplications produced by both small scale (small-scale duplications) and genome duplication (whole-genome duplications), there is a significant association between the flux and the tendency to duplicate. Following this global analysis, we made a more fine-scale study of the selective constraints observed on plant sodium and phosphate transporters. We find that the different duplication mechanisms give rise to differing selective constraints. However, the exact nature of this pattern varies between the gene families, and we argue that the duplication mechanism alone does not define a duplicated gene's subsequent evolutionary trajectory. Collectively, our results argue for the interplay of history, function, and selection in shaping the duplicate gene evolution in plants.
Collapse
|
39
|
Fares MA, Ruiz-González MX, Labrador JP. Protein coadaptation and the design of novel approaches to identify protein-protein interactions. IUBMB Life 2011; 63:264-71. [PMID: 21488148 DOI: 10.1002/iub.455] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
Proteins rarely function in isolation but they form part of complex networks of interactions with other proteins within or among cells. The importance of a particular protein for cell viability is directly dependent upon the number of interactions where it participates and the function it performs: the larger the number of interactions of a protein the greater its functional importance is for the cell. With the advent of genome sequencing and "omics" technologies it became feasible conducting large-scale searches for protein interacting partners. Unfortunately, the accuracy of such analyses has been underwhelming owing to methodological limitations and to the inherent complexity of protein interactions. In addition to these experimental approaches, many computational methods have been developed to identify protein-protein interactions by assuming that interacting proteins coevolve resulting from the coadaptation dynamics between the amino acids of their interacting faces. We review the main technological advances made in the field of interactomics and discuss the feasibility of computational methods to identify protein-protein interactions based on the estimation of coevolution. As proof-of-concept, we present a classical case study: the interactions of cell surface proteins (receptors) and their ligands. Finally, we take this discussion one step forward to include interactions between organisms and species to understand the generation of biological complexity. Development of technologies for accurate detection of protein-protein interactions may shed light on processes that go from the fine-tuning of pathways and metabolic networks to the emergence of biological complexity.
Collapse
Affiliation(s)
- Mario A Fares
- Department of Abiotic Stress, Group of Integrative and Systems Biology, Instituto de Biología Molecular y Celular de Plantas (CSIC-Universidad Politécnica de Valencia), Valencia, Spain.
| | | | | |
Collapse
|
40
|
Gordon JL, Byrne KP, Wolfe KH. Mechanisms of chromosome number evolution in yeast. PLoS Genet 2011; 7:e1002190. [PMID: 21811419 PMCID: PMC3141009 DOI: 10.1371/journal.pgen.1002190] [Citation(s) in RCA: 97] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2011] [Accepted: 06/03/2011] [Indexed: 12/25/2022] Open
Abstract
The whole-genome duplication (WGD) that occurred during yeast evolution changed the basal number of chromosomes from 8 to 16. However, the number of chromosomes in post-WGD species now ranges between 10 and 16, and the number in non-WGD species (Zygosaccharomyces, Kluyveromyces, Lachancea, and Ashbya) ranges between 6 and 8. To study the mechanism by which chromosome number changes, we traced the ancestry of centromeres and telomeres in each species. We observe only two mechanisms by which the number of chromosomes has decreased, as indicated by the loss of a centromere. The most frequent mechanism, seen 8 times, is telomere-to-telomere fusion between two chromosomes with the concomitant death of one centromere. The other mechanism, seen once, involves the breakage of a chromosome at its centromere, followed by the fusion of the two arms to the telomeres of two other chromosomes. The only mechanism by which chromosome number has increased in these species is WGD. Translocations and inversions have cycled telomere locations, internalizing some previously telomeric genes and creating novel telomeric locations. Comparison of centromere structures shows that the length of the CDEII region is variable between species but uniform within species. We trace the complete rearrangement history of the Lachancea kluyveri genome since its common ancestor with Saccharomyces and propose that its exceptionally low level of rearrangement is a consequence of the loss of the non-homologous end joining (NHEJ) DNA repair pathway in this species. The number of chromosomes in organisms often changes over evolutionary time. To study how the number changes, we compare several related species of yeast that share a common ancestor roughly 150 million years ago and have varying numbers of chromosomes. By inferring ancestral genome structures, we examine the changes in location of centromeres and telomeres, key elements that biologically define chromosomes. Their locations change over time by rearrangements of chromosome segments. By following these rearrangements, we trace an evolutionary path between existing centromeres and telomeres to those in the ancestral genomes, allowing us to identify the specific evolutionary events that caused changes in chromosome number. We show that, in these yeasts, chromosome number has generally decreased over time except for one notable exception: an event in an ancestor of several species where the whole genome was duplicated. Chromosome number reduction occurs by the simultaneous removal of a centromere from a chromosome and fusion of the rest of the chromosome to another that contains a working centromere. This process also results in telomere removal and the movement of genes from the ends of chromosomes to new locations in the middle of chromosomes.
Collapse
Affiliation(s)
- Jonathan L Gordon
- Smurfit Institute of Genetics, Trinity College Dublin, Dublin, Ireland.
| | | | | |
Collapse
|
41
|
Slow protein evolutionary rates are dictated by surface-core association. Proc Natl Acad Sci U S A 2011; 108:11151-6. [PMID: 21690394 DOI: 10.1073/pnas.1015994108] [Citation(s) in RCA: 66] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Abstract
Why do certain proteins evolve much slower than others? We compared not only rates per protein, but also rates per position within individual proteins. For ∼90% of proteins, the distribution of positional rates exhibits three peaks: a peak of slow evolving residues, with average log(2)[normalized rate], log(2)μ, of ca. -2, corresponding primarily to core residues; a peak of fast evolving residues (log(2)μ ∼ 0.5) largely corresponding to surface residues; and a very fast peak (log(2)μ ∼ 2) associated with disordered segments. However, a unique fraction of proteins that evolve very slowly exhibit not only a negligible fast peak, but also a peak with a log(2)μ ∼ -4, rather than the standard core peak of -2. Thus, a "freeze" of a protein's surface seems to stop core evolution as well. We also observed a much higher fraction of substitutions in potentially interacting residues than expected by chance, including substitutions in pairs of contacting surface-core residues. Overall, the data suggest that accumulation of surface substitutions enables the acceptance of substitutions in core positions. The underlying reason for slow evolution might therefore be a highly constrained surface due to protein-protein interactions or the need to prevent misfolding or aggregation. If the surface is inaccessible to substitutions, so becomes the core, thus resulting in very slow overall rates.
Collapse
|
42
|
Hudson CM, Conant GC. Expression level, cellular compartment and metabolic network position all influence the average selective constraint on mammalian enzymes. BMC Evol Biol 2011; 11:89. [PMID: 21470417 PMCID: PMC3082228 DOI: 10.1186/1471-2148-11-89] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2010] [Accepted: 04/06/2011] [Indexed: 12/24/2022] Open
Abstract
BACKGROUND A gene's position in regulatory, protein interaction or metabolic networks can be predictive of the strength of purifying selection acting on it, but these relationships are neither universal nor invariably strong. Following work in bacteria, fungi and invertebrate animals, we explore the relationship between selective constraint and metabolic function in mammals. RESULTS We measure the association between selective constraint, estimated by the ratio of nonsynonymous (Ka) to synonymous (Ks) substitutions, and several, primarily metabolic, measures of gene function. We find significant differences between the selective constraints acting on enzyme-coding genes from different cellular compartments, with the nucleus showing higher constraint than genes from either the cytoplasm or the mitochondria. Among metabolic genes, the centrality of an enzyme in the metabolic network is significantly correlated with Ka/Ks. In contrast to yeasts, gene expression magnitude does not appear to be the primary predictor of selective constraint in these organisms. CONCLUSIONS Our results imply that the relationship between selective constraint and enzyme centrality is complex: the strength of selective constraint acting on mammalian genes is quite variable and does not appear to exclusively follow patterns seen in other organisms.
Collapse
Affiliation(s)
- Corey M Hudson
- Informatics Institute, University of Missouri, Columbia, MO, USA.
| | | |
Collapse
|
43
|
Vishnoi A, Sethupathy P, Simola D, Plotkin JB, Hannenhalli S. Genome-wide survey of natural selection on functional, structural, and network properties of polymorphic sites in Saccharomyces paradoxus. Mol Biol Evol 2011; 28:2615-27. [PMID: 21478372 DOI: 10.1093/molbev/msr085] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023] Open
Abstract
BACKGROUND To characterize the genetic basis of phenotypic evolution, numerous studies have identified individual genes that have likely evolved under natural selection. However, phenotypic changes may represent the cumulative effect of similar evolutionary forces acting on functionally related groups of genes. Phylogenetic analyses of divergent yeast species have identified functional groups of genes that have evolved at significantly different rates, suggestive of differential selection on the functional properties. However, due to environmental heterogeneity over long evolutionary timescales, selection operating within a single lineage may be dramatically different, and it is not detectable via interspecific comparisons alone. Moreover, interspecific studies typically quantify selection on protein-coding regions using the D(n)/D(s) ratio, which cannot be extended easily to study selection on noncoding regions or synonymous sites. The population genetic-based analysis of selection operating within a single lineage ameliorates these limitations. FINDINGS We investigated selection on several properties associated with genes, promoters, or polymorphic sites, by analyzing the derived allele frequency spectrum of single nucleotide polymorphisms (SNPs) in 28 strains of Saccharomyces paradoxus. We found evidence for significant differential selection between many functionally relevant categories of SNPs, underscoring the utility of function-centric approaches for discovering signatures of natural selection. When comparable, our findings are largely consistent with previous studies based on interspecific comparisons, with one notable exception: our study finds that mutations from an ancient amino acid to a relatively new amino acid are selectively disfavored, whereas interspecific comparisons have found selection against ancient amino acids. Several of our findings have not been addressed through prior interspecific studies: we find that synonymous mutations from preferred to unpreferred codons are selected against and that synonymous SNPs in the linker regions of proteins are relatively less constrained than those within protein domains. CONCLUSIONS We present the first global survey of selection acting on various functional properties in S. paradoxus. We found that selection pressures previously detected over long evolutionary timescales have also shaped the evolution of S. paradoxus. Importantly, we also make novel discoveries untenable via conventional interspecific analyses.
Collapse
|
44
|
Abstract
There is great variation in the rates of sequence evolution among proteins encoded by the same genome. The strongest correlate of evolutionary rate is expression level: highly expressed proteins tend to evolve slowly. This observation has led to the proposal that a major determinant of protein evolutionary rate involves the toxic effects of protein that misfolds due to transcriptional and translational errors (the mistranslation-induced misfolding [MIM] hypothesis). Here, I present a model that explains the correlation of evolutionary rate and expression level by selection for function. The basis of this model is that selection keeps expression levels near optima that reflect a trade-off between beneficial effects of the protein's function and some nonspecific cost of expression (e.g., the biochemical cost of synthesizing protein). Simulations confirm the predictions of the model. Like the MIM hypothesis, this model predicts several other relationships that are observed empirically. Although the model is based on selection for protein function, it is consistent with findings that a protein's rate of evolution is at most weakly correlated with its importance for fitness as measured by gene knockout experiments.
Collapse
Affiliation(s)
- Joshua L Cherry
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, USA.
| |
Collapse
|
45
|
Park SG, Choi SS. Expression breadth and expression abundance behave differently in correlations with evolutionary rates. BMC Evol Biol 2010; 10:241. [PMID: 20691101 PMCID: PMC2924872 DOI: 10.1186/1471-2148-10-241] [Citation(s) in RCA: 46] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2010] [Accepted: 08/07/2010] [Indexed: 01/12/2023] Open
Abstract
Background One of the main objectives of the molecular evolution and evolutionary systems biology field is to reveal the underlying principles that dictate protein evolutionary rates. Several studies argue that expression abundance is the most critical component in determining the rate of evolution, especially in unicellular organisms. However, the expression breadth also needs to be considered for multicellular organisms. Results In the present paper, we analyzed the relationship between the two expression variables and rates using two different genome-scale expression datasets, microarrays and ESTs. A significant positive correlation between the expression abundance (EA) and expression breadth (EB) was revealed by Kendall's rank correlation tests. A novel random shuffling approach was applied for EA and EB to compare the correlation coefficients obtained from real data sets to those estimated based on random chance. A novel method called a Fixed Group Analysis (FGA) was designed and applied to investigate the correlations between expression variables and rates when one of the two expression variables was evenly fixed. Conclusions In conclusion, all of these analyses and tests consistently showed that the breadth rather than the abundance of gene expression is tightly linked with the evolutionary rate in multicellular organisms.
Collapse
Affiliation(s)
- Seung Gu Park
- Department of Medical Biotechnology, College of Biomedical Science, and Institute of Bioscience & Biotechnology, Kangwon National University, Chunchon 200-701, Korea
| | | |
Collapse
|
46
|
Abstract
Molecular chaperones are highly conserved and ubiquitous proteins that help other proteins in the cell to fold. Pioneering work by Rutherford and Lindquist suggested that the chaperone Hsp90 could buffer (i.e., suppress) phenotypic variation in its client proteins and that alternate periods of buffering and expression of these variants might be important in adaptive evolution. More recently, Tokuriki and Tawfik presented an explicit mechanism for chaperone-dependent evolution, in which the Escherichia coli chaperonin GroEL facilitated the folding of clients that had accumulated structurally destabilizing but neofunctionalizing mutations in the protein core. But how important an evolutionary force is chaperonin-mediated buffering in nature? Here, we address this question by modeling the per-residue evolutionary rate of the crystallized E. coli proteome, evaluating the relative contributions of chaperonin buffering, functional importance, and structural features such as residue contact density. Previous findings suggest an interaction between codon bias and GroEL in limiting the effects of misfolding errors. Our results suggest that the buffering of deleterious mutations by GroEL increases the evolutionary rate of client proteins. We then examine the evolutionary fate of GroEL clients in the Mycoplasmas, a group of bacteria containing the only known organisms that lack chaperonins. We show that GroEL was lost once in the common ancestor of a monophyletic subgroup of Mycoplasmas, and we evaluate the effect of this loss on the subsequent evolution of client proteins, providing evidence that client homologs in 11 Mycoplasma species have lost their obligate dependency on GroEL for folding. Our analyses indicate that individual molecules such as chaperonins can have significant effects on proteome evolution through their modulation of protein folding.
Collapse
Affiliation(s)
- Tom A Williams
- Department of Genetics, University of Dublin, Trinity College, Dublin, Ireland
| | | |
Collapse
|
47
|
Inger A, Solomon A, Shenhav B, Olender T, Lancet D. Mutations and lethality in simulated prebiotic networks. J Mol Evol 2009; 69:568-78. [PMID: 19787385 DOI: 10.1007/s00239-009-9281-y] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2009] [Accepted: 09/08/2009] [Indexed: 01/30/2023]
Abstract
The Graded Autocatalysis Replication Domain (GARD) model describes an origin of life scenario which involves non-covalent compositional assemblies, made of monomeric mutually catalytic molecules. GARD constitutes an alternative to informational biopolymers as a mechanism of primordial inheritance. In the present work, we examined the effect of mutations, one of the most fundamental mechanisms for evolution, in the context of the networks of mutual interaction within GARD prebiotic assemblies. We performed a systematic analysis analogous to single and double gene deletions within GARD. While most deletions have only a small effect on both growth rate and molecular composition of the assemblies, ~10% of the deletions caused lethality, or sometimes showed enhanced fitness. Analysis of 14 different network properties on 2,000 different GARD networks indicated that lethality usually takes place when the deleted node has a high molecular count, or when it is a catalyst for such node. A correlation was also found between lethality and node degree centrality, similar to what is seen in real biological networks. Addressing double knockout mutations, our results demonstrate the occurrence of both synthetic lethality and extragenic suppression within GARD networks, and convey an attempt to correlate synthetic lethality to network node-pair properties. The analyses presented help establish GARD as a workable alternative prebiotic scenario, suggesting that life may have begun with large molecular networks of low fidelity, that later underwent evolutionary compaction and fidelity augmentation.
Collapse
Affiliation(s)
- Aron Inger
- Department of Molecular Genetics and the Crown Human Genome Center, Weizmann Institute of Science, Rehovot, 76100, Israel
| | | | | | | | | |
Collapse
|
48
|
Abstract
Nutrigenetics and nutrigenomics are nascent areas that are evolving quickly and riding on the wave of "personalized medicine" that is providing opportunities in the discovery and development of nutraceutical compounds. The human genome sequence and sequences of model organisms provide the equivalent of comprehensive blueprints and parts lists that describe dynamic networks and the bases for understanding their responses to external and internal perturbations. Unfolding the interrelationships among genes, gene products, and dietary habits is fundamental for identifying individuals who will benefit most from, or be placed at risk by, intervention strategies. More accurate assessment of the inputs to human health and the consequences of those inputs measured as accurate transcriptomic, proteomic, and metabolomic analyses would bring personalized health/diet to practice far faster than would waiting for a predictive knowledge of genetic variation. It is widely recognized that systems and network biology has the potential to increase our understanding of how nutrition influences metabolic pathways and homeostasis, how this regulation is disturbed in a diet-related disease, and to what extent individual genotypes contribute to such diseases.
Collapse
Affiliation(s)
- Gianni Panagiotou
- Department of Systems Biology, Center for Microbial Biotechnology, Technical University of Denmark, Kgs. Lyngby, Denmark
| | | |
Collapse
|
49
|
Worth CL, Gong S, Blundell TL. Structural and functional constraints in the evolution of protein families. Nat Rev Mol Cell Biol 2009; 10:709-20. [PMID: 19756040 DOI: 10.1038/nrm2762] [Citation(s) in RCA: 142] [Impact Index Per Article: 8.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023]
|
50
|
Weber CC, Hurst LD. Protein rates of evolution are predicted by double-strand break events, independent of crossing-over rates. Genome Biol Evol 2009; 1:340-9. [PMID: 20333203 PMCID: PMC2817428 DOI: 10.1093/gbe/evp033] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/28/2009] [Indexed: 12/22/2022] Open
Abstract
Theory predicts that, owing to reduced Hill–Robertson interference, genomic regions with high crossing-over rates should experience more efficient selection. In Saccharomyces cerevisiae a negative correlation between the local recombination rate, assayed as meiotic double-strand breaks (DSBs), and the local rate of protein evolution has been considered consistent with such a model. Although DSBs are a prerequisite for crossing-over, they need not result in crossing-over. With recent high-resolution crossover data, we now return to this issue comparing two species of yeast. Strikingly, even allowing for crossover rates, both the rate of premeiotic DSBs and of noncrossover recombination events predict a gene's rate of evolution. This both questions the validity of prior analyses and strongly suggests that any correlation between crossover rates and rates of protein evolution could be owing to slow-evolving genes being prone to DSBs or a direct effect of DSBs on sequence evolution. To ask if classical theory of recombination has any relevance, we determine whether crossover rates predict rates of protein evolution, controlling for noncrossover DSB events, gene ontology (GO) class, gene expression, protein abundance, nucleotide content, and dispensability. We find that genes with high crossing-over rates have low rates of protein evolution after such control, although any correlation is weaker than that previously reported considering meiotic DSBs as a proxy. The data are consistent both with recombination enhancing the efficiency of purifying selection and, independently, with DSBs being associated with low rates of evolution.
Collapse
Affiliation(s)
- Claudia C Weber
- Department of Biology and Biochemistry, University of Bath, Bath, Somerset, UK
| | | |
Collapse
|