1
|
Abduljaleel Z. Molecular insights into TP53 mutation (p. Arg267Trp) and its connection to Choroid Plexus Carcinomas and Li-Fraumeni Syndrome. Genes Genomics 2024; 46:941-953. [PMID: 38896352 DOI: 10.1007/s13258-024-01531-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2024] [Accepted: 06/07/2024] [Indexed: 06/21/2024]
Abstract
BACKGROUND Choroid plexus carcinomas (CPCs) are rare malignant tumors primarily affecting pediatric patients and often co-occur with Li-Fraumeni Syndrome (LFS), an inherited predisposition to early-onset malignancies in multiple organ systems. LFS is closely linked to TP53 mutations, with germline TP53 gene mutations present in approximately 75% of Li-Fraumeni syndrome families and 25% of Li-Fraumeni-like syndrome families. Individuals with TP53 mutations also have an elevated probability of carrying mutations in BRCA1 and BRCA2 genes. OBJECTIVE To investigate the structural and functional implications of the TP53: 799C > T, p. (Arg267Trp) missense mutation, initially identified in a Saudi family, and understand its impact on TP53 functionality and related intermolecular interactions. METHODS Computational analyses were conducted to examine the structural modifications resulting from the TP53: 799C > T, p. (Arg267Trp) mutation. These analyses focused on the mutation's impact on hydrogen bonding, ionic interactions, and the specific interaction with Cell Cycle and Apoptosis Regulator 2 (CCAR2), as annotated in UniProt. RESULTS The study revealed that the native Arg267 residue is critical for a salt bridge interaction with glutamic acid at position 258. The mutation-induced charge alteration has the potential to disrupt this ionic bonding. Additionally, the mutation is located within an amino acid region crucial for interaction with CCAR2. The altered properties of the amino acid within this domain may affect its functionality and disrupt this interaction, thereby impacting the regulation of catalytic enzyme activity. CONCLUSIONS Our findings highlight the intricate intermolecular interactions governing TP53 functionality. The TP53: 799C > T, p. (Arg267Trp) mutation causes structural modifications that potentially disrupt critical ionic bonds and protein interactions, offering valuable insights for the development of targeted mutants with distinct functional attributes. These insights could inform therapeutic strategies for conditions associated with TP53 mutations.
Collapse
Affiliation(s)
- Zainularifeen Abduljaleel
- Science and Technology Unit, Umm Al Qura University, P.O. Box 715, 21955, Makkah, Saudi Arabia.
- Faculty of Medicine, Department of Medical Genetics, Umm Al-Qura University, P.O. Box 715, 21955, Makkah, Saudi Arabia.
- Molecular Diagnostics Unit, Department of Molecular Biology, The Regional Laboratory, Ministry of Health (MOH), P.O. Box 6251, Makkah, Kingdom of Saudi Arabia.
| |
Collapse
|
2
|
Akeju OJ, Cope AL. Re-examining Correlations Between Synonymous Codon Usage and Protein Bond Angles in Escherichia coli. Genome Biol Evol 2024; 16:evae080. [PMID: 38619010 PMCID: PMC11077309 DOI: 10.1093/gbe/evae080] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2023] [Revised: 04/05/2024] [Accepted: 04/10/2024] [Indexed: 04/16/2024] Open
Abstract
Rosenberg AA, Marx A, Bronstein AM (Codon-specific Ramachandran plots show amino acid backbone conformation depends on identity of the translated codon. Nat Commun. 2022:13:2815) recently found a surprising correlation between synonymous codon usage and the dihedral bond angles of the resulting amino acid. However, their analysis did not account for the strongest known correlate of codon usage: gene expression. We re-examined the relationship between bond angles and codon usage by applying the approach of Rosenberg et al. to simulated protein-coding sequences that (i) have random codon usage, (ii) codon usage determined by mutation biases, and (iii) maintain the general relationship between codon usage and gene expression via the assumption of selection-mutation-drift equilibrium. We observed correlations between dihedral bond angle and codon usage when codon usage is entirely random, indicating possible conflation of noise with differences in bond angle distributions between synonymous codons. More relevant to the general analysis of codon usage patterns, we found surprisingly good agreement between the analysis of the real sequences and the analysis of sequences simulated assuming selection-mutation-drift equilibrium, with 91% of significant synonymous codon pairs detected in the former were also detected in the latter. We believe the correlation between codon usage and dihedral bond angles resulted from the variation in codon usage across genes due to the interplay between mutation bias, natural selection for translation efficiency, and gene expression, further underscoring these factors must be controlled for when looking for novel patterns related to codon usage.
Collapse
Affiliation(s)
| | - Alexander L Cope
- Department of Genetics, Rutgers University, Piscataway, New Jersey, USA
- Human Genetics Institute of New Jersey, Rutgers University, Piscataway, New Jersey, USA
- Robert Wood Johnson Medical School, Rutgers University, Piscataway, New Jersey, USA
| |
Collapse
|
3
|
Li W, Li R, Tang X, Cheng J, Zhan L, Shang Z, Wu J. Genomics evolution of Jingmen viruses associated with ticks and vertebrates. Genomics 2023; 115:110734. [PMID: 37890641 DOI: 10.1016/j.ygeno.2023.110734] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2023] [Revised: 10/08/2023] [Accepted: 10/24/2023] [Indexed: 10/29/2023]
Abstract
Jingmen virus (JMV) associated with ticks and vertebrates have been found to be related to human disease. We obtained the genome of a Jingmen tick virus (JMTV) strain from Rhipicephalus microplus in Guizhou province and compared the genomes of seven JMV species associated with ticks and vertebrates to understand the evolutionary relationships. The topology of the phylogenetic tree of segment 1 and segment 3 is similar, and segment 2 and segment 4 formed two different topologies, with the main differences being between Alongshan virus (ALSV), Takachi virus, Yanggou tick virus and Pteropus lylei jingmen virus (PLJV), and the possibility of genetic reassortment among these viruses. Moreover, we detected recombination within JMTV and between PLJV and ALSV. The genetic reassortment and recombination that occurs during cross-species transmission of these JMV associated with ticks and vertebrates not only complicates their evolutionary relationships, but also raises the risk of these viruses to humans.
Collapse
Affiliation(s)
- Weiyi Li
- School of Public Health, the key Laboratory of Environmental Pollution Monitoring and Disease Control, Ministry of Education, Guizhou Medical University, Guiyang 561113, China; Key Laboratory of Modern Pathogen Biology and Characteristics, Basic Medical College, Guizhou Medical University, Guiyang, Guizhou 550025, China
| | - Rongting Li
- School of Public Health, the key Laboratory of Environmental Pollution Monitoring and Disease Control, Ministry of Education, Guizhou Medical University, Guiyang 561113, China; Key Laboratory of Modern Pathogen Biology and Characteristics, Basic Medical College, Guizhou Medical University, Guiyang, Guizhou 550025, China
| | - Xiaomin Tang
- Key Laboratory of Modern Pathogen Biology and Characteristics, Basic Medical College, Guizhou Medical University, Guiyang, Guizhou 550025, China; Department of Human Parasitology, Basic Medical College, Guizhou Medical University, Guiyang, Guizhou 550025, China
| | - Jinzhi Cheng
- Key Laboratory of Modern Pathogen Biology and Characteristics, Basic Medical College, Guizhou Medical University, Guiyang, Guizhou 550025, China; Department of Human Parasitology, Basic Medical College, Guizhou Medical University, Guiyang, Guizhou 550025, China
| | - Lin Zhan
- School of Public Health, the key Laboratory of Environmental Pollution Monitoring and Disease Control, Ministry of Education, Guizhou Medical University, Guiyang 561113, China; Central Laboratory, Guizhou Provincial People's Hospital, Guiyang, Guizhou 550002, China
| | - Zhengling Shang
- Department of Immunology, Basic Medical College, Guizhou Medical University, Guiyang, Guizhou 550025, China
| | - Jiahong Wu
- Key Laboratory of Modern Pathogen Biology and Characteristics, Basic Medical College, Guizhou Medical University, Guiyang, Guizhou 550025, China; Department of Human Parasitology, Basic Medical College, Guizhou Medical University, Guiyang, Guizhou 550025, China.
| |
Collapse
|
4
|
Yubero P, Lavin AA, Poyatos JF. The limitations of phenotype prediction in metabolism. PLoS Comput Biol 2023; 19:e1011631. [PMID: 37948461 PMCID: PMC10664875 DOI: 10.1371/journal.pcbi.1011631] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2023] [Revised: 11/22/2023] [Accepted: 10/24/2023] [Indexed: 11/12/2023] Open
Abstract
Phenotype prediction is at the center of many questions in biology. Prediction is often achieved by determining statistical associations between genetic and phenotypic variation, ignoring the exact processes that cause the phenotype. Here, we present a framework based on genome-scale metabolic reconstructions to reveal the mechanisms behind the associations. We calculated a polygenic score (PGS) that identifies a set of enzymes as predictors of growth, the phenotype. This set arises from the synergy of the functional mode of metabolism in a particular setting and its evolutionary history, and is suitable to infer the phenotype across a variety of conditions. We also find that there is optimal genetic variation for predictability and demonstrate how the linear PGS can still explain phenotypes generated by the underlying nonlinear biochemistry. Therefore, the explicit model interprets the black box statistical associations of the genotype-to-phenotype map and helps to discover what limits the prediction in metabolism.
Collapse
Affiliation(s)
- Pablo Yubero
- Logic of Genomic Systems Lab, CNB-CSIC, Madrid, Spain
| | | | | |
Collapse
|
5
|
Pál C, Papp B. How selection shapes the short- and long-term dynamics of molecular evolution. Proc Natl Acad Sci U S A 2023; 120:e2311012120. [PMID: 37531373 PMCID: PMC10433269 DOI: 10.1073/pnas.2311012120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/04/2023] Open
Affiliation(s)
- Csaba Pál
- Synthetic and System Biology Unit, Biological Research Centre, National Laboratory of Biotechnology, Eötvös Loránd Research Network, SzegedHU-6726, Hungary
| | - Balázs Papp
- Synthetic and System Biology Unit, Biological Research Centre, National Laboratory of Biotechnology, Eötvös Loránd Research Network, SzegedHU-6726, Hungary
- Hungarian Centre of Excellence for Molecular Medicine - Biological Research Centre Metabolic Systems Biology Research Group, SzegedHU-6726, Hungary
- National Laboratory for Health Security, Biological Research Centre, Eötvös Loránd Research Network, SzegedHU-6726, Hungary
| |
Collapse
|
6
|
Zhang J. What Has Genomics Taught An Evolutionary Biologist? GENOMICS, PROTEOMICS & BIOINFORMATICS 2023; 21:1-12. [PMID: 36720382 PMCID: PMC10373158 DOI: 10.1016/j.gpb.2023.01.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/07/2022] [Revised: 01/06/2023] [Accepted: 01/19/2023] [Indexed: 01/30/2023]
Abstract
Genomics, an interdisciplinary field of biology on the structure, function, and evolution of genomes, has revolutionized many subdisciplines of life sciences, including my field of evolutionary biology, by supplying huge data, bringing high-throughput technologies, and offering a new approach to biology. In this review, I describe what I have learned from genomics and highlight the fundamental knowledge and mechanistic insights gained. I focus on three broad topics that are central to evolutionary biology and beyond-variation, interaction, and selection-and use primarily my own research and study subjects as examples. In the next decade or two, I expect that the most important contributions of genomics to evolutionary biology will be to provide genome sequences of nearly all known species on Earth, facilitate high-throughput phenotyping of natural variants and systematically constructed mutants for mapping genotype-phenotype-fitness landscapes, and assist the determination of causality in evolutionary processes using experimental evolution.
Collapse
Affiliation(s)
- Jianzhi Zhang
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI 48109, USA.
| |
Collapse
|
7
|
Xiao L, Fan D, Qi H, Cong Y, Du Z. Defect-buffering cellular plasticity increases robustness of metazoan embryogenesis. Cell Syst 2022; 13:615-630.e9. [PMID: 35882226 DOI: 10.1016/j.cels.2022.07.001] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2021] [Revised: 04/14/2022] [Accepted: 06/30/2022] [Indexed: 01/26/2023]
Abstract
Developmental processes are intrinsically robust so as to preserve a normal-like state in response to genetic and environmental fluctuations. However, the robustness and potential phenotypic plasticity of individual developing cells under genetic perturbations remain to be systematically evaluated. Using large-scale gene perturbation, live imaging, lineage tracing, and single-cell phenomics, we quantified the phenotypic landscape of C. elegans embryogenesis in >2,000 embryos following individual knockdown of over 750 conserved genes. We observed that cellular genetic systems are not sufficiently robust to single-gene perturbations across all cells; rather, gene knockdowns frequently induced cellular defects. Dynamic phenotypic analyses revealed many cellular defects to be transient, with cells exhibiting phenotypic plasticity that serves to alleviate, correct, and accommodate the defects. Moreover, potential developmentally related cell modules may buffer the phenotypic effects of individual cell position changes. Our findings reveal non-negligible contributions of cellular plasticity and multicellularity as compensatory strategies to increase developmental robustness.
Collapse
Affiliation(s)
- Long Xiao
- State Key Laboratory of Molecular Developmental Biology, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Duchangjiang Fan
- State Key Laboratory of Molecular Developmental Biology, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Huan Qi
- State Key Laboratory of Molecular Developmental Biology, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing 100101, China
| | - Yulin Cong
- State Key Laboratory of Molecular Developmental Biology, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Zhuo Du
- State Key Laboratory of Molecular Developmental Biology, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100049, China.
| |
Collapse
|
8
|
Secretory quality control constrains functional selection-associated protein structure innovation. Commun Biol 2022; 5:268. [PMID: 35338247 PMCID: PMC8956723 DOI: 10.1038/s42003-022-03220-3] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2021] [Accepted: 03/03/2022] [Indexed: 12/26/2022] Open
Abstract
Biophysical models suggest a dominant role of structural over functional constraints in shaping protein evolution. Selection on structural constraints is linked closely to expression levels of proteins, which together with structure-associated activities determine in vivo functions of proteins. Here we show that despite the up to two orders of magnitude differences in levels of C-reactive protein (CRP) in distinct species, the in vivo functions of CRP are paradoxically conserved. Such a pronounced level-function mismatch cannot be explained by activities associated with the conserved native structure, but is coupled to hidden activities associated with the unfolded, activated conformation. This is not the result of selection on structural constraints like foldability and stability, but is achieved by folding determinants-mediated functional selection that keeps a confined carrier structure to pass the stringent eukaryotic quality control on secretion. Further analysis suggests a folding threshold model which may partly explain the mismatch between the vast sequence space and the limited structure space of proteins. The mismatch in the conserved structure but different expression levels of C-reactive protein (CRP) in distinct species is reconciled by functional selection on hidden activities of unfolded CRPs.
Collapse
|
9
|
Palazzo AF, Kejiou NS. Non-Darwinian Molecular Biology. Front Genet 2022; 13:831068. [PMID: 35251134 PMCID: PMC8888898 DOI: 10.3389/fgene.2022.831068] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2021] [Accepted: 01/24/2022] [Indexed: 12/14/2022] Open
Abstract
With the discovery of the double helical structure of DNA, a shift occurred in how biologists investigated questions surrounding cellular processes, such as protein synthesis. Instead of viewing biological activity through the lens of chemical reactions, this new field used biological information to gain a new profound view of how biological systems work. Molecular biologists asked new types of questions that would have been inconceivable to the older generation of researchers, such as how cellular machineries convert inherited biological information into functional molecules like proteins. This new focus on biological information also gave molecular biologists a way to link their findings to concepts developed by genetics and the modern synthesis. However, by the late 1960s this all changed. Elevated rates of mutation, unsustainable genetic loads, and high levels of variation in populations, challenged Darwinian evolution, a central tenant of the modern synthesis, where adaptation was the main driver of evolutionary change. Building on these findings, Motoo Kimura advanced the neutral theory of molecular evolution, which advocates that selection in multicellular eukaryotes is weak and that most genomic changes are neutral and due to random drift. This was further elaborated by Jack King and Thomas Jukes, in their paper “Non-Darwinian Evolution”, where they pointed out that the observed changes seen in proteins and the types of polymorphisms observed in populations only become understandable when we take into account biochemistry and Kimura’s new theory. Fifty years later, most molecular biologists remain unaware of these fundamental advances. Their adaptionist viewpoint fails to explain data collected from new powerful technologies which can detect exceedingly rare biochemical events. For example, high throughput sequencing routinely detects RNA transcripts being produced from almost the entire genome yet are present less than one copy per thousand cells and appear to lack any function. Molecular biologists must now reincorporate ideas from classical biochemistry and absorb modern concepts from molecular evolution, to craft a new lens through which they can evaluate the functionality of transcriptional units, and make sense of our messy, intricate, and complicated genome.
Collapse
|
10
|
Huang R, Xie X, Chen A, Li F, Tian E, Chao Z. The chloroplast genomes of four Bupleurum (Apiaceae) species endemic to Southwestern China, a diversity center of the genus, as well as their evolutionary implications and phylogenetic inferences. BMC Genomics 2021; 22:714. [PMID: 34600494 PMCID: PMC8487540 DOI: 10.1186/s12864-021-08008-z] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2020] [Accepted: 09/13/2021] [Indexed: 11/28/2022] Open
Abstract
Background As one of the largest genera in Apiaceae, Bupleurum L. is well known for its high medicinal value. The genus has frequently attracted the attention of evolutionary biologist and taxonomist for its distinctive characteristics in the Apiaceae family. Although some chloroplast genomes data have been now available, the changes in the structure of chloroplast genomes and selective pressure in the genus have not been fully understood. In addition, few of the species are endemic to Southwest China, a distribution and diversity center of Chinese Bupleurum. Endemic species are key components of biodiversity and ecosystems, and investigation of the chloroplast genomes features of endemic species in Bupleurum will be helpful to develop a better understanding of evolutionary process and phylogeny of the genus. In this study, we analyzed the sequences of whole chloroplast genomes of 4 Southwest China endemic Bupleurum species in comparison with the published data of 17 Bupleurum species to determine the evolutionary characteristics of the genus and the phylogenetic relationships of Asian Bupleurum. Results The complete chloroplast genome sequences of the 4 endemic Bupleurum species are 155,025 bp to 155,323 bp in length including a SSC and a LSC region separated by a pair of IRs. Comparative analysis revealed an identical chloroplast gene content across the 21 Bupleurum species, including a total of 114 unique genes (30 tRNA genes, 4 rRNA genes and 80 protein-coding genes). Chloroplast genomes of the 21 Bupleurum species showed no rearrangements and a high sequence identity (96.4–99.2%). They also shared a similar tendency of SDRs and SSRs, but differed in number (59–83). In spite of their high conservation, they contained some mutational hotspots, which can be potentially exploited as high-resolution DNA barcodes for species discrimination. Selective pressure analysis showed that four genes were under positive selection. Phylogenetic analysis revealed that the 21 Bupleurum formed two major clades, which are likely to correspond to their geographical distribution. Conclusions The chloroplast genome data of the four endemic Bupleurum species provide important insights into the characteristics and evolution of chloroplast genomes of this genu, and the phylogeny of Bupleurum. Supplementary Information The online version contains supplementary material available at 10.1186/s12864-021-08008-z.
Collapse
Affiliation(s)
- Rong Huang
- Department of Pharmacy, Zhujiang Hospital, Southern Medical University, Guangzhou, 510282, China
| | - Xuena Xie
- Department of Pharmacy, Zhujiang Hospital, Southern Medical University, Guangzhou, 510282, China
| | - Aimin Chen
- Department of Pharmacy, Zhujiang Hospital, Southern Medical University, Guangzhou, 510282, China
| | - Fang Li
- Department of Pharmacy, Zhujiang Hospital, Southern Medical University, Guangzhou, 510282, China
| | - Enwei Tian
- Department of Pharmacy, Zhujiang Hospital, Southern Medical University, Guangzhou, 510282, China
| | - Zhi Chao
- Department of Pharmacy, Zhujiang Hospital, Southern Medical University, Guangzhou, 510282, China. .,Faculty of Medicinal Plants and Pharmacognosy, School of Traditional Chinese Medicine, Southern Medical University, Guangzhou, 510515, China. .,Guangdong Provincial Key Laboratory of Chinese Medicine Pharmaceutics, Guangzhou, 510515, China.
| |
Collapse
|
11
|
Nassar R, Dignon GL, Razban RM, Dill KA. The Protein Folding Problem: The Role of Theory. J Mol Biol 2021; 433:167126. [PMID: 34224747 PMCID: PMC8547331 DOI: 10.1016/j.jmb.2021.167126] [Citation(s) in RCA: 45] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2021] [Revised: 06/21/2021] [Accepted: 06/26/2021] [Indexed: 10/20/2022]
Abstract
The protein folding problem was first articulated as question of how order arose from disorder in proteins: How did the various native structures of proteins arise from interatomic driving forces encoded within their amino acid sequences, and how did they fold so fast? These matters have now been largely resolved by theory and statistical mechanics combined with experiments. There are general principles. Chain randomness is overcome by solvation-based codes. And in the needle-in-a-haystack metaphor, native states are found efficiently because protein haystacks (conformational ensembles) are funnel-shaped. Order-disorder theory has now grown to encompass a large swath of protein physical science across biology.
Collapse
Affiliation(s)
- Roy Nassar
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, NY, USA; Department of Chemistry, Stony Brook University, Stony Brook, NY, USA
| | - Gregory L Dignon
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, NY, USA
| | - Rostam M Razban
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, NY, USA
| | - Ken A Dill
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, NY, USA; Department of Chemistry, Stony Brook University, Stony Brook, NY, USA; Department of Physics and Astronomy, Stony Brook University, Stony Brook, NY, USA.
| |
Collapse
|
12
|
Razban RM, Dasmeh P, Serohijos AWR, Shakhnovich EI. Avoidance of protein unfolding constrains protein stability in long-term evolution. Biophys J 2021; 120:2413-2424. [PMID: 33932438 PMCID: PMC8390877 DOI: 10.1016/j.bpj.2021.03.042] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2020] [Revised: 02/24/2021] [Accepted: 03/17/2021] [Indexed: 11/28/2022] Open
Abstract
Every amino acid residue can influence a protein's overall stability, making stability highly susceptible to change throughout evolution. We consider the distribution of protein stabilities evolutionarily permittable under two previously reported protein fitness functions: flux dynamics and misfolding avoidance. We develop an evolutionary dynamics theory and find that it agrees better with an extensive protein stability data set for dihydrofolate reductase orthologs under the misfolding avoidance fitness function rather than the flux dynamics fitness function. Further investigation with ribonuclease H data demonstrates that not any misfolded state is avoided; rather, it is only the unfolded state. At the end, we discuss how our work pertains to the universal protein abundance-evolutionary rate correlation seen across organisms' proteomes. We derive a closed-form expression relating protein abundance to evolutionary rate that captures Escherichia coli, Saccharomyces cerevisiae, and Homo sapiens experimental trends without fitted parameters.
Collapse
Affiliation(s)
- Rostam M Razban
- Department of Chemistry and Chemical Biology, Harvard University, Cambridge, Massachusetts
| | - Pouria Dasmeh
- Department of Chemistry and Chemical Biology, Harvard University, Cambridge, Massachusetts; Departement de Biochimie, Université de Montréal, Montreal, Quebec, Canada
| | | | - Eugene I Shakhnovich
- Department of Chemistry and Chemical Biology, Harvard University, Cambridge, Massachusetts.
| |
Collapse
|
13
|
Gao X, Zhang X, Meng H, Li J, Zhang D, Liu C. Comparative chloroplast genomes of Paris Sect. Marmorata: insights into repeat regions and evolutionary implications. BMC Genomics 2018; 19:878. [PMID: 30598104 PMCID: PMC6311911 DOI: 10.1186/s12864-018-5281-x] [Citation(s) in RCA: 34] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Abstract
Background Species of Paris Sect. Marmorata are valuable medicinal plants to synthesize steroidal saponins with effective pharmacological therapy. However, the wild resources of the species are threatened by plundering exploitation before the molecular genetics studies uncover the genomes and evolutionary significance. Thus, the availability of complete chloroplast genome sequences of Sect. Marmorata is necessary and crucial to the understanding the plastome evolution of this section and facilitating future population genetics studies. Here, we determined chloroplast genomes of Sect. Marmorata, and conducted the whole chloroplast genome comparison. Results This study presented detailed sequences and structural variations of chloroplast genomes of Sect. Marmorata. Over 40 large repeats and approximately 130 simple sequence repeats as well as a group of genomic hotspots were detected. Inverted repeat contraction of this section was inferred via comparing the chloroplast genomes with the one of P. verticillata. Additionally, almost all the plastid protein coding genes were found to prefer ending with A/U. Mutation bias and selection pressure predominately shaped the codon bias of most genes. And most of the genes underwent purifying selection, whereas photosynthetic genes experienced a relatively relaxed purifying selection. Conclusions Repeat sequences and hotspot regions can be scanned to detect the intraspecific and interspecific variability, and selected to infer the phylogenetic relationships of Sect. Marmorata and other species in subgenus Daiswa. Mutation and natural selection were the main forces to drive the codon bias pattern of most plastid protein coding genes. Therefore, this study enhances the understanding about evolution of Sect. Marmorata from the chloroplast genome, and provide genomic insights into genetic analyses of Sect. Marmorata. Electronic supplementary material The online version of this article (10.1186/s12864-018-5281-x) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Xiaoyang Gao
- CAS Key Laboratory of Tropical Plant Resources and Sustainable Use, Xishuangbanna Tropical Botanical Garden, Chinese Academy of Science, Menglun, 666303, Yunnan, China
| | - Xuan Zhang
- CAS Key Laboratory of Tropical Plant Resources and Sustainable Use, Xishuangbanna Tropical Botanical Garden, Chinese Academy of Science, Menglun, 666303, Yunnan, China.,University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Honghu Meng
- Center for Integrative Conservation, Xishuangbanna Tropical Botanical Garden, Chinese Academy of Sciences, Kunming, 650223, Yunnan, China
| | - Jing Li
- CAS Key Laboratory of Tropical Plant Resources and Sustainable Use, Xishuangbanna Tropical Botanical Garden, Chinese Academy of Science, Menglun, 666303, Yunnan, China.,University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Di Zhang
- CAS Key Laboratory of Tropical Plant Resources and Sustainable Use, Xishuangbanna Tropical Botanical Garden, Chinese Academy of Science, Menglun, 666303, Yunnan, China.,University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Changning Liu
- CAS Key Laboratory of Tropical Plant Resources and Sustainable Use, Xishuangbanna Tropical Botanical Garden, Chinese Academy of Science, Menglun, 666303, Yunnan, China.
| |
Collapse
|
14
|
Duan C, Huan Q, Chen X, Wu S, Carey LB, He X, Qian W. Reduced intrinsic DNA curvature leads to increased mutation rate. Genome Biol 2018; 19:132. [PMID: 30217230 PMCID: PMC6138893 DOI: 10.1186/s13059-018-1525-y] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2018] [Accepted: 09/05/2018] [Indexed: 01/24/2023] Open
Abstract
BACKGROUND Mutation rates vary across the genome. Many trans factors that influence mutation rates have been identified, as have specific sequence motifs at the 1-7-bp scale, but cis elements remain poorly characterized. The lack of understanding regarding why different sequences have different mutation rates hampers our ability to identify positive selection in evolution and to identify driver mutations in tumorigenesis. RESULTS Here, we use a combination of synthetic genes and sequences of thousands of isolated yeast colonies to show that intrinsic DNA curvature is a major cis determinant of mutation rate. Mutation rate negatively correlates with DNA curvature within genes, and a 10% decrease in curvature results in a 70% increase in mutation rate. Consistently, both yeast and humans accumulate mutations in regions with small curvature. We further show that this effect is due to differences in the intrinsic mutation rate, likely due to differences in mutagen sensitivity and not due to differences in the local activity of DNA repair. CONCLUSIONS Our study establishes a framework for understanding the cis properties of DNA sequence in modulating the local mutation rate and identifies a novel causal source of non-uniform mutation rates across the genome.
Collapse
Affiliation(s)
- Chaorui Duan
- State Key Laboratory of Plant Genomics, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing, 100101, China.,Key Laboratory of Genetic Network Biology, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing, 100101, China.,University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Qing Huan
- State Key Laboratory of Plant Genomics, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing, 100101, China.,Key Laboratory of Genetic Network Biology, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing, 100101, China
| | - Xiaoshu Chen
- Human Genome Research Institute and Department of Medical Genetics, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, 510080, China
| | - Shaohuan Wu
- State Key Laboratory of Plant Genomics, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing, 100101, China.,Key Laboratory of Genetic Network Biology, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing, 100101, China.,University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Lucas B Carey
- Department of Experimental and Health Sciences, Universitat Pompeu Fabra, 08003, Barcelona, Spain
| | - Xionglei He
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-sen University, Guangzhou, 510275, China
| | - Wenfeng Qian
- State Key Laboratory of Plant Genomics, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing, 100101, China. .,Key Laboratory of Genetic Network Biology, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing, 100101, China. .,University of Chinese Academy of Sciences, Beijing, 100049, China.
| |
Collapse
|
15
|
Alvarez-Ponce D, Feyertag F, Chakraborty S. Position Matters: Network Centrality Considerably Impacts Rates of Protein Evolution in the Human Protein-Protein Interaction Network. Genome Biol Evol 2018; 9:1742-1756. [PMID: 28854629 PMCID: PMC5570066 DOI: 10.1093/gbe/evx117] [Citation(s) in RCA: 30] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/01/2017] [Indexed: 02/06/2023] Open
Abstract
The proteins of any organism evolve at disparate rates. A long list of factors affecting rates of protein evolution have been identified. However, the relative importance of each factor in determining rates of protein evolution remains unresolved. The prevailing view is that evolutionary rates are dominantly determined by gene expression, and that other factors such as network centrality have only a marginal effect, if any. However, this view is largely based on analyses in yeasts, and accurately measuring the importance of the determinants of rates of protein evolution is complicated by the fact that the different factors are often correlated with each other, and by the relatively poor quality of available functional genomics data sets. Here, we use correlation, partial correlation and principal component regression analyses to measure the contributions of several factors to the variability of the rates of evolution of human proteins. For this purpose, we analyzed the entire human protein–protein interaction data set and the human signal transduction network—a network data set of exceptionally high quality, obtained by manual curation, which is expected to be virtually free from false positives. In contrast with the prevailing view, we observe that network centrality (measured as the number of physical and nonphysical interactions, betweenness, and closeness) has a considerable impact on rates of protein evolution. Surprisingly, the impact of centrality on rates of protein evolution seems to be comparable, or even superior according to some analyses, to that of gene expression. Our observations seem to be independent of potentially confounding factors and from the limitations (biases and errors) of interactomic data sets.
Collapse
|
16
|
Kim H, Kim YM. Pan-cancer analysis of somatic mutations and transcriptomes reveals common functional gene clusters shared by multiple cancer types. Sci Rep 2018; 8:6041. [PMID: 29662161 PMCID: PMC5902616 DOI: 10.1038/s41598-018-24379-y] [Citation(s) in RCA: 25] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2017] [Accepted: 04/03/2018] [Indexed: 12/28/2022] Open
Abstract
To discover functional gene clusters across cancers, we performed a systematic pan-cancer analysis of 33 cancer types. We identified genes that were associated with somatic mutations and were the cores of a co-expression network. We found that multiple cancer types have relatively exclusive hub genes individually; however, the hub genes cooperate with each other based on their functional relationship. When we built a protein-protein interaction network of hub genes and found nine functional gene clusters across cancer types, the gene clusters divided not only the region of the network map, but also the function of the network by their distinct roles related to the development and progression of cancer. This functional relationship between the clusters and cancers was underpinned by the high expression of module genes and enrichment of programmed cell death, and known candidate cancer genes. In addition to protein-coding hub genes, non-coding hub genes had a possible relationship with cancer. Overall, our approach of investigating cancer genes enabled finding pan-cancer hub genes and common functional gene clusters shared by multiple cancer types based on the expression status of the primary tumour and the functional relationship of genes in the biological network.
Collapse
Affiliation(s)
- Hyeongmin Kim
- Korean Bioinformation Center, Korea Research Institute of Bioscience and Biotechnology, Daejeon, 34141, Korea
| | - Yong-Min Kim
- Korean Bioinformation Center, Korea Research Institute of Bioscience and Biotechnology, Daejeon, 34141, Korea.
| |
Collapse
|
17
|
Gumi AM, Guha PK, Mazumder A, Jayaswal P, Mondal TK. Characterization of OglDREB2A gene from African rice ( Oryza glaberrima), comparative analysis and its transcriptional regulation under salinity stress. 3 Biotech 2018; 8:91. [PMID: 29430353 PMCID: PMC5796934 DOI: 10.1007/s13205-018-1098-1] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2017] [Accepted: 01/05/2018] [Indexed: 01/17/2023] Open
Abstract
In this study, AP2 DNA-binding domain-containing transcription factor, OglDREB2A, was cloned from the African rice (Oryza glaberrima) and compared with 3000 rice genotypes. Further, the phylogenetic and various structural analysis was performed using in silico approaches. Further, to understand its allelic variation in rice, SNPs and indels were detected among the 3000 rice genotypes which indicated that while coding region is highly conserved, yet noncoding regions such as UTR and intron contained most of the variation. Phylogenetic analysis of the OglDREB2A sequence in different Oryza as well as in diverse eudicot species revealed that DREB from various Oryza species were diversed much earlier than other genes. Further, structural features and in silico analyses provided insights into different properties of OglDREB2A protein. The neutrality test on the coding region of OglDREB2A from different genotypes of O. glaberrima showed the lack of selection in this gene. Among the different developmental stages, it was upregulated at tillering and flag leaf under salinity treatment indicating its positive role in seedling and reproductive stage tolerance. Real-time PCR analysis also indicated the conserve expression pattern of this gene under salinity stress across the three different Oryza species having different degree of salinity tolerance.
Collapse
Affiliation(s)
- Abubakar Mohammad Gumi
- ICAR-National Bureau of Plant Genetic Resources, IARI Campus, Pusa, New Delhi, 110012 India
- Present Address: Department of Biological Sciences, Usmanu Danfodiyo University, Sokoto, Nigeria
| | - Pritam Kanti Guha
- ICAR-National Bureau of Plant Genetic Resources, IARI Campus, Pusa, New Delhi, 110012 India
- ICAR-National Research Centre on Plant Biotechnology, LBS Building, IARI, New Delhi, 110012 India
| | - Abhishek Mazumder
- ICAR-National Research Centre on Plant Biotechnology, LBS Building, IARI, New Delhi, 110012 India
| | - Pawan Jayaswal
- ICAR-National Research Centre on Plant Biotechnology, LBS Building, IARI, New Delhi, 110012 India
| | - Tapan Kumar Mondal
- ICAR-National Bureau of Plant Genetic Resources, IARI Campus, Pusa, New Delhi, 110012 India
- ICAR-National Research Centre on Plant Biotechnology, LBS Building, IARI, New Delhi, 110012 India
- Present Address: Department of Biological Sciences, Usmanu Danfodiyo University, Sokoto, Nigeria
| |
Collapse
|
18
|
Ho WC, Zhang J. Evolutionary adaptations to new environments generally reverse plastic phenotypic changes. Nat Commun 2018; 9:350. [PMID: 29367589 PMCID: PMC5783951 DOI: 10.1038/s41467-017-02724-5] [Citation(s) in RCA: 56] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2017] [Accepted: 12/20/2017] [Indexed: 11/25/2022] Open
Abstract
Organismal adaptation to a new environment may start with plastic phenotypic changes followed by genetic changes, but whether the plastic changes are stepping stones to genetic adaptation is debated. Here we address this question by investigating gene expression and metabolic flux changes in the two-phase adaptation process using transcriptomic data from multiple experimental evolution studies and computational metabolic network analysis, respectively. We discover that genetic changes more frequently reverse than reinforce plastic phenotypic changes in virtually every adaptation. Metabolic network analysis reveals that, even in the presence of plasticity, organismal fitness drops after environmental shifts, but largely recovers through subsequent evolution. Such fitness trajectories explain why plastic phenotypic changes are genetically compensated rather than strengthened. In conclusion, although phenotypic plasticity may serve as an emergency response to a new environment that is necessary for survival, it does not generally facilitate genetic adaptation by bringing the organismal phenotype closer to the new optimum. Phenotypic plasticity has been suggested to facilitate survival in new environments and subsequent adaptation. Here, the authors reanalyze transcriptomic data from experimental evolution studies in combination with computational metabolic network analysis and show that genetic adaptation tends to reverse plastic changes in order to recover fitness.
Collapse
Affiliation(s)
- Wei-Chin Ho
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI, 48109, USA.,Center for Mechanisms of Evolution, The Biodesign Institute, Arizona State University, Tempe, AZ, 85287, USA
| | - Jianzhi Zhang
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI, 48109, USA.
| |
Collapse
|
19
|
Kachroo AH, Laurent JM, Akhmetov A, Szilagyi-Jones M, McWhite CD, Zhao A, Marcotte EM. Systematic bacterialization of yeast genes identifies a near-universally swappable pathway. eLife 2017; 6:e25093. [PMID: 28661399 PMCID: PMC5536947 DOI: 10.7554/elife.25093] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2017] [Accepted: 06/26/2017] [Indexed: 11/13/2022] Open
Abstract
Eukaryotes and prokaryotes last shared a common ancestor ~2 billion years ago, and while many present-day genes in these lineages predate this divergence, the extent to which these genes still perform their ancestral functions is largely unknown. To test principles governing retention of ancient function, we asked if prokaryotic genes could replace their essential eukaryotic orthologs. We systematically replaced essential genes in yeast by their 1:1 orthologs from Escherichia coli. After accounting for mitochondrial localization and alternative start codons, 31 out of 51 bacterial genes tested (61%) could complement a lethal growth defect and replace their yeast orthologs with minimal effects on growth rate. Replaceability was determined on a pathway-by-pathway basis; codon usage, abundance, and sequence similarity contributed predictive power. The heme biosynthesis pathway was particularly amenable to inter-kingdom exchange, with each yeast enzyme replaceable by its bacterial, human, or plant ortholog, suggesting it as a near-universally swappable pathway.
Collapse
Affiliation(s)
- Aashiq H Kachroo
- Center for Systems and Synthetic Biology, Institute for Cellular and Molecular Biology, University of Texas at Austin, Austin, United States
| | - Jon M Laurent
- Center for Systems and Synthetic Biology, Institute for Cellular and Molecular Biology, University of Texas at Austin, Austin, United States
| | - Azat Akhmetov
- Center for Systems and Synthetic Biology, Institute for Cellular and Molecular Biology, University of Texas at Austin, Austin, United States
| | - Madelyn Szilagyi-Jones
- Center for Systems and Synthetic Biology, Institute for Cellular and Molecular Biology, University of Texas at Austin, Austin, United States
| | - Claire D McWhite
- Center for Systems and Synthetic Biology, Institute for Cellular and Molecular Biology, University of Texas at Austin, Austin, United States
| | - Alice Zhao
- Center for Systems and Synthetic Biology, Institute for Cellular and Molecular Biology, University of Texas at Austin, Austin, United States
| | - Edward M Marcotte
- Center for Systems and Synthetic Biology, Institute for Cellular and Molecular Biology, University of Texas at Austin, Austin, United States
- Department of Molecular Biosciences, University of Texas at Austin, Austin, United States
| |
Collapse
|
20
|
Abstract
Acinetobacter baumannii is a clinical threat to human health, causing major infection outbreaks worldwide. As new drugs against Gram-negative bacteria do not seem to be forthcoming, and due to the microbial capability of acquiring multi-resistance, there is an urgent need for novel therapeutic targets. Here we have derived a list of new potential targets by means of metabolic reconstruction and modelling of A. baumannii ATCC 19606. By integrating constraint-based modelling with gene expression data, we simulated microbial growth in normal and stressful conditions (i.e. following antibiotic exposure). This allowed us to describe the metabolic reprogramming that occurs in this bacterium when treated with colistin (the currently adopted last-line treatment) and identify a set of genes that are primary targets for developing new drugs against A. baumannii, including colistin-resistant strains. It can be anticipated that the metabolic model presented herein will represent a solid and reliable resource for the future treatment of A. baumannii infections.
Collapse
|
21
|
Oral Biosciences: The annual review 2016. J Oral Biosci 2017. [DOI: 10.1016/j.job.2016.12.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
22
|
Jacobs C, Lambourne L, Xia Y, Segrè D. Upon Accounting for the Impact of Isoenzyme Loss, Gene Deletion Costs Anticorrelate with Their Evolutionary Rates. PLoS One 2017; 12:e0170164. [PMID: 28107392 PMCID: PMC5249160 DOI: 10.1371/journal.pone.0170164] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2016] [Accepted: 12/30/2016] [Indexed: 12/19/2022] Open
Abstract
System-level metabolic network models enable the computation of growth and metabolic phenotypes from an organism's genome. In particular, flux balance approaches have been used to estimate the contribution of individual metabolic genes to organismal fitness, offering the opportunity to test whether such contributions carry information about the evolutionary pressure on the corresponding genes. Previous failure to identify the expected negative correlation between such computed gene-loss cost and sequence-derived evolutionary rates in Saccharomyces cerevisiae has been ascribed to a real biological gap between a gene's fitness contribution to an organism "here and now" and the same gene's historical importance as evidenced by its accumulated mutations over millions of years of evolution. Here we show that this negative correlation does exist, and can be exposed by revisiting a broadly employed assumption of flux balance models. In particular, we introduce a new metric that we call "function-loss cost", which estimates the cost of a gene loss event as the total potential functional impairment caused by that loss. This new metric displays significant negative correlation with evolutionary rate, across several thousand minimal environments. We demonstrate that the improvement gained using function-loss cost over gene-loss cost is explained by replacing the base assumption that isoenzymes provide unlimited capacity for backup with the assumption that isoenzymes are completely non-redundant. We further show that this change of the assumption regarding isoenzymes increases the recall of epistatic interactions predicted by the flux balance model at the cost of a reduction in the precision of the predictions. In addition to suggesting that the gene-to-reaction mapping in genome-scale flux balance models should be used with caution, our analysis provides new evidence that evolutionary gene importance captures much more than strict essentiality.
Collapse
Affiliation(s)
- Christopher Jacobs
- Bioinformatics Program, Boston University, Boston, Massachusetts, United States of America
| | - Luke Lambourne
- Bioinformatics Program, Boston University, Boston, Massachusetts, United States of America
- Department of Bioengineering, Faculty of Engineering, McGill University, Montreal, Quebec, Canada
| | - Yu Xia
- Bioinformatics Program, Boston University, Boston, Massachusetts, United States of America
- Department of Bioengineering, Faculty of Engineering, McGill University, Montreal, Quebec, Canada
| | - Daniel Segrè
- Bioinformatics Program, Boston University, Boston, Massachusetts, United States of America
- Department of Biology, Boston University, Boston, Massachusetts, United States of America
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts, United States of America
| |
Collapse
|
23
|
Cohen O, Oberhardt M, Yizhak K, Ruppin E. Essential Genes Embody Increased Mutational Robustness to Compensate for the Lack of Backup Genetic Redundancy. PLoS One 2016; 11:e0168444. [PMID: 27997585 PMCID: PMC5173180 DOI: 10.1371/journal.pone.0168444] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2016] [Accepted: 12/01/2016] [Indexed: 11/23/2022] Open
Abstract
Genetic robustness is a hallmark of cells, occurring through many mechanisms and at many levels. Essential genes lack the common robustness mechanism of genetic redundancy (i.e., existing alongside other genes with the same function), and thus appear at first glance to leave cells highly vulnerable to genetic or environmental perturbations. Here we explore a hypothesis that cells might protect against essential gene loss through mechanisms that occur at various cellular levels aside from the level of the gene. Using Escherichia coli and Saccharomyces cerevisiae as models, we find that essential genes are enriched over non-essential genes for properties we call "coding efficiency" and "coding robustness", denoting respectively a gene's efficiency of translation and robustness to non-synonymous mutations. The coding efficiency levels of essential genes are highly positively correlated with their evolutionary conservation levels, suggesting that this feature plays a key role in protecting conserved, evolutionarily important genes. We then extend our hypothesis into the realm of metabolic networks, showing that essential metabolic reactions are encoded by more "robust" genes than non-essential reactions, and that essential metabolites are produced by more reactions than non-essential metabolites. Taken together, these results testify that robustness at the gene-loss level and at the mutation level (and more generally, at two cellular levels that are usually treated separately) are not decoupled, but rather, that cellular vulnerability exposed due to complete gene loss is compensated by increased mutational robustness. Why some genes are backed up primarily against loss and others against mutations still remains an open question.
Collapse
Affiliation(s)
- Osher Cohen
- School of Computer Sciences and Sackler School of Medicine, Tel Aviv University, Tel Aviv, Israel
| | - Matthew Oberhardt
- Center for Bioinformatics and Computational Biology, University of Maryland, College Park, MD, United States of America
| | - Keren Yizhak
- School of Computer Sciences and Sackler School of Medicine, Tel Aviv University, Tel Aviv, Israel
| | - Eytan Ruppin
- School of Computer Sciences and Sackler School of Medicine, Tel Aviv University, Tel Aviv, Israel
- Center for Bioinformatics and Computational Biology, University of Maryland, College Park, MD, United States of America
| |
Collapse
|
24
|
Dentin sialophosphoprotein is a potentially latent bioactive protein in dentin. J Oral Biosci 2016; 58:134-142. [DOI: 10.1016/j.job.2016.08.002] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2016] [Accepted: 08/01/2016] [Indexed: 11/18/2022]
|
25
|
Alvarez-Ponce D, Sabater-Muñoz B, Toft C, Ruiz-González MX, Fares MA. Essentiality Is a Strong Determinant of Protein Rates of Evolution during Mutation Accumulation Experiments in Escherichia coli. Genome Biol Evol 2016; 8:2914-2927. [PMID: 27566759 PMCID: PMC5630975 DOI: 10.1093/gbe/evw205] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
The Neutral Theory of Molecular Evolution is considered the most powerful theory to understand the evolutionary behavior of proteins. One of the main predictions of this theory is that essential proteins should evolve slower than dispensable ones owing to increased selective constraints. Comparison of genomes of different species, however, has revealed only small differences between the rates of evolution of essential and nonessential proteins. In some analyses, these differences vanish once confounding factors are controlled for, whereas in other cases essentiality seems to have an independent, albeit small, effect. It has been argued that comparing relatively distant genomes may entail a number of limitations. For instance, many of the genes that are dispensable in controlled lab conditions may be essential in some of the conditions faced in nature. Moreover, essentiality can change during evolution, and rates of protein evolution are simultaneously shaped by a variety of factors, whose individual effects are difficult to isolate. Here, we conducted two parallel mutation accumulation experiments in Escherichia coli, during 5,500–5,750 generations, and compared the genomes at different points of the experiments. Our approach (a short-term experiment, under highly controlled conditions) enabled us to overcome many of the limitations of previous studies. We observed that essential proteins evolved substantially slower than nonessential ones during our experiments. Strikingly, rates of protein evolution were only moderately affected by expression level and protein length.
Collapse
Affiliation(s)
| | - Beatriz Sabater-Muñoz
- Instituto de Biología Molecular y Celular de Plantas (CSIC-UPV), Valencia, Spain Department of Genetics, Smurfit Institute of Genetics, University of Dublin, Trinity College Dublin, Dublin, Ireland
| | - Christina Toft
- Department of Genetics, University of Valencia, Valencia, Spain Departamento de Biotecnología, Instituto de Agroquímica y Tecnología de los Alimentos (CSIC), Valencia, Spain
| | - Mario X Ruiz-González
- Instituto de Biología Molecular y Celular de Plantas (CSIC-UPV), Valencia, Spain Current Address: Secretaría de Educación Superior, Ciencia, Tecnología e Innovación, Proyecto Prometeo; Departamento de Ciencias Biológicas, Universidad Tócnica Particular de Loja, Loja, Ecuador
| | - Mario A Fares
- Instituto de Biología Molecular y Celular de Plantas (CSIC-UPV), Valencia, Spain Department of Genetics, Smurfit Institute of Genetics, University of Dublin, Trinity College Dublin, Dublin, Ireland
| |
Collapse
|
26
|
Zhang XF, Ou-Yang L, Dai DQ, Wu MY, Zhu Y, Yan H. Comparative analysis of housekeeping and tissue-specific driver nodes in human protein interaction networks. BMC Bioinformatics 2016; 17:358. [PMID: 27612563 PMCID: PMC5016887 DOI: 10.1186/s12859-016-1233-0] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2015] [Accepted: 08/31/2016] [Indexed: 12/31/2022] Open
Abstract
Background Several recent studies have used the Minimum Dominating Set (MDS) model to identify driver nodes, which provide the control of the underlying networks, in protein interaction networks. There may exist multiple MDS configurations in a given network, thus it is difficult to determine which one represents the real set of driver nodes. Because these previous studies only focus on static networks and ignore the contextual information on particular tissues, their findings could be insufficient or even be misleading. Results In this study, we develop a Collective-Influence-corrected Minimum Dominating Set (CI-MDS) model which takes into account the collective influence of proteins. By integrating molecular expression profiles and static protein interactions, 16 tissue-specific networks are established as well. We then apply the CI-MDS model to each tissue-specific network to detect MDS proteins. It generates almost the same MDSs when it is solved using different optimization algorithms. In addition, we classify MDS proteins into Tissue-Specific MDS (TS-MDS) proteins and HouseKeeping MDS (HK-MDS) proteins based on the number of tissues in which they are expressed and identified as MDS proteins. Notably, we find that TS-MDS proteins and HK-MDS proteins have significantly different topological and functional properties. HK-MDS proteins are more central in protein interaction networks, associated with more functions, evolving more slowly and subjected to a greater number of post-translational modifications than TS-MDS proteins. Unlike TS-MDS proteins, HK-MDS proteins significantly correspond to essential genes, ageing genes, virus-targeted proteins, transcription factors and protein kinases. Moreover, we find that besides HK-MDS proteins, many TS-MDS proteins are also linked to disease related genes, suggesting the tissue specificity of human diseases. Furthermore, functional enrichment analysis reveals that HK-MDS proteins carry out universally necessary biological processes and TS-MDS proteins usually involve in tissue-dependent functions. Conclusions Our study uncovers key features of TS-MDS proteins and HK-MDS proteins, and is a step forward towards a better understanding of the controllability of human interactomes. Electronic supplementary material The online version of this article (doi:10.1186/s12859-016-1233-0) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Xiao-Fei Zhang
- School of Mathematics and Statistics & Hubei Key Laboratory of Mathematical Sciences, Central China Normal University, Luoyu Road, Wuhan, 430079, China
| | - Le Ou-Yang
- College of Information Engineering, Shenzhen University, Nanhai Ave 3688, Shenzhen, 518060, China
| | - Dao-Qing Dai
- Intelligent Data Center and Department of Mathematics, Sun Yat-Sen University, Xingang West Road, Guangzhou, 510275, China.
| | - Meng-Yun Wu
- School of Statistics and Management, Shanghai University of Finance and Economics, Guoding Road, Shanghai, 200433, China
| | - Yuan Zhu
- School of Automation, China University of Geosciences, Lumo Road, Wuhan, 430074, China
| | - Hong Yan
- Department of Electronic and Engineering, City University of Hong Kong, Tat Chee Avenue, Hong Kong, China
| |
Collapse
|
27
|
Mannakee BK, Gutenkunst RN. Selection on Network Dynamics Drives Differential Rates of Protein Domain Evolution. PLoS Genet 2016; 12:e1006132. [PMID: 27380265 PMCID: PMC4933380 DOI: 10.1371/journal.pgen.1006132] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2016] [Accepted: 05/27/2016] [Indexed: 11/19/2022] Open
Abstract
The long-held principle that functionally important proteins evolve slowly has recently been challenged by studies in mice and yeast showing that the severity of a protein knockout only weakly predicts that protein's rate of evolution. However, the relevance of these studies to evolutionary changes within proteins is unknown, because amino acid substitutions, unlike knockouts, often only slightly perturb protein activity. To quantify the phenotypic effect of small biochemical perturbations, we developed an approach to use computational systems biology models to measure the influence of individual reaction rate constants on network dynamics. We show that this dynamical influence is predictive of protein domain evolutionary rate within networks in vertebrates and yeast, even after controlling for expression level and breadth, network topology, and knockout effect. Thus, our results not only demonstrate the importance of protein domain function in determining evolutionary rate, but also the power of systems biology modeling to uncover unanticipated evolutionary forces.
Collapse
Affiliation(s)
- Brian K. Mannakee
- Division of Epidemiology and Biostatistics, Mel and Enid Zuckerman College of Public Health, University of Arizona, Tucson, Arizona, United States of America
| | - Ryan N. Gutenkunst
- Department of Molecular and Cellular Biology, University of Arizona, Tucson, Arizona, United States of America
- * E-mail:
| |
Collapse
|
28
|
Abstract
Genetic robustness refers to phenotypic invariance in the face of mutation and is a common characteristic of life, but its evolutionary origin is highly controversial. Genetic robustness could be an intrinsic property of biological systems, a result of direct natural selection, or a byproduct of selection for environmental robustness. To differentiate among these hypotheses, we analyze the metabolic network of Escherichia coli and comparable functional random networks. Treating the flux of each reaction as a trait and computationally predicting trait values upon mutations or environmental shifts, we discover that 1) genetic robustness is greater for the actual network than the random networks, 2) the genetic robustness of a trait increases with trait importance and this correlation is stronger in the actual network than in the random networks, and 3) the above result holds even after the control of environmental robustness. These findings demonstrate an adaptive origin of genetic robustness, consistent with the theoretical prediction that, under certain conditions, direct selection is sufficiently powerful to promote genetic robustness in cellular organisms.
Collapse
Affiliation(s)
- Wei-Chin Ho
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor
| | - Jianzhi Zhang
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor
| |
Collapse
|
29
|
mtDNA analysis of 174 Eurasian populations using a new iterative rank correlation method. Mol Genet Genomics 2015; 291:493-509. [PMID: 26142878 DOI: 10.1007/s00438-015-1084-9] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2015] [Accepted: 06/19/2015] [Indexed: 10/23/2022]
Abstract
In this study, we analyse 27-dimensional mtDNA haplogroup distributions of 174 Eurasian, North-African and American populations, including numerous ancient data as well. The main contribution of this work was the description of the haplogroup distribution of recent and ancient populations as compounds of certain hypothetic ancient core populations immediately or indirectly determining the migration processes in Eurasia for a long time. To identify these core populations, we developed a new iterative algorithm determining clusters of the 27 mtDNA haplogroups studied having strong rank correlation among each other within a definite subset of the populations. Based on this study, the current Eurasian populations can be considered as compounds of three early core populations regarding to maternal lineages. We wanted to show that a simultaneous analysis of ancient and recent data using a new iterative rank correlation algorithm and the weighted SOC learning technique may reveal the most important and deterministic migration processes in the past. This technique allowed us to determine geographically, historically and linguistically well-interpretable clusters of our dataset having a very specific, hardly classifiable structure. The method was validated using a 2-dimensional stepping stone model.
Collapse
|
30
|
Abstract
The rate and mechanism of protein sequence evolution have been central questions in evolutionary biology since the 1960s. Although the rate of protein sequence evolution depends primarily on the level of functional constraint, exactly what determines functional constraint has remained unclear. The increasing availability of genomic data has enabled much needed empirical examinations on the nature of functional constraint. These studies found that the evolutionary rate of a protein is predominantly influenced by its expression level rather than functional importance. A combination of theoretical and empirical analyses has identified multiple mechanisms behind these observations and demonstrated a prominent role in protein evolution of selection against errors in molecular and cellular processes.
Collapse
Affiliation(s)
- Jianzhi Zhang
- Department of Ecology and Evolutionary Biology, University of Michigan, 830 North University Avenue, Ann Arbor, Michigan 48109, USA
| | - Jian-Rong Yang
- Department of Ecology and Evolutionary Biology, University of Michigan, 830 North University Avenue, Ann Arbor, Michigan 48109, USA
| |
Collapse
|
31
|
Nazareno AG, Carlsen M, Lohmann LG. Complete Chloroplast Genome of Tanaecium tetragonolobum: The First Bignoniaceae Plastome. PLoS One 2015; 10:e0129930. [PMID: 26103589 PMCID: PMC4478014 DOI: 10.1371/journal.pone.0129930] [Citation(s) in RCA: 61] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2015] [Accepted: 05/13/2015] [Indexed: 12/13/2022] Open
Abstract
Bignoniaceae is a Pantropical plant family that is especially abundant in the Neotropics. Members of the Bignoniaceae are diverse in many ecosystems and represent key components of the Tropical flora. Despite the ecological importance of the Bignoniaceae and all the efforts to reconstruct the phylogeny of this group, whole chloroplast genome information has not yet been reported for any members of the family. Here, we report the complete chloroplast genome sequence of Tanaecium tetragonolobum (Jacq.) L.G. Lohmann, which was reconstructed using de novo and referenced-based assembly of single-end reads generated by shotgun sequencing of total genomic DNA in an Illumina platform. The gene order and organization of the chloroplast genome of T. tetragonolobum exhibits the general structure of flowering plants, and is similar to other Lamiales chloroplast genomes. The chloroplast genome of T. tetragonolobum is a circular molecule of 153,776 base pairs (bp) with a quadripartite structure containing two single copy regions, a large single copy region (LSC, 84,612 bp) and a small single copy region (SSC, 17,586 bp) separated by inverted repeat regions (IRs, 25,789 bp). In addition, the chloroplast genome of T. tetragonolobum has 38.3% GC content and includes 121 genes, of which 86 are protein-coding, 31 are transfer RNA, and four are ribosomal RNA. The chloroplast genome of T. tetragonolobum presents a total of 47 tandem repeats and 347 simple sequence repeats (SSRs) with mononucleotides being the most common and di-, tri-, tetra-, and hexanucleotides occurring with less frequency. The results obtained here were compared to other chloroplast genomes of Lamiales available to date, providing new insight into the evolution of chloroplast genomes within Lamiales. Overall, the evolutionary rates of genes in Lamiales are lineage-, locus-, and region-specific, indicating that the evolutionary pattern of nucleotide substitution in chloroplast genomes of flowering plants is complex. The discovery of tandem repeats within T. tetragonolobum and the presence of divergent regions between chloroplast genomes of Lamiales provides the basis for the development of markers at various taxonomic levels. The newly developed markers have the potential to greatly improve the resolution of molecular phylogenies.
Collapse
Affiliation(s)
- Alison Gonçalves Nazareno
- Universidade de São Paulo, Instituto de Biociências, Departamento de Botânica, São Paulo, São Paulo, Brazil
- * E-mail: (AGN); (LGL)
| | - Monica Carlsen
- University of Missouri-St. Louis, Biology Department, St. Louis, Missouri, United States of America
| | - Lúcia Garcez Lohmann
- Universidade de São Paulo, Instituto de Biociências, Departamento de Botânica, São Paulo, São Paulo, Brazil
- * E-mail: (AGN); (LGL)
| |
Collapse
|
32
|
Ish-Am O, Kristensen DM, Ruppin E. Evolutionary Conservation of Bacterial Essential Metabolic Genes across All Bacterial Culture Media. PLoS One 2015; 10:e0123785. [PMID: 25894004 PMCID: PMC4403854 DOI: 10.1371/journal.pone.0123785] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2014] [Accepted: 03/08/2015] [Indexed: 11/22/2022] Open
Abstract
One of the basic postulates of molecular evolution is that functionally important genes should evolve slower than genes of lesser significance. Essential genes, whose knockout leads to a lethal phenotype are considered of high functional importance, yet whether they are truly more conserved than nonessential genes has been the topic of much debate, fuelled by a host of contradictory findings. Here we conduct the first large-scale study utilizing genome-scale metabolic modeling and spanning many bacterial species, which aims to answer this question. Using the novel Media Variation Analysis, we examine the range of conservation of essential vs. nonessential metabolic genes in a given species across all possible media. We are thus able to obtain for the first time, exact upper and lower bounds on the levels of differential conservation of essential genes for each of the species studied. The results show that bacteria do exhibit an overall tendency for differential conservation of their essential genes vs. their non-essential ones, yet this tendency is highly variable across species. We show that the model bacterium E. coli K12 may or may not exhibit differential conservation of essential genes depending on its growth medium, shedding light on previous experimental studies showing opposite trends.
Collapse
Affiliation(s)
- Oren Ish-Am
- The Blavatnik School of Computer Science, Tel Aviv University, Tel Aviv, Israel
| | - David M. Kristensen
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Eytan Ruppin
- The Blavatnik School of Computer Science, Tel Aviv University, Tel Aviv, Israel
- The Sackler School of Medicine, Tel Aviv University, Tel Aviv, Israel
- Dept. of Computer Science and the Center for Bioinformatics & Computational Biology, the University of Maryland, Maryland, United States of America
| |
Collapse
|
33
|
Shin SH, Choi SS. Lengths of coding and noncoding regions of a gene correlate with gene essentiality and rates of evolution. Genes Genomics 2015. [DOI: 10.1007/s13258-015-0265-6] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
|
34
|
Abstract
Protein metabolism is one of the most costly processes in the cell and is therefore expected to be under the effective control of natural selection. We stimulated yeast strains to overexpress each single gene product to approximately 1% of the total protein content. Consistent with previous reports, we found that excessive expression of proteins containing disordered or membrane-protruding regions resulted in an especially high fitness cost. We estimated these costs to be nearly twice as high as for other proteins. There was a ten-fold difference in cost if, instead of entire proteins, only the disordered or membrane-embedded regions were compared with other segments. Although the cost of processing bulk protein was measurable, it could not be explained by several tested protein features, including those linked to translational efficiency or intensity of physical interactions after maturation. It most likely included a number of individually indiscernible effects arising during protein synthesis, maturation, maintenance, (mal)functioning, and disposal. When scaled to the levels normally achieved by proteins in the cell, the fitness cost of dealing with one amino acid in a standard protein appears to be generally very low. Many single amino acid additions or deletions are likely to be neutral even if the effective population size is as large as that of the budding yeast. This should also apply to substitutions. Selection is much more likely to operate if point mutations affect protein structure by, for example, extending or creating stretches that tend to unfold or interact improperly with membranes.
Collapse
|
35
|
Breugelmans B, Jex AR, Korhonen PK, Mangiola S, Young ND, Sternberg PW, Boag PR, Hofmann A, Gasser RB. Bioinformatic exploration of RIO protein kinases of parasitic and free-living nematodes. Int J Parasitol 2014; 44:827-36. [PMID: 25038443 DOI: 10.1016/j.ijpara.2014.06.005] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2014] [Revised: 06/17/2014] [Accepted: 06/18/2014] [Indexed: 01/07/2023]
Abstract
Despite right open reading frame kinases (RIOKs) being essential for life, their functions, substrates and cellular pathways remain enigmatic. In the present study, gene structures were characterised for 26 RIOKs from draft genomes of parasitic and free-living nematodes. RNA-seq transcription profiles of riok genes were investigated for selected parasitic nematodes and showed that these kinases are transcribed in developmental stages that infect their mammalian host. Three-dimensional structural models of Caenorhabditis elegans RIOKs were predicted, and elucidated functional domains and conserved regions in nematode homologs. These findings provide prospects for functional studies of riok genes in C. elegans, and an opportunity for the design and validation of nematode-specific inhibitors of these enzymes in socioeconomic parasitic worms.
Collapse
Affiliation(s)
- Bert Breugelmans
- Faculty of Veterinary Science, The University of Melbourne, Parkville, Victoria, Australia
| | - Aaron R Jex
- Faculty of Veterinary Science, The University of Melbourne, Parkville, Victoria, Australia
| | - Pasi K Korhonen
- Faculty of Veterinary Science, The University of Melbourne, Parkville, Victoria, Australia
| | - Stefano Mangiola
- Faculty of Veterinary Science, The University of Melbourne, Parkville, Victoria, Australia
| | - Neil D Young
- Faculty of Veterinary Science, The University of Melbourne, Parkville, Victoria, Australia
| | - Paul W Sternberg
- Howard Hughes Medical Institute (HHMI), Division of Biology, California Institute of Technology, Pasadena, CA, USA
| | - Peter R Boag
- Faculty of Medicine, Nursing and Health Sciences, Monash University, Clayton, Victoria, Australia
| | - Andreas Hofmann
- Faculty of Veterinary Science, The University of Melbourne, Parkville, Victoria, Australia; Structural Chemistry Program, Eskitis Institute, Griffith University, Brisbane, Australia
| | - Robin B Gasser
- Faculty of Veterinary Science, The University of Melbourne, Parkville, Victoria, Australia.
| |
Collapse
|
36
|
Haldane A, Manhart M, Morozov AV. Biophysical fitness landscapes for transcription factor binding sites. PLoS Comput Biol 2014; 10:e1003683. [PMID: 25010228 PMCID: PMC4091707 DOI: 10.1371/journal.pcbi.1003683] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2013] [Accepted: 05/11/2014] [Indexed: 11/18/2022] Open
Abstract
Phenotypic states and evolutionary trajectories available to cell populations are ultimately dictated by complex interactions among DNA, RNA, proteins, and other molecular species. Here we study how evolution of gene regulation in a single-cell eukaryote S. cerevisiae is affected by interactions between transcription factors (TFs) and their cognate DNA sites. Our study is informed by a comprehensive collection of genomic binding sites and high-throughput in vitro measurements of TF-DNA binding interactions. Using an evolutionary model for monomorphic populations evolving on a fitness landscape, we infer fitness as a function of TF-DNA binding to show that the shape of the inferred fitness functions is in broad agreement with a simple functional form inspired by a thermodynamic model of two-state TF-DNA binding. However, the effective parameters of the model are not always consistent with physical values, indicating selection pressures beyond the biophysical constraints imposed by TF-DNA interactions. We find little statistical support for the fitness landscape in which each position in the binding site evolves independently, indicating that epistasis is common in the evolution of gene regulation. Finally, by correlating TF-DNA binding energies with biological properties of the sites or the genes they regulate, we are able to rule out several scenarios of site-specific selection, under which binding sites of the same TF would experience different selection pressures depending on their position in the genome. These findings support the existence of universal fitness landscapes which shape evolution of all sites for a given TF, and whose properties are determined in part by the physics of protein-DNA interactions. Specialized proteins called transcription factors turn genes on and off by binding to short stretches of DNA in their regulatory regions. Precise gene regulation is essential for cellular survival and proliferation, and its evolution and maintenance under mutational pressure are central issues in biology. Here we discuss how evolution of gene regulation is shaped by the need to maintain favorable binding energies between transcription factors and their genomic binding sites. We show that, surprisingly, transcription factor binding is not affected by many biological properties, such as the essentiality of the gene it regulates. Rather, all sites for a given factor appear to evolve under a universal set of constraints, which can be rationalized in terms of a simple model inspired by transcription factor – DNA binding thermodynamics.
Collapse
Affiliation(s)
- Allan Haldane
- Department of Physics and Astronomy, Rutgers University, Piscataway, New Jersey, United States of America
| | - Michael Manhart
- Department of Physics and Astronomy, Rutgers University, Piscataway, New Jersey, United States of America
| | - Alexandre V. Morozov
- Department of Physics and Astronomy, Rutgers University, Piscataway, New Jersey, United States of America
- BioMaPS Institute for Quantitative Biology, Rutgers University, Piscataway, New Jersey, United States of America
- * E-mail:
| |
Collapse
|
37
|
Zarin T, Moses AM. Insights into molecular evolution from yeast genomics. Yeast 2014; 31:233-41. [PMID: 24760744 DOI: 10.1002/yea.3018] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2014] [Revised: 04/09/2014] [Accepted: 04/10/2014] [Indexed: 12/13/2022] Open
Abstract
Enabled by comparative genomics, yeasts have increasingly developed into a powerful model system for molecular evolution. Here we survey several areas in which yeast studies have made important contributions, including regulatory evolution, gene duplication and divergence, evolution of gene order and evolution of complexity. In each area we highlight key studies and findings based on techniques ranging from statistical analysis of large datasets to direct laboratory measurements of fitness. Future work will combine traditional evolutionary genetics analysis and experimental evolution with tools from systems biology to yield mechanistic insight into complex phenotypes.
Collapse
Affiliation(s)
- Taraneh Zarin
- Department of Cell and Systems Biology, University of Toronto, ON, Canada
| | | |
Collapse
|
38
|
Zhang H, Li C, Miao H, Xiong S. Insights from the complete chloroplast genome into the evolution of Sesamum indicum L. PLoS One 2013; 8:e80508. [PMID: 24303020 PMCID: PMC3841184 DOI: 10.1371/journal.pone.0080508] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2013] [Accepted: 10/02/2013] [Indexed: 11/18/2022] Open
Abstract
Sesame (Sesamum indicum L.) is one of the oldest oilseed crops. In order to investigate the evolutionary characters according to the Sesame Genome Project, apart from sequencing its nuclear genome, we sequenced the complete chloroplast genome of S. indicum cv. Yuzhi 11 (white seeded) using Illumina and 454 sequencing. Comparisons of chloroplast genomes between S. indicum and the 18 other higher plants were then analyzed. The chloroplast genome of cv. Yuzhi 11 contains 153,338 bp and a total of 114 unique genes (KC569603). The number of chloroplast genes in sesame is the same as that in Nicotiana tabacum, Vitis vinifera and Platanus occidentalis. The variation in the length of the large single-copy (LSC) regions and inverted repeats (IR) in sesame compared to 18 other higher plant species was the main contributor to size variation in the cp genome in these species. The 77 functional chloroplast genes, except for ycf1 and ycf2, were highly conserved. The deletion of the cp ycf1 gene sequence in cp genomes may be due either to its transfer to the nuclear genome, as has occurred in sesame, or direct deletion, as has occurred in Panax ginseng and Cucumis sativus. The sesame ycf2 gene is only 5,721 bp in length and has lost about 1,179 bp. Nucleotides 1–585 of ycf2 when queried in BLAST had hits in the sesame draft genome. Five repeats (R10, R12, R13, R14 and R17) were unique to the sesame chloroplast genome. We also found that IR contraction/expansion in the cp genome alters its rate of evolution. Chloroplast genes and repeats display the signature of convergent evolution in sesame and other species. These findings provide a foundation for further investigation of cp genome evolution in Sesamum and other higher plants.
Collapse
Affiliation(s)
- Haiyang Zhang
- Henan Sesame Research Center, Henan Academy of Agricultural Sciences, Zhengzhou, People's Republic of China
- * E-mail:
| | - Chun Li
- Henan Sesame Research Center, Henan Academy of Agricultural Sciences, Zhengzhou, People's Republic of China
| | - Hongmei Miao
- Henan Sesame Research Center, Henan Academy of Agricultural Sciences, Zhengzhou, People's Republic of China
| | - Songjin Xiong
- TEDA School of Biological Sciences and Biotechnology, Nankai University, Tianjin, People's Republic of China
| |
Collapse
|
39
|
Simon-Loriere E, Holmes EC, Pagán I. The effect of gene overlapping on the rate of RNA virus evolution. Mol Biol Evol 2013; 30:1916-28. [PMID: 23686658 DOI: 10.1093/molbev/mst094] [Citation(s) in RCA: 38] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023] Open
Abstract
Gene overlapping is widely employed by RNA viruses to generate genetic novelty while retaining a small genome size. However, gene overlapping also increases the deleterious effect of mutations as they affect more than one gene, thereby reducing the evolutionary rate of RNA viruses and hence their adaptive capacity. Although there is general agreement on the benefits of gene overlapping as a mechanism of genomic compression for rapidly evolving organisms, its effect on the pace of RNA virus evolution remains a source of debate. To address this issue, we collected sequence data from 117 instances of gene overlapping across 19 families, 30 genera, and 55 species of RNA viruses. On these data, we analyzed how genetic distances, selective pressures, and the distribution of RNA secondary structures and conserved protein functional domains vary between overlapping (OV) and nonoverlapping (NOV) regions. We show that gene overlapping generally results in a decrease in the rate of RNA virus evolution through a reduction in the frequency of synonymous mutations. However, this effect is less pronounced in genes with a terminal rather than an internal gene overlap, which might result from a greater proportion of protein functional conserved domains in NOV than in OV regions, in turn reducing the number of nonsynonymous mutations in the former. Overall, our analyses clarify the role of gene overlapping as a modulator of the evolutionary rates exhibited by RNA viruses and shed light on the factors that shape the genetic diversity of this important group of pathogens.
Collapse
Affiliation(s)
- Etienne Simon-Loriere
- Institut Pasteur, Unité de Génétique Fonctionnelle des Maladies Infectieuses, Paris, France
| | | | | |
Collapse
|
40
|
Choi SS, Hannenhalli S. Three independent determinants of protein evolutionary rate. J Mol Evol 2013; 76:98-111. [PMID: 23400388 DOI: 10.1007/s00239-013-9543-6] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2012] [Accepted: 01/16/2013] [Indexed: 12/15/2022]
Abstract
One of the most widely accepted ideas related to the evolutionary rates of proteins is that functionally important residues or regions evolve slower than other regions, a reasonable outcome of which should be a slower evolutionary rate of the proteins with a higher density of functionally important sites. Oddly, the role of functional importance, mainly measured by essentiality, in determining evolutionary rate has been challenged in recent studies. Several variables other than protein essentiality, such as expression level, gene compactness, protein-protein interactions, etc., have been suggested to affect protein evolutionary rate. In the present review, we try to refine the concept of functional importance of a gene, and consider three factors-functional importance, expression level, and gene compactness, as independent determinants of evolutionary rate of a protein, based not only on their known correlation with evolutionary rate but also on a reasonable mechanistic model. We suggest a framework based on these mechanistic models to correctly interpret the correlations between evolutionary rates and the various variables as well as the interrelationships among the variables.
Collapse
Affiliation(s)
- Sun Shim Choi
- Department of Medical Biotechnology, College of Biomedical Science, and Institute of Bioscience & Biotechnology, Kangwon National University, Chuncheon, South Korea.
| | | |
Collapse
|
41
|
Differential requirements for mRNA folding partially explain why highly expressed proteins evolve slowly. Proc Natl Acad Sci U S A 2013; 110:E678-86. [PMID: 23382244 DOI: 10.1073/pnas.1218066110] [Citation(s) in RCA: 85] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The cause of the tremendous among-protein variation in the rate of sequence evolution is a central subject of molecular evolution. Expression level has been identified as a leading determinant of this variation among genes encoded in the same genome, but the underlying mechanisms are not fully understood. We here propose and demonstrate that a requirement for stronger folding of more abundant mRNAs results in slower evolution of more highly expressed genes and proteins. Specifically, we show that: (i) the higher the expression level of a gene, the greater the selective pressure for its mRNA to fold; (ii) random mutations are more likely to decrease mRNA folding when occurring in highly expressed genes than in lowly expressed genes; and (iii) amino acid substitution rate is negatively correlated with mRNA folding strength, with or without the control of expression level. Furthermore, synonymous (d(S)) and nonsynonymous (d(N)) nucleotide substitution rates are both negatively correlated with mRNA folding strength. However, counterintuitively, d(S) and d(N) are differentially constrained by selection for mRNA folding, resulting in a significant correlation between mRNA folding strength and d(N)/d(S), even when gene expression level is controlled. The direction and magnitude of this correlation is determined primarily by the G+C frequency at third codon positions. Together, these findings explain why highly expressed genes evolve slowly, demonstrate a major role of natural selection at the mRNA level in constraining protein evolution, and reveal a previously unrecognized and unexpected form of nonprotein-level selection that impacts d(N)/d(S).
Collapse
|
42
|
Protein misinteraction avoidance causes highly expressed proteins to evolve slowly. Proc Natl Acad Sci U S A 2012; 109:E831-40. [PMID: 22416125 DOI: 10.1073/pnas.1117408109] [Citation(s) in RCA: 129] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The tempo and mode of protein evolution have been central questions in biology. Genomic data have shown a strong influence of the expression level of a protein on its rate of sequence evolution (E-R anticorrelation), which is currently explained by the protein misfolding avoidance hypothesis. Here, we show that this hypothesis does not fully explain the E-R anticorrelation, especially for protein surface residues. We propose that natural selection against protein-protein misinteraction, which wastes functional molecules and is potentially toxic, constrains the evolution of surface residues. Because highly expressed proteins are under stronger pressures to avoid misinteraction, surface residues are expected to show an E-R anticorrelation. Our molecular-level evolutionary simulation and yeast genomic analysis confirm multiple predictions of the hypothesis. These findings show a pluralistic origin of the E-R anticorrelation and reveal the role of protein misinteraction, an inherent property of complex cellular systems, in constraining protein evolution.
Collapse
|
43
|
Regular patterns for proteome-wide distribution of protein abundance across species. PLoS One 2012; 7:e32423. [PMID: 22427835 PMCID: PMC3302874 DOI: 10.1371/journal.pone.0032423] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2011] [Accepted: 01/26/2012] [Indexed: 11/26/2022] Open
Abstract
A proteome of the bio-entity, including cell, tissue, organ, and organism, consists of proteins of diverse abundance. The principle that determines the abundance of different proteins in a proteome is of fundamental significance for an understanding of the building blocks of the bio-entity. Here, we report three regular patterns in the proteome-wide distribution of protein abundance across species such as human, mouse, fly, worm, yeast, and bacteria: in most cases, protein abundance is positively correlated with the protein's origination time or sequence conservation during evolution; it is negatively correlated with the protein's domain number and positively correlated with domain coverage in protein structure, and the correlations became stronger during the course of evolution; protein abundance can be further stratified by the function of the protein, whereby proteins that act on material conversion and transportation (mass category) are more abundant than those that act on information modulation (information category). Thus, protein abundance is intrinsically related to the protein's inherent characters of evolution, structure, and function.
Collapse
|
44
|
Level of gene expression is a major determinant of protein evolution in the viral order Mononegavirales. J Virol 2012; 86:5253-63. [PMID: 22345453 DOI: 10.1128/jvi.06050-11] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
Although the rate at which proteins change is a key parameter in molecular evolution, its determinants are poorly understood in viruses. A variety of factors, including gene length, codon usage bias, protein abundance, protein function, and gene expression level, have been shown to affect the rate of protein evolution in a diverse array of organisms. However, the role of these factors in viral evolution has yet to be addressed. The polar 3'-5' stepwise attenuation of transcription in the Mononegavirales, a group of single-strand negative-sense RNA viruses, provides a unique system to explore the determinants of protein evolution in viruses. We analyzed the relative importance of a variety of factors in shaping patterns of sequence variation in full-length genomes from 13 Mononegavirales species. Our analysis suggests that the level of gene expression, and by extension the relative genomic position of each gene, is a key determinant of the protein evolution in these viruses. This appears to be the consequence of selection for translational robustness, but not for translational accuracy, in highly expressed genes. The small genome size and number of proteins encoded by these viruses allowed us to identify other protein-specific factors that may also play a role in virus evolution, such as host-virus interactions and functional constraints. Finally, we explored the evolutionary pressures acting on noncoding regions in Mononegavirales genomes and observed that, despite being less constrained than coding regions, their evolutionary rates are also associated with genomic position.
Collapse
|
45
|
Ramani A, Chuluunbaatar T, Verster A, Na H, Vu V, Pelte N, Wannissorn N, Jiao A, Fraser A. The Majority of Animal Genes Are Required for Wild-Type Fitness. Cell 2012; 148:792-802. [DOI: 10.1016/j.cell.2012.01.019] [Citation(s) in RCA: 45] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2011] [Revised: 10/07/2011] [Accepted: 01/05/2012] [Indexed: 01/18/2023]
|
46
|
Use of comparative genomics approaches to characterize interspecies differences in response to environmental chemicals: challenges, opportunities, and research needs. Toxicol Appl Pharmacol 2011; 271:372-85. [PMID: 22142766 DOI: 10.1016/j.taap.2011.11.011] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2011] [Revised: 11/11/2011] [Accepted: 11/16/2011] [Indexed: 01/12/2023]
Abstract
A critical challenge for environmental chemical risk assessment is the characterization and reduction of uncertainties introduced when extrapolating inferences from one species to another. The purpose of this article is to explore the challenges, opportunities, and research needs surrounding the issue of how genomics data and computational and systems level approaches can be applied to inform differences in response to environmental chemical exposure across species. We propose that the data, tools, and evolutionary framework of comparative genomics be adapted to inform interspecies differences in chemical mechanisms of action. We compare and contrast existing approaches, from disciplines as varied as evolutionary biology, systems biology, mathematics, and computer science, that can be used, modified, and combined in new ways to discover and characterize interspecies differences in chemical mechanism of action which, in turn, can be explored for application to risk assessment. We consider how genetic, protein, pathway, and network information can be interrogated from an evolutionary biology perspective to effectively characterize variations in biological processes of toxicological relevance among organisms. We conclude that comparative genomics approaches show promise for characterizing interspecies differences in mechanisms of action, and further, for improving our understanding of the uncertainties inherent in extrapolating inferences across species in both ecological and human health risk assessment. To achieve long-term relevance and consistent use in environmental chemical risk assessment, improved bioinformatics tools, computational methods robust to data gaps, and quantitative approaches for conducting extrapolations across species are critically needed. Specific areas ripe for research to address these needs are recommended.
Collapse
|
47
|
Testing hypotheses on the rate of molecular evolution in relation to gene expression using microRNAs. Proc Natl Acad Sci U S A 2011; 108:15942-7. [PMID: 21911382 DOI: 10.1073/pnas.1110098108] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
There exists an inverse relationship between the rate of molecular evolution and the level of gene expression. Among the many explanations, the "toxic-error" hypothesis is a most general one, which posits that processing errors may often be toxic to the cells. However, toxic errors that constrain the evolution of highly expressed genes are often difficult to measure. In this study, we test the toxic-error hypothesis by using microRNA (miRNA) genes because their processing errors can be directly measured by deep sequencing. A miRNA gene consists of a small mature product (≈22 nt long) and a "backbone." Our analysis shows that (i) like the mature miRNA, the backbone is highly conserved; (ii) the rate of sequence evolution in the backbone is negatively correlated with expression; and (iii) although conserved between distantly related species, the error rate in miRNA processing is also negatively correlated with the expression level. The observations suggest that, as a miRNA gene becomes more highly (or more ubiquitously) expressed, its sequence evolves toward a structure that minimizes processing errors.
Collapse
|
48
|
Abstract
Despite our extensive knowledge about the rate of protein sequence evolution for thousands of genes in hundreds of species, the corresponding rate of protein function evolution is virtually unknown, especially at the genomic scale. This lack of knowledge is primarily because of the huge diversity in protein function and the consequent difficulty in gauging and comparing rates of protein function evolution. Nevertheless, most proteins function through interacting with other proteins, and protein-protein interaction (PPI) can be tested by standard assays. Thus, the rate of protein function evolution may be measured by the rate of PPI evolution. Here, we experimentally examine 87 potential interactions between Kluyveromyces waltii proteins, whose one to one orthologs in the related budding yeast Saccharomyces cerevisiae have been reported to interact. Combining our results with available data from other eukaryotes, we estimate that the evolutionary rate of protein interaction is (2.6 ± 1.6) × 10(-10) per PPI per year, which is three orders of magnitude lower than the rate of protein sequence evolution measured by the number of amino acid substitutions per protein per year. The extremely slow evolution of protein molecular function may account for the remarkable conservation of life at molecular and cellular levels and allow for studying the mechanistic basis of human disease in much simpler organisms.
Collapse
|
49
|
Impact of gene expression noise on organismal fitness and the efficacy of natural selection. Proc Natl Acad Sci U S A 2011; 108:E67-76. [PMID: 21464323 DOI: 10.1073/pnas.1100059108] [Citation(s) in RCA: 160] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Gene expression noise is a universal phenomenon across all life forms. Although beneficial under certain circumstances, expression noise is generally thought to be deleterious. However, neither the magnitude of the deleterious effect nor the primary mechanism of this effect is known. Here, we model the impact of expression noise on the fitness of unicellular organisms by considering the influence of suboptimal expressions of enzymes on the rate of biomass production and the energetic cost associated with imprecise amounts of protein synthesis. Our theoretical modeling and empirical analysis of yeast data show four findings. (i) Expression noise reduces the mean fitness of a cell by at least 25%, and this reduction cannot be substantially alleviated by gene overexpression. (ii) Higher sensitivity of fitness to the expression fluctuations of essential genes than nonessential genes creates stronger selection against noise in essential genes, resulting in a decrease in their noise. (iii) Reduction of expression noise by genome doubling offers a substantial fitness advantage to diploids over haploids, even in the absence of sex. (iv) Expression noise generates fitness variation among isogenic cells, which lowers the efficacy of natural selection similar to the effect of population shrinkage. Thus, expression noise renders organisms both less adapted and less adaptable. Because expression noise is only one of many manifestations of the stochasticity in cellular molecular processes, our results suggest a much more fundamental role of molecular stochasticity in evolution than is currently appreciated.
Collapse
|
50
|
Yang JR, Zhuang SM, Zhang J. Impact of translational error-induced and error-free misfolding on the rate of protein evolution. Mol Syst Biol 2011; 6:421. [PMID: 20959819 PMCID: PMC2990641 DOI: 10.1038/msb.2010.78] [Citation(s) in RCA: 76] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2010] [Accepted: 08/31/2010] [Indexed: 11/26/2022] Open
Abstract
Theoretical calculations suggest that, in addition to translational error-induced protein misfolding, a non-negligible fraction of misfolded proteins are error free. We propose that the anticorrelation between the expression level of a protein and its rate of sequence evolution be explained by an overarching protein-misfolding-avoidance hypothesis that includes selection against both error-induced and error-free protein misfolding, and verify this model by a molecular-level evolutionary simulation. We provide strong empirical evidence for the protein-misfolding-avoidance hypothesis, including a positive correlation between protein expression level and stability, enrichment of misfolding-minimizing codons and amino acids in highly expressed genes, and stronger evolutionary conservation of residues in which nonsynonymous changes are more likely to increase protein misfolding.
The rate of protein sequence evolution has long been of central interest to molecular evolutionists. Different proteins of the same species evolve at vastly different rates, which is commonly explained by a variation in functional constraint among different proteins (Kimura and Ohta, 1974). However, it is unclear how to quantify the functional constraint of a protein from the knowledge of its function. In the past decade, various types of genomic data from model organisms have been examined to look for the determinants of the rate of protein sequence evolution. The most unexpected discovery was a very strong anticorrelation between the expression level and evolutionary rate of a protein (E–R anticorrelation) (Pal et al, 2001). The prevailing explanation of the E–R anticorrelation is the translational robustness hypothesis (Drummond et al, 2005). This hypothesis posits that mistranslation induces protein misfolding, which is toxic to cells (Figure 1). Consequently, highly expressed proteins are under stronger pressures to be translationally robust and thus are more constrained in sequence evolution. However, the impact of the other source of misfolded proteins, translational error-free proteins (Figure 1), has not been evaluated. By theoretical calculation, computer simulation, and empirical data analysis, we examined the role of selection against both error-induced and error-free protein misfolding in creating the E–R correlation. Our theoretical calculations suggested that a non-negligible fraction of misfolded proteins are error free. We estimated that when a protein is not very stable, on average ∼20% of misfolded molecules are error free. However, when a protein is very stable, this fraction reduces to ∼5%, which is probably a result of natural selection against protein misfolding. We conducted a molecular-level evolutionary simulation (Figure 2A) using three different schemes: error-induced misfolding only, error-free misfolding only, and both types of misfolding. As expected, results from the first simulation are similar to those from a previous study that considers only error-induced misfolding (Drummond and Wilke, 2008). Interestingly, the second and third simulations can also generate the same patterns, including a positive correlation between the protein expression level and the unfolding energy (ΔG) of the error-free protein (Figure 2B), a negative correlation between the expression level and the fraction of protein molecules that misfold after being mistranslated (Figure 2C), a negative correlation between ΔG and the evolutionary rate (Figure 2D), and a negative correlation between the expression level and the evolutionary rate (i.e., the E–R anticorrelation) (Figure 2E). Furthermore, we found that selection against protein misfolding is more effective in reducing error-free misfolding than error-induced misfolding. Based on these results, we propose that an overarching protein-misfolding-avoidance hypothesis that includes both sources of misfolding is superior to the prevailing translational robustness hypothesis, which considers only error-induced misfolding. We tested three key predictions of the protein-misfolding-avoidance hypotheses using yeast data. First, we showed that, consistent with our prediction, a positive correlation exists between the protein expression level and stability, which is measured by the unfolding energy or melting temperature. In addition, protein expression level is negatively correlated with protein aggregation propensity. Second, we found that codons minimizing protein misfolding are used more frequently in highly expressed proteins than in lowly expressed ones. Third, we showed that, within the same protein, amino acid residues in which random nonsynonymous mutations are more likely to increase protein misfolding are evolutionarily more conserved. Together, these results provide unambiguous evidence that avoidance of both error-induced and error-free protein misfolding is a major source of the E–R anticorrelation and that protein stability and mistranslation have important roles in protein evolution. What determines the rate of protein evolution is a fundamental question in biology. Recent genomic studies revealed a surprisingly strong anticorrelation between the expression level of a protein and its rate of sequence evolution. This observation is currently explained by the translational robustness hypothesis in which the toxicity of translational error-induced protein misfolding selects for higher translational robustness of more abundant proteins, which constrains sequence evolution. However, the impact of error-free protein misfolding has not been evaluated. We estimate that a non-negligible fraction of misfolded proteins are error free and demonstrate by a molecular-level evolutionary simulation that selection against protein misfolding results in a greater reduction of error-free misfolding than error-induced misfolding. Thus, an overarching protein-misfolding-avoidance hypothesis that includes both sources of misfolding is superior to the translational robustness hypothesis. We show that misfolding-minimizing amino acids are preferentially used in highly abundant yeast proteins and that these residues are evolutionarily more conserved than other residues of the same proteins. These findings provide unambiguous support to the role of protein-misfolding-avoidance in determining the rate of protein sequence evolution.
Collapse
Affiliation(s)
- Jian-Rong Yang
- Key Laboratory of Gene Engineering of the Ministry of Education, State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-sen University, Guangzhou, PR China
| | | | | |
Collapse
|