1
|
Zhou Y, Liu Y, Gupta S, Paramo MI, Hou Y, Mao C, Luo Y, Judd J, Wierbowski S, Bertolotti M, Nerkar M, Jehi L, Drayman N, Nicolaescu V, Gula H, Tay S, Randall G, Wang P, Lis JT, Feschotte C, Erzurum SC, Cheng F, Yu H. A comprehensive SARS-CoV-2-human protein-protein interactome reveals COVID-19 pathobiology and potential host therapeutic targets. Nat Biotechnol 2023; 41:128-139. [PMID: 36217030 PMCID: PMC9851973 DOI: 10.1038/s41587-022-01474-0] [Citation(s) in RCA: 70] [Impact Index Per Article: 70.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2022] [Accepted: 08/15/2022] [Indexed: 01/25/2023]
Abstract
Studying viral-host protein-protein interactions can facilitate the discovery of therapies for viral infection. We use high-throughput yeast two-hybrid experiments and mass spectrometry to generate a comprehensive SARS-CoV-2-human protein-protein interactome network consisting of 739 high-confidence binary and co-complex interactions, validating 218 known SARS-CoV-2 host factors and revealing 361 novel ones. Our results show the highest overlap of interaction partners between published datasets and of genes differentially expressed in samples from COVID-19 patients. We identify an interaction between the viral protein ORF3a and the human transcription factor ZNF579, illustrating a direct viral impact on host transcription. We perform network-based screens of >2,900 FDA-approved or investigational drugs and identify 23 with significant network proximity to SARS-CoV-2 host factors. One of these drugs, carvedilol, shows clinical benefits for COVID-19 patients in an electronic health records analysis and antiviral properties in a human lung cell line infected with SARS-CoV-2. Our study demonstrates the value of network systems biology to understand human-virus interactions and provides hits for further research on COVID-19 therapeutics.
Collapse
Affiliation(s)
- Yadi Zhou
- Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, Cleveland, OH, USA
| | - Yuan Liu
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY, USA
- Center for Advanced Proteomics, Cornell University, Ithaca, NY, USA
| | - Shagun Gupta
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY, USA
- Center for Advanced Proteomics, Cornell University, Ithaca, NY, USA
- Department of Computational Biology, Cornell University, Ithaca, NY, USA
| | - Mauricio I Paramo
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY, USA
- Center for Advanced Proteomics, Cornell University, Ithaca, NY, USA
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY, USA
| | - Yuan Hou
- Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, Cleveland, OH, USA
| | - Chengsheng Mao
- Division of Health and Biomedical Informatics, Department of Preventive Medicine, Northwestern University, Chicago, IL, USA
| | - Yuan Luo
- Division of Health and Biomedical Informatics, Department of Preventive Medicine, Northwestern University, Chicago, IL, USA
| | - Julius Judd
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY, USA
| | - Shayne Wierbowski
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY, USA
- Center for Advanced Proteomics, Cornell University, Ithaca, NY, USA
- Department of Computational Biology, Cornell University, Ithaca, NY, USA
| | - Marta Bertolotti
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY, USA
- Center for Advanced Proteomics, Cornell University, Ithaca, NY, USA
| | - Mriganka Nerkar
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY, USA
| | - Lara Jehi
- Lerner Research Institute, Cleveland Clinic, Cleveland, OH, USA
| | - Nir Drayman
- Department of Molecular Biology and Biochemistry, University of California, Irvine, Irvine, CA, USA
| | - Vlad Nicolaescu
- Department of Microbiology, Ricketts Laboratory, University of Chicago, Chicago, IL, USA
| | - Haley Gula
- Department of Microbiology, Ricketts Laboratory, University of Chicago, Chicago, IL, USA
| | - Savaş Tay
- Pritzker School of Molecular Engineering, University of Chicago, Chicago, IL, USA
| | - Glenn Randall
- Department of Microbiology, Ricketts Laboratory, University of Chicago, Chicago, IL, USA
| | - Peihui Wang
- Key Laboratory for Experimental Teratology of Ministry of Education and Advanced Medical Research Institute, Cheeloo College of Medicine, Shandong University, Jinan, China
| | - John T Lis
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY, USA
| | - Cédric Feschotte
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY, USA
| | | | - Feixiong Cheng
- Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, Cleveland, OH, USA.
- Case Comprehensive Cancer Center, School of Medicine, Case Western Reserve University, Cleveland, OH, USA.
- Department of Molecular Medicine, Cleveland Clinic Lerner College of Medicine, Case Western Reserve University, Cleveland, OH, USA.
| | - Haiyuan Yu
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY, USA.
- Center for Advanced Proteomics, Cornell University, Ithaca, NY, USA.
- Department of Computational Biology, Cornell University, Ithaca, NY, USA.
| |
Collapse
|
2
|
Weak selection on synonymous codons substantially inflates dN/dS estimates in bacteria. Proc Natl Acad Sci U S A 2021; 118:2023575118. [PMID: 33972434 DOI: 10.1073/pnas.2023575118] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023] Open
Abstract
Synonymous codon substitutions are not always selectively neutral as revealed by several types of analyses, including studies of codon usage patterns among genes. We analyzed codon usage in 13 bacterial genomes sampled from across a large order of bacteria, Enterobacterales, and identified presumptively neutral and selected classes of synonymous substitutions. To estimate substitution rates, given a neutral/selected classification of synonymous substitutions, we developed a flexible [Formula: see text] substitution model that allows multiple classes of synonymous substitutions. Under this multiclass synonymous substitution (MSS) model, the denominator of [Formula: see text] includes only the strictly neutral class of synonymous substitutions. On average, the value of [Formula: see text] under the MSS model was 80% of that under the standard codon model in which all synonymous substitutions are assumed to be neutral. The indication is that conventional [Formula: see text] analyses overestimate these values and thus overestimate the frequency of positive diversifying selection and underestimate the strength of purifying selection. To quantify the strength of selection necessary to explain this reduction, we developed a model of selected compensatory codon substitutions. The reduction in synonymous substitution rate, and thus the contribution that selection makes to codon bias variation among genes, can be adequately explained by very weak selection, with a mean product of population size and selection coefficient, [Formula: see text].
Collapse
|
3
|
Zhou Y, Hou Y, Shen J, Mehra R, Kallianpur A, Culver DA, Gack MU, Farha S, Zein J, Comhair S, Fiocchi C, Stappenbeck T, Chan T, Eng C, Jung JU, Jehi L, Erzurum S, Cheng F. A network medicine approach to investigation and population-based validation of disease manifestations and drug repurposing for COVID-19. PLoS Biol 2020; 18:e3000970. [PMID: 33156843 PMCID: PMC7728249 DOI: 10.1371/journal.pbio.3000970] [Citation(s) in RCA: 113] [Impact Index Per Article: 28.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2020] [Revised: 12/10/2020] [Accepted: 10/28/2020] [Indexed: 01/08/2023] Open
Abstract
The global coronavirus disease 2019 (COVID-19) pandemic, caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), has led to unprecedented social and economic consequences. The risk of morbidity and mortality due to COVID-19 increases dramatically in the presence of coexisting medical conditions, while the underlying mechanisms remain unclear. Furthermore, there are no approved therapies for COVID-19. This study aims to identify SARS-CoV-2 pathogenesis, disease manifestations, and COVID-19 therapies using network medicine methodologies along with clinical and multi-omics observations. We incorporate SARS-CoV-2 virus-host protein-protein interactions, transcriptomics, and proteomics into the human interactome. Network proximity measurement revealed underlying pathogenesis for broad COVID-19-associated disease manifestations. Analyses of single-cell RNA sequencing data show that co-expression of ACE2 and TMPRSS2 is elevated in absorptive enterocytes from the inflamed ileal tissues of Crohn disease patients compared to uninflamed tissues, revealing shared pathobiology between COVID-19 and inflammatory bowel disease. Integrative analyses of metabolomics and transcriptomics (bulk and single-cell) data from asthma patients indicate that COVID-19 shares an intermediate inflammatory molecular profile with asthma (including IRAK3 and ADRB2). To prioritize potential treatments, we combined network-based prediction and a propensity score (PS) matching observational study of 26,779 individuals from a COVID-19 registry. We identified that melatonin usage (odds ratio [OR] = 0.72, 95% CI 0.56-0.91) is significantly associated with a 28% reduced likelihood of a positive laboratory test result for SARS-CoV-2 confirmed by reverse transcription-polymerase chain reaction assay. Using a PS matching user active comparator design, we determined that melatonin usage was associated with a reduced likelihood of SARS-CoV-2 positive test result compared to use of angiotensin II receptor blockers (OR = 0.70, 95% CI 0.54-0.92) or angiotensin-converting enzyme inhibitors (OR = 0.69, 95% CI 0.52-0.90). Importantly, melatonin usage (OR = 0.48, 95% CI 0.31-0.75) is associated with a 52% reduced likelihood of a positive laboratory test result for SARS-CoV-2 in African Americans after adjusting for age, sex, race, smoking history, and various disease comorbidities using PS matching. In summary, this study presents an integrative network medicine platform for predicting disease manifestations associated with COVID-19 and identifying melatonin for potential prevention and treatment of COVID-19.
Collapse
Affiliation(s)
- Yadi Zhou
- Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, Cleveland, Ohio, United States of America
| | - Yuan Hou
- Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, Cleveland, Ohio, United States of America
| | - Jiayu Shen
- Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, Cleveland, Ohio, United States of America
| | - Reena Mehra
- Department of Molecular Medicine, Cleveland Clinic Lerner College of Medicine, Case Western Reserve University, Cleveland, Ohio, United States of America
- Neurological Institute, Cleveland Clinic, Cleveland, Ohio, United States of America
| | - Asha Kallianpur
- Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, Cleveland, Ohio, United States of America
- Department of Molecular Medicine, Cleveland Clinic Lerner College of Medicine, Case Western Reserve University, Cleveland, Ohio, United States of America
| | - Daniel A. Culver
- Lerner Research Institute, Cleveland Clinic, Cleveland, Ohio, United States of America
- Department of Pulmonary Medicine, Respiratory Institute, Cleveland Clinic, Cleveland, Ohio, United States of America
| | - Michaela U. Gack
- Florida Research and Innovation Center, Cleveland Clinic, Port Saint Lucie, Florida, United States of America
| | - Samar Farha
- Lerner Research Institute, Cleveland Clinic, Cleveland, Ohio, United States of America
- Department of Pulmonary Medicine, Respiratory Institute, Cleveland Clinic, Cleveland, Ohio, United States of America
| | - Joe Zein
- Department of Molecular Medicine, Cleveland Clinic Lerner College of Medicine, Case Western Reserve University, Cleveland, Ohio, United States of America
- Lerner Research Institute, Cleveland Clinic, Cleveland, Ohio, United States of America
| | - Suzy Comhair
- Department of Molecular Medicine, Cleveland Clinic Lerner College of Medicine, Case Western Reserve University, Cleveland, Ohio, United States of America
- Lerner Research Institute, Cleveland Clinic, Cleveland, Ohio, United States of America
| | - Claudio Fiocchi
- Department of Molecular Medicine, Cleveland Clinic Lerner College of Medicine, Case Western Reserve University, Cleveland, Ohio, United States of America
- Lerner Research Institute, Cleveland Clinic, Cleveland, Ohio, United States of America
| | - Thaddeus Stappenbeck
- Department of Molecular Medicine, Cleveland Clinic Lerner College of Medicine, Case Western Reserve University, Cleveland, Ohio, United States of America
- Lerner Research Institute, Cleveland Clinic, Cleveland, Ohio, United States of America
| | - Timothy Chan
- Department of Molecular Medicine, Cleveland Clinic Lerner College of Medicine, Case Western Reserve University, Cleveland, Ohio, United States of America
- Lerner Research Institute, Cleveland Clinic, Cleveland, Ohio, United States of America
| | - Charis Eng
- Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, Cleveland, Ohio, United States of America
- Department of Molecular Medicine, Cleveland Clinic Lerner College of Medicine, Case Western Reserve University, Cleveland, Ohio, United States of America
- Lerner Research Institute, Cleveland Clinic, Cleveland, Ohio, United States of America
- Department of Genetics and Genome Sciences, Case Western Reserve University School of Medicine, Cleveland, Ohio, United States of America
- Case Comprehensive Cancer Center, Case Western Reserve University School of Medicine, Cleveland, Ohio, United States of America
| | - Jae U. Jung
- Department of Molecular Medicine, Cleveland Clinic Lerner College of Medicine, Case Western Reserve University, Cleveland, Ohio, United States of America
- Lerner Research Institute, Cleveland Clinic, Cleveland, Ohio, United States of America
| | - Lara Jehi
- Department of Molecular Medicine, Cleveland Clinic Lerner College of Medicine, Case Western Reserve University, Cleveland, Ohio, United States of America
- Neurological Institute, Cleveland Clinic, Cleveland, Ohio, United States of America
| | - Serpil Erzurum
- Department of Molecular Medicine, Cleveland Clinic Lerner College of Medicine, Case Western Reserve University, Cleveland, Ohio, United States of America
- Lerner Research Institute, Cleveland Clinic, Cleveland, Ohio, United States of America
| | - Feixiong Cheng
- Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, Cleveland, Ohio, United States of America
- Department of Molecular Medicine, Cleveland Clinic Lerner College of Medicine, Case Western Reserve University, Cleveland, Ohio, United States of America
- Case Comprehensive Cancer Center, Case Western Reserve University School of Medicine, Cleveland, Ohio, United States of America
| |
Collapse
|
4
|
Alvarez-Ponce D, Feyertag F, Chakraborty S. Position Matters: Network Centrality Considerably Impacts Rates of Protein Evolution in the Human Protein-Protein Interaction Network. Genome Biol Evol 2018; 9:1742-1756. [PMID: 28854629 PMCID: PMC5570066 DOI: 10.1093/gbe/evx117] [Citation(s) in RCA: 30] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/01/2017] [Indexed: 02/06/2023] Open
Abstract
The proteins of any organism evolve at disparate rates. A long list of factors affecting rates of protein evolution have been identified. However, the relative importance of each factor in determining rates of protein evolution remains unresolved. The prevailing view is that evolutionary rates are dominantly determined by gene expression, and that other factors such as network centrality have only a marginal effect, if any. However, this view is largely based on analyses in yeasts, and accurately measuring the importance of the determinants of rates of protein evolution is complicated by the fact that the different factors are often correlated with each other, and by the relatively poor quality of available functional genomics data sets. Here, we use correlation, partial correlation and principal component regression analyses to measure the contributions of several factors to the variability of the rates of evolution of human proteins. For this purpose, we analyzed the entire human protein–protein interaction data set and the human signal transduction network—a network data set of exceptionally high quality, obtained by manual curation, which is expected to be virtually free from false positives. In contrast with the prevailing view, we observe that network centrality (measured as the number of physical and nonphysical interactions, betweenness, and closeness) has a considerable impact on rates of protein evolution. Surprisingly, the impact of centrality on rates of protein evolution seems to be comparable, or even superior according to some analyses, to that of gene expression. Our observations seem to be independent of potentially confounding factors and from the limitations (biases and errors) of interactomic data sets.
Collapse
|
5
|
Jacobs C, Lambourne L, Xia Y, Segrè D. Upon Accounting for the Impact of Isoenzyme Loss, Gene Deletion Costs Anticorrelate with Their Evolutionary Rates. PLoS One 2017; 12:e0170164. [PMID: 28107392 PMCID: PMC5249160 DOI: 10.1371/journal.pone.0170164] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2016] [Accepted: 12/30/2016] [Indexed: 12/19/2022] Open
Abstract
System-level metabolic network models enable the computation of growth and metabolic phenotypes from an organism's genome. In particular, flux balance approaches have been used to estimate the contribution of individual metabolic genes to organismal fitness, offering the opportunity to test whether such contributions carry information about the evolutionary pressure on the corresponding genes. Previous failure to identify the expected negative correlation between such computed gene-loss cost and sequence-derived evolutionary rates in Saccharomyces cerevisiae has been ascribed to a real biological gap between a gene's fitness contribution to an organism "here and now" and the same gene's historical importance as evidenced by its accumulated mutations over millions of years of evolution. Here we show that this negative correlation does exist, and can be exposed by revisiting a broadly employed assumption of flux balance models. In particular, we introduce a new metric that we call "function-loss cost", which estimates the cost of a gene loss event as the total potential functional impairment caused by that loss. This new metric displays significant negative correlation with evolutionary rate, across several thousand minimal environments. We demonstrate that the improvement gained using function-loss cost over gene-loss cost is explained by replacing the base assumption that isoenzymes provide unlimited capacity for backup with the assumption that isoenzymes are completely non-redundant. We further show that this change of the assumption regarding isoenzymes increases the recall of epistatic interactions predicted by the flux balance model at the cost of a reduction in the precision of the predictions. In addition to suggesting that the gene-to-reaction mapping in genome-scale flux balance models should be used with caution, our analysis provides new evidence that evolutionary gene importance captures much more than strict essentiality.
Collapse
Affiliation(s)
- Christopher Jacobs
- Bioinformatics Program, Boston University, Boston, Massachusetts, United States of America
| | - Luke Lambourne
- Bioinformatics Program, Boston University, Boston, Massachusetts, United States of America
- Department of Bioengineering, Faculty of Engineering, McGill University, Montreal, Quebec, Canada
| | - Yu Xia
- Bioinformatics Program, Boston University, Boston, Massachusetts, United States of America
- Department of Bioengineering, Faculty of Engineering, McGill University, Montreal, Quebec, Canada
| | - Daniel Segrè
- Bioinformatics Program, Boston University, Boston, Massachusetts, United States of America
- Department of Biology, Boston University, Boston, Massachusetts, United States of America
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts, United States of America
| |
Collapse
|
6
|
Proteome-Scale Investigation of Protein Allosteric Regulation Perturbed by Somatic Mutations in 7,000 Cancer Genomes. Am J Hum Genet 2017; 100:5-20. [PMID: 27939638 DOI: 10.1016/j.ajhg.2016.09.020] [Citation(s) in RCA: 57] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2016] [Accepted: 09/27/2016] [Indexed: 02/05/2023] Open
Abstract
The allosteric regulation triggering the protein's functional activity via conformational changes is an intrinsic function of protein under many physiological and pathological conditions, including cancer. Identification of the biological effects of specific somatic variants on allosteric proteins and the phenotypes that they alter during tumor initiation and progression is a central challenge for cancer genomes in the post-genomic era. Here, we mapped more than 47,000 somatic missense mutations observed in approximately 7,000 tumor-normal matched samples across 33 cancer types into protein allosteric sites to prioritize the mutated allosteric proteins and we tested our prediction in cancer cell lines. We found that the deleterious mutations identified in cancer genomes were more significantly enriched at protein allosteric sites than tolerated mutations, suggesting a critical role for protein allosteric variants in cancer. Next, we developed a statistical approach, namely AlloDriver, and further identified 15 potential mutated allosteric proteins during pan-cancer and individual cancer-type analyses. More importantly, we experimentally confirmed that p.Pro360Ala on PDE10A played a potential oncogenic role in mediating tumorigenesis in non-small cell lung cancer (NSCLC). In summary, these findings shed light on the role of allosteric regulation during tumorigenesis and provide a useful tool for the timely development of targeted cancer therapies.
Collapse
|
7
|
Systems Biology-Based Investigation of Cellular Antiviral Drug Targets Identified by Gene-Trap Insertional Mutagenesis. PLoS Comput Biol 2016; 12:e1005074. [PMID: 27632082 PMCID: PMC5025164 DOI: 10.1371/journal.pcbi.1005074] [Citation(s) in RCA: 40] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2016] [Accepted: 07/22/2016] [Indexed: 02/05/2023] Open
Abstract
Viruses require host cellular factors for successful replication. A comprehensive systems-level investigation of the virus-host interactome is critical for understanding the roles of host factors with the end goal of discovering new druggable antiviral targets. Gene-trap insertional mutagenesis is a high-throughput forward genetics approach to randomly disrupt (trap) host genes and discover host genes that are essential for viral replication, but not for host cell survival. In this study, we used libraries of randomly mutagenized cells to discover cellular genes that are essential for the replication of 10 distinct cytotoxic mammalian viruses, 1 gram-negative bacterium, and 5 toxins. We herein reported 712 candidate cellular genes, characterizing distinct topological network and evolutionary signatures, and occupying central hubs in the human interactome. Cell cycle phase-specific network analysis showed that host cell cycle programs played critical roles during viral replication (e.g. MYC and TAF4 regulating G0/1 phase). Moreover, the viral perturbation of host cellular networks reflected disease etiology in that host genes (e.g. CTCF, RHOA, and CDKN1B) identified were frequently essential and significantly associated with Mendelian and orphan diseases, or somatic mutations in cancer. Computational drug repositioning framework via incorporating drug-gene signatures from the Connectivity Map into the virus-host interactome identified 110 putative druggable antiviral targets and prioritized several existing drugs (e.g. ajmaline) that may be potential for antiviral indication (e.g. anti-Ebola). In summary, this work provides a powerful methodology with a tight integration of gene-trap insertional mutagenesis testing and systems biology to identify new antiviral targets and drugs for the development of broadly acting and targeted clinical antiviral therapeutics. Infectious diseases result in millions of deaths and cost billions of dollars annually. Hence, there is urgency for developing more innovative and effective antiviral therapeutics. In this study, we used libraries of randomly mutagenized cells to discover cellular genes that are essential for the replication of 10 distinct cytotoxic mammalian viruses. We herein reported over 700 candidate cellular genes, over 20% of which were independently selected by multiple viruses in one or more cell types. Using systems biology-based analysis, we found that host genes associated with viral replication tended to occupy central hubs in the human protein interactome and to be ancient genes with low evolutionary rates, compared to non-virus-associated genes. Cell cycle phase-specific sub-network analysis showed that host cell cycle program played important roles during viral replication by regulating specific cell cycle phases. Moreover, we presented novel evidences to suggest that host genes supporting viral replication were frequently implicated in Mendelian and orphan diseases, or played critical roles in cancer. Importantly, we found approximately 110 new putative druggable antiviral targets by merging genome-wide gene-trap insertional mutagenesis, drug-gene network, and bioinformatics data. Furthermore, we have demonstrated the use of a computable representation of genetic testing to effectively identify new potential antiviral indications for existing drugs. In summary, this study presents new and important methodologies for developing broadly active antiviral therapeutics.
Collapse
|
8
|
Ogishima S, Tanaka H, Nakaya J. Modularity in the evolution of yeast protein interaction network. Bioinformation 2015; 11:127-30. [PMID: 25914446 PMCID: PMC4403033 DOI: 10.6026/97320630011127] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2014] [Accepted: 07/14/2014] [Indexed: 11/23/2022] Open
Abstract
Protein interaction networks are known to exhibit remarkable structures: scale-free and small-world and modular structures. To explain the evolutionary processes of protein interaction networks possessing scale-free and small-world structures, preferential attachment and duplication-divergence models have been proposed as mathematical models. Protein interaction networks are also known to exhibit another remarkable structural characteristic, modular structure. How the protein interaction networks became to exhibit modularity in their evolution? Here, we propose a hypothesis of modularity in the evolution of yeast protein interaction network based on molecular evolutionary evidence. We assigned yeast proteins into six evolutionary ages by constructing a phylogenetic profile. We found that all the almost half of hub proteins are evolutionarily new. Examining the evolutionary processes of protein complexes, functional modules and topological modules, we also found that member proteins of these modules tend to appear in one or two evolutionary ages. Moreover, proteins in protein complexes and topological modules show significantly low evolutionary rates than those not in these modules. Our results suggest a hypothesis of modularity in the evolution of yeast protein interaction network as systems evolution.
Collapse
Affiliation(s)
- Soichi Ogishima
- Department of Bioclinical Informatics, Tohoku Medical and Megabank Organization, Tohoku University, Seiryo-cho 4-1, Aoba-ku, Sendai-shi Miyagi 980-8575 Japan
| | - Hiroshi Tanaka
- Department of Bioclinical Informatics, Tohoku Medical and Megabank Organization, Tohoku University, Seiryo-cho 4-1, Aoba-ku, Sendai-shi Miyagi 980-8575 Japan
- Department of Bioinformatics, Tokyo Medical and Dental University, Yushima 1-5-45, Bunkyo-ku, Tokyo 113-8510 Japan
| | - Jun Nakaya
- Department of Bioclinical Informatics, Tohoku Medical and Megabank Organization, Tohoku University, Seiryo-cho 4-1, Aoba-ku, Sendai-shi Miyagi 980-8575 Japan
| |
Collapse
|
9
|
Kisslov I, Naamati A, Shakarchy N, Pines O. Dual-targeted proteins tend to be more evolutionarily conserved. Mol Biol Evol 2014; 31:2770-9. [PMID: 25063438 DOI: 10.1093/molbev/msu221] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
In eukaryotic cells, identical proteins can be located in more than a single subcellular compartment, a phenomenon termed dual targeting. We hypothesized that dual-targeted proteins should be more evolutionary conserved than exclusive mitochondrial proteins, due to separate selective pressures administered by the different compartments to maintain the functions associated with the protein sequences. We employed codon usage bias, propensity for gene loss, phylogenetic relationships, conservation analysis at the DNA level, and gene expression, to test our hypothesis. Our findings indicate that, indeed, dual-targeted proteins are significantly more conserved than their exclusively targeted counterparts. We then used this trait of gene conservation, together with previously identified traits of dual-targeted proteins (such as protein net charge and mitochondrial targeting sequence strength) to 1) create, for the first time (due to addition of conservation parameters), a tool for the prediction of dual-targeted mitochondrial proteins based on protein and mRNA sequences, and 2) show that molecular mechanisms involving one versus two translation products are not correlated with specific dual-targeting parameters. Finally, we discuss what evolutionary pressure maintains protein dual targeting in eukaryotes and deduce, as we initially hypothesized, that it is the discrete functions of these proteins in the different subcellular compartments, regardless of their dual-targeting mechanism.
Collapse
Affiliation(s)
- Irit Kisslov
- Department of Microbiology Molecular Genetics, IMRIC, Faculty of Medicine, Hebrew University of Jerusalem, Jerusalem, Israel
| | - Adi Naamati
- Department of Microbiology Molecular Genetics, IMRIC, Faculty of Medicine, Hebrew University of Jerusalem, Jerusalem, Israel Department of Microbiology Molecular Genetics, IMRIC, Faculty of Medicine, Hebrew University of Jerusalem, Jerusalem, Israel
| | - Nitzan Shakarchy
- Department of Microbiology Molecular Genetics, IMRIC, Faculty of Medicine, Hebrew University of Jerusalem, Jerusalem, Israel
| | - Ophry Pines
- Department of Microbiology Molecular Genetics, IMRIC, Faculty of Medicine, Hebrew University of Jerusalem, Jerusalem, Israel CREATE-NUS-HUJ Cellular & Molecular Mechanisms of Inflammation Program, National University of Singapore, Singapore
| |
Collapse
|
10
|
Cheng F, Jia P, Wang Q, Lin CC, Li WH, Zhao Z. Studying tumorigenesis through network evolution and somatic mutational perturbations in the cancer interactome. Mol Biol Evol 2014; 31:2156-69. [PMID: 24881052 DOI: 10.1093/molbev/msu167] [Citation(s) in RCA: 73] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
Cells govern biological functions through complex biological networks. Perturbations to networks may drive cells to new phenotypic states, for example, tumorigenesis. Identifying how genetic lesions perturb molecular networks is a fundamental challenge. This study used large-scale human interactome data to systematically explore the relationship among network topology, somatic mutation, evolutionary rate, and evolutionary origin of cancer genes. We found the unique network centrality of cancer proteins, which is largely independent of gene essentiality. Cancer genes likely have experienced a lower evolutionary rate and stronger purifying selection than those of noncancer, Mendelian disease, and orphan disease genes. Cancer proteins tend to have ancient histories, likely originated in early metazoan, although they are younger than proteins encoded by Mendelian disease genes, orphan disease genes, and essential genes. We found that the protein evolutionary origin (age) positively correlates with protein connectivity in the human interactome. Furthermore, we investigated the network-attacking perturbations due to somatic mutations identified from 3,268 tumors across 12 cancer types in The Cancer Genome Atlas. We observed a positive correlation between protein connectivity and the number of nonsynonymous somatic mutations, whereas a weaker or insignificant correlation between protein connectivity and the number of synonymous somatic mutations. These observations suggest that somatic mutational network-attacking perturbations to hub genes play an important role in tumor emergence and evolution. Collectively, this work has broad biomedical implications for both basic cancer biology and the development of personalized cancer therapy.
Collapse
Affiliation(s)
- Feixiong Cheng
- Department of Biomedical Informatics, Vanderbilt University School of Medicine
| | - Peilin Jia
- Department of Biomedical Informatics, Vanderbilt University School of Medicine
| | - Quan Wang
- Department of Biomedical Informatics, Vanderbilt University School of Medicine
| | - Chen-Ching Lin
- Department of Biomedical Informatics, Vanderbilt University School of Medicine
| | - Wen-Hsiung Li
- Department of Ecology and Evolution, University of ChicagoBiodiversity Research Center and Genomics Research Center, Academia Sinica, Taipei, Taiwan
| | - Zhongming Zhao
- Department of Biomedical Informatics, Vanderbilt University School of MedicineDepartment of Cancer Biology, Vanderbilt University School of MedicineDepartment of Psychiatry, Vanderbilt University School of MedicineCenter for Quantitative Sciences, Vanderbilt University Medical Center
| |
Collapse
|
11
|
Chang X, Xu T, Li Y, Wang K. Dynamic modular architecture of protein-protein interaction networks beyond the dichotomy of 'date' and 'party' hubs. Sci Rep 2013; 3:1691. [PMID: 23603706 PMCID: PMC3631766 DOI: 10.1038/srep01691] [Citation(s) in RCA: 51] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2012] [Accepted: 03/26/2013] [Indexed: 02/07/2023] Open
Abstract
The protein-protein interaction (PPI) networks are dynamically organized as modules, and are typically described by hub dichotomy: 'party' hubs act as intramodule hubs and are coexpressed with their partners, yet 'date' hubs act as coordinators among modules and are incoherently expressed with their partners. However, there remains skepticism about the existence of hub dichotomy. Since different algorithms and data sets were used in previous studies to test the model of hub classification, the conclusions may be largely influenced by the potential inherent biases. In this study, we evaluated two data sets of yeast interactome, and systematically investigated the behavior of hubs from multiple perspectives including co-expression patterns, topological roles and functional classifications. Our results revealed consistency between the two data sets, confirming the presence of hub dichotomy. Furthermore, we analyzed a human interactome data set, and demonstrated that the modular architecture of the PPI networks was more complicated than hub dichotomy.
Collapse
Affiliation(s)
- Xiao Chang
- Zilkha Neurogenetic Institute, Keck School of Medicine, University of Southern California, Los Angeles, USA
| | | | | | | |
Collapse
|
12
|
Smith JD, McManus KF, Fraser HB. A novel test for selection on cis-regulatory elements reveals positive and negative selection acting on mammalian transcriptional enhancers. Mol Biol Evol 2013; 30:2509-18. [PMID: 23904330 DOI: 10.1093/molbev/mst134] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022] Open
Abstract
Measuring natural selection on genomic elements involved in the cis-regulation of gene expression--such as transcriptional enhancers and promoters--is critical for understanding the evolution of genomes, yet it remains a major challenge. Many studies have attempted to detect positive or negative selection in these noncoding elements by searching for those with the fastest or slowest rates of evolution, but this can be problematic. Here, we introduce a new approach to this issue, and demonstrate its utility on three mammalian transcriptional enhancers. Using results from saturation mutagenesis studies of these enhancers, we classified all possible point mutations as upregulating, downregulating, or silent, and determined which of these mutations have occurred on each branch of a phylogeny. Applying a framework analogous to Ka/Ks in protein-coding genes, we measured the strength of selection on upregulating and downregulating mutations, in specific branches as well as entire phylogenies. We discovered distinct modes of selection acting on different enhancers: although all three have experienced negative selection against downregulating mutations, the selection pressures on upregulating mutations vary. In one case, we detected positive selection for upregulation, whereas the other two had no detectable selection on upregulating mutations. Our methodology is applicable to the growing number of saturation mutagenesis data sets, and provides a detailed picture of the mode and strength of natural selection acting on cis-regulatory elements.
Collapse
|
13
|
Ding Y, Shah P, Plotkin JB. Weak 5'-mRNA secondary structures in short eukaryotic genes. Genome Biol Evol 2013; 4:1046-53. [PMID: 23034215 PMCID: PMC3490412 DOI: 10.1093/gbe/evs082] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023] Open
Abstract
Experimental studies of translation have found that short genes tend to exhibit greater densities of ribosomes than long genes in eukaryotic species. It remains an open question whether the elevated ribosome density on short genes is due to faster initiation or slower elongation dynamics. Here, we address this question computationally using 5′-mRNA folding energy as a proxy for translation initiation rates and codon bias as a proxy for elongation rates. We report a significant trend toward reduced 5′-secondary structure in shorter coding sequences, suggesting that short genes initiate faster during translation. We also find a trend toward higher 5′-codon bias in short genes, suggesting that short genes elongate faster than long genes. Both of these trends hold across a diverse set of eukaryotic taxa. Thus, the elevated ribosome density on short eukaryotic genes is likely caused by differential rates of initiation, rather than differential rates of elongation.
Collapse
Affiliation(s)
- Yang Ding
- Department of Biology, University of Pennsylvania, PA, USA
| | | | | |
Collapse
|
14
|
Kim J, Kim I, Han SK, Bowie JU, Kim S. Network rewiring is an important mechanism of gene essentiality change. Sci Rep 2012. [PMID: 23198090 PMCID: PMC3509348 DOI: 10.1038/srep00900] [Citation(s) in RCA: 41] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023] Open
Abstract
Gene essentiality changes are crucial for organismal evolution. However, it is unclear how essentiality of orthologs varies across species. We investigated the underlying mechanism of gene essentiality changes between yeast and mouse based on the framework of network evolution and comparative genomic analysis. We found that yeast nonessential genes become essential in mouse when their network connections rapidly increase through engagement in protein complexes. The increased interactions allowed the previously nonessential genes to become members of vital pathways. By accounting for changes in gene essentiality, we firmly reestablished the centrality-lethality rule, which proposed the relationship of essential genes and network hubs. Furthermore, we discovered that the number of connections associated with essential and non-essential genes depends on whether they were essential in ancestral species. Our study describes for the first time how network evolution occurs to change gene essentiality.
Collapse
Affiliation(s)
- Jinho Kim
- Division of Molecular and Life Science, Pohang University of Science and Technology, Pohang 790-784, Korea
| | | | | | | | | |
Collapse
|
15
|
Moura GR, Pinheiro M, Freitas A, Oliveira JL, Frommlet JC, Carreto L, Soares AR, Bezerra AR, Santos MAS. Species-specific codon context rules unveil non-neutrality effects of synonymous mutations. PLoS One 2011; 6:e26817. [PMID: 22046369 PMCID: PMC3202573 DOI: 10.1371/journal.pone.0026817] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2011] [Accepted: 10/05/2011] [Indexed: 11/18/2022] Open
Abstract
Background Codon pair usage (codon context) is a species specific gene primary structure feature whose evolutionary and functional roles are poorly understood. The data available show that codon-context has direct impact on both translation accuracy and efficiency, but one does not yet understand how it affects these two translation variables or whether context biases shape gene evolution. Methodologies/Principal Findings Here we study codon-context biases using a set of 72 orthologous highly conserved genes from bacteria, archaea, fungi and high eukaryotes to identify 7 distinct groups of codon context rules. We show that synonymous mutations, i.e., neutral mutations that occur in synonymous codons of codon-pairs, are selected to maintain context biases and that non-synonymous mutations, i.e., non-neutral mutations that alter protein amino acid sequences, are also under selective pressure to preserve codon-context biases. Conclusions Since in vivo studies provide evidence for a role of codon context on decoding fidelity in E. coli and for decoding efficiency in mammalian cells, our data support the hypothesis that, like codon usage, codon context modulates the evolution of gene primary structure and fine tunes the structure of open reading frames for high genome translational fidelity and efficiency in the 3 domains of life.
Collapse
Affiliation(s)
- Gabriela R Moura
- RNA Biology Laboratory, Department of Biology and CESAM, University of Aveiro, Aveiro, Portugal.
| | | | | | | | | | | | | | | | | |
Collapse
|
16
|
Wang GZ, Lercher MJ. The effects of network neighbours on protein evolution. PLoS One 2011; 6:e18288. [PMID: 21532755 PMCID: PMC3075247 DOI: 10.1371/journal.pone.0018288] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2010] [Accepted: 03/02/2011] [Indexed: 11/19/2022] Open
Abstract
Interacting proteins may often experience similar selection pressures. Thus, we may expect that neighbouring proteins in biological interaction networks evolve at similar rates. This has been previously shown for protein-protein interaction networks. Similarly, we find correlated rates of evolution of neighbours in networks based on co-expression, metabolism, and synthetic lethal genetic interactions. While the correlations are statistically significant, their magnitude is small, with network effects explaining only between 2% and 7% of the variation. The strongest known predictor of the rate of protein evolution remains expression level. We confirmed the previous observation that similar expression levels of neighbours indeed explain their similar evolution rates in protein-protein networks, and showed that the same is true for metabolic networks. In co-expression and synthetic lethal genetic interaction networks, however, neighbouring genes still show somewhat similar evolutionary rates even after simultaneously controlling for expression level, gene essentiality and gene length. Thus, similar expression levels and related functions (as inferred from co-expression and synthetic lethal interactions) seem to explain correlated evolutionary rates of network neighbours across all currently available types of biological networks.
Collapse
Affiliation(s)
| | - Martin J. Lercher
- Institute for Computer Science, Heinrich-Heine-University, Düsseldorf, Germany
- * E-mail:
| |
Collapse
|
17
|
Cutter AD, Moses AM. Polymorphism, divergence, and the role of recombination in Saccharomyces cerevisiae genome evolution. Mol Biol Evol 2011; 28:1745-54. [PMID: 21199893 DOI: 10.1093/molbev/msq356] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023] Open
Abstract
A contentious issue in molecular evolution and population genetics concerns the roles of recombination as a facilitator of natural selection and as a potential source of mutational input into genomes. The budding yeast Saccharomyces cerevisiae, in particular, has injected both insights and confusion into this topic, as an early system subject to genomic analysis with subsequent conflicting reports. Here, we revisit the role of recombination in mutation and selection with recent genome-wide maps of population polymorphism and recombination for S. cerevisiae. We confirm that recombination-associated mutation does not leave a genomic signature in yeast and conclude that a previously observed, enigmatic, negative recombination-divergence correlation is largely a consequence of weak selection and other genomic covariates. We also corroborate the presence of biased gene conversion from patterns of polymorphism. Moreover, we identify significant positive relations between recombination and population polymorphism at putatively neutrally evolving sites, independent of other factors and the genomic scale of interrogation. We conclude that widespread natural selection across the yeast genome has left its imprint on segregating genetic variation, but that this signature is much weaker than in Drosophila and Caenorhabditis.
Collapse
Affiliation(s)
- Asher D Cutter
- Department of Ecology and Evolutionary Biology, University of Toronto, Toronto, ON, Canada.
| | | |
Collapse
|
18
|
Zhou T, Gu W, Wilke CO. Detecting positive and purifying selection at synonymous sites in yeast and worm. Mol Biol Evol 2010; 27:1912-22. [PMID: 20231333 DOI: 10.1093/molbev/msq077] [Citation(s) in RCA: 52] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
We present a new computational method to identify positive and purifying selection at synonymous sites in yeast and worm. We define synonymous substitutions that change codons from preferred to unpreferred or vice versa as nonconservative synonymous substitutions and all other substitutions as conservative. Using a maximum-likelihood framework, we then test whether conservative and nonconservative synonymous substitutions occur at equal rates. Our approach replaces the standard rate of synonymous substitutions per synonymous site, dS, with two new rates, the conservative synonymous substitution rate (dS(C)) and the nonconservative synonymous substitution rate (dS(N)). Based on the ratio dS(N)/dS(C), we find that 0.05% of all yeast genes and none of worm genes show evidence of positive selection at synonymous sites (dS(N)/dS(C) > 1). On the other hand, 9.44% of all yeast genes and 5.12% of all worm genes show evidence of significant purifying selection on synonymous sites (dS(N)/dS(C) < 1). We also find that dS(N) correlates strongly with gene expression level, whereas the correlation between expression level and dS(C) is very weak. Thus, dS(N) captures most of the signal of selection for translational accuracy and speed, whereas dS(C) is not strongly influenced by this selection pressure. We suggest that the ratio dN/dS(C) may be more appropriate than the ratio dN/dS to identify positive or purifying selection on amino acids.
Collapse
Affiliation(s)
- Tong Zhou
- Center for Computational Biology and Bioinformatics, University of Texas at Austin, TX, USA
| | | | | |
Collapse
|
19
|
He J, Deem MW. Hierarchical evolution of animal body plans. Dev Biol 2010; 337:157-61. [DOI: 10.1016/j.ydbio.2009.09.038] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2009] [Revised: 08/15/2009] [Accepted: 09/24/2009] [Indexed: 11/28/2022]
|
20
|
Jovelin R. Rapid sequence evolution of transcription factors controlling neuron differentiation in Caenorhabditis. Mol Biol Evol 2009; 26:2373-86. [PMID: 19589887 DOI: 10.1093/molbev/msp142] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
Whether phenotypic evolution proceeds predominantly through changes in regulatory sequences is a controversial issue in evolutionary genetics. Ample evidence indicates that the evolution of gene regulatory networks via changes in cis-regulatory sequences is an important determinant of phenotypic diversity. However, recent experimental work suggests that the role of transcription factor (TF) divergence in developmental evolution may be underestimated. In order to help understand what levels of constraints are acting on the coding sequence of developmental regulatory genes, evolutionary rates were investigated among 48 TFs required for neuronal development in Caenorhabditis elegans. Allelic variation was then sampled for 28 of these genes within a population of the related species Caenorhabditis remanei. Neuronal TFs are more divergent, both within and between species, than structural genes. TFs affecting different neuronal classes are under different levels of selective constraints. The regulatory genes controlling the differentiation of chemosensory neurons evolve particularly fast and exhibit higher levels of within- and between-species nucleotide variation than TFs required for the development of several neuronal classes and TFs required for motorneuron differentiation. The TFs affecting chemosensory neuron development are also more divergent than chemosensory genes expressed in the neurons they differentiate. These results illustrate that TFs are not as highly constrained as commonly thought and suggest that the role of divergence in developmental regulatory genes during the evolution of gene regulatory networks requires further attention.
Collapse
Affiliation(s)
- Richard Jovelin
- Center for Ecology and Evolutionary Biology, University of Oregon, Oregon, USA.
| |
Collapse
|
21
|
Xia Y, Franzosa EA, Gerstein MB. Integrated assessment of genomic correlates of protein evolutionary rate. PLoS Comput Biol 2009; 5:e1000413. [PMID: 19521505 PMCID: PMC2688033 DOI: 10.1371/journal.pcbi.1000413] [Citation(s) in RCA: 54] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2008] [Accepted: 05/12/2009] [Indexed: 12/25/2022] Open
Abstract
Rates of evolution differ widely among proteins, but the causes and consequences of such differences remain under debate. With the advent of high-throughput functional genomics, it is now possible to rigorously assess the genomic correlates of protein evolutionary rate. However, dissecting the correlations among evolutionary rate and these genomic features remains a major challenge. Here, we use an integrated probabilistic modeling approach to study genomic correlates of protein evolutionary rate in Saccharomyces cerevisiae. We measure and rank degrees of association between (i) an approximate measure of protein evolutionary rate with high genome coverage, and (ii) a diverse list of protein properties (sequence, structural, functional, network, and phenotypic). We observe, among many statistically significant correlations, that slowly evolving proteins tend to be regulated by more transcription factors, deficient in predicted structural disorder, involved in characteristic biological functions (such as translation), biased in amino acid composition, and are generally more abundant, more essential, and enriched for interaction partners. Many of these results are in agreement with recent studies. In addition, we assess information contribution of different subsets of these protein properties in the task of predicting slowly evolving proteins. We employ a logistic regression model on binned data that is able to account for intercorrelation, non-linearity, and heterogeneity within features. Our model considers features both individually and in natural ensembles (“meta-features”) in order to assess joint information contribution and degree of contribution independence. Meta-features based on protein abundance and amino acid composition make strong, partially independent contributions to the task of predicting slowly evolving proteins; other meta-features make additional minor contributions. The combination of all meta-features yields predictions comparable to those based on paired species comparisons, and approaching the predictive limit of optimal lineage-insensitive features. Our integrated assessment framework can be readily extended to other correlational analyses at the genome scale. Proteins encoded within a given genome are known to evolve at drastically different rates. Through recent large-scale studies, researchers have measured a wide variety of properties for all proteins in yeast. We are interested to know how these properties relate to one another and to what extent they explain evolutionary rate variation. Protein properties are a heterogeneous mix, a factor which complicates research in this area. For example, some properties (e.g., protein abundance) are numerical, while others (e.g., protein function) are descriptive; protein properties may also suffer from noise and hidden redundancies. We have addressed these issues within a flexible and robust statistical framework. We first ranked a large list of protein properties by the strength of their relationships with evolutionary rate; this confirms many known evolutionary relationships and also highlights several new ones. Similar protein properties were then grouped and applied to predict slowly evolving proteins. Some of these groups were as effective as paired species comparison in making correct predictions, although in both cases a great deal of evolutionary rate variation remained to be explained. Our work has helped to refine the set of protein properties that researchers should consider as they investigate the mechanisms underlying protein evolution.
Collapse
Affiliation(s)
- Yu Xia
- Bioinformatics Program, Boston University, Boston, Massachusetts, USA.
| | | | | |
Collapse
|
22
|
Tuller T, Kupiec M, Ruppin E. Co-evolutionary networks of genes and cellular processes across fungal species. Genome Biol 2009; 10:R48. [PMID: 19416514 PMCID: PMC2718514 DOI: 10.1186/gb-2009-10-5-r48] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2009] [Revised: 02/24/2009] [Accepted: 05/05/2009] [Indexed: 12/29/2022] Open
Abstract
BACKGROUND The introduction of measures such as evolutionary rate and propensity for gene loss have significantly advanced our knowledge of the evolutionary history and selection forces acting upon individual genes and cellular processes. RESULTS We present two new measures, the 'relative evolutionary rate pattern' (rERP), which records the relative evolutionary rates of conserved genes across the different branches of a species' phylogenetic tree, and the 'copy number pattern' (CNP), which quantifies the rate of gene loss of less conserved genes. Together, these measures yield a high-resolution study of the co-evolution of genes in 9 fungal species, spanning 3,540 sets of orthologs. We find that the evolutionary tempo of conserved genes varies in different evolutionary periods. The co-evolution of genes' Gene Ontology categories exhibits a significant correlation with their functional distance in the Gene Ontology hierarchy, but not with their location on chromosomes, showing that cellular functions are a more important driving force in gene co-evolution than their chromosomal proximity. Two fundamental patterns of co-evolution of conserved genes, cooperative and reciprocal, are identified; only genes co-evolving cooperatively functionally back each other up. The co-evolution of conserved and less conserved genes exhibits both commonalities and differences; DNA metabolism is positively correlated with nuclear traffic, transcription processes and vacuolar biology in both analyses. CONCLUSIONS Overall, this study charts the first global network view of gene co-evolution in fungi. The future application of the approach presented here to other phylogenetic trees holds much promise in characterizing the forces that shape cellular co-evolution.
Collapse
Affiliation(s)
- Tamir Tuller
- School of Computer Sciences, Tel Aviv University, Ramat Aviv 69978, Israel
- Department of Molecular Microbiology and Biotechnology, Tel Aviv University, Ramat Aviv 69978, Israel
- School of Medicine, Tel Aviv University, Ramat Aviv 69978, Israel
| | - Martin Kupiec
- Department of Molecular Microbiology and Biotechnology, Tel Aviv University, Ramat Aviv 69978, Israel
| | - Eytan Ruppin
- School of Computer Sciences, Tel Aviv University, Ramat Aviv 69978, Israel
- School of Medicine, Tel Aviv University, Ramat Aviv 69978, Israel
| |
Collapse
|
23
|
Evolutionary rates and centrality in the yeast gene regulatory network. Genome Biol 2009; 10:R35. [PMID: 19358738 PMCID: PMC2688926 DOI: 10.1186/gb-2009-10-4-r35] [Citation(s) in RCA: 63] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2008] [Accepted: 04/09/2009] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Transcription factors play a fundamental role in regulating physiological responses and developmental processes. Here we examine the evolution of the yeast transcription factors in the context of the structure of the gene regulatory network. RESULTS In contrast to previous results for the protein-protein interaction and metabolic networks, we find that the position of a gene within the transcription network affects the rate of protein evolution such that more central transcription factors tend to evolve faster. Centrality is also positively correlated with expression variability, suggesting that the higher rate of divergence among central transcription factors may be due to their role in controlling information flow and may be the result of adaptation to changing environmental conditions. Alternatively, more central transcription factors could be more buffered against environmental perturbations and, therefore, less subject to strong purifying selection. Importantly, the relationship between centrality and evolutionary rates is independent of expression level, expression variability and gene essentiality. CONCLUSIONS Our analysis of the transcription network highlights the role of network structure on protein evolutionary rate. Further, the effect of network centrality on nucleotide divergence is different among the metabolic, protein-protein and transcriptional networks, suggesting that the effect of gene position is dependant on the function of the specific network under study. A better understanding of how these three cellular networks interact with one another may be needed to fully examine the impact of network structure on the function and evolution of biological systems.
Collapse
|
24
|
Vinogradov AE, Anatskaya OV. Loss of protein interactions and regulatory divergence in yeast whole-genome duplicates. Genomics 2009; 93:534-42. [PMID: 19272438 DOI: 10.1016/j.ygeno.2009.02.004] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2008] [Revised: 02/26/2009] [Accepted: 02/27/2009] [Indexed: 11/19/2022]
Abstract
Whole-genome duplications are important for the growth of genome complexity. We investigated various factors involved in the evolution of yeast whole-genome duplicates (ohnologs) making emphasis on the analysis of protein interactions. We found that ohnologs have a lower number of protein interactions compared with small-scale duplicates and singletons (by about -40%). The loss of interactions was proportional to their initial number and independent of ohnolog position in the protein interaction network. A faster evolving member of an ohnolog pair has a lower number of interactions compared to its counterpart. The Gene Ontology mapping of non-overlapping and overlapping interactants of paired ohnologs reveals a sharp asymmetry in GO terms related to regulation. The fraction of these terms is much higher in non-overlapping interactants (compared to overlapping interactants and total dataset). Network clustering coefficient is lower in ohnologs, yet they show an increased density of protein interactions restricted within the whole ohnologs set. These facts suggest that subfunctionalization (or subneofunctionalization) reflected in the loss of protein interactions was a prevailing process in the divergence of ohnologs, which distinguishes them from small-scale duplicates. The loss of protein interactions was associated with the regulatory divergence between the members of an ohnolog pair. A small-scale modularity (reflected in clustering coefficient) probably was not important for ohnologs retention, yet a larger-scale modularity could be involved in their evolution.
Collapse
Affiliation(s)
- Alexander E Vinogradov
- Institute of Cytology, Russian Academy of Sciences, Tikhoretsky Ave. 4, St. Petersburg 194064, Russia.
| | | |
Collapse
|
25
|
He J, Sun J, Deem MW. Spontaneous emergence of modularity in a model of evolving individuals and in real networks. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2009; 79:031907. [PMID: 19391971 DOI: 10.1103/physreve.79.031907] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/23/2008] [Revised: 01/28/2009] [Indexed: 05/27/2023]
Abstract
We investigate the selective forces that promote the emergence of modularity in nature. We demonstrate the spontaneous emergence of modularity in a population of individuals that evolve in a changing environment. We show that the level of modularity correlates with the rapidity and severity of environmental change. The modularity arises as a synergistic response to the noise in the environment in the presence of horizontal gene transfer. We suggest that the hierarchical structure observed in the natural world may be a broken symmetry state, which generically results from evolution in a changing environment. To support our results, we analyze experimental protein interaction data and show that protein interaction networks became increasingly modular as evolution proceeded over the last four billion years. We also discuss a method to determine the divergence time of a protein.
Collapse
Affiliation(s)
- Jiankui He
- Departments of Physics and Astronomy and Bioengineering, Rice University, Houston, Texas 77005, USA
| | | | | |
Collapse
|
26
|
Fokkens L, Snel B. Cohesive versus flexible evolution of functional modules in eukaryotes. PLoS Comput Biol 2009; 5:e1000276. [PMID: 19180181 PMCID: PMC2615111 DOI: 10.1371/journal.pcbi.1000276] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2008] [Accepted: 12/16/2008] [Indexed: 12/02/2022] Open
Abstract
Although functionally related proteins can be reliably predicted from phylogenetic profiles, many functional modules do not seem to evolve cohesively according to case studies and systematic analyses in prokaryotes. In this study we quantify the extent of evolutionary cohesiveness of functional modules in eukaryotes and probe the biological and methodological factors influencing our estimates. We have collected various datasets of protein complexes and pathways in Saccheromyces cerevisiae. We define orthologous groups on 34 eukaryotic genomes and measure the extent of cohesive evolution of sets of orthologous groups of which members constitute a known complex or pathway. Within this framework it appears that most functional modules evolve flexibly rather than cohesively. Even after correcting for uncertain module definitions and potentially problematic orthologous groups, only 46% of pathways and complexes evolve more cohesively than random modules. This flexibility seems partly coupled to the nature of the functional module because biochemical pathways are generally more cohesively evolving than complexes. Components of a protein complex or a metabolic pathway strongly cooperate to perform a specific function. Because of this functional interdependence, proteins that form a complex or pathway are expected to be present and absent together in different species. Phylogenetic profiling methods, in which proteins with similar presence and absence patterns are inferred to be functionally linked, are based on this assumption. In this report, we quantify to what extent proteins that together constitute a complex or pathway (a functional module) in yeast are present and absent together (evolve cohesively) in other eukaryotic species. We find that more than half of all complexes and pathways are only partially present in a number of species. It appears that evolution of functional modules is very flexible; components are not indispensable; they can be replaced or reused in a different functional context. This places a limit on how well phylogenetic profiling methods can detect functionally related proteins. Functional modules that evolve cohesively are typically involved in biological processes such as translation and amino acid metabolism.
Collapse
Affiliation(s)
- Like Fokkens
- Theoretical Biology and Bioinformatics, Utrecht University, Utrecht, The Netherlands.
| | | |
Collapse
|
27
|
Wong P, Althammer S, Hildebrand A, Kirschner A, Pagel P, Geissler B, Smialowski P, Blöchl F, Oesterheld M, Schmidt T, Strack N, Theis FJ, Ruepp A, Frishman D. An evolutionary and structural characterization of mammalian protein complex organization. BMC Genomics 2008; 9:629. [PMID: 19108706 PMCID: PMC2645396 DOI: 10.1186/1471-2164-9-629] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2008] [Accepted: 12/23/2008] [Indexed: 12/25/2022] Open
Abstract
Background We have recently released a comprehensive, manually curated database of mammalian protein complexes called CORUM. Combining CORUM with other resources, we assembled a dataset of over 2700 mammalian complexes. The availability of a rich information resource allows us to search for organizational properties concerning these complexes. Results As the complexity of a protein complex in terms of the number of unique subunits increases, we observed that the number of such complexes and the mean non-synonymous to synonymous substitution ratio of associated genes tend to decrease. Similarly, as the number of different complexes a given protein participates in increases, the number of such proteins and the substitution ratio of the associated gene also tends to decrease. These observations provide evidence relating natural selection and the organization of mammalian complexes. We also observed greater homogeneity in terms of predicted protein isoelectric points, secondary structure and substitution ratio in annotated versus randomly generated complexes. A large proportion of the protein content and interactions in the complexes could be predicted from known binary protein-protein and domain-domain interactions. In particular, we found that large proteins interact preferentially with much smaller proteins. Conclusion We observed similar trends in yeast and other data. Our results support the existence of conserved relations associated with the mammalian protein complexes.
Collapse
Affiliation(s)
- Philip Wong
- Helmholtz Center Munich-German Research Center for Environmental Health (GmbH), Institute of Bioinformatics and Systems Biology, Ingolstädter Landstrasse 1, Neuherberg, Germany.
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
28
|
Abstract
Positive selection for protein function can lead to multiple mutations within a small stretch of DNA, i.e., to a cluster of mutations. Recently, Wagner proposed a method to detect such mutation clusters. His method, however, did not take into account that residues with high solvent accessibility are inherently more variable than residues with low solvent accessibility. Here, we propose a new algorithm to detect clustered evolution. Our algorithm controls for different substitution probabilities at buried and exposed sites in the tertiary protein structure, and uses random permutations to calculate accurate P values for inferred clusters. We apply the algorithm to genomes of bacteria, fly, and mammals, and find several clusters of mutations in functionally important regions of proteins. Surprisingly, clustered evolution is a relatively rare phenomenon. Only between 2% and 10% of the genes we analyze contain a statistically significant mutation cluster. We also find that not controlling for solvent accessibility leads to an excess of clusters in terminal and solvent-exposed regions of proteins. Our algorithm provides a novel method to identify functionally relevant divergence between groups of species. Moreover, it could also be useful to detect artifacts in automatically assembled genomes.
Collapse
Affiliation(s)
- Tong Zhou
- Center for Computational Biology and Bioinformatics, Section of Integrative Biology, University of Texas at Austin, Austin, Texas, United States of America
| | - Peter J. Enyeart
- Institute for Cell and Molecular Biology, University of Texas at Austin, Austin, Texas, United States of America
| | - Claus O. Wilke
- Center for Computational Biology and Bioinformatics, Section of Integrative Biology, University of Texas at Austin, Austin, Texas, United States of America
- Institute for Cell and Molecular Biology, University of Texas at Austin, Austin, Texas, United States of America
- * E-mail:
| |
Collapse
|
29
|
High nucleotide divergence in developmental regulatory genes contrasts with the structural elements of olfactory pathways in caenorhabditis. Genetics 2008; 181:1387-97. [PMID: 19001295 DOI: 10.1534/genetics.107.082651] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Almost all organismal function is controlled by pathways composed of interacting genetic components. The relationship between pathway structure and the evolution of individual pathway components is not completely understood. For the nematode Caenorhabditis elegans, chemosensory pathways regulate critical aspects of an individual's life history and development. To help understand how olfaction evolves in Caenorhabditis and to examine patterns of gene evolution within transduction pathways in general, we analyzed nucleotide variation within and between species across two well-characterized olfactory pathways, including regulatory genes controlling the fate of the cells in which the pathways are expressed. In agreement with previous studies, we found much higher levels of polymorphism within C. remanei than within the related species C. elegans and C. briggsae. There are significant differences in the rates of nucleotide evolution for genes across the two pathways but no particular association between evolutionary rate and gene position, suggesting that the evolution of functional pathways must be considered within the context of broader gene network structure. However, developmental regulatory genes show both higher levels of divergence and polymorphism than the structural genes of the pathway. These results show that, contrary to the emerging paradigm in the evolution of development, important structural changes can accumulate in transcription factors.
Collapse
|
30
|
Amills M, Ramírez O, Tomàs A, Obexer-Ruff G, Vidal O. Positive selection on mammalian MHC-DQ genes revisited from a multispecies perspective. Genes Immun 2008; 9:651-8. [PMID: 18685643 DOI: 10.1038/gene.2008.62] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
Abstract
Major histocompatibility complex class II DQA and DQB genes have been shown to be under positive selection in certain mammalian species but not in others, fuelling a debate about how their polymorphism has evolved. In this study, we have analysed whether polymorphism in the peptide-binding region (PBR) of DQA (190 sequences, 11 species) and DQB (209 sequences, 7 species) molecules is positively selected by using both approximate (Nei-Gojobori, Li-Wu-Luo and Pamilo-Bianchi-Li) and maximum-likelihood methods. The results obtained with approximate methods were rather inconsistent for DQA, probably due to the high inaccuracy with which d(S) (PBR) is estimated, whereas evidence of positive selection was observed for most of the DQB PBR sequences. A parallel analysis with CodeML allowed us to demonstrate, in a very consistent way, the occurrence of positive selection in the PBR-encoding region of both DQA and DQB genes. Moreover, we have identified several DQA (alpha47, alpha55, alpha56, alpha68, alpha69, alpha76 and alpha79) and DQB (beta9, beta26 and beta57) codons that appear to be under positive selection in different, and often unrelated, mammalian species. Non-synonymous polymorphism at these sites has been evolutionarily conserved meaning that it might have functional consequences on peptide binding.
Collapse
Affiliation(s)
- M Amills
- Departament de Ciència Animal i dels Aliments, Facultat de Veterinària, Universitat Autònoma de Barcelona, Bellaterra, Spain.
| | | | | | | | | |
Collapse
|
31
|
Abstract
To find the most rapidly evolving regions in the yeast genome we compared most of chromosome III from three closely related lineages of the wild yeast Saccharomyces paradoxus. Unexpectedly, the centromere appears to be the fastest-evolving part of the chromosome, evolving even faster than DNA sequences unlikely to be under selective constraint (i.e., synonymous sites after correcting for codon usage bias and remnant transposable elements). Centromeres on other chromosomes also show an elevated rate of nucleotide substitution. Rapid centromere evolution has also been reported for some plants and animals and has been attributed to selection for inclusion in the egg or the ovule at female meiosis. But Saccharomyces yeasts have symmetrical meioses with all four products surviving, thus providing no opportunity for meiotic drive. In addition, yeast centromeres show the high levels of polymorphism expected under a neutral model of molecular evolution. We suggest that yeast centromeres suffer an elevated rate of mutation relative to other chromosomal regions and they change through a process of "centromere drift," not drive.
Collapse
|
32
|
Warden CD, Kim SH, Yi SV. Predicted functional RNAs within coding regions constrain evolutionary rates of yeast proteins. PLoS One 2008; 3:e1559. [PMID: 18270559 PMCID: PMC2216430 DOI: 10.1371/journal.pone.0001559] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2007] [Accepted: 12/30/2007] [Indexed: 11/25/2022] Open
Abstract
Functional RNAs (fRNAs) are being recognized as an important regulatory component in biological processes. Interestingly, recent computational studies suggest that the number and biological significance of functional RNAs within coding regions (coding fRNAs) may have been underestimated. We hypothesized that such coding fRNAs will impose additional constraint on sequence evolution because the DNA primary sequence has to simultaneously code for functional RNA secondary structures on the messenger RNA in addition to the amino acid codons for the protein sequence. To test this prediction, we first utilized computational methods to predict conserved fRNA secondary structures within multiple species alignments of Saccharomyces sensu strico genomes. We predict that as much as 5% of the genes in the yeast genome contain at least one functional RNA secondary structure within their protein-coding region. We then analyzed the impact of coding fRNAs on the evolutionary rate of protein-coding genes because a decrease in evolutionary rate implies constraint due to biological functionality. We found that our predicted coding fRNAs have a significant influence on evolutionary rates (especially at synonymous sites), independent of other functional measures. Thus, coding fRNA may play a role on sequence evolution. Given that coding regions of humans and flies contain many more predicted coding fRNAs than yeast, the impact of coding fRNAs on sequence evolution may be substantial in genomes of higher eukaryotes.
Collapse
Affiliation(s)
- Charles D. Warden
- School of Biology, Georgia Institute of Technology, Atlanta, Georgia, United States of America
| | - Seong-Ho Kim
- Division of Biostatistics, Indiana University School of Medicine, Indianapolis, Indiana, United States of America
| | - Soojin V. Yi
- School of Biology, Georgia Institute of Technology, Atlanta, Georgia, United States of America
- *E-mail:
| |
Collapse
|
33
|
Bertin N, Simonis N, Dupuy D, Cusick ME, Han JDJ, Fraser HB, Roth FP, Vidal M. Confirmation of organized modularity in the yeast interactome. PLoS Biol 2008; 5:e153. [PMID: 17564493 PMCID: PMC1892830 DOI: 10.1371/journal.pbio.0050153] [Citation(s) in RCA: 82] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023] Open
Affiliation(s)
| | | | | | | | | | - Hunter B Fraser
- * To whom correspondence should be addressed. E-mail: (MV); (HBF); (FPR)
| | - Frederick P Roth
- * To whom correspondence should be addressed. E-mail: (MV); (HBF); (FPR)
| | - Marc Vidal
- * To whom correspondence should be addressed. E-mail: (MV); (HBF); (FPR)
| |
Collapse
|
34
|
Batada NN, Reguly T, Breitkreutz A, Boucher L, Breitkreutz BJ, Hurst LD, Tyers M. Still stratus not altocumulus: further evidence against the date/party hub distinction. PLoS Biol 2008; 5:e154. [PMID: 17564494 PMCID: PMC1892831 DOI: 10.1371/journal.pbio.0050154] [Citation(s) in RCA: 89] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Analysis of multi-validated protein interaction data reveals networks with greater interconnectivity than the more segregated structures seen in previously available data. To help visualize this, the authors draw comparisons between continuous stratus clouds and altocumulus clouds.
Collapse
Affiliation(s)
- Nizar N Batada
- * To whom correspondence should be addressed. E-mail: (NNB); (LDH); (MT)
| | | | | | | | | | - Laurence D Hurst
- * To whom correspondence should be addressed. E-mail: (NNB); (LDH); (MT)
| | - Mike Tyers
- * To whom correspondence should be addressed. E-mail: (NNB); (LDH); (MT)
| |
Collapse
|
35
|
Heger A, Ponting CP. Variable strength of translational selection among 12 Drosophila species. Genetics 2007; 177:1337-48. [PMID: 18039870 PMCID: PMC2147958 DOI: 10.1534/genetics.107.070466] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2007] [Accepted: 09/05/2007] [Indexed: 01/06/2023] Open
Abstract
Codon usage bias in Drosophila melanogaster genes has been attributed to negative selection of those codons whose cellular tRNA abundance restricts rates of mRNA translation. Previous studies, which involved limited numbers of genes, can now be compared against analyses of the entire gene complements of 12 Drosophila species whose genome sequences have become available. Using large numbers (6138) of orthologs represented in all 12 species, we establish that the codon preferences of more closely related species are better correlated. Differences between codon usage biases are attributed, in part, to changes in mutational biases. These biases are apparent from the strong correlation (r = 0.92, P < 0.001) among these genomes' intronic G + C contents and exonic G + C contents at degenerate third codon positions. To perform a cross-species comparison of selection on codon usage, while accounting for changes in mutational biases, we calibrated each genome in turn using the codon usage bias indices of highly expressed ribosomal protein genes. The strength of translational selection was predicted to have varied between species largely according to their phylogeny, with the D. melanogaster group species exhibiting the strongest degree of selection.
Collapse
Affiliation(s)
- Andreas Heger
- MRC Functional Genetics Unit, Department of Physiology, Anatomy, and Genetics, University of Oxford, Oxford OX1 3QX, United Kingdom.
| | | |
Collapse
|
36
|
Komurov K, White M. Revealing static and dynamic modular architecture of the eukaryotic protein interaction network. Mol Syst Biol 2007; 3:110. [PMID: 17453049 PMCID: PMC1865589 DOI: 10.1038/msb4100149] [Citation(s) in RCA: 102] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2006] [Accepted: 03/14/2007] [Indexed: 01/09/2023] Open
Abstract
In an effort to understand the dynamic organization of the protein interaction network and its role in the regulation of cell behavior, positioning of proteins into specific network localities was studied with respect to their expression dynamics. First, we find that constitutively expressed and dynamically co-regulated proteins cluster in distinct functionally specialized network neighborhoods to form static and dynamic functional modules, respectively. Then, we show that whereas dynamic modules are mainly responsible for condition-dependent regulation of cell behavior, static modules provide robustness to the cell against genetic perturbations or protein expression noise, and therefore may act as buffers of evolutionary as well as population variations in cell behavior. Observations in this study refine the previously proposed model of dynamic modularity in the protein interaction network, and propose a link between the evolution of gene expression regulation and biological robustness.
Collapse
Affiliation(s)
- Kakajan Komurov
- Department of Cell Biology, University of Texas Southwestern Medical Center, Dallas, TX 75390-9039, USA.
| | | |
Collapse
|
37
|
Batada NN, Reguly T, Breitkreutz A, Boucher L, Breitkreutz BJ, Hurst LD, Tyers M. Stratus not altocumulus: a new view of the yeast protein interaction network. PLoS Biol 2007; 4:e317. [PMID: 16984220 PMCID: PMC1569888 DOI: 10.1371/journal.pbio.0040317] [Citation(s) in RCA: 167] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2006] [Accepted: 07/25/2006] [Indexed: 11/18/2022] Open
Abstract
Systems biology approaches can reveal intermediary levels of organization between genotype and phenotype that often underlie biological phenomena such as polygenic effects and protein dispensability. An important conceptualization is the module, which is loosely defined as a cohort of proteins that perform a dedicated cellular task. Based on a computational analysis of limited interaction datasets in the budding yeast Saccharomyces cerevisiae, it has been suggested that the global protein interaction network is segregated such that highly connected proteins, called hubs, tend not to link to each other. Moreover, it has been suggested that hubs fall into two distinct classes: “party” hubs are co-expressed and co-localized with their partners, whereas “date” hubs interact with incoherently expressed and diversely localized partners, and thereby cohere disparate parts of the global network. This structure may be compared with altocumulus clouds, i.e., cotton ball–like structures sparsely connected by thin wisps. However, this organization might reflect a small and/or biased sample set of interactions. In a multi-validated high-confidence (HC) interaction network, assembled from all extant S. cerevisiae interaction data, including recently available proteome-wide interaction data and a large set of reliable literature-derived interactions, we find that hub–hub interactions are not suppressed. In fact, the number of interactions a hub has with other hubs is a good predictor of whether a hub protein is essential or not. We find that date hubs are neither required for network tolerance to node deletion, nor do date hubs have distinct biological attributes compared to other hubs. Date and party hubs do not, for example, evolve at different rates. Our analysis suggests that the organization of global protein interaction network is highly interconnected and hence interdependent, more like the continuous dense aggregations of stratus clouds than the segregated configuration of altocumulus clouds. If the network is configured in a stratus format, cross-talk between proteins is potentially a major source of noise. In turn, control of the activity of the most highly connected proteins may be vital. Indeed, we find that a fluctuation in steady-state levels of the most connected proteins is minimized. Analysis of multi-validated protein interaction data reveals networks with greater interconnectivity than the more segregated structures seen in previously available data. To help visualize this, the authors draw comparisons between continuous stratus clouds and altocumulus clouds.
Collapse
Affiliation(s)
- Nizar N Batada
- Samuel Lunenfeld Research Institute, Mount Sinai Hospital, Toronto, Canada
- Department of Biology and Biochemistry, University of Bath, Bath, United Kingdom
- * To whom correspondence should be addressed. E-mail: (NNB); (LDH); (MT)
| | - Teresa Reguly
- Samuel Lunenfeld Research Institute, Mount Sinai Hospital, Toronto, Canada
| | - Ashton Breitkreutz
- Samuel Lunenfeld Research Institute, Mount Sinai Hospital, Toronto, Canada
| | - Lorrie Boucher
- Samuel Lunenfeld Research Institute, Mount Sinai Hospital, Toronto, Canada
- Department of Medical Genetics and Microbiology, University of Toronto, Toronto, Canada
| | | | - Laurence D Hurst
- Department of Biology and Biochemistry, University of Bath, Bath, United Kingdom
- * To whom correspondence should be addressed. E-mail: (NNB); (LDH); (MT)
| | - Mike Tyers
- Samuel Lunenfeld Research Institute, Mount Sinai Hospital, Toronto, Canada
- Department of Medical Genetics and Microbiology, University of Toronto, Toronto, Canada
- * To whom correspondence should be addressed. E-mail: (NNB); (LDH); (MT)
| |
Collapse
|
38
|
Kim SH, Yi SV. Understanding relationship between sequence and functional evolution in yeast proteins. Genetica 2006; 131:151-6. [PMID: 17160620 DOI: 10.1007/s10709-006-9125-2] [Citation(s) in RCA: 70] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2006] [Accepted: 11/09/2006] [Indexed: 10/23/2022]
Abstract
The underlying relationship between functional variables and sequence evolutionary rates is often assessed by partial correlation analysis. However, this strategy is impeded by the difficulty of conducting meaningful statistical analysis using noisy biological data. A recent study suggested that the partial correlation analysis is misleading when data is noisy and that the principal component regression analysis is a better tool to analyze biological data. In this paper, we evaluate how these two statistical tools (partial correlation and principal component regression) perform when data are noisy. Contrary to the earlier conclusion, we found that these two tools perform comparably in most cases. Furthermore, when there is more than one 'true' independent variable, partial correlation analysis delivers a better representation of the data. Employing both tools may provide a more complete and complementary representation of the real data. In this light, and with new analyses, we suggest that protein length and gene dispensability play significant, independent roles in yeast protein evolution.
Collapse
Affiliation(s)
- Seong-Ho Kim
- School of Biology, Georgia Institute of Technology, 310 Ferst Drive, Atlanta, GA 30332, USA.
| | | |
Collapse
|
39
|
Plotkin JB, Dushoff J, Desai MM, Fraser HB. Codon usage and selection on proteins. J Mol Evol 2006; 63:635-53. [PMID: 17043750 DOI: 10.1007/s00239-005-0233-x] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2005] [Accepted: 06/12/2006] [Indexed: 10/24/2022]
Abstract
Selection pressures on proteins are usually measured by comparing homologous nucleotide sequences (Zuckerkandl and Pauling 1965). Recently we introduced a novel method, termed volatility, to estimate selection pressures on proteins on the basis of their synonymous codon usage (Plotkin and Dushoff 2003; Plotkin et al. 2004). Here we provide a theoretical foundation for this approach. Under the Fisher-Wright model, we derive the expected frequencies of synonymous codons as a function of the strength of selection on amino acids, the mutation rate, and the effective population size. We analyze the conditions under which we can expect to draw inferences from biased codon usage, and we estimate the time scales required to establish and maintain such a signal. We find that synonymous codon usage can reliably distinguish between negative selection and neutrality only for organisms, such as some microbes, that experience large effective population sizes or periods of elevated mutation rates. The power of volatility to detect positive selection is also modest--requiring approximately 100 selected sites--but it depends less strongly on population size. We show that phenomena such as transient hyper-mutators can improve the power of volatility to detect selection, even when the neutral site heterozygosity is low. We also discuss several confounding factors, neglected by the Fisher-Wright model, that may limit the applicability of volatility in practice.
Collapse
Affiliation(s)
- Joshua B Plotkin
- Department of Biology, University of Pennsylvania, Philadelphia, PA 19104, USA.
| | | | | | | |
Collapse
|
40
|
Lin YS, Byrnes JK, Hwang JK, Li WH. Codon-usage bias versus gene conversion in the evolution of yeast duplicate genes. Proc Natl Acad Sci U S A 2006; 103:14412-6. [PMID: 16971485 PMCID: PMC1599977 DOI: 10.1073/pnas.0606348103] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Many Saccharomyces cerevisiae duplicate genes that were derived from an ancient whole-genome duplication (WGD) unexpectedly show a small synonymous divergence (K(S)), a higher sequence similarity to each other than to orthologues in Saccharomyces bayanus, or slow evolution compared with the orthologue in Kluyveromyces waltii, a non-WGD species. This decelerated evolution was attributed to gene conversion between duplicates. Using approximately 300 WGD gene pairs in four species and their orthologues in non-WGD species, we show that codon-usage bias and protein-sequence conservation are two important causes for decelerated evolution of duplicate genes, whereas gene conversion is effective only in the presence of strong codon-usage bias or protein-sequence conservation. Furthermore, we find that change in mutation pattern or in tDNA copy number changed codon-usage bias and increased the K(S) distance between K. waltii and S. cerevisiae. Intriguingly, some proteins showed fast evolution before the radiation of WGD species but little or no sequence divergence between orthologues and paralogues thereafter, indicating that functional conservation after the radiation may also be responsible for decelerated evolution in duplicates.
Collapse
Affiliation(s)
- Yeong-Shin Lin
- *Department of Ecology and Evolution, University of Chicago, 1101 East 57th Street, Chicago, IL 60637; and
- Department of Biological Science and Technology, National Chiao Tung University, Hsinchu 300, Taiwan
| | - Jake K. Byrnes
- *Department of Ecology and Evolution, University of Chicago, 1101 East 57th Street, Chicago, IL 60637; and
| | - Jenn-Kang Hwang
- Department of Biological Science and Technology, National Chiao Tung University, Hsinchu 300, Taiwan
| | - Wen-Hsiung Li
- *Department of Ecology and Evolution, University of Chicago, 1101 East 57th Street, Chicago, IL 60637; and
| |
Collapse
|
41
|
Xing Y, Lee C. Alternative splicing and RNA selection pressure--evolutionary consequences for eukaryotic genomes. Nat Rev Genet 2006; 7:499-509. [PMID: 16770337 DOI: 10.1038/nrg1896] [Citation(s) in RCA: 206] [Impact Index Per Article: 11.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Genome-wide analyses of alternative splicing have established its nearly ubiquitous role in gene regulation in many organisms. Genome sequencing and comparative genomics have made it possible to look in detail at the evolutionary history of specific alternative exons or splice sites, resulting in a flurry of publications in recent years. Here, we consider how alternative splicing has contributed to the evolution of modern genomes, and discuss constraints on evolution associated with alternative splicing that might have important medical implications.
Collapse
Affiliation(s)
- Yi Xing
- Molecular Biology Institute, Center for Genomics and Proteomics, Department of Chemistry and Biochemistry, University of California, Los Angeles, California 90095, USA
| | | |
Collapse
|
42
|
Plotkin JB, Dushoff J, Desai MM, Fraser HB. Estimating selection pressures from limited comparative data. Mol Biol Evol 2006; 23:1457-9. [PMID: 16754640 DOI: 10.1093/molbev/msl021] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
We recently introduced a novel method for estimating selection pressures on proteins, termed "volatility," which requires only a single genome sequence. Some criticisms that have been levied against this approach are valid, but many others are based on misconceptions of volatility, or they apply equally to comparative methods of estimating selection. Here, we introduce a simple regression technique for estimating selection pressures on all proteins in a genome, on the basis of limited comparative data. The regression technique does not depend on an underlying population-genetic mechanism. This new approach to estimating selection across a genome should be more powerful and more widely applicable than volatility itself.
Collapse
|
43
|
Batada NN, Hurst LD, Tyers M. Evolutionary and physiological importance of hub proteins. PLoS Comput Biol 2006; 2:e88. [PMID: 16839197 PMCID: PMC1500817 DOI: 10.1371/journal.pcbi.0020088] [Citation(s) in RCA: 221] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2006] [Accepted: 05/31/2006] [Indexed: 02/05/2023] Open
Abstract
It has been claimed that proteins with more interaction partners (hubs) are both physiologically more important (i.e., less dispensable) and, owing to an assumed high density of binding sites, slow evolving. Not all analyses, however, support these results, probably because of biased and less-than reliable global protein interaction data. Here we provide the first examination of these issues using a comprehensive literature-curated dataset of well-substantiated protein interactions in Saccharomyces cerevisiae. Whereas use of less reliable yeast two-hybrid data alone can reject the possibility that local connectivity correlates with measures of dispensability, in higher quality datasets a relatively robust correlation is observed. In contrast, local connectivity does not correlate with the rate of protein evolution even in reliable datasets. This perhaps surprising lack of correlation with evolutionary rate appears in part to arise from the fact that hub proteins do not have a higher density of residues associated with binding. However, hub proteins do have at least one other set of unusual features, namely rapid turnover and regulation, as manifest in high mRNA decay rates and a large number of phosphorylation sites. This, we suggest, is an adaptation to minimize unwanted activation of pathways that might be mediated by adventitious binding to hubs, were they to actively persist longer than required at any given time point. We conclude that hub proteins are more important for cellular growth rate and under tight regulation but are not slow evolving. Why do some proteins evolve so very slowly? Why are only a few proteins uniquely vital to the functioning of an organism? Understanding how proteins interact with other proteins may provide the answers. Some proteins are, it is suggested, like hubs on a wheel with multiple spokes (interacting partners) attached: take away a spoke and the wheel works, take away the hub and the wheel is useless. With so many proteins to bind with, hubs may also be as slow evolving as some interaction sites are constrained in their evolution. Unfortunately, prior analyses have been equivocal, not least because of an uncertainty about which proteins interact with which others. Here the authors employ an extensive literature-curated dataset of reliable protein–protein interactions to address the issue of essentiality, connectivity, and evolutionary rate. This study finds that hubs are more likely to be essential, and if not essential, at least have a larger impact on fitness. However, hub proteins are not slow evolving, in part, because hubs do not have a higher density of binding sites. Hub proteins do, however, appear to be under strong regulation, an adaptation the authors suggest that minimizes the risk of unwanted activation.
Collapse
Affiliation(s)
- Nizar N Batada
- Samuel Lunenfeld Research Institute, Mount Sinai Hospital, Toronto, Canada
- * To whom correspondence should be addressed. E-mail: (NNB); (LDH)
| | - Laurence D Hurst
- Department of Biology and Biochemistry, University of Bath, Bath, United Kingdom
- * To whom correspondence should be addressed. E-mail: (NNB); (LDH)
| | - Mike Tyers
- Samuel Lunenfeld Research Institute, Mount Sinai Hospital, Toronto, Canada
- Department of Medical Genetics and Microbiology, University of Toronto, Toronto, Canada
| |
Collapse
|
44
|
Ericson E, Pylvänäinen I, Fernandez-Ricaud L, Nerman O, Warringer J, Blomberg A. Genetic pleiotropy in Saccharomyces cerevisiae quantified by high-resolution phenotypic profiling. Mol Genet Genomics 2006; 275:605-14. [PMID: 16534619 DOI: 10.1007/s00438-006-0112-1] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2005] [Accepted: 02/03/2006] [Indexed: 10/24/2022]
Abstract
Genetic pleiotropy, the ability of a mutation in a single gene to give rise to multiple phenotypic outcomes, constitutes an important but incompletely understood biological phenomenon. We used a high-resolution and high-precision phenotypic profiling approach to quantify the fitness contribution of genes on the five smallest yeast chromosomes during different forms of environmental stress, selected to probe a wide diversity of physiological features. We found that the extent of pleiotropy is much higher than previously claimed; 17% of the yeast genes were pleiotropic whereof one-fifth were hyper-pleiotropic. Pleiotropic genes preferentially participate in functions related to determination of protein fate, cell growth and morphogenesis, signal transduction and transcription. Contrary to what has earlier been proposed we did not find experimental evidence for slower evolutionary rate of pleiotropic genes/proteins. We also refute the existence of phenotypic islands along chromosomes but report on a remarkable loss both of pleiotropy and of phenotypic penetrance towards chromosomal ends. Thus, the here reported features of pleiotropy both have implications on our understanding of evolutionary processes as well as the mechanisms underlying disease.
Collapse
Affiliation(s)
- Elke Ericson
- Department of Cell and Molecular Biology, Göteborg University, Medicinaregatan 9c, Box 462, 40530, Göteborg, Sweden
| | | | | | | | | | | |
Collapse
|
45
|
Xing Y, Lee C. Can RNA selection pressure distort the measurement of Ka/Ks? Gene 2006; 370:1-5. [PMID: 16488091 DOI: 10.1016/j.gene.2005.12.015] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2005] [Revised: 12/15/2005] [Accepted: 12/20/2005] [Indexed: 11/24/2022]
Abstract
Recently, an interesting question has emerged in the evolutionary interpretation of sequence substitution data as evidence of amino acid selection pressure. Specifically, the Ka/Ks metric was designed to measure selection pressure on amino acid substitutions, assuming that the synonymous substitution rate Ks reflects the neutral nucleotide substitution rate. However, there is increasing evidence for selection pressure at silent sites due to constraints of RNA splicing. Is Ka/Ks an appropriate metric for selection pressure on amino acid substitutions, in the presence of other selection pressures acting only at the RNA level (such as selection for exonic splicing enhancers)? Or can the resulting decreases in Ks from such selection pressures introduce bias into the Ka/Ks metric, so that it no longer gives an accurate measure of amino acid level selection pressure? In this review, we present both mathematical models and empirical evidence for these divergent points of view.
Collapse
Affiliation(s)
- Yi Xing
- Molecular Biology Institute, Center for Genomics and Proteomics, Department of Chemistry and Biochemistry, University of California, Los Angeles, CA 90095, USA
| | | |
Collapse
|
46
|
Herbeck JT, Wall DP. Converging on a general model of protein evolution. Trends Biotechnol 2006; 23:485-7. [PMID: 16054255 DOI: 10.1016/j.tibtech.2005.07.009] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2005] [Revised: 06/08/2005] [Accepted: 07/18/2005] [Indexed: 10/25/2022]
Abstract
The availability of high-throughput genomic databases that establish protein dispensability, expression and interaction networks enables rigorous tests of competing models of protein evolution. Recent research utilizing these new data sets shows that protein evolution is more complex than was previously thought. Several variables, including protein dispensability, expression, functional density, and genetic modularity, appear to have independent effects on the evolutionary rate of proteins, suggesting that proteomes have evolved via an assembly of selectional regimes. These results indicate that a general model of protein evolution will emerge as more functional genomic data from a diversity of organisms accumulate.
Collapse
Affiliation(s)
- Joshua T Herbeck
- Department of Microbiology, University of Washington School of Medicine, Seattle, WA 98103, USA.
| | | |
Collapse
|
47
|
Popescu CE, Borza T, Bielawski JP, Lee RW. Evolutionary rates and expression level in Chlamydomonas. Genetics 2005; 172:1567-76. [PMID: 16361241 PMCID: PMC1456299 DOI: 10.1534/genetics.105.047399] [Citation(s) in RCA: 36] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
In many biological systems, especially bacteria and unicellular eukaryotes, rates of synonymous and nonsynonymous nucleotide divergence are negatively correlated with the level of gene expression, a phenomenon that has been attributed to natural selection. Surprisingly, this relationship has not been examined in many important groups, including the unicellular model organism Chlamydomonas reinhardtii. Prior to this study, comparative data on protein-coding sequences from C. reinhardtii and its close noninterfertile relative C. incerta were very limited. We compiled and analyzed protein-coding sequences for 67 nuclear genes from these taxa; the sequences were mostly obtained from the C. reinhardtii EST database and our C. incerta EST data. Compositional and synonymous codon usage biases varied among genes within each species but were highly correlated between the orthologous genes of the two species. Relative rates of synonymous and nonsynonymous substitution across genes varied widely and showed a strong negative correlation with the level of gene expression estimated by the codon adaptation index. Our comparative analysis of substitution rates in introns of lowly and highly expressed genes suggests that natural selection has a larger contribution than mutation to the observed correlation between evolutionary rates and gene expression level in Chlamydomonas.
Collapse
Affiliation(s)
- Cristina E Popescu
- Department of Biology, Dalhousie University, Halifax, Nova Scotia B3H 4J1, Canada
| | | | | | | |
Collapse
|
48
|
Drummond DA, Raval A, Wilke CO. A single determinant dominates the rate of yeast protein evolution. Mol Biol Evol 2005; 23:327-37. [PMID: 16237209 DOI: 10.1093/molbev/msj038] [Citation(s) in RCA: 308] [Impact Index Per Article: 16.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
A gene's rate of sequence evolution is among the most fundamental evolutionary quantities in common use, but what determines evolutionary rates has remained unclear. Here, we carry out the first combined analysis of seven predictors (gene expression level, dispensability, protein abundance, codon adaptation index, gene length, number of protein-protein interactions, and the gene's centrality in the interaction network) previously reported to have independent influences on protein evolutionary rates. Strikingly, our analysis reveals a single dominant variable linked to the number of translation events which explains 40-fold more variation in evolutionary rate than any other, suggesting that protein evolutionary rate has a single major determinant among the seven predictors. The dominant variable explains nearly half the variation in the rate of synonymous and protein evolution. We show that the two most commonly used methods to disentangle the determinants of evolutionary rate, partial correlation analysis and ordinary multivariate regression, produce misleading or spurious results when applied to noisy biological data. We overcome these difficulties by employing principal component regression, a multivariate regression of evolutionary rate against the principal components of the predictor variables. Our results support the hypothesis that translational selection governs the rate of synonymous and protein sequence evolution in yeast.
Collapse
Affiliation(s)
- D Allan Drummond
- Program in Computation and Neural Systems, California Institute of Technology, Pasadena, USA
| | | | | |
Collapse
|
49
|
Wall DP, Hirsh AE, Fraser HB, Kumm J, Giaever G, Eisen MB, Feldman MW. Functional genomic analysis of the rates of protein evolution. Proc Natl Acad Sci U S A 2005; 102:5483-8. [PMID: 15800036 PMCID: PMC555735 DOI: 10.1073/pnas.0501761102] [Citation(s) in RCA: 225] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The evolutionary rates of proteins vary over several orders of magnitude. Recent work suggests that analysis of large data sets of evolutionary rates in conjunction with the results from high-throughput functional genomic experiments can identify the factors that cause proteins to evolve at such dramatically different rates. To this end, we estimated the evolutionary rates of >3,000 proteins in four species of the yeast genus Saccharomyces and investigated their relationship with levels of expression and protein dispensability. Each protein's dispensability was estimated by the growth rate of mutants deficient for the protein. Our analyses of these improved evolutionary and functional genomic data sets yield three main results. First, dispensability and expression have independent, significant effects on the rate of protein evolution. Second, measurements of expression levels in the laboratory can be used to filter data sets of dispensability estimates, removing variates that are unlikely to reflect real biological effects. Third, structural equation models show that although we may reasonably infer that dispensability and expression have significant effects on protein evolutionary rate, we cannot yet accurately estimate the relative strengths of these effects.
Collapse
Affiliation(s)
- Dennis P Wall
- Department of Biological Sciences, and Stanford Genome Technology Center, Stanford University, Stanford, CA 94305, USA.
| | | | | | | | | | | | | |
Collapse
|
50
|
Fraser HB. Modularity and evolutionary constraint on proteins. Nat Genet 2005; 37:351-2. [PMID: 15750592 DOI: 10.1038/ng1530] [Citation(s) in RCA: 158] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2004] [Accepted: 02/03/2005] [Indexed: 11/09/2022]
Abstract
Modularity, which has been found in the functional and physical protein interaction networks of many organisms, has been postulated to affect both the mode and tempo of evolution. Here I show that in the yeast Saccharomyces cerevisiae, protein interaction hubs situated in single modules are highly constrained, whereas those connecting different modules are more plastic. This pattern of change could reflect a tendency for evolutionary innovations to occur by altering the proteins and interactions between rather than within modules, in a manner somewhat similar to the evolution of new proteins through the shuffling of conserved protein domains.
Collapse
Affiliation(s)
- Hunter B Fraser
- Department of Molecular and Cell Biology, University of California, Berkeley, California 94720, USA.
| |
Collapse
|