Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For:	[Subscribe] [Scholar Register]

Number

Cited by Other Article(s)

Zheng F, Liu Y, Yang Y, Wen Y, Li M. Assessing computational tools for predicting protein stability changes upon missense mutations using a new dataset. Protein Sci 2024;33:e4861. [PMID: 38084013 PMCID: PMC10751734 DOI: 10.1002/pro.4861] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2023] [Revised: 11/14/2023] [Accepted: 12/06/2023] [Indexed: 12/28/2023]

Wu H, Chen Q, Zhang W, Mu W. Overview of strategies for developing high thermostability industrial enzymes: Discovery, mechanism, modification and challenges. Crit Rev Food Sci Nutr 2021;63:2057-2073. [PMID: 34445912 DOI: 10.1080/10408398.2021.1970508] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]

Chen Y, Lu H, Zhang N, Zhu Z, Wang S, Li M. PremPS: Predicting the impact of missense mutations on protein stability. PLoS Comput Biol 2020;16:e1008543. [PMID: 33378330 PMCID: PMC7802934 DOI: 10.1371/journal.pcbi.1008543] [Citation(s) in RCA: 93] [Impact Index Per Article: 23.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2020] [Revised: 01/12/2021] [Accepted: 11/16/2020] [Indexed: 12/12/2022] Open

Abstract

Computational methods that predict protein stability changes induced by missense mutations have made a lot of progress over the past decades. Most of the available methods however have very limited accuracy in predicting stabilizing mutations because existing experimental sets are dominated by mutations reducing protein stability. Moreover, few approaches could consistently perform well across different test cases. To address these issues, we developed a new computational method PremPS to more accurately evaluate the effects of missense mutations on protein stability. The PremPS method is composed of only ten evolutionary- and structure-based features and parameterized on a balanced dataset with an equal number of stabilizing and destabilizing mutations. A comprehensive comparison of the predictive performance of PremPS with other available methods on nine benchmark datasets confirms that our approach consistently outperforms other methods and shows considerable improvement in estimating the impacts of stabilizing mutations. A protein could have multiple structures available, and if another structure of the same protein is used, the predicted change in stability for structure-based methods might be different. Thus, we further estimated the impact of using different structures on prediction accuracy, and demonstrate that our method performs well across different types of structures except for low-resolution structures and models built based on templates with low sequence identity. PremPS can be used for finding functionally important variants, revealing the molecular mechanisms of functional influences and protein design. PremPS is freely available at https://lilab.jysw.suda.edu.cn/research/PremPS/, which allows to do large-scale mutational scanning and takes about four minutes to perform calculations for a single mutation per protein with ~ 300 residues and requires ~ 0.4 seconds for each additional mutation.

Collapse

Zaucha J, Heinzinger M, Kulandaisamy A, Kataka E, Salvádor ÓL, Popov P, Rost B, Gromiha MM, Zhorov BS, Frishman D. Mutations in transmembrane proteins: diseases, evolutionary insights, prediction and comparison with globular proteins. Brief Bioinform 2020;22:5872174. [PMID: 32672331 DOI: 10.1093/bib/bbaa132] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2020] [Revised: 05/26/2020] [Accepted: 05/28/2020] [Indexed: 12/18/2022] Open

Marabotti A, Scafuri B, Facchiano A. Predicting the stability of mutant proteins by computational approaches: an overview. Brief Bioinform 2020;22:5850907. [PMID: 32496523 DOI: 10.1093/bib/bbaa074] [Citation(s) in RCA: 29] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2020] [Revised: 04/07/2020] [Accepted: 04/10/2020] [Indexed: 01/06/2023] Open

Broom A, Trainor K, Jacobi Z, Meiering EM. Computational Modeling of Protein Stability: Quantitative Analysis Reveals Solutions to Pervasive Problems. Structure 2020;28:717-726.e3. [DOI: 10.1016/j.str.2020.04.003] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2019] [Revised: 03/26/2020] [Accepted: 04/06/2020] [Indexed: 12/20/2022]

Pandurangan AP, Blundell TL. Prediction of impacts of mutations on protein structure and interactions: SDM, a statistical approach, and mCSM, using machine learning. Protein Sci 2020;29:247-257. [PMID: 31693276 PMCID: PMC6933854 DOI: 10.1002/pro.3774] [Citation(s) in RCA: 46] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2019] [Revised: 10/31/2019] [Accepted: 10/31/2019] [Indexed: 02/02/2023]

Abstract

Next-generation sequencing methods have not only allowed an understanding of genome sequence variation during the evolution of organisms but have also provided invaluable information about genetic variants in inherited disease and the emergence of resistance to drugs in cancers and infectious disease. A challenge is to distinguish mutations that are drivers of disease or drug resistance, from passengers that are neutral or even selectively advantageous to the organism. This requires an understanding of impacts of missense mutations in gene expression and regulation, and on the disruption of protein function by modulating protein stability or disturbing interactions with proteins, nucleic acids, small molecule ligands, and other biological molecules. Experimental approaches to understanding differences between wild-type and mutant proteins are most accurate but are also time-consuming and costly. Computational tools used to predict the impacts of mutations can provide useful information more quickly. Here, we focus on two widely used structure-based approaches, originally developed in the Blundell lab: site-directed mutator (SDM), a statistical approach to analyze amino acid substitutions, and mutation cutoff scanning matrix (mCSM), which uses graph-based signatures to represent the wild-type structural environment and machine learning to predict the effect of mutations on protein stability. Here, we describe DUET that uses machine learning to combine the two approaches. We discuss briefly the development of mCSM for understanding the impacts of mutations on interfaces with other proteins, nucleic acids, and ligands, and we exemplify the wide application of these approaches to understand human genetic disorders and drug resistance mutations relevant to cancer and mycobacterial infections. STATEMENT FOR A BROADER AUDIENCE: Genetic or somatic changes in genes can lead to mutations in human proteins, which give rise to genetic disorders or cancer, or to genes of pathogens leading to drug resistance. Computer software described here, using statistical approaches or machine learning, uses the information from genome sequencing of humans and pathogens, together with experimental or modeled 3D structures of gene products, the proteins, to predict impacts of mutations in genetic disease, cancer and drug resistance.

Collapse

Hu Z, Yu C, Furutsuki M, Andreoletti G, Ly M, Hoskins R, Adhikari AN, Brenner SE. VIPdb, a genetic Variant Impact Predictor Database. Hum Mutat 2019;40:1202-1214. [PMID: 31283070 PMCID: PMC7288905 DOI: 10.1002/humu.23858] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2019] [Accepted: 06/27/2019] [Indexed: 12/30/2022]

The state-of-the-art strategies of protein engineering for enzyme stabilization. Biotechnol Adv 2018;37:530-537. [PMID: 31138425 DOI: 10.1016/j.biotechadv.2018.10.011] [Citation(s) in RCA: 89] [Impact Index Per Article: 14.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2017] [Revised: 10/12/2018] [Accepted: 10/25/2018] [Indexed: 12/11/2022]

Glusman G, Rose PW, Prlić A, Dougherty J, Duarte JM, Hoffman AS, Barton GJ, Bendixen E, Bergquist T, Bock C, Brunk E, Buljan M, Burley SK, Cai B, Carter H, Gao J, Godzik A, Heuer M, Hicks M, Hrabe T, Karchin R, Leman JK, Lane L, Masica DL, Mooney SD, Moult J, Omenn GS, Pearl F, Pejaver V, Reynolds SM, Rokem A, Schwede T, Song S, Tilgner H, Valasatava Y, Zhang Y, Deutsch EW. Mapping genetic variations to three-dimensional protein structures to enhance variant interpretation: a proposed framework. Genome Med 2017;9:113. [PMID: 29254494 PMCID: PMC5735928 DOI: 10.1186/s13073-017-0509-y] [Citation(s) in RCA: 41] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open

Abstract

The translation of personal genomics to precision medicine depends on the accurate interpretation of the multitude of genetic variants observed for each individual. However, even when genetic variants are predicted to modify a protein, their functional implications may be unclear. Many diseases are caused by genetic variants affecting important protein features, such as enzyme active sites or interaction interfaces. The scientific community has catalogued millions of genetic variants in genomic databases and thousands of protein structures in the Protein Data Bank. Mapping mutations onto three-dimensional (3D) structures enables atomic-level analyses of protein positions that may be important for the stability or formation of interactions; these may explain the effect of mutations and in some cases even open a path for targeted drug development. To accelerate progress in the integration of these data types, we held a two-day Gene Variation to 3D (GVto3D) workshop to report on the latest advances and to discuss unmet needs. The overarching goal of the workshop was to address the question: what can be done together as a community to advance the integration of genetic variants and 3D protein structures that could not be done by a single investigator or laboratory? Here we describe the workshop outcomes, review the state of the field, and propose the development of a framework with which to promote progress in this arena. The framework will include a set of standard formats, common ontologies, a common application programming interface to enable interoperation of the resources, and a Tool Registry to make it easy to find and apply the tools to specific analysis problems. Interoperability will enable integration of diverse data sources and tools and collaborative development of variant effect prediction methods.

Collapse

Affiliation(s)

Gustavo Glusman Institute for Systems Biology, Seattle, WA, 98109, USA.
Peter W Rose San Diego Supercomputer Center, University of California San Diego, La Jolla, CA, 98093, USA
Andreas Prlić San Diego Supercomputer Center, University of California San Diego, La Jolla, CA, 98093, USA.,RCSB Protein Data Bank, University of California San Diego, La Jolla, CA, 98093, USA
Jennifer Dougherty Institute for Systems Biology, Seattle, WA, 98109, USA
José M Duarte RCSB Protein Data Bank, University of California San Diego, La Jolla, CA, 98093, USA
Andrew S Hoffman Human Centered Design & Engineering, University of Washington, Seattle, WA, 98195, USA
Geoffrey J Barton Division of Computational Biology, School of Life Sciences, University of Dundee, Dundee, DD1 5EH, UK
Emøke Bendixen Department of Molecular Biology and Genetics, Aarhus University, 8000, Aarhus, Denmark
Timothy Bergquist Department of Biomedical Informatics and Medical Education, University of Washington, Seattle, WA, 98109, USA
Christian Bock Department of Biomedical Informatics and Medical Education, University of Washington, Seattle, WA, 98109, USA
Elizabeth Brunk University of California San Diego, La Jolla, CA, 92093, USA
Marija Buljan Institute of Molecular Systems Biology, ETH Zurich, CH-8093, Zurich, Switzerland
Stephen K Burley San Diego Supercomputer Center, University of California San Diego, La Jolla, CA, 98093, USA.,RCSB Protein Data Bank, University of California San Diego, La Jolla, CA, 98093, USA.,Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ, 08854, USA
Binghuang Cai Department of Biomedical Informatics and Medical Education, University of Washington, Seattle, WA, 98109, USA
Hannah Carter University of California San Diego, La Jolla, CA, 92093, USA
JianJiong Gao Kravis Center for Molecular Oncology, Memorial Sloan Kettering Cancer Center, New York, NY, 10065, USA
Adam Godzik SBP Medical Discovery Institute, La Jolla, CA, 92037, USA
Michael Heuer AMPLab, University of California, Berkeley, CA, 94720, USA
Michael Hicks Human Longevity, Inc, San Diego, CA, 92121, USA
Thomas Hrabe SBP Medical Discovery Institute, La Jolla, CA, 92037, USA
Rachel Karchin Department of Biomedical Engineering, Institute for Computational Medicine, Johns Hopkins University, Baltimore, MD, 21218, USA.,Department of Oncology, Johns Hopkins Medicine, Baltimore, MD, 21287, USA
Julia Koehler Leman Flatiron Institute, Center for Computational Biology, Simons Foundation, New York, NY, 10010, USA.,Department of Biology and Center for Genomics and Systems Biology, New York University, New York, NY, 10003, USA
Lydie Lane SIB Swiss Institute of Bioinformatics and University of Geneva, CH-1211, Geneva, Switzerland
David L Masica Department of Biomedical Engineering, Institute for Computational Medicine, Johns Hopkins University, Baltimore, MD, 21218, USA
Sean D Mooney Department of Biomedical Informatics and Medical Education, University of Washington, Seattle, WA, 98109, USA
John Moult Institute for Bioscience and Biotechnology Research, University of Maryland, Rockville, MD, 20850, USA.,Department of Cell Biology and Molecular Genetics, University of Maryland, College Park, MD, 20742, USA
Gilbert S Omenn Institute for Systems Biology, Seattle, WA, 98109, USA.,Department of Computational Medicine & Bioinformatics, University of Michigan, Ann Arbor, MI, 48109-2218, USA
Frances Pearl School of Life Sciences, University of Sussex, Brighton, BN1 9QG, UK
Vikas Pejaver Department of Biomedical Informatics and Medical Education, University of Washington, Seattle, WA, 98109, USA.,The University of Washington eScience Institute, Seattle, WA, 98195, USA
Sheila M Reynolds Institute for Systems Biology, Seattle, WA, 98109, USA
Ariel Rokem The University of Washington eScience Institute, Seattle, WA, 98195, USA
Torsten Schwede SIB Swiss Institute of Bioinformatics and Biozentrum University of Basel, CH-4056, Basel, Switzerland
Sicheng Song Department of Biomedical Informatics and Medical Education, University of Washington, Seattle, WA, 98109, USA
Hagen Tilgner Brain and Mind Research Institute, Weill Cornell Medicine, New York City, NY, 10021, USA
Yana Valasatava RCSB Protein Data Bank, University of California San Diego, La Jolla, CA, 98093, USA
Yang Zhang Department of Computational Medicine & Bioinformatics, University of Michigan, Ann Arbor, MI, 48109-2218, USA
Eric W Deutsch Institute for Systems Biology, Seattle, WA, 98109, USA

Collapse

Schomburg KT, Nittinger E, Meyder A, Bietz S, Schneider N, Lange G, Klein R, Rarey M. Prediction of protein mutation effects based on dehydration and hydrogen bonding - A large-scale study. Proteins 2017;85:1550-1566. [DOI: 10.1002/prot.25315] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2017] [Revised: 04/29/2017] [Accepted: 05/02/2017] [Indexed: 11/11/2022]

Broom A, Jacobi Z, Trainor K, Meiering EM. Computational tools help improve protein stability but with a solubility tradeoff. J Biol Chem 2017;292:14349-14361. [PMID: 28710274 DOI: 10.1074/jbc.m117.784165] [Citation(s) in RCA: 77] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2017] [Revised: 07/11/2017] [Indexed: 01/18/2023] Open

Abstract

Accurately predicting changes in protein stability upon amino acid substitution is a much sought after goal. Destabilizing mutations are often implicated in disease, whereas stabilizing mutations are of great value for industrial and therapeutic biotechnology. Increasing protein stability is an especially challenging task, with random substitution yielding stabilizing mutations in only ∼2% of cases. To overcome this bottleneck, computational tools that aim to predict the effect of mutations have been developed; however, achieving accuracy and consistency remains challenging. Here, we combined 11 freely available tools into a meta-predictor (meieringlab.uwaterloo.ca/stabilitypredict/). Validation against ∼600 experimental mutations indicated that our meta-predictor has improved performance over any of the individual tools. The meta-predictor was then used to recommend 10 mutations in a previously designed protein of moderate thermodynamic stability, ThreeFoil. Experimental characterization showed that four mutations increased protein stability and could be amplified through ThreeFoil's structural symmetry to yield several multiple mutants with >2-kcal/mol stabilization. By avoiding residues within functional ties, we could maintain ThreeFoil's glycan-binding capacity. Despite successfully achieving substantial stabilization, however, almost all mutations decreased protein solubility, the most common cause of protein design failure. Examination of the 600-mutation data set revealed that stabilizing mutations on the protein surface tend to increase hydrophobicity and that the individual tools favor this approach to gain stability. Thus, whereas currently available tools can increase protein stability and combining them into a meta-predictor yields enhanced reliability, improvements to the potentials/force fields underlying these tools are needed to avoid gaining protein stability at the cost of solubility.

Collapse

Li M, Goncearenco A, Panchenko AR. Annotating Mutational Effects on Proteins and Protein Interactions: Designing Novel and Revisiting Existing Protocols. Methods Mol Biol 2017;1550:235-260. [PMID: 28188534 PMCID: PMC5388446 DOI: 10.1007/978-1-4939-6747-6_17] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/02/2023]

Striegel DA, Wojtowicz D, Przytycka TM, Periwal V. Correlated rigid modes in protein families. Phys Biol 2016;13:025003. [PMID: 27063781 DOI: 10.1088/1478-3975/13/2/025003] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]

Modarres HP, Mofrad MR, Sanati-Nezhad A. Protein thermostability engineering. RSC Adv 2016. [DOI: 10.1039/c6ra16992a] [Citation(s) in RCA: 70] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023] Open

Rohani L, Morton DJ, Wang XQ, Chaudhary J. Relative Stability of Wild-Type and Mutant p53 Core Domain: A Molecular Dynamic Study. J Comput Biol 2015;23:80-89. [PMID: 26675082 DOI: 10.1089/cmb.2015.0163] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Open

Feature-based multiple models improve classification of mutation-induced stability changes. BMC Genomics 2014;15 Suppl 4:S6. [PMID: 25057118 PMCID: PMC4083411 DOI: 10.1186/1471-2164-15-s4-s6] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open

Abstract

Background

Reliable prediction of stability changes in protein variants is an important aspect of computational protein design. A number of machine learning methods that allow a classification of stability changes knowing only the sequence of the protein emerged. However, their performance on amino acid substitutions of previously unseen non-homologous proteins is rather limited. Moreover, the performance varies for different types of mutations based on the secondary structure or accessible surface area of the mutation site.

Results

We proposed feature-based multiple models with each model designed for a specific type of mutations. The new method is composed of five models trained for mutations in exposed, buried, helical, sheet, and coil residues. The classification of a mutation as stabilising or destabilising is made as a consensus of two models, one selected based on the predicted accessible surface area and the other based on the predicted secondary structure of the mutation site. We refer to our new method as Evolutionary, Amino acid, and Structural Encodings with Multiple Models (EASE-MM). Cross-validation results show that EASE-MM provides a notable improvement to our previous work reaching a Matthews correlation coefficient of 0.44. EASE-MM was able to correctly classify 73% and 75% of stabilising and destabilising protein variants, respectively. Using an independent test set of 238 mutations, we confirmed our results in a comparison with related work.

Conclusions

EASE-MM not only outperformed other related methods but achieved more balanced results for different types of mutations based on the accessible surface area, secondary structure, or magnitude of stability changes. This can be attributed to using multiple models with the most relevant features selected for the given type of mutations. Therefore, our results support the presumption that different interactions govern stability changes in the exposed and buried residues or in residues with a different secondary structure.

Collapse

Stefl S, Nishi H, Petukh M, Panchenko AR, Alexov E. Molecular mechanisms of disease-causing missense mutations. J Mol Biol 2013;425:3919-36. [PMID: 23871686 DOI: 10.1016/j.jmb.2013.07.014] [Citation(s) in RCA: 187] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2013] [Revised: 07/04/2013] [Accepted: 07/10/2013] [Indexed: 12/23/2022]

Sánchez-González G, Kim JK, Kim DS, Garduño-Juárez R. A beta-complex statistical four body contact potential combined with a hydrogen bond statistical potential recognizes the correct native structure from protein decoy sets. Proteins 2013;81:1420-33. [PMID: 23568277 DOI: 10.1002/prot.24293] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2012] [Revised: 03/04/2013] [Accepted: 03/22/2013] [Indexed: 11/10/2022]

Structure-based mutant stability predictions on proteins of unknown structure. J Biotechnol 2012;161:287-93. [DOI: 10.1016/j.jbiotec.2012.06.020] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2012] [Revised: 06/19/2012] [Accepted: 06/22/2012] [Indexed: 11/23/2022]

Toll-Riera M, Bostick D, Albà MM, Plotkin JB. Structure and age jointly influence rates of protein evolution. PLoS Comput Biol 2012;8:e1002542. [PMID: 22693443 PMCID: PMC3364943 DOI: 10.1371/journal.pcbi.1002542] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2012] [Accepted: 04/17/2012] [Indexed: 12/01/2022] Open

Abstract

What factors determine a protein's rate of evolution are actively debated. Especially unclear is the relative role of intrinsic factors of present-day proteins versus historical factors such as protein age. Here we study the interplay of structural properties and evolutionary age, as determinants of protein evolutionary rate. We use a large set of one-to-one orthologs between human and mouse proteins, with mapped PDB structures. We report that previously observed structural correlations also hold within each age group – including relationships between solvent accessibility, designabililty, and evolutionary rates. However, age also plays a crucial role: age modulates the relationship between solvent accessibility and rate. Additionally, younger proteins, despite being less designable, tend to evolve faster than older proteins. We show that previously reported relationships between age and rate cannot be explained by structural biases among age groups. Finally, we introduce a knowledge-based potential function to study the stability of proteins through large-scale computation. We find that older proteins are more stable for their native structure, and more robust to mutations, than younger ones. Our results underscore that several determinants, both intrinsic and historical, can interact to determine rates of protein evolution.

Rates of protein evolution vary dramatically within and between organisms. But the factors that determine a protein's evolutionary rate are still under debate, despite extensive studies over the past decade. Several determinants have been proposed, for example gene expression, the importance of the gene for the organism, the number of physical or genetic interactions it has, its structural characteristics, or when it originated. Here we study how age and structural characteristics interact with one another to influence evolutionary rates. We use a set of one-to-one orthologs of human and mouse proteins, with known crystal structures. We find that these two determinants interact: for example, the age of protein modulates how its structure correlates with evolutionary rate. Nonetheless, the influence of age on evolutionary rate cannot be explained by its interplay with structure.

Collapse

Capriotti E, Nehrt NL, Kann MG, Bromberg Y. Bioinformatics for personal genome interpretation. Brief Bioinform 2012;13:495-512. [PMID: 22247263 DOI: 10.1093/bib/bbr070] [Citation(s) in RCA: 57] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023] Open

Li Y, Zhang J, Tai D, Middaugh CR, Zhang Y, Fang J. PROTS: a fragment based protein thermo-stability potential. Proteins 2011;80:81-92. [PMID: 21976375 DOI: 10.1002/prot.23163] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2011] [Revised: 07/18/2011] [Accepted: 07/31/2011] [Indexed: 12/30/2022]

Dehouck Y, Kwasigroch JM, Gilis D, Rooman M. PoPMuSiC 2.1: a web server for the estimation of protein stability changes upon mutation and sequence optimality. BMC Bioinformatics 2011;12:151. [PMID: 21569468 PMCID: PMC3113940 DOI: 10.1186/1471-2105-12-151] [Citation(s) in RCA: 367] [Impact Index Per Article: 28.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2010] [Accepted: 05/13/2011] [Indexed: 12/31/2022] Open

Abstract

Background

The rational design of modified proteins with controlled stability is of extreme importance in a whole range of applications, notably in the biotechnological and environmental areas, where proteins are used for their catalytic or other functional activities. Future breakthroughs in medical research may also be expected from an improved understanding of the effect of naturally occurring disease-causing mutations on the molecular level.

Results

PoPMuSiC-2.1 is a web server that predicts the thermodynamic stability changes caused by single site mutations in proteins, using a linear combination of statistical potentials whose coefficients depend on the solvent accessibility of the mutated residue. PoPMuSiC presents good prediction performances (correlation coefficient of 0.8 between predicted and measured stability changes, in cross validation, after exclusion of 10% outliers). It is moreover very fast, allowing the prediction of the stability changes resulting from all possible mutations in a medium size protein in less than a minute. This unique functionality is user-friendly implemented in PoPMuSiC and is particularly easy to exploit. Another new functionality of our server concerns the estimation of the optimality of each amino acid in the sequence, with respect to the stability of the structure. It may be used to detect structural weaknesses, i.e. clusters of non-optimal residues, which represent particularly interesting sites for introducing targeted mutations. This sequence optimality data is also expected to have significant implications in the prediction and the analysis of particular structural or functional protein regions. To illustrate the interest of this new functionality, we apply it to a dataset of known catalytic sites, and show that a much larger than average concentration of structural weaknesses is detected, quantifying how these sites have been optimized for function rather than stability.

Conclusion

The freely available PoPMuSiC-2.1 web server is highly useful for identifying very rapidly a list of possibly relevant mutations with the desired stability properties, on which subsequent experimental studies can be focused. It can also be used to detect sequence regions corresponding to structural weaknesses, which could be functionally important or structurally delicate regions, with obvious applications in rational protein design.

Collapse

Sun W, He J. From isotropic to anisotropic side chain representations: comparison of three models for residue contact estimation. PLoS One 2011;6:e19238. [PMID: 21552527 PMCID: PMC3084275 DOI: 10.1371/journal.pone.0019238] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2010] [Accepted: 03/29/2011] [Indexed: 11/19/2022] Open

Ackerman SH, Gatti DL. The contribution of coevolving residues to the stability of KDO8P synthase. PLoS One 2011;6:e17459. [PMID: 21408011 PMCID: PMC3052366 DOI: 10.1371/journal.pone.0017459] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2010] [Accepted: 02/03/2011] [Indexed: 12/03/2022] Open

Abstract

Background

The evolutionary tree of 3-deoxy-D-manno-octulosonate 8-phosphate (KDO8P) synthase (KDO8PS), a bacterial enzyme that catalyzes a key step in the biosynthesis of bacterial endotoxin, is evenly divided between metal and non-metal forms, both having similar structures, but diverging in various degrees in amino acid sequence. Mutagenesis, crystallographic and computational studies have established that only a few residues determine whether or not KDO8PS requires a metal for function. The remaining divergence in the amino acid sequence of KDO8PSs is apparently unrelated to the underlying catalytic mechanism.

Methodology/Principal Findings

The multiple alignment of all known KDO8PS sequences reveals that several residue pairs coevolved, an indication of their possible linkage to a structural constraint. In this study we investigated by computational means the contribution of coevolving residues to the stability of KDO8PS. We found that about 1/4 of all strongly coevolving pairs probably originated from cycles of mutation (decreasing stability) and suppression (restoring it), while the remaining pairs are best explained by a succession of neutral or nearly neutral covarions.

Conclusions/Significance

Both sequence conservation and coevolution are involved in the preservation of the core structure of KDO8PS, but the contribution of coevolving residues is, in proportion, smaller. This is because small stability gains or losses associated with selection of certain residues in some regions of the stability landscape of KDO8PS are easily offset by a large number of possible changes in other regions. While this effect increases the tolerance of KDO8PS to deleterious mutations, it also decreases the probability that specific pairs of residues could have a strong contribution to the thermodynamic stability of the protein.

Collapse

Dong Q, Zhou S. Novel nonlinear knowledge-based mean force potentials based on machine learning. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2011;8:476-486. [PMID: 20820079 DOI: 10.1109/tcbb.2010.86] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2023]

Abstract

The prediction of 3D structures of proteins from amino acid sequences is one of the most challenging problems in molecular biology. An essential task for solving this problem with coarse-grained models is to deduce effective interaction potentials. The development and evaluation of new energy functions is critical to accurately modeling the properties of biological macromolecules. Knowledge-based mean force potentials are derived from statistical analysis of proteins of known structures. Current knowledge-based potentials are almost in the form of weighted linear sum of interaction pairs. In this study, a class of novel nonlinear knowledge-based mean force potentials is presented. The potential parameters are obtained by nonlinear classifiers, instead of relative frequencies of interaction pairs against a reference state or linear classifiers. The support vector machine is used to derive the potential parameters on data sets that contain both native structures and decoy structures. Five knowledge-based mean force Boltzmann-based or linear potentials are introduced and their corresponding nonlinear potentials are implemented. They are the DIH potential (single-body residue-level Boltzmann-based potential), the DFIRE-SCM potential (two-body residue-level Boltzmann-based potential), the FS potential (two-body atom-level Boltzmann-based potential), the HR potential (two-body residue-level linear potential), and the T32S3 potential (two-body atom-level linear potential). Experiments are performed on well-established decoy sets, including the LKF data set, the CASP7 data set, and the Decoys “R”Us data set. The evaluation metrics include the energy Z score and the ability of each potential to discriminate native structures from a set of decoy structures. Experimental results show that all nonlinear potentials significantly outperform the corresponding Boltzmann-based or linear potentials, and the proposed discriminative framework is effective in developing knowledge-based mean force potentials. The nonlinear potentials can be widely used for ab initio protein structure prediction, model quality assessment, protein docking, and other challenging problems in computational biology.

Collapse

Masso M, Vaisman II. A structure-based computational mutagenesis elucidates the spectrum of stability-activity relationships in proteins. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2011;2011:3225-3228. [PMID: 22255026 DOI: 10.1109/iembs.2011.6090877] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]

Tian Y, Deutsch C, Krishnamoorthy B. Scoring function to predict solubility mutagenesis. Algorithms Mol Biol 2010;5:33. [PMID: 20929563 PMCID: PMC2958853 DOI: 10.1186/1748-7188-5-33] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2010] [Accepted: 10/07/2010] [Indexed: 11/16/2022] Open

Masso M, Vaisman II. Knowledge-based computational mutagenesis for predicting the disease potential of human non-synonymous single nucleotide polymorphisms. J Theor Biol 2010;266:560-8. [DOI: 10.1016/j.jtbi.2010.07.026] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2010] [Revised: 04/25/2010] [Accepted: 07/21/2010] [Indexed: 10/19/2022]

Esque J, Oguey C, de Brevern AG. A novel evaluation of residue and protein volumes by means of Laguerre tessellation. J Chem Inf Model 2010;50:947-60. [PMID: 20392096 DOI: 10.1021/ci9004892] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]

Sun W, He J. Understanding on the residue contact network using the log-normal cluster model and the multilevel wheel diagram. Biopolymers 2010;93:904-16. [DOI: 10.1002/bip.21494] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]

Khan S, Vihinen M. Performance of protein stability predictors. Hum Mutat 2010;31:675-84. [DOI: 10.1002/humu.21242] [Citation(s) in RCA: 240] [Impact Index Per Article: 17.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]

Aita T, Nishigaki K, Husimi Y. Toward the fast blind docking of a peptide to a target protein by using a four-body statistical pseudo-potential. Comput Biol Chem 2010;34:53-62. [DOI: 10.1016/j.compbiolchem.2009.10.005] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2009] [Revised: 09/27/2009] [Accepted: 10/20/2009] [Indexed: 11/26/2022]

Betancourt MR. Another look at the conditions for the extraction of protein knowledge-based potentials. Proteins 2009;76:72-85. [PMID: 19089977 DOI: 10.1002/prot.22320] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]

Abstract

Protein knowledge-based potentials are effective free energies obtained from databases of known protein structures. They are used to parameterize coarse-grained protein models in many folding simulation and structure prediction methods. Two common approaches are used in the derivation of knowledge-based potentials. One assumes that the energy parameters optimize the native structure stability. The other assumes that interaction events are related to their energies according to the Boltzmann distribution, and that they are distributed independently of other events, that is, the quasi-chemical approximation. Here, these assumptions are systematically tested by extracting contact energies from artificial databases of lattice proteins with predefined pairwise contact energies. Databases of protein sequences are designed to either satisfy the Boltzmann distribution at high or low temperatures, or to simultaneously optimize the native stability and folding kinetics. It is found that the quasi-chemical approximation, with the ideal reference state, accurately reproduce the true energies for high temperature Boltzmann distributed sequences (weakly interacting residues), but less accurately at low temperatures, where the sequences correspond to energy minima and the residues are strongly interacting. To overcome this problem, an iterative procedure for Boltzmann distributed sequences is introduced, which accounts for interacting residue correlations and eliminates the need for the quasi-chemical approximation. In this case, the energies are accurately reproduced at any ensemble temperature. However, when the database of sequences designed for optimal stability and kinetics is used, the energy correlation is less than optimal using either method, exhibiting random and systematic deviations from linearity. Therefore, the assumption that native structures are maximally stable or that sequences are determined according to the Boltzmann distribution seems to be inadequate for obtaining accurate energies. The limited number of sequences in the database and the inhomogeneous concentration of amino acids from one structure to another do not seem to be major obstacles for improving the quality of the extracted pairwise energies, with the exception of repulsive interactions.

Collapse