Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Gutin AM, Abkevich VI, Shakhnovich EI. Evolution-like selection of fast-folding model proteins. Proc Natl Acad Sci U S A 1995;92:1282-6. [PMID: 7877968 PMCID: PMC42503 DOI: 10.1073/pnas.92.5.1282] [Citation(s) in RCA: 121] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023] Open

For:	Gutin AM, Abkevich VI, Shakhnovich EI. Evolution-like selection of fast-folding model proteins. Proc Natl Acad Sci U S A 1995;92:1282-6. [PMID: 7877968 PMCID: PMC42503 DOI: 10.1073/pnas.92.5.1282] [Citation(s) in RCA: 121] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023] Open

Number

Cited by Other Article(s)

Yan Z, Wang J. Superfunneled Energy Landscape of Protein Evolution Unifies the Principles of Protein Evolution, Folding, and Design. PHYSICAL REVIEW LETTERS 2019;122:018103. [PMID: 31012725 DOI: 10.1103/physrevlett.122.018103] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/06/2017] [Revised: 11/08/2018] [Indexed: 06/09/2023]

Ferreira DC, van der Linden MG, de Oliveira LC, Onuchic JN, de Araújo AFP. Information and redundancy in the burial folding code of globular proteins within a wide range of shapes and sizes. Proteins 2016;84:515-31. [PMID: 26815167 DOI: 10.1002/prot.24998] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2015] [Revised: 12/28/2015] [Accepted: 01/19/2016] [Indexed: 11/09/2022]

The universal statistical distributions of the affinity, equilibrium constants, kinetics and specificity in biomolecular recognition. PLoS Comput Biol 2015;11:e1004212. [PMID: 25885453 PMCID: PMC4401658 DOI: 10.1371/journal.pcbi.1004212] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2014] [Accepted: 02/24/2015] [Indexed: 01/01/2023] Open

Abstract

We uncovered the universal statistical laws for the biomolecular recognition/binding process. We quantified the statistical energy landscapes for binding, from which we can characterize the distributions of the binding free energy (affinity), the equilibrium constants, the kinetics and the specificity by exploring the different ligands binding with a particular receptor. The results of the analytical studies are confirmed by the microscopic flexible docking simulations. The distribution of binding affinity is Gaussian around the mean and becomes exponential near the tail. The equilibrium constants of the binding follow a log-normal distribution around the mean and a power law distribution in the tail. The intrinsic specificity for biomolecular recognition measures the degree of discrimination of native versus non-native binding and the optimization of which becomes the maximization of the ratio of the free energy gap between the native state and the average of non-native states versus the roughness measured by the variance of the free energy landscape around its mean. The intrinsic specificity obeys a Gaussian distribution near the mean and an exponential distribution near the tail. Furthermore, the kinetics of binding follows a log-normal distribution near the mean and a power law distribution at the tail. Our study provides new insights into the statistical nature of thermodynamics, kinetics and function from different ligands binding with a specific receptor or equivalently specific ligand binding with different receptors. The elucidation of distributions of the kinetics and free energy has guiding roles in studying biomolecular recognition and function through small-molecule evolution and chemical genetics.

Uncovering the principles and underlying mechanisms of biomolecular recognition and molecular binding process is crucial for understanding the function and evolution, yet challenging. We meet the challenge by quantifying the statistical natures of the relevant physical variables of biomolecular recognition using the analytical model combined with microscopic flexible docking simulation methods. We uncovered the universal statistical laws obeyed by the affinity, equilibrium constant, intrinsic specificity and kinetics for biomolecular recognition. The general statistical laws based on energy landscape theory can serve as a conceptual framework for molecular recognition in biological repertoires. They can be applied to molecular selection, in vitro evolution process, high throughput screening and virtual screening for drug discovery. The statistical laws in combinations with experiments provide quantitative signatures of a specific ligand binding to a specific receptor, these resultant laws as a guideline will contribute to drug design against a specific target. Our developed statistical methodology is general and applicable for all other biomolecular recognitions.

Collapse

Arenas M, Sánchez-Cobos A, Bastolla U. Maximum-Likelihood Phylogenetic Inference with Selection on Protein Folding Stability. Mol Biol Evol 2015;32:2195-207. [PMID: 25837579 DOI: 10.1093/molbev/msv085] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open

Abstract

Despite intense work, incorporating constraints on protein native structures into the mathematical models of molecular evolution remains difficult, because most models and programs assume that protein sites evolve independently, whereas protein stability is maintained by interactions between sites. Here, we address this problem by developing a new mean-field substitution model that generates independent site-specific amino acid distributions with constraints on the stability of the native state against both unfolding and misfolding. The model depends on a background distribution of amino acids and one selection parameter that we fix maximizing the likelihood of the observed protein sequence. The analytic solution of the model shows that the main determinant of the site-specific distributions is the number of native contacts of the site and that the most variable sites are those with an intermediate number of native contacts. The mean-field models obtained, taking into account misfolded conformations, yield larger likelihood than models that only consider the native state, because their average hydrophobicity is more realistic, and they produce on the average stable sequences for most proteins. We evaluated the mean-field model with respect to empirical substitution models on 12 test data sets of different protein families. In all cases, the observed site-specific sequence profiles presented smaller Kullback-Leibler divergence from the mean-field distributions than from the empirical substitution model. Next, we obtained substitution rates combining the mean-field frequencies with an empirical substitution model. The resulting mean-field substitution model assigns larger likelihood than the empirical model to all studied families when we consider sequences with identity larger than 0.35, plausibly a condition that enforces conservation of the native structure across the family. We found that the mean-field model performs better than other structurally constrained models with similar or higher complexity. With respect to the much more complex model recently developed by Bordner and Mittelmann, which takes into account pairwise terms in the amino acid distributions and also optimizes the exchangeability matrix, our model performed worse for data with small sequence divergence but better for data with larger sequence divergence. The mean-field model has been implemented into the computer program Prot_Evol that is freely available at http://ub.cbm.uam.es/software/Prot_Evol.php.

Collapse

Wolynes PG. Evolution, energy landscapes and the paradoxes of protein folding. Biochimie 2014;119:218-30. [PMID: 25530262 DOI: 10.1016/j.biochi.2014.12.007] [Citation(s) in RCA: 110] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2014] [Accepted: 12/11/2014] [Indexed: 01/25/2023]

Detecting selection on protein stability through statistical mechanical models of folding and evolution. Biomolecules 2014;4:291-314. [PMID: 24970217 PMCID: PMC4030984 DOI: 10.3390/biom4010291] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/25/2013] [Revised: 02/13/2014] [Accepted: 02/14/2014] [Indexed: 12/31/2022] Open

Krobath H, Shakhnovich EI, Faísca PFN. Structural and energetic determinants of co-translational folding. J Chem Phys 2014;138:215101. [PMID: 23758397 DOI: 10.1063/1.4808044] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

Abstract

We performed extensive lattice Monte Carlo simulations of ribosome-bound stalled nascent chains (RNCs) to explore the relative roles of native topology and non-native interactions in co-translational folding of small proteins. We found that the formation of a substantial part of the native structure generally occurs towards the end of protein synthesis. However, multi-domain structures, which are rich in local interactions, are able to develop gradually during chain elongation, while those with proximate chain termini require full protein synthesis to fold. A detailed assessment of the conformational ensembles populated by RNCs with different lengths reveals that the directionality of protein synthesis has a fine-tuning effect on the probability to populate low-energy conformations. In particular, if the participation of non-native interactions in folding energetics is mild, the formation of native-like conformations is majorly determined by the properties of the contact map around the tethering terminus. Likewise, a pair of RNCs differing by only 1-2 residues can populate structurally well-resolved low energy conformations with significantly different probabilities. An interesting structural feature of these low-energy conformations is that, irrespective of native structure, their non-native interactions are always long-ranged and marginally stabilizing. A comparison between the conformational spectra of RNCs and chain fragments folding freely in the bulk reveals drastic changes amongst the two set-ups depending on the native structure. Furthermore, they also show that the ribosome may enhance (up to 20%) the population of low energy conformations for chains folding to native structures dominated by local interactions. In contrast, a RNC folding to a non-local topology is forced to remain largely unstructured but can attain low energy conformations in bulk.

Collapse

Minning J, Porto M, Bastolla U. Detecting selection for negative design in proteins through an improved model of the misfolded state. Proteins 2013;81:1102-12. [PMID: 23280507 DOI: 10.1002/prot.24244] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2012] [Accepted: 12/17/2012] [Indexed: 11/05/2022]

Krobath H, Faísca PFN. Interplay between native topology and non-native interactions in the folding of tethered proteins. Phys Biol 2013;10:016002. [DOI: 10.1088/1478-3975/10/1/016002] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]

Bastolla U, Bruscolini P, Velasco JL. Sequence determinants of protein folding rates: Positive correlation between contact energy and contact range indicates selection for fast folding. Proteins 2012;80:2287-304. [DOI: 10.1002/prot.24118] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2011] [Revised: 05/14/2012] [Accepted: 05/17/2012] [Indexed: 11/12/2022]

Dal Molin JP, da Silva MAA, Caliri A. Effect of local thermal fluctuations on folding kinetics: a study from the perspective of nonextensive statistical mechanics. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2011;84:041903. [PMID: 22181171 DOI: 10.1103/physreve.84.041903] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/08/2010] [Revised: 08/04/2011] [Indexed: 05/31/2023]

Abstract

The search through the proteins conformational space is thought as an early independent stage of the folding process, governed mainly by the hydrophobic effect. Because of the nanoscopic size of proteins, we assume that the effects of local thermal fluctuations work like folding assistants, managed by the nonextensive parameter q. Using a 27-mer heteropolymer on a cubic lattice, we obtained--by Monte Carlo simulations--kinetic and thermodynamic amounts (such as the characteristic folding time and the native stability) as a function of temperature T and q for a few distinct native targets. We found that for each native structure, at a specific system temperature T, there exists an optimum q* that minimizes the folding characteristic time τ(min); for T=1, it is found that q* lies in the interval 1.15±0.05, even for native structures presenting significantly different topological complexities. The distribution of τ(min) obtained for specific q>1 (nonextensive approach) and temperature T can be fully reproduced for q=1 (Boltzmann approach), but only at higher temperatures T'>T. However, assuming that the complete set of proteins of each organism is optimized to work in a narrow range of temperature, we conclude that--for the present problem--the two approaches, namely, (T,q>1) and (T>T',q=1), cannot be equivalent; it is not a simple matter of reparametrization. Finally, by associating the nonextensive parameter q with the instantaneous degree of compactness of the globule, q becomes a dynamic variable, self-adjusted along the simulation. The results obtained through the q-variable approach are utterly consistent with those obtained by using a target-tuned parameter q*. However, in the former approach, q is automatically adjusted by the chain conformational evolution, eliminating the need to seek for a specific optimized value of q for each case. Besides, using the q-variable approach, different target structures are promptly characterized by inherent distributions of q, which reflect the overall complexity of their corresponding native topologies and energy landscapes.

Collapse

Collet O. How does the first water shell fold proteins so fast? J Chem Phys 2011;134:085107. [DOI: 10.1063/1.3554731] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

Faísca PFN, Nunes A, Travasso RDM, Shakhnovich EI. Non-native interactions play an effective role in protein folding dynamics. Protein Sci 2011;19:2196-209. [PMID: 20836137 DOI: 10.1002/pro.498] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]

Deeds EJ, Shakhnovich EI. A structure-centric view of protein evolution, design, and adaptation. ADVANCES IN ENZYMOLOGY AND RELATED AREAS OF MOLECULAR BIOLOGY 2010;75:133-91, xi-xii. [PMID: 17124867 DOI: 10.1002/9780471224464.ch2] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]

Abstract

Proteins, by virtue of their central role in most biological processes, represent one of the key subjects of the study of molecular evolution. Inherent in the indispensability of proteins for living cells is the fact that a given protein can adopt a specific three-dimensional shape that is specified solely by the protein's sequence of amino acids. Over the past several decades, structural biologists have demonstrated that the array of structures that proteins may adopt is quite astounding, and this has lead to a strong interest in understanding how protein structures change and evolve over time. In this review we consider a large body of recent work that attempts to illuminate this structure-centric picture of protein evolution. Much of this work has focused on the question of how completely new protein structures (i.e., new folds or topologies) are discovered by protein sequences as they evolve. Pursuant to this question of structural innovation has been a desire to describe and understand the observation that certain types of protein structures are far more abundant than others and how this uneven distribution of proteins implicates on the process through which new shapes are discovered. We consider a number of theoretical models that have been successful at explaining this heterogeneity in protein populations and discuss the increasing amount of evidence that indicates that the process of structural evolution involves the divergence of protein sequences and structures from one another. We also consider the topic of protein designability, which concerns itself with understanding how a protein's structure influences the number of sequences that can fold successfully into that structure. Understanding and quantifying the relationship between the physical feature of a structure and its designability has been a long-standing goal of the study of protein structure and evolution, and we discuss a number of recent advances that have yielded a promising answer to this question. Finally, we review the relatively new field of protein structural phylogeny, an area of study in which information about the distribution of protein structures among different organisms is used to reconstruct the evolutionary relationships between them. Taken together, the work that we review presents an increasingly coherent picture of how these unique polymers have evolved over the course of life on Earth.

Collapse

Universal distribution of protein evolution rates as a consequence of protein folding physics. Proc Natl Acad Sci U S A 2010;107:2983-8. [PMID: 20133769 DOI: 10.1073/pnas.0910445107] [Citation(s) in RCA: 58] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023] Open

Betancourt MR. Another look at the conditions for the extraction of protein knowledge-based potentials. Proteins 2009;76:72-85. [PMID: 19089977 DOI: 10.1002/prot.22320] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]

Abstract

Protein knowledge-based potentials are effective free energies obtained from databases of known protein structures. They are used to parameterize coarse-grained protein models in many folding simulation and structure prediction methods. Two common approaches are used in the derivation of knowledge-based potentials. One assumes that the energy parameters optimize the native structure stability. The other assumes that interaction events are related to their energies according to the Boltzmann distribution, and that they are distributed independently of other events, that is, the quasi-chemical approximation. Here, these assumptions are systematically tested by extracting contact energies from artificial databases of lattice proteins with predefined pairwise contact energies. Databases of protein sequences are designed to either satisfy the Boltzmann distribution at high or low temperatures, or to simultaneously optimize the native stability and folding kinetics. It is found that the quasi-chemical approximation, with the ideal reference state, accurately reproduce the true energies for high temperature Boltzmann distributed sequences (weakly interacting residues), but less accurately at low temperatures, where the sequences correspond to energy minima and the residues are strongly interacting. To overcome this problem, an iterative procedure for Boltzmann distributed sequences is introduced, which accounts for interacting residue correlations and eliminates the need for the quasi-chemical approximation. In this case, the energies are accurately reproduced at any ensemble temperature. However, when the database of sequences designed for optimal stability and kinetics is used, the energy correlation is less than optimal using either method, exhibiting random and systematic deviations from linearity. Therefore, the assumption that native structures are maximally stable or that sequences are determined according to the Boltzmann distribution seems to be inadequate for obtaining accurate energies. The limited number of sequences in the database and the inhomogeneous concentration of amino acids from one structure to another do not seem to be major obstacles for improving the quality of the extracted pairwise energies, with the exception of repulsive interactions.

Collapse

Lai Z, Su J, Chen W, Wang C. Uncovering the properties of energy-weighted conformation space networks with a hydrophobic-hydrophilic model. Int J Mol Sci 2009;10:1808-1823. [PMID: 19468340 PMCID: PMC2680648 DOI: 10.3390/ijms10041808] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2009] [Revised: 03/30/2009] [Accepted: 04/07/2009] [Indexed: 11/16/2022] Open

Peto M, Kloczkowski A, Honavar V, Jernigan RL. Use of machine learning algorithms to classify binary protein sequences as highly-designable or poorly-designable. BMC Bioinformatics 2008;9:487. [PMID: 19014713 PMCID: PMC2655094 DOI: 10.1186/1471-2105-9-487] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2008] [Accepted: 11/18/2008] [Indexed: 11/10/2022] Open

Collet O. Folding kinetics of proteins and cold denaturation. J Chem Phys 2008;129:155101. [DOI: 10.1063/1.2992556] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

Zeldovich KB, Chen P, Shakhnovich BE, Shakhnovich EI. A first-principles model of early evolution: emergence of gene families, species, and preferred protein folds. PLoS Comput Biol 2008;3:e139. [PMID: 17630830 PMCID: PMC1914367 DOI: 10.1371/journal.pcbi.0030139] [Citation(s) in RCA: 52] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2007] [Accepted: 06/04/2007] [Indexed: 11/19/2022] Open

Abstract

In this work we develop a microscopic physical model of early evolution where phenotype—organism life expectancy—is directly related to genotype—the stability of its proteins in their native conformations—which can be determined exactly in the model. Simulating the model on a computer, we consistently observe the “Big Bang” scenario whereby exponential population growth ensues as soon as favorable sequence–structure combinations (precursors of stable proteins) are discovered. Upon that, random diversity of the structural space abruptly collapses into a small set of preferred proteins. We observe that protein folds remain stable and abundant in the population at timescales much greater than mutation or organism lifetime, and the distribution of the lifetimes of dominant folds in a population approximately follows a power law. The separation of evolutionary timescales between discovery of new folds and generation of new sequences gives rise to emergence of protein families and superfamilies whose sizes are power-law distributed, closely matching the same distributions for real proteins. On the population level we observe emergence of species—subpopulations that carry similar genomes. Further, we present a simple theory that relates stability of evolving proteins to the sizes of emerging genomes. Together, these results provide a microscopic first-principles picture of how first-gene families developed in the course of early evolution.

Here, we address the question of how Darwinian evolution of organisms determines molecular evolution of their proteins and genomes. We developed a microscopic ab initio model of early biological evolution where the fitness (essentially lifetime) of an organism is explicitly related to the evolving sequences of its proteins. The main assumption of the model is that the death rate of an organism is determined by the stability of the least stable of their proteins. A lattice model is used to calculate stability of all proteins in a genome from their amino acid sequence. The simulation of the model starts from 100 identical organisms, each carrying the same random gene, and proceeds via random mutations, gene duplication, organism births via replication, and organism deaths. We find that exponential population growth is possible only after the discovery of a very small number of specific advantageous protein structures. The number of genes in the evolving organisms depends on the mutation rate, demonstrating the intricate relationship between the genome sizes and protein stability requirements. Further, the model explains the observed power-law distributions of protein family and superfamily sizes, as well as the scale-free character of protein structural similarity graphs. Together, these results and their analysis suggest a plausible comprehensive scenario of emergence of the protein universe in early biological evolution.

Collapse

Zeldovich KB, Shakhnovich EI. Understanding protein evolution: from protein physics to Darwinian selection. Annu Rev Phys Chem 2008;59:105-27. [PMID: 17937598 DOI: 10.1146/annurev.physchem.58.032806.104449] [Citation(s) in RCA: 51] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

Analyzing pathogenic mutations of C5 domain from cardiac myosin binding protein C through MD simulations. EUROPEAN BIOPHYSICS JOURNAL: EBJ 2008;37:683-91. [DOI: 10.1007/s00249-008-0308-x] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/26/2007] [Revised: 02/04/2008] [Accepted: 03/10/2008] [Indexed: 11/26/2022]

Franzosa E, Xia Y. Structural Perspectives on Protein Evolution. ANNUAL REPORTS IN COMPUTATIONAL CHEMISTRY 2008. [DOI: 10.1016/s1574-1400(08)00001-7] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]

Wales DJ, Bogdan TV. Potential energy and free energy landscapes. J Phys Chem B 2007;110:20765-76. [PMID: 17048885 DOI: 10.1021/jp0680544] [Citation(s) in RCA: 133] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]

Peto M, Kloczkowski A, Jernigan RL. Shape-dependent designability studies of lattice proteins. JOURNAL OF PHYSICS. CONDENSED MATTER : AN INSTITUTE OF PHYSICS JOURNAL 2007;19:285220-285230. [PMID: 18079979 PMCID: PMC2134837 DOI: 10.1088/0953-8984/19/28/285220] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]

The Structurally Constrained Neutral Model of Protein Evolution. ACTA ACUST UNITED AC 2007. [DOI: 10.1007/978-3-540-35306-5_4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register]

Matysiak S, Clementi C. Minimalist protein model as a diagnostic tool for misfolding and aggregation. J Mol Biol 2006;363:297-308. [PMID: 16959265 DOI: 10.1016/j.jmb.2006.07.088] [Citation(s) in RCA: 46] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2006] [Revised: 07/25/2006] [Accepted: 07/28/2006] [Indexed: 11/24/2022]

Bastolla U, Porto M, Roman HE, Vendruscolo M. A protein evolution model with independent sites that reproduces site-specific amino acid distributions from the Protein Data Bank. BMC Evol Biol 2006;6:43. [PMID: 16737532 PMCID: PMC1570368 DOI: 10.1186/1471-2148-6-43] [Citation(s) in RCA: 33] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2005] [Accepted: 05/31/2006] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

Since thermodynamic stability is a global property of proteins that has to be conserved during evolution, the selective pressure at a given site of a protein sequence depends on the amino acids present at other sites. However, models of molecular evolution that aim at reconstructing the evolutionary history of macromolecules become computationally intractable if such correlations between sites are explicitly taken into account.

RESULTS

We introduce an evolutionary model with sites evolving independently under a global constraint on the conservation of structural stability. This model consists of a selection process, which depends on two hydrophobicity parameters that can be computed from protein sequences without any fit, and a mutation process for which we consider various models. It reproduces quantitatively the results of Structurally Constrained Neutral (SCN) simulations of protein evolution in which the stability of the native state is explicitly computed and conserved. We then compare the predicted site-specific amino acid distributions with those sampled from the Protein Data Bank (PDB). The parameters of the mutation model, whose number varies between zero and five, are fitted from the data. The mean correlation coefficient between predicted and observed site-specific amino acid distributions is larger than <r> = 0.70 for a mutation model with no free parameters and no genetic code. In contrast, considering only the mutation process with no selection yields a mean correlation coefficient of <r> = 0.56 with three fitted parameters. The mutation model that best fits the data takes into account increased mutation rate at CpG dinucleotides, yielding <r> = 0.90 with five parameters.

CONCLUSION

The effective selection process that we propose reproduces well amino acid distributions as observed in the protein sequences in the PDB. Its simplicity makes it very promising for likelihood calculations in phylogenetic studies. Interestingly, in this approach the mutation process influences the effective selection process, i.e. selection and mutation must be entangled in order to obtain effectively independent sites. This interdependence between mutation and selection reflects the deep influence that mutation has on the evolutionary process: The bias in the mutation influences the thermodynamic properties of the evolving proteins, in agreement with comparative studies of bacterial proteomes, and it also influences the rate of accepted mutations.

Collapse

Principal eigenvector of contact matrices and hydrophobicity profiles in proteins. Proteins 2006;58:22-30. [PMID: 15523667 DOI: 10.1002/prot.20240] [Citation(s) in RCA: 61] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]

Shell MS, Debenedetti PG, Panagiotopoulos AZ. Computational characterization of the sequence landscape in simple protein alphabets. Proteins 2005;62:232-43. [PMID: 16284961 DOI: 10.1002/prot.20714] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]

Bastolla U, Demetrius L. Stability constraints and protein evolution: the role of chain length, composition and disulfide bonds. Protein Eng Des Sel 2005;18:405-15. [PMID: 16085657 DOI: 10.1093/protein/gzi045] [Citation(s) in RCA: 46] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

Abstract

Stability of the native state is an essential requirement in protein evolution and design. Here we investigated the interplay between chain length and stability constraints using a simple model of protein folding and a statistical study of the Protein Data Bank. We distinguish two types of stability of the native state: with respect to the unfolded state (unfolding stability) and with respect to misfolded configurations (misfolding stability). Several contributions to stability are evaluated and their correlations are disentangled through principal components analysis, with the following main results. (1) We show that longer proteins can fulfil more easily the requirements of unfolding and misfolding stability, because they have a higher number of native interactions per residue. Consistently, in longer proteins native interactions are weaker and they are less optimized with respect to non-native interactions. (2) Stability against misfolding is negatively correlated with the strength of native interactions, which is related to hydrophobicity. Hence there is a trade-off between unfolding and misfolding stability. This trade-off is influenced by protein length: less hydrophobic sequences are observed in very long proteins. (3) The number of disulfide bonds is positively correlated with the deficit of free energy stabilizing the native state. Chain length and the number of disulfide bonds per residue are negatively correlated in proteins with short chains and uncorrelated in proteins with long chains. (4) The number of salt bridges per residue and per native contact increases with chain length. We interpret these observations as an indication that the constraints imposed by unfolding stability are less demanding in long proteins and they are further reduced by the competing requirement for stability against misfolding. In particular, disulfide bonds appear to be positively selected in short proteins, whereas they evolve in an effectively neutral way in long proteins.

Collapse

Das P, Matysiak S, Clementi C. Balancing energy and entropy: a minimalist model for the characterization of protein folding landscapes. Proc Natl Acad Sci U S A 2005;102:10141-6. [PMID: 16006532 PMCID: PMC1177359 DOI: 10.1073/pnas.0409471102] [Citation(s) in RCA: 82] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2005] [Indexed: 11/18/2022] Open

Choi HS, Huh J, Jo WH. Comparison between denaturant- and temperature-induced unfolding pathways of protein: a lattice Monte Carlo simulation. Biomacromolecules 2005;5:2289-96. [PMID: 15530044 DOI: 10.1021/bm049663p] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]

Wang J, Huang W, Lu H, Wang E. Downhill kinetics of biomolecular interface binding: globally connected scenario. Biophys J 2005;87:2187-94. [PMID: 15454421 PMCID: PMC1304644 DOI: 10.1529/biophysj.104.042747] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

Briones C, Bastolla U. Protein evolution in viral quasispecies under selective pressure: A thermodynamic and phylogenetic analysis. Gene 2005;347:237-46. [PMID: 15725390 DOI: 10.1016/j.gene.2004.12.018] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2004] [Revised: 11/23/2004] [Accepted: 12/10/2004] [Indexed: 01/21/2023]

Bastolla U, Porto M, Roman HE, Vendruscolo M. Looking at structure, stability, and evolution of proteins through the principal eigenvector of contact matrices and hydrophobicity profiles. Gene 2005;347:219-30. [PMID: 15777696 DOI: 10.1016/j.gene.2004.12.015] [Citation(s) in RCA: 17] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2004] [Revised: 11/29/2004] [Accepted: 12/10/2004] [Indexed: 11/28/2022]

Chen C, Xiao Y, Zhang L. A directed essential dynamics simulation of peptide folding. Biophys J 2005;88:3276-85. [PMID: 15731383 PMCID: PMC1305476 DOI: 10.1529/biophysj.104.046904] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

Porto M, Roman HE, Vendruscolo M, Bastolla U. Prediction of site-specific amino acid distributions and limits of divergent evolutionary changes in protein sequences. Mol Biol Evol 2004;22:630-8. [PMID: 15537801 DOI: 10.1093/molbev/msi048] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Bastolla U, Moya A, Viguera E, van Ham RCHJ. Genomic determinants of protein folding thermodynamics in prokaryotic organisms. J Mol Biol 2004;343:1451-66. [PMID: 15491623 DOI: 10.1016/j.jmb.2004.08.086] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2004] [Revised: 08/24/2004] [Accepted: 08/27/2004] [Indexed: 02/07/2023]

Abstract

Here we investigate how thermodynamic properties of orthologous proteins are influenced by the genomic environment in which they evolve. We performed a comparative computational study of 21 protein families in 73 prokaryotic species and obtained the following main results. (i) Protein stability with respect to the unfolded state and with respect to misfolding are anticorrelated. There appears to be a trade-off between these two properties, which cannot be optimized simultaneously. (ii) Folding thermodynamic parameters are strongly correlated with two genomic features, genome size and G+C composition. In particular, the normalized energy gap, an indicator of folding efficiency in statistical mechanical models of protein folding, is smaller in proteins of organisms with a small genome size and a compositional bias towards A+T. Such genomic features are characteristic for bacteria with an intracellular lifestyle. We interpret these correlations in light of mutation pressure and natural selection. A mutational bias toward A+T at the DNA level translates into a mutational bias toward more hydrophobic (and in general more interactive) proteins, a consequence of the structure of the genetic code. Increased hydrophobicity renders proteins more stable against unfolding but less stable against misfolding. Proteins with high hydrophobicity and low stability against misfolding occur in organisms with reduced genomes, like obligate intracellular bacteria. We argue that they are fixed because these organisms experience weaker purifying selection due to their small effective population sizes. This interpretation is supported by the observation of a high expression level of chaperones in these bacteria. Our results indicate that the mutational spectrum of a genome and the strength of selection significantly influence protein folding thermodynamics.

Collapse

Gillespie B, Plaxco KW. Using protein folding rates to test protein folding theories. Annu Rev Biochem 2004;73:837-59. [PMID: 15189160 DOI: 10.1146/annurev.biochem.73.011303.073904] [Citation(s) in RCA: 55] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

Bloom JD, Wilke CO, Arnold FH, Adami C. Stability and the evolvability of function in a model protein. Biophys J 2004;86:2758-64. [PMID: 15111394 PMCID: PMC1304146 DOI: 10.1016/s0006-3495(04)74329-5] [Citation(s) in RCA: 82] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2003] [Accepted: 01/12/2004] [Indexed: 11/18/2022] Open

Li J, Wang J, Zhang J, Wang W. Thermodynamic stability and kinetic foldability of a lattice protein model. J Chem Phys 2004;120:6274-87. [PMID: 15267515 DOI: 10.1063/1.1651053] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

Tiana G, Shakhnovich BE, Dokholyan NV, Shakhnovich EI. Imprint of evolution on protein structures. Proc Natl Acad Sci U S A 2004;101:2846-51. [PMID: 14970345 PMCID: PMC365708 DOI: 10.1073/pnas.0306638101] [Citation(s) in RCA: 48] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2003] [Accepted: 12/22/2003] [Indexed: 11/18/2022] Open

Kolinski A, Skolnick J. Reduced models of proteins and their applications. POLYMER 2004. [DOI: 10.1016/j.polymer.2003.10.064] [Citation(s) in RCA: 64] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]

Ball RC, Fink TMA, Bowler NE. Stochastic annealing. PHYSICAL REVIEW LETTERS 2003;91:030201. [PMID: 12906405 DOI: 10.1103/physrevlett.91.030201] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/14/2003] [Indexed: 05/24/2023]

Qin M, Wang J, Tang Y, Wang W. Folding behaviors of lattice model proteins with three kinds of contact potentials. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2003;67:061905. [PMID: 16241259 DOI: 10.1103/physreve.67.061905] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/30/2003] [Indexed: 05/04/2023]

Fan K, Wang W. What is the minimum number of letters required to fold a protein? J Mol Biol 2003;328:921-6. [PMID: 12729764 DOI: 10.1016/s0022-2836(03)00324-3] [Citation(s) in RCA: 57] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]

Nelson E, Grishin N. Investigation of the folding profiles of evolutionarily selected model proteins. J Chem Phys 2003. [DOI: 10.1063/1.1536621] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

van Ham RCHJ, Kamerbeek J, Palacios C, Rausell C, Abascal F, Bastolla U, Fernández JM, Jiménez L, Postigo M, Silva FJ, Tamames J, Viguera E, Latorre A, Valencia A, Morán F, Moya A. Reductive genome evolution in Buchnera aphidicola. Proc Natl Acad Sci U S A 2003;100:581-6. [PMID: 12522265 PMCID: PMC141039 DOI: 10.1073/pnas.0235981100] [Citation(s) in RCA: 350] [Impact Index Per Article: 16.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2002] [Indexed: 02/07/2023] Open

Ball RC, Fink TMA. Protein design depends on the size of the amino acid alphabet. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2002;66:031902. [PMID: 12366147 DOI: 10.1103/physreve.66.031902] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/26/2001] [Indexed: 05/23/2023]