1
|
Soranno A. Physical basis of the disorder-order transition. Arch Biochem Biophys 2020; 685:108305. [DOI: 10.1016/j.abb.2020.108305] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2019] [Revised: 02/10/2020] [Accepted: 02/14/2020] [Indexed: 12/29/2022]
|
2
|
Abstract
Classically, phenotype is what is observed, and genotype is the genetic makeup. Statistical studies aim to project phenotypic likelihoods of genotypic patterns. The traditional genotype-to-phenotype theory embraces the view that the encoded protein shape together with gene expression level largely determines the resulting phenotypic trait. Here, we point out that the molecular biology revolution at the turn of the century explained that the gene encodes not one but ensembles of conformations, which in turn spell all possible gene-associated phenotypes. The significance of a dynamic ensemble view is in understanding the linkage between genetic change and the gained observable physical or biochemical characteristics. Thus, despite the transformative shift in our understanding of the basis of protein structure and function, the literature still commonly relates to the classical genotype-phenotype paradigm. This is important because an ensemble view clarifies how even seemingly small genetic alterations can lead to pleiotropic traits in adaptive evolution and in disease, why cellular pathways can be modified in monogenic and polygenic traits, and how the environment may tweak protein function.
Collapse
Affiliation(s)
- Ruth Nussinov
- Cancer and Inflammation Program, Leidos Biomedical Research, Inc., Frederick National Laboratory for Cancer Research, National Cancer Institute at Frederick, Frederick, Maryland, United States of America
- Sackler Institute of Molecular Medicine, Department of Human Genetics and Molecular Medicine, Sackler School of Medicine, Tel Aviv University, Tel Aviv, Israel
| | - Chung-Jung Tsai
- Cancer and Inflammation Program, Leidos Biomedical Research, Inc., Frederick National Laboratory for Cancer Research, National Cancer Institute at Frederick, Frederick, Maryland, United States of America
| | - Hyunbum Jang
- Cancer and Inflammation Program, Leidos Biomedical Research, Inc., Frederick National Laboratory for Cancer Research, National Cancer Institute at Frederick, Frederick, Maryland, United States of America
| |
Collapse
|
3
|
Kinjo AR. Cooperative "folding transition" in the sequence space facilitates function-driven evolution of protein families. J Theor Biol 2018; 443:18-27. [PMID: 29355538 DOI: 10.1016/j.jtbi.2018.01.019] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2017] [Revised: 01/16/2018] [Accepted: 01/17/2018] [Indexed: 12/23/2022]
Abstract
In the protein sequence space, natural proteins form clusters of families which are characterized by their unique native folds whereas the great majority of random polypeptides are neither clustered nor foldable to unique structures. Since a given polypeptide can be either foldable or unfoldable, a kind of "folding transition" is expected at the boundary of a protein family in the sequence space. By Monte Carlo simulations of a statistical mechanical model of protein sequence alignment that coherently incorporates both short-range and long-range interactions as well as variable-length insertions to reproduce the statistics of the multiple sequence alignment of a given protein family, we demonstrate the existence of such transition between natural-like sequences and random sequences in the sequence subspaces for 15 domain families of various folds. The transition was found to be highly cooperative and two-state-like. Furthermore, enforcing or suppressing consensus residues on a few of the well-conserved sites enhanced or diminished, respectively, the natural-like pattern formation over the entire sequence. In most families, the key sites included ligand binding sites. These results suggest some selective pressure on the key residues, such as ligand binding activity, may cooperatively facilitate the emergence of a protein family during evolution. From a more practical aspect, the present results highlight an essential role of long-range effects in precisely defining protein families, which are absent in conventional sequence models.
Collapse
Affiliation(s)
- Akira R Kinjo
- Institute for Protein Research, Osaka University, 3-2 Yamadaoka, Suita, Osaka 565-0871, Japan.
| |
Collapse
|
4
|
Venev SV, Zeldovich KB. Massively parallel sampling of lattice proteins reveals foundations of thermal adaptation. J Chem Phys 2016; 143:055101. [PMID: 26254668 DOI: 10.1063/1.4927565] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open
Abstract
Evolution of proteins in bacteria and archaea living in different conditions leads to significant correlations between amino acid usage and environmental temperature. The origins of these correlations are poorly understood, and an important question of protein theory, physics-based prediction of types of amino acids overrepresented in highly thermostable proteins, remains largely unsolved. Here, we extend the random energy model of protein folding by weighting the interaction energies of amino acids by their frequencies in protein sequences and predict the energy gap of proteins designed to fold well at elevated temperatures. To test the model, we present a novel scalable algorithm for simultaneous energy calculation for many sequences in many structures, targeting massively parallel computing architectures such as graphics processing unit. The energy calculation is performed by multiplying two matrices, one representing the complete set of sequences, and the other describing the contact maps of all structural templates. An implementation of the algorithm for the CUDA platform is available at http://www.github.com/kzeldovich/galeprot and calculates protein folding energies over 250 times faster than a single central processing unit. Analysis of amino acid usage in 64-mer cubic lattice proteins designed to fold well at different temperatures demonstrates an excellent agreement between theoretical and simulated values of energy gap. The theoretical predictions of temperature trends of amino acid frequencies are significantly correlated with bioinformatics data on 191 bacteria and archaea, and highlight protein folding constraints as a fundamental selection pressure during thermal adaptation in biological evolution.
Collapse
Affiliation(s)
- Sergey V Venev
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, 368 Plantation St, Worcester, Massachusetts 01605, USA
| | - Konstantin B Zeldovich
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, 368 Plantation St, Worcester, Massachusetts 01605, USA
| |
Collapse
|
5
|
Rashid MA, Iqbal S, Khatib F, Hoque MT, Sattar A. Guided macro-mutation in a graded energy based genetic algorithm for protein structure prediction. Comput Biol Chem 2016; 61:162-77. [PMID: 26878130 DOI: 10.1016/j.compbiolchem.2016.01.008] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2015] [Revised: 11/29/2015] [Accepted: 01/21/2016] [Indexed: 10/22/2022]
Abstract
Protein structure prediction is considered as one of the most challenging and computationally intractable combinatorial problem. Thus, the efficient modeling of convoluted search space, the clever use of energy functions, and more importantly, the use of effective sampling algorithms become crucial to address this problem. For protein structure modeling, an off-lattice model provides limited scopes to exercise and evaluate the algorithmic developments due to its astronomically large set of data-points. In contrast, an on-lattice model widens the scopes and permits studying the relatively larger proteins because of its finite set of data-points. In this work, we took the full advantage of an on-lattice model by using a face-centered-cube lattice that has the highest packing density with the maximum degree of freedom. We proposed a graded energy-strategically mixes the Miyazawa-Jernigan (MJ) energy with the hydrophobic-polar (HP) energy-based genetic algorithm (GA) for conformational search. In our application, we introduced a 2 × 2 HP energy guided macro-mutation operator within the GA to explore the best possible local changes exhaustively. Conversely, the 20 × 20 MJ energy model-the ultimate objective function of our GA that needs to be minimized-considers the impacts amongst the 20 different amino acids and allow searching the globally acceptable conformations. On a set of benchmark proteins, our proposed approach outperformed state-of-the-art approaches in terms of the free energy levels and the root-mean-square deviations.
Collapse
Affiliation(s)
- Mahmood A Rashid
- SCIMS, University of the South Pacific, Laucala Bay, Suva, Fiji; IIIS, Griffith University, Brisbane, QLD, Australia.
| | | | - Firas Khatib
- CIS, University of Massachusetts Dartmouth, MA, USA.
| | | | - Abdul Sattar
- IIIS, Griffith University, Brisbane, QLD, Australia.
| |
Collapse
|
6
|
Sikosek T, Chan HS. Biophysics of protein evolution and evolutionary protein biophysics. J R Soc Interface 2015; 11:20140419. [PMID: 25165599 DOI: 10.1098/rsif.2014.0419] [Citation(s) in RCA: 150] [Impact Index Per Article: 16.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023] Open
Abstract
The study of molecular evolution at the level of protein-coding genes often entails comparing large datasets of sequences to infer their evolutionary relationships. Despite the importance of a protein's structure and conformational dynamics to its function and thus its fitness, common phylogenetic methods embody minimal biophysical knowledge of proteins. To underscore the biophysical constraints on natural selection, we survey effects of protein mutations, highlighting the physical basis for marginal stability of natural globular proteins and how requirement for kinetic stability and avoidance of misfolding and misinteractions might have affected protein evolution. The biophysical underpinnings of these effects have been addressed by models with an explicit coarse-grained spatial representation of the polypeptide chain. Sequence-structure mappings based on such models are powerful conceptual tools that rationalize mutational robustness, evolvability, epistasis, promiscuous function performed by 'hidden' conformational states, resolution of adaptive conflicts and conformational switches in the evolution from one protein fold to another. Recently, protein biophysics has been applied to derive more accurate evolutionary accounts of sequence data. Methods have also been developed to exploit sequence-based evolutionary information to predict biophysical behaviours of proteins. The success of these approaches demonstrates a deep synergy between the fields of protein biophysics and protein evolution.
Collapse
Affiliation(s)
- Tobias Sikosek
- Department of Biochemistry, University of Toronto, Toronto, Ontario, Canada M5S 1A8 Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada M5S 1A8 Department of Physics, University of Toronto, Toronto, Ontario, Canada M5S 1A8
| | - Hue Sun Chan
- Department of Biochemistry, University of Toronto, Toronto, Ontario, Canada M5S 1A8 Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada M5S 1A8 Department of Physics, University of Toronto, Toronto, Ontario, Canada M5S 1A8
| |
Collapse
|
7
|
Ferrada E. The amino acid alphabet and the architecture of the protein sequence-structure map. I. Binary alphabets. PLoS Comput Biol 2014; 10:e1003946. [PMID: 25473967 PMCID: PMC4256021 DOI: 10.1371/journal.pcbi.1003946] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2014] [Accepted: 09/26/2014] [Indexed: 11/19/2022] Open
Abstract
The correspondence between protein sequences and structures, or sequence-structure map, relates to fundamental aspects of structural, evolutionary and synthetic biology. The specifics of the mapping, such as the fraction of accessible sequences and structures, or the sequences' ability to fold fast, are dictated by the type of interactions between the monomers that compose the sequences. The set of possible interactions between monomers is encapsulated by the potential energy function. In this study, I explore the impact of the relative forces of the potential on the architecture of the sequence-structure map. My observations rely on simple exact models of proteins and random samples of the space of potential energy functions of binary alphabets. I adopt a graph perspective and study the distribution of viable sequences and the structures they produce, as networks of sequences connected by point mutations. I observe that the relative proportion of attractive, neutral and repulsive forces defines types of potentials, that induce sequence-structure maps of vastly different architectures. I characterize the properties underlying these differences and relate them to the structure of the potential. Among these properties are the expected number and relative distribution of sequences associated to specific structures and the diversity of structures as a function of sequence divergence. I study the types of binary potentials observed in natural amino acids and show that there is a strong bias towards only some types of potentials, a bias that seems to characterize the folding code of natural proteins. I discuss implications of these observations for the architecture of the sequence-structure map of natural proteins, the construction of random libraries of peptides, and the early evolution of the natural amino acid alphabet.
Collapse
Affiliation(s)
- Evandro Ferrada
- Santa Fe Institute, Santa Fe, New Mexico, United States of America
| |
Collapse
|
8
|
Sikosek T, Bornberg-Bauer E, Chan HS. Evolutionary dynamics on protein bi-stability landscapes can potentially resolve adaptive conflicts. PLoS Comput Biol 2012; 8:e1002659. [PMID: 23028272 PMCID: PMC3441461 DOI: 10.1371/journal.pcbi.1002659] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2012] [Accepted: 07/12/2012] [Indexed: 11/18/2022] Open
Abstract
Experimental studies have shown that some proteins exist in two alternative native-state conformations. It has been proposed that such bi-stable proteins can potentially function as evolutionary bridges at the interface between two neutral networks of protein sequences that fold uniquely into the two different native conformations. Under adaptive conflict scenarios, bi-stable proteins may be of particular advantage if they simultaneously provide two beneficial biological functions. However, computational models that simulate protein structure evolution do not yet recognize the importance of bi-stability. Here we use a biophysical model to analyze sequence space to identify bi-stable or multi-stable proteins with two or more equally stable native-state structures. The inclusion of such proteins enhances phenotype connectivity between neutral networks in sequence space. Consideration of the sequence space neighborhood of bridge proteins revealed that bi-stability decreases gradually with each mutation that takes the sequence further away from an exactly bi-stable protein. With relaxed selection pressures, we found that bi-stable proteins in our model are highly successful under simulated adaptive conflict. Inspired by these model predictions, we developed a method to identify real proteins in the PDB with bridge-like properties, and have verified a clear bi-stability gradient for a series of mutants studied by Alexander et al. (Proc Nat Acad Sci USA 2009, 106:21149–21154) that connect two sequences that fold uniquely into two different native structures via a bridge-like intermediate mutant sequence. Based on these findings, new testable predictions for future studies on protein bi-stability and evolution are discussed. Proteins are essential molecules for performing a majority of functions in all biological systems. These functions often depend on the three-dimensional structures of proteins. Here, we investigate a fundamental question in molecular evolution: how can proteins acquire new advantageous structures via mutations while not sacrificing their existing structures that are still needed? Some authors have suggested that the same protein may adopt two or more alternative structures, switch between them and thus perform different functions with each of the alternative structures. Intuitively, such a protein could provide an evolutionary compromise between conflicting demands for existing and new protein structures. Yet no theoretical study has systematically tackled the biophysical basis of such compromises during evolutionary processes. Here we devise a model of evolution that specifically recognizes protein molecules that can exist in several different stable structures. Our model demonstrates that proteins can indeed utilize multiple structures to satisfy conflicting evolutionary requirements. In light of these results, we identify data from known protein structures that are consistent with our predictions and suggest novel directions for future investigation.
Collapse
Affiliation(s)
- Tobias Sikosek
- Evolutionary Bioinformatics Group, Institute for Evolution and Biodiversity, University of Münster, Münster, Germany.
| | | | | |
Collapse
|
9
|
Mittenthal J, Caetano-Anollés D, Caetano-Anollés G. Biphasic patterns of diversification and the emergence of modules. Front Genet 2012; 3:147. [PMID: 22891076 PMCID: PMC3413098 DOI: 10.3389/fgene.2012.00147] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2012] [Accepted: 07/19/2012] [Indexed: 01/08/2023] Open
Abstract
The intricate molecular and cellular structure of organisms converts energy to work, which builds and maintains structure. Evolving structure implements modules, in which parts are tightly linked. Each module performs characteristic functions. In this work we propose that a module can emerge through two phases of diversification of parts. Early in the first phase of this biphasic pattern, the parts have weak linkage-they interact weakly and associate variously. The parts diversify and compete. Under selection for performance, interactions among the parts increasingly constrain their structure and associations. As many variants are eliminated, parts self-organize into modules with tight linkage. Linkage may increase in response to exogenous stresses as well as endogenous processes. In the second phase of diversification, variants of the module and its functions evolve and become new parts for a new cycle of generation of higher-level modules. This linkage hypothesis can interpret biphasic patterns in the diversification of protein domain structure, RNA and protein shapes, and networks in metabolism, codes, and embryos, and can explain hierarchical levels of structural organization that are widespread in biology.
Collapse
Affiliation(s)
- Jay Mittenthal
- Department of Cell and Developmental Biology, University of IllinoisUrbana-Champaign, IL, USA
- Institute for Genomic Biology, University of IllinoisUrbana-Champaign, IL, USA
| | - Derek Caetano-Anollés
- Department of Cell and Developmental Biology, University of IllinoisUrbana-Champaign, IL, USA
| | - Gustavo Caetano-Anollés
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences, University of IllinoisUrbana, IL, USA
- Institute for Genomic Biology, University of IllinoisUrbana-Champaign, IL, USA
| |
Collapse
|
10
|
Burke S, Elber R. Super folds, networks, and barriers. Proteins 2012; 80:463-70. [PMID: 22095563 PMCID: PMC3290721 DOI: 10.1002/prot.23212] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2011] [Revised: 08/31/2011] [Accepted: 09/22/2011] [Indexed: 11/06/2022]
Abstract
Exhaustive enumeration of sequences and folds is conducted for a simple lattice model of conformations, sequences, and energies. Examination of all foldable sequences and their nearest connected neighbors (sequences that differ by no more than a point mutation) illustrates the following: (i) There exist unusually large number of sequences that fold into a few structures (super-folds). The same observation was made experimentally and computationally using stochastic sampling and exhaustive enumeration of related models. (ii) There exist only a few large networks of connected sequences that are not restricted to one fold. These networks cover a significant fraction of fold spaces (super-networks). (iii) There exist barriers in sequence space that prevent foldable sequences of the same structure to "connect" through a series of single point mutations (super-barrier), even in the presence of the sequence connection between folds. While there is ample experimental evidence for the existence of super-folds, evidence for a super-network is just starting to emerge. The prediction of a sequence barrier is an intriguing characteristic of sequence space, suggesting that the overall sequence space may be disconnected. The implications and limitations of these observations for evolution of protein structures are discussed.
Collapse
Affiliation(s)
- Sean Burke
- Institute for Computational Engineering and Sciences, University of Texas at Austin, Austin TX 78712
| | - Ron Elber
- Institute for Computational Engineering and Sciences, University of Texas at Austin, Austin TX 78712
- Department of Chemistry and Biochemistry, University of Texas at Austin, Austin TX 78712
| |
Collapse
|
11
|
Holzgräfe C, Irbäck A, Troein C. Mutation-induced fold switching among lattice proteins. J Chem Phys 2011; 135:195101. [DOI: 10.1063/1.3660691] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
|
12
|
Fernández JD, Vico FJ. Automating the search of molecular motor templates by evolutionary methods. Biosystems 2011; 106:82-93. [PMID: 21784125 DOI: 10.1016/j.biosystems.2011.07.002] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2011] [Revised: 06/30/2011] [Accepted: 07/06/2011] [Indexed: 01/10/2023]
Abstract
Biological molecular motors are nanoscale devices capable of transforming chemical energy into mechanical work, which are being researched in many scientific disciplines. From a computational point of view, the characteristics and dynamics of these motors are studied at multiple time scales, ranging from very detailed and complex molecular dynamics simulations spanning a few microseconds, to extremely simple and coarse-grained theoretical models of their working cycles. However, this research is performed only in the (relatively few) instances known from molecular biology. In this work, results from elastic network analysis and behaviour-finding methods are applied to explore a subset of the configuration space of template molecular structures that are able to transform chemical energy into directed movement, for a fixed instance of working cycle. While using methods based on elastic networks limits the scope of our results, it enables the implementation of computationally lightweight methods, in a way that evolutionary search techniques can be applied to discover novel molecular motor templates. The results show that molecular motion can be attained from a variety of structural configurations, when a functional working cycle is provided. Additionally, these methods enable a new computational way to test hypotheses about molecular motors.
Collapse
Affiliation(s)
- Jose D Fernández
- Departamento de Lenguajes y Ciencias de la Computación, Universidad de Málaga, Severo Ochoa 4, 29590 Málaga, Spain.
| | | |
Collapse
|
13
|
Saunders R, Mann M, Deane CM. Signatures of co-translational folding. Biotechnol J 2011; 6:742-51. [DOI: 10.1002/biot.201000330] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2010] [Revised: 03/01/2011] [Accepted: 03/03/2011] [Indexed: 12/11/2022]
|
14
|
Caetano-Anollés G, Mittenthal J. Exploring the interplay of stability and function in protein evolution: new methods further elucidate why protein stability is necessarily so tenuous and stability-increasing mutations compromise biological function. Bioessays 2010; 32:655-8. [PMID: 20658703 DOI: 10.1002/bies.201000038] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
A new split beta-lactamase assay promises experimental testing of the interplay of protein stability and function. Proteins are sufficiently stable to act effectively within cells. However, mutations generally destabilize structure, with effects on free energy that are comparable to the free energy of folding. Assays of protein functionality and stability in vivo enable a quick study of factors that influence these properties in response to targeted mutations. These assays can help molecular engineering but can also be used to target important questions, including why most proteins are marginally stable, how mutations alter structural makeup, and how thermodynamics, function, and environment shape molecular change. Processes of self-organization and natural selection are determinants of stability and function. Non-equilibrium thermodynamics provides crucial concepts, e.g., cells as emergent energy-dissipating entities that do work and build their own parts, and a framework to study the sculpting role of evolution at different scales.
Collapse
|
15
|
Wroe R, Chan HS, Bornberg‐Bauer E. A structural model of latent evolutionary potentials underlying neutral networks in proteins. HFSP JOURNAL 2010; 1:79-87. [DOI: 10.2976/1.2739116/10.2976/1] [Citation(s) in RCA: 48] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/21/2006] [Accepted: 04/20/2007] [Indexed: 11/19/2022]
Affiliation(s)
- Richard Wroe
- a Faculty of Life Sciences , University of Manchester , United Kingdom
- b MRC Centre for Neurodegeneration Research , Kings College , London, United Kingdom
| | - Hue Sun Chan
- c Department of Biochemistry, and Department of Medical Genetics & Microbiology, Faculty of Medicine , University of Toronto , Toronto, Canada
| | - Erich Bornberg‐Bauer
- d Institute for Evolution and Biodiversity, School of Biological Sciences , University of Münster , Huefferstrasse 1, Münster, D48 149, Germany E-mail:
| |
Collapse
|
16
|
|
17
|
Chen T, Vernazobres D, Yomo T, Bornberg-Bauer E, Chan HS. Evolvability and single-genotype fluctuation in phenotypic properties: a simple heteropolymer model. Biophys J 2010; 98:2487-96. [PMID: 20513392 PMCID: PMC2877360 DOI: 10.1016/j.bpj.2010.02.046] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2009] [Revised: 02/15/2010] [Accepted: 02/26/2010] [Indexed: 11/26/2022] Open
Abstract
Experiment showed that the response of a genotype to mutation, i.e., the magnitude of mutational change in a phenotypic property, can be correlated with the extent of phenotypic fluctuation among genetic clones. To address a possible statistical mechanical basis for such phenomena at the protein level, we consider a simple hydrophobic-polar lattice protein-chain model with an exhaustive mapping between sequence (genotype) and conformational (phenotype) spaces. Using squared end-to-end distance, R(N)(2), as an example conformational property, we study how the thermal fluctuation of a sequence's R(N)(2) may be predictive of the changes in the Boltzmann average R(N)(2) caused by single-point mutations on that sequence. We found that sequences with the same ground-state (R(N)(2))(0) exhibit a funnel-like organization under conditions favorable to chain collapse or folding: fluctuation (standard deviation sigma) of R(N)(2) tends to increase with mutational distance from a prototype sequence whose R(N)(2) deviates little from its (R(N)(2))(0). In general, large mutational decreases in R(N)(2) or in sigma are only possible for some, though not all, sequences with large sigma values. This finding suggests that single-genotype phenotypic fluctuation is a necessary, though not sufficient, indicator of evolvability toward genotypes with less phenotypic fluctuations.
Collapse
Affiliation(s)
- Tao Chen
- Departments of Biochemistry and of Molecular Genetics, Faculty of Medicine, and Department of Physics, University of Toronto, Toronto, Ontario, Canada
| | - David Vernazobres
- Institute for Evolution and Biodiversity, School of Biological Sciences, University of Münster, Münster, Germany
| | - Tetsuya Yomo
- Department of Bioinformatic Engineering, Graduate School of Information Science and Technology, and the Graduate School of Frontier Bioscience, Osaka University, Osaka, Japan
- Exploratory Research for Advanced Technology, Japan Science and Technology Agency, Osaka, Japan
| | - Erich Bornberg-Bauer
- Institute for Evolution and Biodiversity, School of Biological Sciences, University of Münster, Münster, Germany
| | - Hue Sun Chan
- Departments of Biochemistry and of Molecular Genetics, Faculty of Medicine, and Department of Physics, University of Toronto, Toronto, Ontario, Canada
| |
Collapse
|
18
|
Abstract
Contemporary protein architectures can be regarded as molecular fossils, historical imprints that mark important milestones in the history of life. Whereas sequences change at a considerable pace, higher-order structures are constrained by the energetic landscape of protein folding, the exploration of sequence and structure space, and complex interactions mediated by the proteostasis and proteolytic machineries of the cell. The survey of architectures in the living world that was fuelled by recent structural genomic initiatives has been summarized in protein classification schemes, and the overall structure of fold space explored with novel bioinformatic approaches. However, metrics of general structural comparison have not yet unified architectural complexity using the 'shared and derived' tenet of evolutionary analysis. In contrast, a shift of focus from molecules to proteomes and a census of protein structure in fully sequenced genomes were able to uncover global evolutionary patterns in the structure of proteins. Timelines of discovery of architectures and functions unfolded episodes of specialization, reductive evolutionary tendencies of architectural repertoires in proteomes and the rise of modularity in the protein world. They revealed a biologically complex ancestral proteome and the early origin of the archaeal lineage. Studies also identified an origin of the protein world in enzymes of nucleotide metabolism harbouring the P-loop-containing triphosphate hydrolase fold and the explosive discovery of metabolic functions that recapitulated well-defined prebiotic shells and involved the recruitment of structures and functions. These observations have important implications for origins of modern biochemistry and diversification of life.
Collapse
|
19
|
Hoque T, Chetty M, Sattar A. Extended HP model for protein structure prediction. J Comput Biol 2009; 16:85-103. [PMID: 19119994 DOI: 10.1089/cmb.2008.0082] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
This paper describes a detailed investigation of a lattice-based HP (hydrophobic-hydrophilic) model for ab initio protein structure prediction (PSP). The outcome of the simplified HP lattice model has high degeneracy, which could mislead the prediction. The HPNX model was proposed to address the degeneracy problem as well as to avoid the conformational deformity with the hydrophilic (P) residues. We have experimentally shown that it is necessary to further improve the existing HPNX model. We have found and solved the critical error of another existing YhHX model. By extracting the significant features from the YhHX for the HPNX model, we have proposed a novel hHPNX model. Hybrid Genetic Algorithm (HGA) has been used to compare the predictability of these models and hHPNX outperformed other models. We preferred 3D face-centered-cube (FCC) lattice configuration to have closest resemblance to the real folded 3D protein.
Collapse
Affiliation(s)
- Tamjidul Hoque
- Institute for Integrated and Intelligent Systems (IIIS), Griffith University, Nathan, QLD, Australia
| | - Madhu Chetty
- Gippsland School of Information Technology (GSIT), Monash University, Churchill, VIC, Australia
| | - Abdul Sattar
- Institute for Integrated and Intelligent Systems (IIIS), Griffith University, Nathan, QLD, Australia
| |
Collapse
|
20
|
Noirel J, Simonson T. Neutral evolution of proteins: The superfunnel in sequence space and its relation to mutational robustness. J Chem Phys 2009; 129:185104. [PMID: 19045432 DOI: 10.1063/1.2992853] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Following Kimura's neutral theory of molecular evolution [M. Kimura, The Neutral Theory of Molecular Evolution (Cambridge University Press, Cambridge, 1983) (reprinted in 1986)], it has become common to assume that the vast majority of viable mutations of a gene confer little or no functional advantage. Yet, in silico models of protein evolution have shown that mutational robustness of sequences could be selected for, even in the context of neutral evolution. The evolution of a biological population can be seen as a diffusion on the network of viable sequences. This network is called a "neutral network." Depending on the mutation rate mu and the population size N, the biological population can evolve purely randomly (muN<<1) or it can evolve in such a way as to select for sequences of higher mutational robustness (muN>>1). The stringency of the selection depends not only on the product muN but also on the exact topology of the neutral network, the special arrangement of which was named "superfunnel." Even though the relation between mutation rate, population size, and selection was thoroughly investigated, a study of the salient topological features of the superfunnel that could affect the strength of the selection was wanting. This question is addressed in this study. We use two different models of proteins: on lattice and off lattice. We compare neutral networks computed using these models to random networks. From this, we identify two important factors of the topology that determine the stringency of the selection for mutationally robust sequences. First, the presence of highly connected nodes ("hubs") in the network increases the selection for mutationally robust sequences. Second, the stringency of the selection increases when the correlation between a sequence's mutational robustness and its neighbors' increases. The latter finding relates a global characteristic of the neutral network to a local one, which is attainable through experiments or molecular modeling.
Collapse
Affiliation(s)
- Josselin Noirel
- Laboratoire de Biochimie, Ecole Polytechnique, Route de Saclay, Palaiseau 91128 Cedex, France.
| | | |
Collapse
|
21
|
Patel BA, Debenedetti PG, Stillinger FH, Rossky PJ. The effect of sequence on the conformational stability of a model heteropolymer in explicit water. J Chem Phys 2008; 128:175102. [PMID: 18465941 DOI: 10.1063/1.2909974] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
We investigate the properties of a two-dimensional lattice heteropolymer model for a protein in which water is explicitly represented. The model protein distinguishes between hydrophobic and polar monomers through the effect of the hydrophobic monomers on the entropy and enthalpy of the hydrogen bonding of solvation shell water molecules. As experimentally observed, model heteropolymer sequences fold into stable native states characterized by a hydrophobic core to avoid unfavorable interactions with the solvent. These native states undergo cold, pressure, and thermal denaturation into distinct configurations for each type of unfolding transition. However, the heteropolymer sequence is an important element, since not all sequences will fold into stable native states at positive pressures. Simulation of a large collection of sequences indicates that these fall into two general groups, those exhibiting highly stable native structures and those that do not. Statistical analysis of important patterns in sequences shows a strong tendency for observing long blocks of hydrophobic or polar monomers in the most stable sequences. Statistical analysis also shows that alternation of hydrophobic and polar monomers appears infrequently among the most stable sequences. These observations are not absolute design rules and, in practice, these are not sufficient to rationally design very stable heteropolymers. We also study the effect of mutations on improving the stability of the model proteins, and demonstrate that it is possible to obtain a very stable heteropolymer from directed evolution of an initially unstable heteropolymer.
Collapse
Affiliation(s)
- Bryan A Patel
- Department of Chemical Engineering, Princeton University, Princeton, New Jersey 08544, USA
| | | | | | | |
Collapse
|
22
|
Noirel J, Simonson T. Neutral evolution of protein-protein interactions: a computational study using simple models. BMC STRUCTURAL BIOLOGY 2007; 7:79. [PMID: 18021454 PMCID: PMC2248192 DOI: 10.1186/1472-6807-7-79] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/23/2007] [Accepted: 11/19/2007] [Indexed: 11/30/2022]
Abstract
Background Protein-protein interactions are central to cellular organization, and must have appeared at an early stage of evolution. To understand better their role, we consider a simple model of protein evolution and determine the effect of an explicit selection for Protein-protein interactions. Results In the model, viable sequences all have the same fitness, following the neutral evolution theory. A very simple, two-dimensional lattice representation of the protein structures is used, and the model only considers two kinds of amino acids: hydrophobic and polar. With these approximations, exact calculations are performed. The results do not depend too strongly on these assumptions, since a model using a 3D, off-lattice representation of the proteins gives results in qualitative agreement with the 2D one. With both models, the evolutionary dynamics lead to a steady state population that is enriched in sequences that dimerize with a high affinity, well beyond the minimal level needed to survive. Correspondingly, sequences close to the viability threshold are less abundant in the steady state, being subject to a larger proportion of lethal mutations. The set of viable sequences has a "funnel" shape, consistent with earlier studies: sequences that are highly populated in the steady state are "close" to each other (with proximity being measured by the number of amino acids that differ). Conclusion This bias in the the steady state sequences should lead to an increased resistance of the population to environmental change and an increased ability to evolve.
Collapse
Affiliation(s)
- Josselin Noirel
- Laboratoire de Biochimie, Ecole polytechnique, route de Saclay, 91128 Palaiseau Cedex, France.
| | | |
Collapse
|
23
|
Wroe R, Chan HS, Bornberg-Bauer E. A structural model of latent evolutionary potentials underlying neutral networks in proteins. HFSP JOURNAL 2007. [PMID: 19404462 DOI: 10.2976/1.2739116] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
A central question in molecular evolution concerns the nature of phenotypic transitions, in particular, if neutral mutations hamper or somehow facilitate adaptability of proteins to new requirements. Proteins have been found to fluctuate between different structures, with frequencies of structures being proportional to their stability. Therefore, functional promiscuity may correspond to different structures with energies close to the ground state which then represent multiple selectable traits. We here postulate that these near-ground-state structures facilitate smooth transitions between phenotypes. Using a biophysical heteropolymer model with exhaustive mappings of sequences onto structures, we demonstrate that this is indeed possible because of a smooth gradient of stability along which any structural phenotype can be optimized and also because of mutational proximity of similar phenotypes in genotype space. Our model provides a biophysical rationalization of the intriguing, and otherwise puzzling experimental observation that adaptation to new requirements, e.g., latent function of a promiscuous enzyme, can proceed while the "old," phenotypically dominant function is maintained along a series of seemingly neutral mutations (see accompanying article). Thus pleiotropy may facilitate adaptation of latent traits before gene duplications and increase the effective adaptability of proteins.
Collapse
|
24
|
Roth C, Rastogi S, Arvestad L, Dittmar K, Light S, Ekman D, Liberles DA. Evolution after gene duplication: models, mechanisms, sequences, systems, and organisms. JOURNAL OF EXPERIMENTAL ZOOLOGY PART B-MOLECULAR AND DEVELOPMENTAL EVOLUTION 2007; 308:58-73. [PMID: 16838295 DOI: 10.1002/jez.b.21124] [Citation(s) in RCA: 120] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
Gene duplication is postulated to have played a major role in the evolution of biological novelty. Here, gene duplication is examined across levels of biological organization in an attempt to create a unified picture of the mechanistic process by which gene duplication can have played a role in generating biodiversity. Neofunctionalization and subfunctionalization have been proposed as important processes driving the retention of duplicate genes. These models have foundations in population genetic theory, which is now being refined by explicit consideration of the structural constraints placed upon genes encoding proteins through physical chemistry. Further, such models can be examined in the context of comparative genomics, where an integration of gene-level evolution and species-level evolution allows an assessment of the frequency of duplication and the fate of duplicate genes. This process, of course, is dependent upon the biochemical role that duplicated genes play in biological systems, which is in turn dependent upon the mechanism of duplication: whole genome duplication involving a co-duplication of interacting partners vs. single gene duplication. Lastly, the role that these processes may have played in driving speciation is examined.
Collapse
Affiliation(s)
- Christian Roth
- Department of Molecular Biology, University of Wyoming, Laramie, Wyoming 82071, USA
| | | | | | | | | | | | | |
Collapse
|
25
|
Rashin AA, Rashin AHL. Surface hydrophobic groups, stability, and flip-flopping in lattice proteins. Proteins 2007; 66:321-41. [PMID: 17096417 DOI: 10.1002/prot.21169] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
Two-dimensional lattice protein models were studied in two approximations of the conformational equilibrium to elucidate the role of surface hydrophobic groups in their stabilities. We demonstrate that stability of any compactly folded sequence is determined by its ability to "flip-flop" (refold) into alternative compact structures. The degree of stability required for folded sequences determines the average numbers of surface hydrophobic groups in stable lattice structures which are in good agreement with ratios of core to surface hydrophobic groups in real proteins. However, the average destabilization of the native structure per surface hydrophobic group is small (0-0.25 kcal/mol), often disagrees with the free energies derived from the ratios of core to surface hydrophobic groups in the same structures, and has a combinatorial entropic nature independent of the strength of structure stabilizing interactions. This suggests that the free energies derived from the core to surface ratios of hydrophobic groups in real proteins have little to do with folding thermodynamics. On average, sequences with highly stable native structures are the least hydrophobic. The results suggest that in designing novel stable proteins hydrophobic groups on the surface should be avoided to reduce the possibility of flip-flopping. The average stability of highly designable structures is never higher than that of some low designability structures, contrary to the accepted view. In the equilibrium approximation with alternative compact and partially unfolded structures, the requirement of high stability selects a unique 5 x 5 structure formed by only a few sequences, suggesting much stronger sequence selectivity than commonly thought.
Collapse
|
26
|
Moghaddam MS, Chan HS. Selective adsorption of block copolymers on patterned surfaces. J Chem Phys 2006; 125:164909. [PMID: 17092141 DOI: 10.1063/1.2359437] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
Adsorption of copolymers on patterned surfaces is studied using lattice modeling and multiple Markov chain Monte Carlo methods. The copolymer is composed of alternating blocks of A and B monomers, and the adsorbing surface is composed of alternating square blocks containing C and D sites. Effects of interaction specificity on the adsorbed pattern of the copolymer and the sharpness of the adsorption transition are investigated by comparing three different models of copolymer-surface interactions. Analyses of the underlying energy distribution indicate that adsorption transitions in our models are not two-state-like. We show how the corresponding experimental question may be addressed by calorimetric measurements as have been applied to protein folding. Although the adsorption transitions are not "first order" or two-state-like, the sharpness of the transition increases when interaction specificity is enhanced by either including more attractive interaction types or by introducing repulsive interactions. Uniformity of the pattern of the adsorbed copolymer is also sensitive to the interaction scheme. Ramifications of the results from the present minimalist models of pattern recognition on the energetic and statistical mechanical origins of undesirable nonspecific adsorption of synthetic biopolymers in cellular environments are discussed.
Collapse
Affiliation(s)
- Maria Sabaye Moghaddam
- Department of Biochemistry, Faculty of Medicine, University of Toronto, Toronto, Ontario M5S 1A8, Canada.
| | | |
Collapse
|
27
|
Bloom JD, Drummond DA, Arnold FH, Wilke CO. Structural determinants of the rate of protein evolution in yeast. Mol Biol Evol 2006; 23:1751-61. [PMID: 16782762 DOI: 10.1093/molbev/msl040] [Citation(s) in RCA: 148] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
We investigate how a protein's structure influences the rate at which its sequence evolves. Our basic hypothesis is that proteins with highly designable structures (structures that are encoded by many sequences) will evolve more rapidly. Recent theoretical advances argue that structures with a higher density of interresidue contacts are more designable, and we show that high contact density is correlated with an increased rate of sequence evolution in yeast. In addition, we investigate the correlations between the rate of sequence evolution and several other structural descriptors, carefully controlling for the strong effect of expression level on evolutionary rate. Overall, we find that the structural descriptors that we consider appear to explain roughly 10% of the variation in rates of protein evolution in yeast. We also show that despite the well-known trend for buried residues to be more conserved, proteins with a higher fraction of buried residues, nonetheless, tend to evolve their sequences more rapidly. We suggest that this effect is due to the increased designability of structures with more buried residues. Our results provide evidence that protein structure plays an important role in shaping the rate of sequence evolution and provide evidence to support recent theoretical advances linking structural designability to contact density.
Collapse
Affiliation(s)
- Jesse D Bloom
- Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, California, USA
| | | | | | | |
Collapse
|
28
|
Aynechi T, Kuntz ID. An information theoretic approach to macromolecular modeling: I. Sequence alignments. Biophys J 2005; 89:2998-3007. [PMID: 16254389 PMCID: PMC1366797 DOI: 10.1529/biophysj.104.054072] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2004] [Accepted: 08/15/2005] [Indexed: 11/18/2022] Open
Abstract
We are interested in applying the principles of information theory to structural biology calculations. In this article, we explore the information content of an important computational procedure: sequence alignment. Using a reference state developed from exhaustive sequences, we measure alignment statistics and evaluate gap penalties based on first-principle considerations and gap distributions. We show that there are different gap penalties for different alphabet sizes and that the gap penalties can depend on the length of the sequences being aligned. In a companion article, we examine the information content of molecular force fields.
Collapse
Affiliation(s)
- Tiba Aynechi
- Graduate Group in Biophysics, and Department of Pharmaceutical Chemistry, University of California-San Francisco, San Francisco, CA 94143, USA
| | | |
Collapse
|