1
|
Sood A, Zhang B. Preserving condensate structure and composition by lowering sequence complexity. Biophys J 2024:S0006-3495(24)00373-4. [PMID: 38824391 DOI: 10.1016/j.bpj.2024.05.026] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2024] [Revised: 04/25/2024] [Accepted: 05/28/2024] [Indexed: 06/03/2024] Open
Abstract
Biomolecular condensates play a vital role in organizing cellular chemistry. They selectively partition biomolecules, preventing unwanted cross talk and buffering against chemical noise. Intrinsically disordered proteins (IDPs) serve as primary components of these condensates due to their flexibility and ability to engage in multivalent interactions, leading to spontaneous aggregation. Theoretical advancements are critical at connecting IDP sequences with condensate emergent properties to establish the so-called molecular grammar. We proposed an extension to the stickers and spacers model, incorporating heterogeneous, nonspecific pairwise interactions between spacers alongside specific interactions among stickers. Our investigation revealed that although spacer interactions contribute to phase separation and co-condensation, their nonspecific nature leads to disorganized condensates. Specific sticker-sticker interactions drive the formation of condensates with well-defined networked structures and molecular composition. We discussed how evolutionary pressures might emerge to affect these interactions, leading to the prevalence of low-complexity domains in IDP sequences. These domains suppress spurious interactions and facilitate the formation of biologically meaningful condensates.
Collapse
Affiliation(s)
- Amogh Sood
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachusetts
| | - Bin Zhang
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachusetts.
| |
Collapse
|
2
|
Sánchez IE, Galpern EA, Ferreiro DU. Solvent constraints for biopolymer folding and evolution in extraterrestrial environments. Proc Natl Acad Sci U S A 2024; 121:e2318905121. [PMID: 38739787 PMCID: PMC11127021 DOI: 10.1073/pnas.2318905121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2023] [Accepted: 04/16/2024] [Indexed: 05/16/2024] Open
Abstract
We propose that spontaneous folding and molecular evolution of biopolymers are two universal aspects that must concur for life to happen. These aspects are fundamentally related to the chemical composition of biopolymers and crucially depend on the solvent in which they are embedded. We show that molecular information theory and energy landscape theory allow us to explore the limits that solvents impose on biopolymer existence. We consider 54 solvents, including water, alcohols, hydrocarbons, halogenated solvents, aromatic solvents, and low molecular weight substances made up of elements abundant in the universe, which may potentially take part in alternative biochemistries. We find that along with water, there are many solvents for which the liquid regime is compatible with biopolymer folding and evolution. We present a ranking of the solvents in terms of biopolymer compatibility. Many of these solvents have been found in molecular clouds or may be expected to occur in extrasolar planets.
Collapse
Affiliation(s)
- Ignacio E. Sánchez
- Laboratorio de Fisiología de Proteínas, Departamento de Química Biológica, Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires, Buenos AiresCP1428, Argentina
- Consejo Nacional de Investigaciones Científicas y Técnicas, Instituto de Química Biológica de la Facultad de Ciencias Exactas y Naturales, Buenos AiresCP1428, Argentina
| | - Ezequiel A. Galpern
- Laboratorio de Fisiología de Proteínas, Departamento de Química Biológica, Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires, Buenos AiresCP1428, Argentina
- Consejo Nacional de Investigaciones Científicas y Técnicas, Instituto de Química Biológica de la Facultad de Ciencias Exactas y Naturales, Buenos AiresCP1428, Argentina
| | - Diego U. Ferreiro
- Laboratorio de Fisiología de Proteínas, Departamento de Química Biológica, Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires, Buenos AiresCP1428, Argentina
- Consejo Nacional de Investigaciones Científicas y Técnicas, Instituto de Química Biológica de la Facultad de Ciencias Exactas y Naturales, Buenos AiresCP1428, Argentina
| |
Collapse
|
3
|
Jaafari H, Bueno C, Schafer NP, Martin J, Morcos F, Wolynes PG. The physical and evolutionary energy landscapes of devolved protein sequences corresponding to pseudogenes. Proc Natl Acad Sci U S A 2024; 121:e2322428121. [PMID: 38739795 PMCID: PMC11127006 DOI: 10.1073/pnas.2322428121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2023] [Accepted: 03/26/2024] [Indexed: 05/16/2024] Open
Abstract
Protein evolution is guided by structural, functional, and dynamical constraints ensuring organismal viability. Pseudogenes are genomic sequences identified in many eukaryotes that lack translational activity due to sequence degradation and thus over time have undergone "devolution." Previously pseudogenized genes sometimes regain their protein-coding function, suggesting they may still encode robust folding energy landscapes despite multiple mutations. We study both the physical folding landscapes of protein sequences corresponding to human pseudogenes using the Associative Memory, Water Mediated, Structure and Energy Model, and the evolutionary energy landscapes obtained using direct coupling analysis (DCA) on their parent protein families. We found that generally mutations that have occurred in pseudogene sequences have disrupted their native global network of stabilizing residue interactions, making it harder for them to fold if they were translated. In some cases, however, energetic frustration has apparently decreased when the functional constraints were removed. We analyzed this unexpected situation for Cyclophilin A, Profilin-1, and Small Ubiquitin-like Modifier 2 Protein. Our analysis reveals that when such mutations in the pseudogene ultimately stabilize folding, at the same time, they likely alter the pseudogenes' former biological activity, as estimated by DCA. We localize most of these stabilizing mutations generally to normally frustrated regions required for binding to other partners.
Collapse
Affiliation(s)
- Hana Jaafari
- Center for Theoretical Biophysics, Rice University, Houston, TX77005
- Applied Physics Graduate Program, Smalley-Curl Institute, Rice University, Houston, TX77005
- Department of Chemistry, Rice University, Houston, TX77005
| | - Carlos Bueno
- Center for Theoretical Biophysics, Rice University, Houston, TX77005
| | | | - Jonathan Martin
- Department of Biological Sciences, University of Texas at Dallas, Richardson, TX75080
| | - Faruck Morcos
- Department of Biological Sciences, University of Texas at Dallas, Richardson, TX75080
- Department of Bioengineering, University of Texas at Dallas, Richardson, TX75080
- Center for Systems Biology, University of Texas at Dallas, Richardson, TX75080
| | - Peter G. Wolynes
- Center for Theoretical Biophysics, Rice University, Houston, TX77005
- Department of Chemistry, Rice University, Houston, TX77005
- Department of Physics and Astronomy, Rice University, Houston, TX77005
- Department of Biochemistry and Cell Biology, Rice University, Houston, TX77005
| |
Collapse
|
4
|
Pereira de Araújo AF. Sequence-dependent and -independent information in a combined random energy model for protein folding and coding. Proteins 2024; 92:679-687. [PMID: 38158239 DOI: 10.1002/prot.26658] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2023] [Revised: 12/11/2023] [Accepted: 12/15/2023] [Indexed: 01/03/2024]
Abstract
Random energy models (REMs) provide a simple description of the energy landscapes that guide protein folding and evolution. The requirement of a large energy gap between the native structure and unfolded conformations, considered necessary for cooperative, protein-like, folding behavior, indicates that proteins differ markedly from random heteropolymers. It has been suggested, therefore, that natural selection might have acted to choose nonrandom amino acid sequences satisfying this particular condition, implying that a large fraction of possible, unselected random sequences, would not fold to any structure. From an informational perspective, however, this scenario could indicate that protein structures, regarded as messages to be transmitted through a communication channel, would not be efficiently encoded in amino acid sequences, regarded as the communication channel for this transmission, since a large fraction of possible channel states would not be used. Here, we use a combined REM for conformations and sequences, with previously estimated parameters for natural proteins, to explore an alternative possibility in which the appropriate shape of the landscape results mainly from the deviation from randomness of possible native structures instead of sequences. We observe that this situation emerges naturally if the distribution of conformational energies happens to arise from two independent contributions corresponding to sequence-dependent and -independent terms. This construction is consistent with the hypothesis of a protein burial folding code, with native structures being determined by a modest amount of sequence-dependent atomic burial information with sequence-independent constraints imposed by unspecific hydrogen bond formation. More generally, an appropriate combination of sequence-dependent and -independent information accommodates the possibility of an efficient structural encoding with the main physical requirement for folding, providing possible insight not only on the folding process but also on several aspects sequence evolution such as neutral networks, conformational coverage, and de novo gene emergence.
Collapse
Affiliation(s)
- Antônio F Pereira de Araújo
- Laboratório de Biofísica Teórica, Departamento de Biologia Celular, Universidade de Brasília, Brasília, Brazil
| |
Collapse
|
5
|
Hayes RL, Nixon CF, Marqusee S, Brooks CL. Selection pressures on evolution of ribonuclease H explored with rigorous free-energy-based design. Proc Natl Acad Sci U S A 2024; 121:e2312029121. [PMID: 38194446 PMCID: PMC10801872 DOI: 10.1073/pnas.2312029121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2023] [Accepted: 11/22/2023] [Indexed: 01/11/2024] Open
Abstract
Understanding natural protein evolution and designing novel proteins are motivating interest in development of high-throughput methods to explore large sequence spaces. In this work, we demonstrate the application of multisite λ dynamics (MSλD), a rigorous free energy simulation method, and chemical denaturation experiments to quantify evolutionary selection pressure from sequence-stability relationships and to address questions of design. This study examines a mesophilic phylogenetic clade of ribonuclease H (RNase H), furthering its extensive characterization in earlier studies, focusing on E. coli RNase H (ecRNH) and a more stable consensus sequence (AncCcons) differing at 15 positions. The stabilities of 32,768 chimeras between these two sequences were computed using the MSλD framework. The most stable and least stable chimeras were predicted and tested along with several other sequences, revealing a designed chimera with approximately the same stability increase as AncCcons, but requiring only half the mutations. Comparing the computed stabilities with experiment for 12 sequences reveals a Pearson correlation of 0.86 and root mean squared error of 1.18 kcal/mol, an unprecedented level of accuracy well beyond less rigorous computational design methods. We then quantified selection pressure using a simple evolutionary model in which sequences are selected according to the Boltzmann factor of their stability. Selection temperatures from 110 to 168 K are estimated in three ways by comparing experimental and computational results to evolutionary models. These estimates indicate selection pressure is high, which has implications for evolutionary dynamics and for the accuracy required for design, and suggests accurate high-throughput computational methods like MSλD may enable more effective protein design.
Collapse
Affiliation(s)
- Ryan L. Hayes
- Department of Chemical and Biomolecular Engineering, University of California, Irvine, CA92697
- Department of Chemistry, University of Michigan, Ann Arbor, MI48109
| | - Charlotte F. Nixon
- Department of Molecular and Cell Biology, University of California, Berkeley, CA94720
| | - Susan Marqusee
- Department of Molecular and Cell Biology, University of California, Berkeley, CA94720
- California Institute for Quantitative Biosciences, University of California, Berkeley, CA94720
- Department of Chemistry, University of California, Berkeley, CA94720
| | - Charles L. Brooks
- Department of Chemistry, University of Michigan, Ann Arbor, MI48109
- Biophysics Program, University of Michigan, Ann Arbor, MI48109
| |
Collapse
|
6
|
Sood A, Zhang B. Preserving condensate structure and composition by lowering sequence complexity. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.11.29.569249. [PMID: 38076908 PMCID: PMC10705451 DOI: 10.1101/2023.11.29.569249] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/22/2023]
Abstract
Biological condensates play a vital role in organizing cellular chemistry. They selectively partition biomolecules, preventing unwanted cross-talk and buffering against chemical noise. Intrinsically disordered proteins (IDPs) serve as primary components of these condensates due to their flexibility and ability to engage in multivalent, non-specific interactions, leading to spontaneous aggregation. Theoretical advancements are critical at connecting IDP sequences with condensate emergent properties to establish the so-called molecular grammar. We proposed an extension to the stickers and spacers model, incorporating non-specific pairwise interactions between spacers alongside specific interactions among stickers. Our investigation revealed that while spacer interactions contribute to phase separation and co-condensation, their non-specific nature leads to disorganized condensates. Specific sticker-sticker interactions drive the formation of condensates with well-defined structures and molecular composition. We discussed how evolutionary pressures might emerge to affect these interactions, leading to the prevalence of low complexity domains in IDP sequences. These domains suppress spurious interactions and facilitate the formation of biologically meaningful condensates.
Collapse
Affiliation(s)
- Amogh Sood
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Bin Zhang
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, MA, USA
| |
Collapse
|
7
|
Rivoire O. How Flexibility Can Enhance Catalysis. PHYSICAL REVIEW LETTERS 2023; 131:088401. [PMID: 37683166 DOI: 10.1103/physrevlett.131.088401] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/16/2022] [Accepted: 07/28/2023] [Indexed: 09/10/2023]
Abstract
Conformational changes are observed in many enzymes, but their role in catalysis is highly controversial. Here we present a theoretical model that illustrates how rigid catalysts can be fundamentally limited and how a conformational change induced by substrate binding can overcome this limitation, ultimately enabling barrier-free catalysis. The model is deliberately minimal, but the principle it illustrates is general and consistent with unique features of proteins as well as with previous informal proposals to explain the superiority of enzymes over other classes of catalysts. Implementing the discriminative switch suggested by the model could help overcome limitations currently encountered in the design of artificial catalysts.
Collapse
Affiliation(s)
- Olivier Rivoire
- Center for Interdisciplinary Research in Biology (CIRB), Collège de France, CNRS, INSERM, and Gulliver, CNRS, ESPCI, Université Paris Sciences et Lettres, 75005 Paris, France
| |
Collapse
|
8
|
Dietschreit JCB, Diestler DJ, Gómez-Bombarelli R. Entropy and Energy Profiles of Chemical Reactions. J Chem Theory Comput 2023; 19:5369-5379. [PMID: 37535443 DOI: 10.1021/acs.jctc.3c00448] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/05/2023]
Abstract
The description of chemical processes at the molecular level is often facilitated by the use of reaction coordinates or collective variables (CVs). The CV measures the progress of the reaction and allows the construction of profiles that track how specific properties evolve as the reaction progresses. Whereas CVs are routinely used, especially alongside enhanced sampling techniques, the links among reaction profiles, thermodynamic state functions, and reaction rate constants are not rigorously exploited. Here, we report a unified treatment of such reaction profiles. Tractable expressions are derived for the free-energy, internal-energy, and entropy profiles as functions of only the CV. We demonstrate the ability of this treatment to extract quantitative insight from the entropy and internal-energy profiles of various real-world physicochemical processes, including intramolecular organic reactions, ionic transport in superionic electrolytes, and molecular transport in nanoporous materials.
Collapse
Affiliation(s)
- Johannes C B Dietschreit
- Department of Materials Science and Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Dennis J Diestler
- University of Nebraska-Lincoln, Lincoln, Nebraska 68583, United States
| | - Rafael Gómez-Bombarelli
- Department of Materials Science and Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| |
Collapse
|
9
|
Singh TV, Shagolsem LS. Universality and Identity Ordering in Heteropolymer Coil–Globule Transition. Macromolecules 2022. [DOI: 10.1021/acs.macromol.2c01559] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Affiliation(s)
- Thoudam Vilip Singh
- Department of Physics, National Institute of Technology Manipur, Imphal795004, India
| | - Lenin S. Shagolsem
- Department of Physics, National Institute of Technology Manipur, Imphal795004, India
| |
Collapse
|
10
|
Sánchez IE, Galpern EA, Garibaldi MM, Ferreiro DU. Molecular Information Theory Meets Protein Folding. J Phys Chem B 2022; 126:8655-8668. [PMID: 36282961 DOI: 10.1021/acs.jpcb.2c04532] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]
Abstract
We propose an application of molecular information theory to analyze the folding of single domain proteins. We analyze results from various areas of protein science, such as sequence-based potentials, reduced amino acid alphabets, backbone configurational entropy, secondary structure content, residue burial layers, and mutational studies of protein stability changes. We found that the average information contained in the sequences of evolved proteins is very close to the average information needed to specify a fold ∼2.2 ± 0.3 bits/(site·operation). The effective alphabet size in evolved proteins equals the effective number of conformations of a residue in the compact unfolded state at around 5. We calculated an energy-to-information conversion efficiency upon folding of around 50%, lower than the theoretical limit of 70%, but much higher than human-built macroscopic machines. We propose a simple mapping between molecular information theory and energy landscape theory and explore the connections between sequence evolution, configurational entropy, and the energetics of protein folding.
Collapse
Affiliation(s)
- Ignacio E Sánchez
- Facultad de Ciencias Exactas y Naturales, Laboratorio de Fisiología de Proteínas, Consejo Nacional de Investigaciones Científicas y Técnicas, Instituto de Química Biológica de la Facultad de Ciencias Exactas y Naturales (IQUIBICEN), Universidad de Buenos Aires, Buenos AiresCP1428, Argentina
| | - Ezequiel A Galpern
- Facultad de Ciencias Exactas y Naturales, Laboratorio de Fisiología de Proteínas, Consejo Nacional de Investigaciones Científicas y Técnicas, Instituto de Química Biológica de la Facultad de Ciencias Exactas y Naturales (IQUIBICEN), Universidad de Buenos Aires, Buenos AiresCP1428, Argentina
| | - Martín M Garibaldi
- Facultad de Ciencias Exactas y Naturales, Laboratorio de Fisiología de Proteínas, Consejo Nacional de Investigaciones Científicas y Técnicas, Instituto de Química Biológica de la Facultad de Ciencias Exactas y Naturales (IQUIBICEN), Universidad de Buenos Aires, Buenos AiresCP1428, Argentina
| | - Diego U Ferreiro
- Facultad de Ciencias Exactas y Naturales, Laboratorio de Fisiología de Proteínas, Consejo Nacional de Investigaciones Científicas y Técnicas, Instituto de Química Biológica de la Facultad de Ciencias Exactas y Naturales (IQUIBICEN), Universidad de Buenos Aires, Buenos AiresCP1428, Argentina
| |
Collapse
|
11
|
Phase Transition of Gels—A Review of Toyoich Tanaka’s Research. Gels 2022; 8:gels8090550. [PMID: 36135263 PMCID: PMC9498857 DOI: 10.3390/gels8090550] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2022] [Revised: 08/18/2022] [Accepted: 08/18/2022] [Indexed: 11/28/2022] Open
Abstract
In 70’s, the extensive studies about the gel science has begun with the discovery of the volume phase transition of gel at the physics department of Massachusetts Institute of Technology. After the discovery of the volume phase transition of gel, the phenomenon was extensively studied and advanced by the discoverer, the late Professor Toyoichi Tanaka, who deceased on 20 May 2000 in the halfway of his research. In this paper, we would like to review his research to clarify his deep insight into the science of gels.
Collapse
|
12
|
Magi Meconi G, Sasselli IR, Bianco V, Onuchic JN, Coluzza I. Key aspects of the past 30 years of protein design. REPORTS ON PROGRESS IN PHYSICS. PHYSICAL SOCIETY (GREAT BRITAIN) 2022; 85:086601. [PMID: 35704983 DOI: 10.1088/1361-6633/ac78ef] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/03/2021] [Accepted: 06/15/2022] [Indexed: 06/15/2023]
Abstract
Proteins are the workhorse of life. They are the building infrastructure of living systems; they are the most efficient molecular machines known, and their enzymatic activity is still unmatched in versatility by any artificial system. Perhaps proteins' most remarkable feature is their modularity. The large amount of information required to specify each protein's function is analogically encoded with an alphabet of just ∼20 letters. The protein folding problem is how to encode all such information in a sequence of 20 letters. In this review, we go through the last 30 years of research to summarize the state of the art and highlight some applications related to fundamental problems of protein evolution.
Collapse
Affiliation(s)
- Giulia Magi Meconi
- Computational Biophysics Lab, Center for Cooperative Research in Biomaterials (CIC biomaGUNE), Basque Research and Technology Alliance (BRTA), Paseo de Miramon 182, 20014, Donostia-San Sebastián, Spain
| | - Ivan R Sasselli
- Computational Biophysics Lab, Center for Cooperative Research in Biomaterials (CIC biomaGUNE), Basque Research and Technology Alliance (BRTA), Paseo de Miramon 182, 20014, Donostia-San Sebastián, Spain
| | | | - Jose N Onuchic
- Center for Theoretical Biological Physics, Department of Physics & Astronomy, Department of Chemistry, Department of Biosciences, Rice University, Houston, TX 77251, United States of America
| | - Ivan Coluzza
- BCMaterials, Basque Center for Materials, Applications and Nanostructures, Bld. Martina Casiano, UPV/EHU Science Park, Barrio Sarriena s/n, 48940 Leioa, Spain
- Basque Foundation for Science, Ikerbasque, 48009, Bilbao, Spain
| |
Collapse
|
13
|
Hayes RL, Vilseck JZ, Brooks CL. Addressing Intersite Coupling Unlocks Large Combinatorial Chemical Spaces for Alchemical Free Energy Methods. J Chem Theory Comput 2022; 18:2114-2123. [PMID: 35255214 PMCID: PMC9700482 DOI: 10.1021/acs.jctc.1c00948] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Alchemical free energy methods are playing a growing role in molecular design, both for computer-aided drug design of small molecules and for computational protein design. Multisite λ dynamics (MSλD) is a uniquely scalable alchemical free energy method that enables more efficient exploration of combinatorial alchemical spaces encountered in molecular design, but simulations have typically been limited to a few hundred ligands or sequences. Here, we focus on coupling between sites to enable scaling to larger alchemical spaces. We first discuss updates to the biasing potentials that facilitate MSλD sampling to include coupling terms and show that this can provide more thorough sampling of alchemical states. We then harness coupling between sites by developing a new free energy estimator based on the Potts models underlying direct coupling analysis, a method for predicting contacts from sequence coevolution, and find it yields more accurate free energies than previous estimators. The sampling requirements of the Potts model estimator scale with the square of the number of sites, a substantial improvement over the exponential scaling of the standard estimator. This opens up exploration of much larger alchemical spaces with MSλD for molecular design.
Collapse
Affiliation(s)
- Ryan L Hayes
- Department of Chemistry, University of Michigan, Ann Arbor, Michigan 48109, United States
- Biophysics Program, University of Michigan, Ann Arbor, Michigan 48109, United States
| | - Jonah Z Vilseck
- Department of Biochemistry and Molecular Biology, Indiana University School of Medicine, Indianapolis, Indiana 46202, United States
- Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, Indianapolis, Indiana 46202, United States
| | - Charles L Brooks
- Department of Chemistry, University of Michigan, Ann Arbor, Michigan 48109, United States
- Biophysics Program, University of Michigan, Ann Arbor, Michigan 48109, United States
| |
Collapse
|
14
|
Bheemireddy S, Srinivasan N. Computational Study on the Dynamics of Mycobacterium Tuberculosis RNA Polymerase Assembly. Methods Mol Biol 2022; 2516:61-79. [PMID: 35922622 DOI: 10.1007/978-1-0716-2413-5_5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Gene regulation is an intricate phenomenon involving precise function of many macromolecular complexes. Molecular basis of this phenomenon is highly complex and cannot be fully understood using a single technique. Computational approaches can play a crucial role in overall understanding of functional and mechanistic features of a protein or an assembly. Large amounts of structural data pertaining to these complexes are publicly available. In this project, we took advantage of the availability of the structural information to unravel functional intricacies of Mycobacterium tuberculosis RNA polymerase upon interaction with RbpA. In this article, we discuss how the knowledge on protein structure and dynamics can be exploited to study function using various computational tools and resources. Overall, this article provides an overview of various computational methods which can be efficiently used to understand the role of any protein. We hope especially the nonexperts in the field could benefit from our article.
Collapse
Affiliation(s)
- Sneha Bheemireddy
- Molecular Biophysics Unit, Indian Institute of Science, Bengaluru, Karnataka, India.
| | | |
Collapse
|
15
|
Miyazawa S. Boltzmann Machine Learning and Regularization Methods for Inferring Evolutionary Fields and Couplings From a Multiple Sequence Alignment. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:328-342. [PMID: 32396099 DOI: 10.1109/tcbb.2020.2993232] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
The inverse Potts problem to infer a Boltzmann distribution for homologous protein sequences from their single-site and pairwise amino acid frequencies recently attracts a great deal of attention in the studies of protein structure and evolution. We study regularization and learning methods and how to tune regularization parameters to correctly infer interactions in Boltzmann machine learning. Using L2 regularization for fields, group L1 for couplings is shown to be very effective for sparse couplings in comparison with L2 and L1. Two regularization parameters are tuned to yield equal values for both the sample and ensemble averages of evolutionary energy. Both averages smoothly change and converge, but their learning profiles are very different between learning methods. The Adam method is modified to make stepsize proportional to the gradient for sparse couplings and to use a soft-thresholding function for group L1. It is shown by first inferring interactions from protein sequences and then from Monte Carlo samples that the fields and couplings can be well recovered, but that recovering the pairwise correlations in the resolution of a total energy is harder for the natural proteins than for the protein-like sequences. Selective temperature for folding/structural constrains in protein evolution is also estimated.
Collapse
|
16
|
Pretti E, Shell MS. A microcanonical approach to temperature-transferable coarse-grained models using the relative entropy. J Chem Phys 2021; 155:094102. [PMID: 34496595 DOI: 10.1063/5.0057104] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Bottom-up coarse-graining methods provide systematic tools for creating simplified models of molecular systems. However, coarse-grained (CG) models produced with such methods frequently fail to accurately reproduce all thermodynamic properties of the reference atomistic systems they seek to model and, moreover, can fail in even more significant ways when used at thermodynamic state points different from the reference conditions. These related problems of representability and transferability limit the usefulness of CG models, especially those of strongly state-dependent systems. In this work, we present a new strategy for creating temperature-transferable CG models using a single reference system and temperature. The approach is based on two complementary concepts. First, we switch to a microcanonical basis for formulating CG models, focusing on effective entropy functions rather than energy functions. This allows CG models to naturally represent information about underlying atomistic energy fluctuations, which would otherwise be lost. Such information not only reproduces energy distributions of the reference model but also successfully predicts the correct temperature dependence of the CG interactions, enabling temperature transferability. Second, we show that relative entropy minimization provides a direct and systematic approach to parameterize such classes of temperature-transferable CG models. We calibrate the approach initially using idealized model systems and then demonstrate its ability to create temperature-transferable CG models for several complex molecular liquids.
Collapse
Affiliation(s)
- Evan Pretti
- Department of Chemical Engineering, Engineering II Building, University of California, Santa Barbara, Santa Barbara, California 93106-5080, USA
| | - M Scott Shell
- Department of Chemical Engineering, Engineering II Building, University of California, Santa Barbara, Santa Barbara, California 93106-5080, USA
| |
Collapse
|
17
|
Tang QY, Kaneko K. Dynamics-Evolution Correspondence in Protein Structures. PHYSICAL REVIEW LETTERS 2021; 127:098103. [PMID: 34506164 DOI: 10.1103/physrevlett.127.098103] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/30/2021] [Accepted: 07/28/2021] [Indexed: 06/13/2023]
Abstract
The genotype-phenotype mapping of proteins is a fundamental question in structural biology. In this Letter, with the analysis of a large dataset of proteins from hundreds of protein families, we quantitatively demonstrate the correlations between the noise-induced protein dynamics and mutation-induced variations of native structures, indicating the dynamics-evolution correspondence of proteins. Based on the investigations of the linear responses of native proteins, the origin of such a correspondence is elucidated. It is essential that the noise- and mutation-induced deformations of the proteins are restricted on a common low-dimensional subspace, as confirmed from the data. These results suggest an evolutionary mechanism of the proteins gaining both dynamical flexibility and evolutionary structural variability.
Collapse
Affiliation(s)
- Qian-Yuan Tang
- Center for Complex Systems Biology, Universal Biology Institute, University of Tokyo, Komaba 3-8-1, Meguro-ku, Tokyo 153-8902, Japan
- Lab for Neural Computation and Adaptation, RIKEN Center for Brain Science, 2-1 Hirosawa, Wako, Saitama 351-0198, Japan
| | - Kunihiko Kaneko
- Center for Complex Systems Biology, Universal Biology Institute, University of Tokyo, Komaba 3-8-1, Meguro-ku, Tokyo 153-8902, Japan
| |
Collapse
|
18
|
Nerattini F, Tubiana L, Cardelli C, Bianco V, Dellago C, Coluzza I. Protein design under competing conditions for the availability of amino acids. Sci Rep 2020; 10:2684. [PMID: 32060385 PMCID: PMC7021711 DOI: 10.1038/s41598-020-59401-9] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2019] [Accepted: 12/08/2019] [Indexed: 11/09/2022] Open
Abstract
Isolating the properties of proteins that allow them to convert sequence into the structure is a long-lasting biophysical problem. In particular, studies focused extensively on the effect of a reduced alphabet size on the folding properties. However, the natural alphabet is a compromise between versatility and optimisation of the available resources. Here, for the first time, we include the impact of the relative availability of the amino acids to extract from the 20 letters the core necessary for protein stability. We present a computational protein design scheme that involves the competition for resources between a protein and a potential interaction partner that, additionally, gives us the chance to investigate the effect of the reduced alphabet on protein-protein interactions. We devise a scheme that automatically identifies the optimal reduced set of letters for the design of the protein, and we observe that even alphabets reduced down to 4 letters allow for single protein folding. However, it is only with 6 letters that we achieve optimal folding, thus recovering experimental observations. Additionally, we notice that the binding between the protein and a potential interaction partner could not be avoided with the investigated reduced alphabets. Therefore, we suggest that aggregation could have been a driving force in the evolution of the large protein alphabet.
Collapse
Affiliation(s)
- Francesca Nerattini
- Faculty of Physics, University of Vienna, Boltzmanngasse 5, 1090, Vienna, Austria
| | - Luca Tubiana
- Faculty of Physics, University of Vienna, Boltzmanngasse 5, 1090, Vienna, Austria
| | - Chiara Cardelli
- Faculty of Physics, University of Vienna, Boltzmanngasse 5, 1090, Vienna, Austria
| | - Valentino Bianco
- Faculty of Physics, University of Vienna, Boltzmanngasse 5, 1090, Vienna, Austria
| | - Christoph Dellago
- Faculty of Physics, University of Vienna, Boltzmanngasse 5, 1090, Vienna, Austria
| | - Ivan Coluzza
- Center for Cooperative Research in Biomaterials (CIC biomaGUNE), Basque Research and Technology Alliance (BRTA), Paseo Miramon 182, 20014, San Sebastian, Spain. .,IKERBASQUE, Basque Foundation for Science, 48013, Bilbao, Spain.
| |
Collapse
|
19
|
Rivoire O. Geometry and Flexibility of Optimal Catalysts in a Minimal Elastic Model. J Phys Chem B 2020; 124:807-813. [PMID: 31990545 DOI: 10.1021/acs.jpcb.0c00244] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
We have general knowledge of the principles by which catalysts accelerate the rate of chemical reactions but no precise understanding of the geometrical and physical constraints to which their design is subject. To analyze these constraints, we introduce a minimal model of catalysis based on elastic networks where the implications of the geometry and flexibility of a catalyst can be studied systematically. The model demonstrates the relevance and limitations of the principle of transition-state stabilization: optimal catalysts are found to have a geometry complementary to the transition state but a degree of flexibility that nontrivially depends on the parameters of the reaction as well as on external parameters such as the concentrations of reactants and products. The results illustrate how simple physical models can provide valuable insights into the design of catalysts.
Collapse
Affiliation(s)
- Olivier Rivoire
- Center for Interdisciplinary Research in Biology (CIRB), Collège de France, CNRS, INSERM , PSL Research University , 75005 Paris , France
| |
Collapse
|
20
|
Kaushik AC, Mehmood A, Khan MT, Kumar A, Dai X, Wei DQ. RETRACTED ARTICLE: Protein blueprint and their interactions while approachability struggle for amino acids. J Biomol Struct Dyn 2020; 39:i-ix. [PMID: 31914855 DOI: 10.1080/07391102.2020.1713894] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
Affiliation(s)
| | - Aamir Mehmood
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China
| | - Muhammad Tahir Khan
- Department of Bioinformatics and Biosciences, Capital University of Science and Technology, Islamabad, Pakistan
| | - Ajay Kumar
- Institute of Biomedical Sciences, National Sun Yat-Sen University, Kaohsiung City, Taiwan
| | - Xiaofeng Dai
- Wuxi School of Medicine, Jiangnan University, Wuxi, China
| | - Dong-Qing Wei
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China
| |
Collapse
|
21
|
Hayes RL, Vilseck JZ, Brooks CL. Approaching protein design with multisite λ dynamics: Accurate and scalable mutational folding free energies in T4 lysozyme. Protein Sci 2019; 27:1910-1922. [PMID: 30175503 DOI: 10.1002/pro.3500] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2018] [Revised: 08/06/2018] [Accepted: 08/15/2018] [Indexed: 12/14/2022]
Abstract
The estimation of changes in free energy upon mutation is central to the problem of protein design. Modern protein design methods have had remarkable success over a wide range of design targets, but are reaching their limits in ligand binding and enzyme design due to insufficient accuracy in mutational free energies. Alchemical free energy calculations have the potential to supplement modern design methods through more accurate molecular dynamics based prediction of free energy changes, but suffer from high computational cost. Multisite λ dynamics (MSλD) is a particularly efficient and scalable free energy method with potential to explore combinatorially large sequence spaces inaccessible with other free energy methods. This work aims to quantify the accuracy of MSλD and demonstrate its scalability. We apply MSλD to the classic problem of calculating folding free energies in T4 lysozyme, a system with a wealth of experimental measurements. Single site mutants considering 32 mutations show remarkable agreement with experiment with a Pearson correlation of 0.914 and mean unsigned error of 1.19 kcal/mol. Multisite mutants in systems with up to five concurrent mutations spanning 240 different sequences show comparable agreement with experiment. These results demonstrate the promise of MSλD in exploring large sequence spaces for protein design.
Collapse
Affiliation(s)
- Ryan L Hayes
- Department of Chemistry, University of Michigan, Ann Arbor, Michigan, 48109
| | - Jonah Z Vilseck
- Department of Chemistry, University of Michigan, Ann Arbor, Michigan, 48109
| | - Charles L Brooks
- Department of Chemistry, University of Michigan, Ann Arbor, Michigan, 48109.,Biophysics Program, University of Michigan, Ann Arbor, Michigan, 48109
| |
Collapse
|
22
|
Cardelli C, Nerattini F, Tubiana L, Bianco V, Dellago C, Sciortino F, Coluzza I. General Methodology to Identify the Minimum Alphabet Size for Heteropolymer Design. ADVANCED THEORY AND SIMULATIONS 2019. [DOI: 10.1002/adts.201900031] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Affiliation(s)
- Chiara Cardelli
- Faculty of PhysicsUniversity of ViennaBoltzmanngasse 5 1090 Vienna Austria
| | | | - Luca Tubiana
- Faculty of PhysicsUniversity of ViennaBoltzmanngasse 5 1090 Vienna Austria
| | - Valentino Bianco
- Faculty of ChemistryChemical Physics DepartmentUniversidad Complutense de Madrid, Plaza de las Ciencias, Ciudad UniversitariaMadrid 28040 Spain
| | - Christoph Dellago
- Faculty of PhysicsUniversity of ViennaBoltzmanngasse 5 1090 Vienna Austria
| | - Francesco Sciortino
- Dipartimento di FisicaSapienza Università di RomaPiazzale Aldo Moro 2 00185 Rome Italy
| | - Ivan Coluzza
- CIC biomaGUNEPaseo Miramon 182 20014 San Sebastian Spain
- IKERBASQUEBasque Foundation for Science48013 Bilbao Spain
| |
Collapse
|
23
|
Herrera-Zúñiga LD, Millán-Pacheco C, Viniegra-González G, Villegas E, Arregui L, Rojo-Domínguez A. Molecular dynamics on laccase from Trametes versicolor to examine thermal stability induced by salt bridges. Chem Phys 2019. [DOI: 10.1016/j.chemphys.2018.10.019] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
|
24
|
Cardelli C, Tubiana L, Bianco V, Nerattini F, Dellago C, Coluzza I. Heteropolymer Design and Folding of Arbitrary Topologies Reveals an Unexpected Role of Alphabet Size on the Knot Population. Macromolecules 2018. [DOI: 10.1021/acs.macromol.8b01359] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Affiliation(s)
- Chiara Cardelli
- Faculty of Physics, University of Vienna, Boltzmanngasse 5, 1090 Vienna, Austria
| | - Luca Tubiana
- Faculty of Physics, University of Vienna, Boltzmanngasse 5, 1090 Vienna, Austria
| | - Valentino Bianco
- Faculty of Physics, University of Vienna, Boltzmanngasse 5, 1090 Vienna, Austria
| | - Francesca Nerattini
- Faculty of Physics, University of Vienna, Boltzmanngasse 5, 1090 Vienna, Austria
| | - Christoph Dellago
- Faculty of Physics, University of Vienna, Boltzmanngasse 5, 1090 Vienna, Austria
| | - Ivan Coluzza
- CIC biomaGUNE, Paseo Miramon 182, 20014 San Sebastian, Spain
- IKERBASQUE,
Basque
Foundation for Science, Maria Diaz de Haro 3, 48013 Bilbao, Spain
| |
Collapse
|
25
|
Selection originating from protein stability/foldability: Relationships between protein folding free energy, sequence ensemble, and fitness. J Theor Biol 2017; 433:21-38. [DOI: 10.1016/j.jtbi.2017.08.018] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2017] [Revised: 07/27/2017] [Accepted: 08/21/2017] [Indexed: 11/19/2022]
|
26
|
Madge J, Miller MA. Optimising minimal building blocks for addressable self-assembly. SOFT MATTER 2017; 13:7780-7792. [PMID: 29018850 DOI: 10.1039/c7sm01646h] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
Addressable structures are characterised by the set of unique components from which they are built and by the specific location that each component occupies. For an addressable structure to self-assemble, its constituent building blocks must be encoded with sufficient information to define their positions with respect to each other and to enable them to navigate to those positions. DNA, with its vast scope for encoding specific interactions, has been successfully used to synthesise addressable systems of several hundred components. In this work we examine the complementary question of the minimal requirements for building blocks to undergo addressable self-assembly driven by a controlled temperature quench. Our testbed is an idealised model of cubic particles patterned with attractive interactions. We introduce a scheme for optimising the interactions using a variant of basin-hopping and a negative design principle. The designed building blocks are tested dynamically in simple target structures to establish how their complexity affects the limits of reliable self-assembly.
Collapse
Affiliation(s)
- Jim Madge
- Department of Chemistry, Durham University, South Road, Durham DH1 3LE, UK.
| | | |
Collapse
|
27
|
Coluzza I. Computational protein design: a review. JOURNAL OF PHYSICS. CONDENSED MATTER : AN INSTITUTE OF PHYSICS JOURNAL 2017; 29:143001. [PMID: 28140371 DOI: 10.1088/1361-648x/aa5c76] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Proteins are one of the most versatile modular assembling systems in nature. Experimentally, more than 110 000 protein structures have been identified and more are deposited every day in the Protein Data Bank. Such an enormous structural variety is to a first approximation controlled by the sequence of amino acids along the peptide chain of each protein. Understanding how the structural and functional properties of the target can be encoded in this sequence is the main objective of protein design. Unfortunately, rational protein design remains one of the major challenges across the disciplines of biology, physics and chemistry. The implications of solving this problem are enormous and branch into materials science, drug design, evolution and even cryptography. For instance, in the field of drug design an effective computational method to design protein-based ligands for biological targets such as viruses, bacteria or tumour cells, could give a significant boost to the development of new therapies with reduced side effects. In materials science, self-assembly is a highly desired property and soon artificial proteins could represent a new class of designable self-assembling materials. The scope of this review is to describe the state of the art in computational protein design methods and give the reader an outline of what developments could be expected in the near future.
Collapse
Affiliation(s)
- Ivan Coluzza
- Computational Physics, Faculty of Physics, University of Vienna, Vienna, Austria
| |
Collapse
|
28
|
Molchanov S, Faizullin DA, Nesmelova IV. Theoretical and Experimental Investigation of the Translational Diffusion of Proteins in the Vicinity of Temperature-Induced Unfolding Transition. J Phys Chem B 2016; 120:10192-10198. [PMID: 27628181 DOI: 10.1021/acs.jpcb.6b05834] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Translational diffusion is the most fundamental form of transport in chemical and biological systems. The diffusion coefficient is highly sensitive to changes in the size of the diffusing species; hence, it provides important information on the variety of macromolecular processes, such as self-assembly or folding-unfolding. Here, we investigate the behavior of the diffusion coefficient of a macromolecule in the vicinity of heat-induced transition from folded to unfolded state. We derive the equation that describes the diffusion coefficient of the macromolecule in the vicinity of the transition and use it to fit the experimental data from pulsed-field-gradient nuclear magnetic resonance (PFG NMR) experiments acquired for two globular proteins, lysozyme and RNase A, undergoing temperature-induced unfolding. A very good qualitative agreement between the theoretically derived diffusion coefficient and experimental data is observed.
Collapse
Affiliation(s)
- Stanislav Molchanov
- National Research University "Higher School of Economics" , Moscow 101000, Russia
| | | | | |
Collapse
|
29
|
Cheng RR, Nordesjö O, Hayes RL, Levine H, Flores SC, Onuchic JN, Morcos F. Connecting the Sequence-Space of Bacterial Signaling Proteins to Phenotypes Using Coevolutionary Landscapes. Mol Biol Evol 2016; 33:3054-3064. [PMID: 27604223 PMCID: PMC5100047 DOI: 10.1093/molbev/msw188] [Citation(s) in RCA: 46] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open
Abstract
Two-component signaling (TCS) is the primary means by which bacteria sense and respond to the environment. TCS involves two partner proteins working in tandem, which interact to perform cellular functions whereas limiting interactions with non-partners (i.e., cross-talk). We construct a Potts model for TCS that can quantitatively predict how mutating amino acid identities affect the interaction between TCS partners and non-partners. The parameters of this model are inferred directly from protein sequence data. This approach drastically reduces the computational complexity of exploring the sequence-space of TCS proteins. As a stringent test, we compare its predictions to a recent comprehensive mutational study, which characterized the functionality of 204 mutational variants of the PhoQ kinase in Escherichia coli We find that our best predictions accurately reproduce the amino acid combinations found in experiment, which enable functional signaling with its partner PhoP. These predictions demonstrate the evolutionary pressure to preserve the interaction between TCS partners as well as prevent unwanted cross-talk. Further, we calculate the mutational change in the binding affinity between PhoQ and PhoP, providing an estimate to the amount of destabilization needed to disrupt TCS.
Collapse
Affiliation(s)
- R R Cheng
- Center for Theoretical Biological Physics, Rice University, Houston, TX
| | - O Nordesjö
- Department of Cell and Molecular Biology, Uppsala University, Uppsala, Sweden
| | - R L Hayes
- Department of Biophysics, University of Michigan, Ann Arbor, MI
| | - H Levine
- Center for Theoretical Biological Physics, Rice University, Houston, TX.,Department of Bioengineering, Rice University, Houston, TX
| | - S C Flores
- Department of Cell and Molecular Biology, Uppsala University, Uppsala, Sweden
| | - J N Onuchic
- Center for Theoretical Biological Physics, Rice University, Houston, TX .,Department of Physics and Astronomy, Rice University, Houston, TX.,Department of Chemistry, and Biosciences, Rice University, Houston, TX
| | - F Morcos
- Department of Biological Sciences and Center for Systems Biology, University of Texas at Dallas, Dallas, TX
| |
Collapse
|
30
|
Venev SV, Zeldovich KB. Massively parallel sampling of lattice proteins reveals foundations of thermal adaptation. J Chem Phys 2016; 143:055101. [PMID: 26254668 DOI: 10.1063/1.4927565] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open
Abstract
Evolution of proteins in bacteria and archaea living in different conditions leads to significant correlations between amino acid usage and environmental temperature. The origins of these correlations are poorly understood, and an important question of protein theory, physics-based prediction of types of amino acids overrepresented in highly thermostable proteins, remains largely unsolved. Here, we extend the random energy model of protein folding by weighting the interaction energies of amino acids by their frequencies in protein sequences and predict the energy gap of proteins designed to fold well at elevated temperatures. To test the model, we present a novel scalable algorithm for simultaneous energy calculation for many sequences in many structures, targeting massively parallel computing architectures such as graphics processing unit. The energy calculation is performed by multiplying two matrices, one representing the complete set of sequences, and the other describing the contact maps of all structural templates. An implementation of the algorithm for the CUDA platform is available at http://www.github.com/kzeldovich/galeprot and calculates protein folding energies over 250 times faster than a single central processing unit. Analysis of amino acid usage in 64-mer cubic lattice proteins designed to fold well at different temperatures demonstrates an excellent agreement between theoretical and simulated values of energy gap. The theoretical predictions of temperature trends of amino acid frequencies are significantly correlated with bioinformatics data on 191 bacteria and archaea, and highlight protein folding constraints as a fundamental selection pressure during thermal adaptation in biological evolution.
Collapse
Affiliation(s)
- Sergey V Venev
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, 368 Plantation St, Worcester, Massachusetts 01605, USA
| | - Konstantin B Zeldovich
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, 368 Plantation St, Worcester, Massachusetts 01605, USA
| |
Collapse
|
31
|
Ferreira DC, van der Linden MG, de Oliveira LC, Onuchic JN, de Araújo AFP. Information and redundancy in the burial folding code of globular proteins within a wide range of shapes and sizes. Proteins 2016; 84:515-31. [PMID: 26815167 DOI: 10.1002/prot.24998] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2015] [Revised: 12/28/2015] [Accepted: 01/19/2016] [Indexed: 11/09/2022]
Abstract
Recent ab initio folding simulations for a limited number of small proteins have corroborated a previous suggestion that atomic burial information obtainable from sequence could be sufficient for tertiary structure determination when combined to sequence-independent geometrical constraints. Here, we use simulations parameterized by native burials to investigate the required amount of information in a diverse set of globular proteins comprising different structural classes and a wide size range. Burial information is provided by a potential term pushing each atom towards one among a small number L of equiprobable concentric layers. An upper bound for the required information is provided by the minimal number of layers L(min) still compatible with correct folding behavior. We obtain L(min) between 3 and 5 for seven small to medium proteins with 50 ≤ Nr ≤ 110 residues while for a larger protein with Nr = 141 we find that L ≥ 6 is required to maintain native stability. We additionally estimate the usable redundancy for a given L ≥ L(min) from the burial entropy associated to the largest folding-compatible fraction of "superfluous" atoms, for which the burial term can be turned off or target layers can be chosen randomly. The estimated redundancy for small proteins with L = 4 is close to 0.8. Our results are consistent with the above-average quality of burial predictions used in previous simulations and indicate that the fraction of approachable proteins could increase significantly with even a mild, plausible, improvement on sequence-dependent burial prediction or on sequence-independent constraints that augment the detectable redundancy during simulations.
Collapse
Affiliation(s)
- Diogo C Ferreira
- Laboratório de Biofísica Teórica e Computacional, Departamento de Biologia Celular, Universidade de Brasília, Brasília, DF, 70910-900, Brazil
| | - Marx G van der Linden
- Laboratório de Biofísica Teórica e Computacional, Departamento de Biologia Celular, Universidade de Brasília, Brasília, DF, 70910-900, Brazil
| | - Leandro C de Oliveira
- Departamento de Física, IBILCE, Universidade Estadual Paulista - UNESP, São José do Rio Preto, SP, 15054-000, Brazil
| | - José N Onuchic
- Center for Theoretical Biological Physics and Departments of Physics and Astronomy, Chemistry and Biosciences Rice University, 6100 Main Street, Houston, Texas, 77005
| | - Antônio F Pereira de Araújo
- Laboratório de Biofísica Teórica e Computacional, Departamento de Biologia Celular, Universidade de Brasília, Brasília, DF, 70910-900, Brazil
| |
Collapse
|
32
|
Wolynes PG. Evolution, energy landscapes and the paradoxes of protein folding. Biochimie 2014; 119:218-30. [PMID: 25530262 DOI: 10.1016/j.biochi.2014.12.007] [Citation(s) in RCA: 110] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2014] [Accepted: 12/11/2014] [Indexed: 01/25/2023]
Abstract
Protein folding has been viewed as a difficult problem of molecular self-organization. The search problem involved in folding however has been simplified through the evolution of folding energy landscapes that are funneled. The funnel hypothesis can be quantified using energy landscape theory based on the minimal frustration principle. Strong quantitative predictions that follow from energy landscape theory have been widely confirmed both through laboratory folding experiments and from detailed simulations. Energy landscape ideas also have allowed successful protein structure prediction algorithms to be developed. The selection constraint of having funneled folding landscapes has left its imprint on the sequences of existing protein structural families. Quantitative analysis of co-evolution patterns allows us to infer the statistical characteristics of the folding landscape. These turn out to be consistent with what has been obtained from laboratory physicochemical folding experiments signaling a beautiful confluence of genomics and chemical physics.
Collapse
Affiliation(s)
- Peter G Wolynes
- Department of Chemistry and Center for Theoretical Biological Physics, Rice University, Houston, TX 77005, USA.
| |
Collapse
|
33
|
Coevolutionary information, protein folding landscapes, and the thermodynamics of natural selection. Proc Natl Acad Sci U S A 2014; 111:12408-13. [PMID: 25114242 DOI: 10.1073/pnas.1413575111] [Citation(s) in RCA: 107] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The energy landscape used by nature over evolutionary timescales to select protein sequences is essentially the same as the one that folds these sequences into functioning proteins, sometimes in microseconds. We show that genomic data, physical coarse-grained free energy functions, and family-specific information theoretic models can be combined to give consistent estimates of energy landscape characteristics of natural proteins. One such characteristic is the effective temperature T(sel) at which these foldable sequences have been selected in sequence space by evolution. T(sel) quantifies the importance of folded-state energetics and structural specificity for molecular evolution. Across all protein families studied, our estimates for T(sel) are well below the experimental folding temperatures, indicating that the energy landscapes of natural foldable proteins are strongly funneled toward the native state.
Collapse
|
34
|
Alvarez-Lorenzo C, Concheiro A. From Drug Dosage Forms to Intelligent Drug-delivery Systems: a Change of Paradigm. SMART MATERIALS FOR DRUG DELIVERY 2013. [DOI: 10.1039/9781849736800-00001] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/30/2023]
Abstract
The design of new drug-delivery systems (DDSs) able to regulate the moment and the rate at which the release should take place, and even to target the drug to specific tissues and cell compartments, has opened novel perspectives to improve the efficacy and safety of the therapeutic treatments. Ideally, the drug should only have access to its site of action and the release should follow the evolution of the disease or of certain biorhythms. The advances in the DDSs field are possible because of a better knowledge of the physiological functions and barriers to the drug access to the action site, but also due to the possibility of having “active” excipients that provide novel features. The joint work in a wide range of disciplines, comprising materials science, biomedical engineering and pharmaceutical technology, prompts the design and development of materials (lipids, polymers, hybrids) that can act as sensors of physiological parameters or external variables, and as actuators able to trigger or tune the release process. Such smart excipients lead to an advanced generation of DDSs designed as intelligent or stimuli-responsive. This chapter provides an overview of how the progress in DDSs is intimately linked to the evolution of the excipients, understood as a specific category of biomaterials. The phase transitions, the stimuli that can trigger them and the mechanisms behind the performance of the intelligent DDSs are analyzed as a whole, to serve as an introduction to the topics that are comprehensively discussed in the subsequent chapters of the book. A look to the future is also provided.
Collapse
Affiliation(s)
- C. Alvarez-Lorenzo
- Departamento de Farmacia y Tecnología Farmacéutica Facultad de Farmacia, Universidad de Santiago de Compostela, 15782-Santiago de Compostela Spain
| | - A. Concheiro
- Departamento de Farmacia y Tecnología Farmacéutica Facultad de Farmacia, Universidad de Santiago de Compostela, 15782-Santiago de Compostela Spain
| |
Collapse
|
35
|
Alvarez-Lorenzo C, González-Chomón C, Concheiro A. Molecularly Imprinted Hydrogels for Affinity-controlled and Stimuli-responsive Drug Delivery. SMART MATERIALS FOR DRUG DELIVERY 2013. [DOI: 10.1039/9781849734318-00228] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/29/2023]
Abstract
The performance of smart or intelligent hydrogels as drug-delivery systems (DDSs) can be notably improved if the network is endowed with high-affinity receptors for the therapeutic molecule. Conventional molecular imprinting technology aims to create tailored binding pockets (artificial receptors) in the structure of rigid polymers by means of a template polymerization, in which the target molecules themselves induce a specific arrangement of the functional monomers during polymer synthesis. Adaptation of this technology to hydrogel synthesis implicates the optimization of the imprinting pocket to be able to recover the high-affinity conformation when distorted by swelling or after the action of a stimulus. This chapter analyzes the implementation of the molecular imprinting technology to the synthesis of both non-responsive and responsive loosely cross-linked hydrogels, and provides recent examples of the suitability of the imprinted networks to attain affinity-controlled, activation-controlled or stimuli-triggered drug and protein release.
Collapse
Affiliation(s)
- C. Alvarez-Lorenzo
- Departamento de Farmacia y Tecnología Farmacéutica, Facultad de Farmacia Universidad de Santiago de Compostela Spain
| | - C. González-Chomón
- Departamento de Farmacia y Tecnología Farmacéutica, Facultad de Farmacia Universidad de Santiago de Compostela Spain
| | - A. Concheiro
- Departamento de Farmacia y Tecnología Farmacéutica, Facultad de Farmacia Universidad de Santiago de Compostela Spain
| |
Collapse
|
36
|
Aita T, Husimi Y. Biophysical connection between evolutionary dynamics and thermodynamics in in vitro evolution. J Theor Biol 2012; 294:122-9. [PMID: 22085736 DOI: 10.1016/j.jtbi.2011.10.036] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2011] [Revised: 08/26/2011] [Accepted: 10/31/2011] [Indexed: 11/15/2022]
Abstract
We analyzed a mathematical model of in vitro evolution conducted by repetition of mutagenesis and selection processes. The selection process consists of the selective enrichment and subsequent sampling as follows: each mutant with fitness W is amplified by the Boltzmann factor exp(rW/k(B)T(the)), where the fitness W is defined as the negative Gibbs free energy (-ΔG) in a reaction of the phenotypic molecules and r is the round number of the selective enrichment; then, an arbitrary mutant is randomly chosen from the resulting mutant population and it becomes a new parent in the next generation. As a result, we found that the evolutionary dynamics is described in a mathematical framework similar to thermodynamics: the "evolution constant" k(E) and "evolutionary temperature" T(evo) play key roles similar to the Boltzmann constant k(B) and thermodynamic temperature T(the), respectively. In the stationary state of the evolutionary dynamics, the attractor of the fitness is in inverse proportion to k(E)T(evo). Furthermore, beyond the mathematical analogy, we obtained a biophysical connection between evolutionary dynamics and thermodynamics. Particularly, we found that T(evo) and T(the) are connected by k(E)T(evo)≈k(B)T(the)/2r. These results suggest that we can predict the fitness value in the stationary state by the thermodynamic temperature T(the) in the experimental setup.
Collapse
Affiliation(s)
- Takuyo Aita
- Graduate School of Science and Engineering, Saitama University, Saitama 338-8570, Japan.
| | | |
Collapse
|
37
|
Ueki T, Yamaguchi A, Watanabe M. Unlocking of interlocked heteropolymer gel by light: photoinduced volume phase transition in an ionic liquid from a metastable state to an equilibrium phase. Chem Commun (Camb) 2012; 48:5133-5. [DOI: 10.1039/c2cc30830d] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
|
38
|
Fitzpatrick AW, Knowles TPJ, Waudby CA, Vendruscolo M, Dobson CM. Inversion of the balance between hydrophobic and hydrogen bonding interactions in protein folding and aggregation. PLoS Comput Biol 2011; 7:e1002169. [PMID: 22022239 PMCID: PMC3192805 DOI: 10.1371/journal.pcbi.1002169] [Citation(s) in RCA: 74] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2010] [Accepted: 07/06/2011] [Indexed: 12/25/2022] Open
Abstract
Identifying the forces that drive proteins to misfold and aggregate, rather than to fold into their functional states, is fundamental to our understanding of living systems and to our ability to combat protein deposition disorders such as Alzheimer's disease and the spongiform encephalopathies. We report here the finding that the balance between hydrophobic and hydrogen bonding interactions is different for proteins in the processes of folding to their native states and misfolding to the alternative amyloid structures. We find that the minima of the protein free energy landscape for folding and misfolding tend to be respectively dominated by hydrophobic and by hydrogen bonding interactions. These results characterise the nature of the interactions that determine the competition between folding and misfolding of proteins by revealing that the stability of native proteins is primarily determined by hydrophobic interactions between side-chains, while the stability of amyloid fibrils depends more on backbone intermolecular hydrogen bonding interactions. In order to carry out their biological functions, most proteins fold into well-defined conformations known as native states. Failure to fold, or to remain folded correctly, may result in misfolding and aggregation, which are processes associated with a wide range of highly debilitating, and so far incurable, human conditions that include Alzheimer's and Parkinson's diseases and type II diabetes. In our work we investigate the nature of the fundamental interactions that are responsible for the folding and misfolding behaviour of proteins, finding that interactions between protein side-chains play a major role in stabilising native states, whilst backbone hydrogen bonding interactions are key in determining the stability of amyloid fibrils.
Collapse
Affiliation(s)
| | | | | | | | - Christopher M. Dobson
- Department of Chemistry, University of Cambridge, Cambridge, United Kingdom
- * E-mail:
| |
Collapse
|
39
|
On the role of frustration in the energy landscapes of allosteric proteins. Proc Natl Acad Sci U S A 2011; 108:3499-503. [PMID: 21273505 DOI: 10.1073/pnas.1018980108] [Citation(s) in RCA: 140] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open
Abstract
Natural protein domains must be sufficiently stable to fold but often need to be locally unstable to function. Overall, strong energetic conflicts are minimized in native states satisfying the principle of minimal frustration. Local violations of this principle open up possibilities to form the complex multifunnel energy landscapes needed for large-scale conformational changes. We survey the local frustration patterns of allosteric domains and show that the regions that reconfigure are often enriched in patches of highly frustrated interactions, consistent both with the idea that these locally frustrated regions may act as specific hinges or that proteins may "crack" in these locations. On the other hand, the symmetry of multimeric protein assemblies allows near degeneracy by reconfiguring while maintaining minimally frustrated interactions. We also anecdotally examine some specific examples of complex conformational changes and speculate on the role of frustration in the kinetics of allosteric change.
Collapse
|
40
|
Tanaka T, Enoki T, Grosberg AYU, Masamune S, Oya T, Takaoka Y, Tanaka K, Wang C, Wang G. Reversible molecular adsorption as a tool to observe freezing and to perform design of heteropolymer gels. ACTA ACUST UNITED AC 2010. [DOI: 10.1002/bbpc.19981021103] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|
41
|
Biomolecular information gained through in vitro evolution. Biophys Rev 2010; 2:1-11. [DOI: 10.1007/s12551-009-0021-8] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2009] [Accepted: 10/22/2009] [Indexed: 11/30/2022] Open
|
42
|
Bonnard C, Kleinman CL, Rodrigue N, Lartillot N. Fast optimization of statistical potentials for structurally constrained phylogenetic models. BMC Evol Biol 2009; 9:227. [PMID: 19740424 PMCID: PMC2754480 DOI: 10.1186/1471-2148-9-227] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2009] [Accepted: 09/09/2009] [Indexed: 11/16/2022] Open
Abstract
Background Statistical approaches for protein design are relevant in the field of molecular evolutionary studies. In recent years, new, so-called structurally constrained (SC) models of protein-coding sequence evolution have been proposed, which use statistical potentials to assess sequence-structure compatibility. In a previous work, we defined a statistical framework for optimizing knowledge-based potentials especially suited to SC models. Our method used the maximum likelihood principle and provided what we call the joint potentials. However, the method required numerical estimations by the use of computationally heavy Markov Chain Monte Carlo sampling algorithms. Results Here, we develop an alternative optimization procedure, based on a leave-one-out argument coupled to fast gradient descent algorithms. We assess that the leave-one-out potential yields very similar results to the joint approach developed previously, both in terms of the resulting potential parameters, and by Bayes factor evaluation in a phylogenetic context. On the other hand, the leave-one-out approach results in a considerable computational benefit (up to a 1,000 fold decrease in computational time for the optimization procedure). Conclusion Due to its computational speed, the optimization method we propose offers an attractive alternative for the design and empirical evaluation of alternative forms of potentials, using large data sets and high-dimensional parameterizations.
Collapse
Affiliation(s)
- Cécile Bonnard
- Département d'Informatique, LIRMM, 161 rue Ada, 34392 Montpellier Cedex 5, France.
| | | | | | | |
Collapse
|
43
|
Zhuravlev PI, Materese CK, Papoian GA. Deconstructing the native state: energy landscapes, function, and dynamics of globular proteins. J Phys Chem B 2009; 113:8800-12. [PMID: 19453123 DOI: 10.1021/jp810659u] [Citation(s) in RCA: 56] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023]
Abstract
Proteins are highly complex molecules with features exquisitely selected by nature to carry out essential biological functions. Physical chemistry and polymer physics provide us with the tools needed to make sense of this complexity. Upon translation, many proteins fold to a thermodynamically stable form known as the native state. The native state is not static, but consists of a hierarchy of conformations, that are continuously explored through dynamics. In this review we provide a brief introduction to some of the core concepts required in the discussion of the protein native dynamics using energy landscapes ideas. We first discuss recent works which have challenged the structure-function paradigm by demonstrating function in disordered proteins. Next we examine the hierarchical organization in the energy landscapes using atomistic molecular dynamics simulations and principal component analysis. In particular, the role of direct and water-mediated contacts in sculpting the landscape is elaborated. Another approach to studying the native state ensemble is based on choosing high-resolution order parameters for computing one- or two-dimensional free energy surfaces. We demonstrate that 2D free energy surfaces provide rich thermodynamic and kinetic information about the native state ensemble. Brownian dynamics simulations on such a surface indicate that protein conformational dynamics is weakly activated. Finally, we briefly discuss implicit and coarse-grained protein models and emphasize the solvent role in determining native state structure and dynamics.
Collapse
Affiliation(s)
- Pavel I Zhuravlev
- Department of Chemistry, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599-3290, USA
| | | | | |
Collapse
|
44
|
Molecular imprinting within hydrogels II: Progress and analysis of the field. Int J Pharm 2008; 364:188-212. [DOI: 10.1016/j.ijpharm.2008.09.002] [Citation(s) in RCA: 134] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2008] [Revised: 08/30/2008] [Accepted: 09/01/2008] [Indexed: 11/22/2022]
|
45
|
Evaluating and optimizing computational protein design force fields using fixed composition-based negative design. Proc Natl Acad Sci U S A 2008; 105:12242-7. [PMID: 18708527 DOI: 10.1073/pnas.0805858105] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
An accurate force field is essential to computational protein design and protein fold prediction studies. Proper force field tuning is problematic, however, due in part to the incomplete modeling of the unfolded state. Here, we evaluate and optimize a protein design force field by constraining the amino acid composition of the designed sequences to that of a well behaved model protein. According to the random energy model, unfolded state energies are dependent only on amino acid composition and not the specific arrangement of amino acids. Therefore, energy discrepancies between computational predictions and experimental results, for sequences of identical composition, can be directly attributed to flaws in the force field's ability to properly account for folded state sequence energies. This aspect of fixed composition design allows for force field optimization by focusing solely on the interactions in the folded state. Several rounds of fixed composition optimization of the 56-residue beta1 domain of protein G yielded force field parameters with significantly greater predictive power: Optimized sequences exhibited higher wild-type sequence identity in critical regions of the structure, and the wild-type sequence showed an improved Z-score. Experimental studies revealed a designed 24-fold mutant to be stably folded with a melting temperature similar to that of the wild-type protein. Sequence designs using engrailed homeodomain as a scaffold produced similar results, suggesting the tuned force field parameters were not specific to protein G.
Collapse
|
46
|
Aita T, Husimi Y. Fitting protein-folding free energy landscape for a certain conformation to an NK fitness landscape. J Theor Biol 2008; 253:151-61. [DOI: 10.1016/j.jtbi.2008.02.034] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2007] [Revised: 01/22/2008] [Accepted: 02/15/2008] [Indexed: 10/22/2022]
|
47
|
Abstract
The amino acid composition of intrinsically disordered proteins and protein segments characteristically differs from that of ordered proteins. This observation forms the basis of several disorder prediction methods. These, however, usually perform worse for smaller proteins (or segments) than for larger ones. We show that the regions of amino acid composition space corresponding to ordered and disordered proteins overlap with each other, and the extent of the overlap (the "twilight zone") is larger for short than for long chains. To explain this finding, we used two-dimensional lattice model proteins containing hydrophobic, polar, and charged monomers and revealed the relation among chain length, amino acid composition, and disorder. Because the number of chain configurations exponentially grows with chain length, a larger fraction of longer chains can reach a low-energy, ordered state than do shorter chains. The amount of information carried by the amino acid composition about whether a protein or segment is (dis)ordered grows with increasing chain length. Smaller proteins rely more on specific interactions for stability, which limits the possible accuracy of disorder prediction methods. For proteins in the "twilight zone", size can determine order, as illustrated by the example of two-state homodimers.
Collapse
|
48
|
Pereira de Araújo AF, Gomes ALC, Bursztyn AA, Shakhnovich EI. Native atomic burials, supplemented by physically motivated hydrogen bond constraints, contain sufficient information to determine the tertiary structure of small globular proteins. Proteins 2008; 70:971-83. [PMID: 17847091 DOI: 10.1002/prot.21571] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
We investigate the possibility that atomic burials, as measured by their distances from the structural geometrical center, contain sufficient information to determine the tertiary structure of globular proteins. We report Monte Carlo simulated annealing results of all-atom hard-sphere models in continuous space for four small proteins: the all-beta WW-domain 1E0L, the alpha/beta protein-G 1IGD, the all-alpha engrailed homeo-domain 1ENH, and the alpha + beta engineered monomeric form of the Cro protein 1ORC. We used as energy function the sum over all atoms, labeled by i, of |R(i) - R(i) (*)|, where R(i) is the atomic distance from the center of coordinates, or central distance, and R(i) (*) is the "ideal" central distance obtained from the native structure. Hydrogen bonds were taken into consideration by the assignment of two ideal distances for backbone atoms forming hydrogen bonds in the native structure depending on the formation of a geometrically defined bond, independently of bond partner. Lowest energy final conformations turned out to be very similar to the native structure for the four proteins under investigation and a strong correlation was observed between energy and distance root mean square deviation (DRMS) from the native in the case of all-beta 1E0L and alpha/beta 1IGD. For all alpha 1ENH and alpha + beta 1ORC the overall correlation between energy and DRMS among final conformations was not as high because some trajectories resulted in high DRMS but low energy final conformations in which alpha-helices adopted a non-native mutual orientation. Comparison between central distances and actual accessible surface areas corroborated the implicit assumption of correlation between these two quantities. The Z-score obtained with this native-centric potential in the discrimination of native 1ORC from a set of random compact structures confirmed that it contains a much smaller amount of native information when compared to a traditional contact Go potential but indicated that simple sequence-dependent burial potentials still need some improvement in order to attain a similar discriminability. Taken together, our results suggest that central distances, in conjunction to physically motivated hydrogen bond constraints, contain sufficient information to determine the native conformation of these small proteins and that a solution to the folding problem for globular proteins could arise from sufficiently accurate burial predictions from sequence followed by minimization of a burial-dependent energy function.
Collapse
Affiliation(s)
- Antônio F Pereira de Araújo
- Laboratório de Biologia Teórica, Departamento de Biologia Celular, Universidade de Brasília, Brasília-DF 70910-900, Brazil.
| | | | | | | |
Collapse
|
49
|
Behringer H, Degenhard A, Schmid F. Coarse-grained lattice model for investigating the role of cooperativity in molecular recognition. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2007; 76:031914. [PMID: 17930278 DOI: 10.1103/physreve.76.031914] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/04/2007] [Indexed: 05/25/2023]
Abstract
Equilibrium aspects of the molecular recognition of rigid biomolecules are investigated using coarse-grained lattice models. The analysis is carried out in two stages. First, an ensemble of probe molecules is designed with respect to the target biomolecule. The recognition ability of the probe ensemble is then investigated by calculating the free energy of association. The influence of cooperative and anticooperative effects accompanying the association of the target and probe molecules is studied. Numerical findings are presented and compared to analytical results which can be obtained in the limit of dominating cooperativity and in the mean-field formulation of the models.
Collapse
Affiliation(s)
- Hans Behringer
- Fakultät für Physik, Universität Bielefeld, D-33615 Bielefeld, Germany
| | | | | |
Collapse
|
50
|
Mamasakhlisov YS, Hayryan S, Hu CK. Random sequences with power-law correlations exhibit proteinlike behavior. J Chem Phys 2007; 126:145103. [PMID: 17444752 DOI: 10.1063/1.2714944] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
We use a replica approach to investigate the thermodynamic properties of the random heteropolymers with persistent power-law correlations in monomer sequence. We show that this type of sequences possess proteinlike properties. In particular, we show that they can fold into stable unique three-dimensional structure (the "native" structure, in protein terminology) through two different types of pathways. One is a fast folding pathway and leads directly to the native structure. Another one, a more slower pathway, passes through the microphase separated (MPS) state and includes a number of intermediate glassy states. The scale and the magnitude of the MPS are calculated. The frozen state can be reached only by sequences with weak long-range correlations. The critical value for the correlation exponent is found, above which (strong correlations) freezing is impossible.
Collapse
|