1
|
Heat-induced structural and chemical changes to a computationally designed miniprotein. Protein Sci 2024; 33:e4991. [PMID: 38757381 PMCID: PMC11099715 DOI: 10.1002/pro.4991] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2023] [Revised: 03/22/2024] [Accepted: 03/28/2024] [Indexed: 05/18/2024]
Abstract
The de novo design of miniprotein inhibitors has recently emerged as a new technology to create proteins that bind with high affinity to specific therapeutic targets. Their size, ease of expression, and apparent high stability makes them excellent candidates for a new class of protein drugs. However, beyond circular dichroism melts and hydrogen/deuterium exchange experiments, little is known about their dynamics, especially at the elevated temperatures they seemingly tolerate quite well. To address that and gain insight for future designs, we have focused on identifying unintended and previously overlooked heat-induced structural and chemical changes in a particularly stable model miniprotein, EHEE_rd2_0005. Nuclear magnetic resonance (NMR) studies suggest the presence of dynamics on multiple time and temperature scales. Transiently elevating the temperature results in spontaneous chemical deamidation visible in the NMR spectra, which we validate using both capillary electrophoresis and mass spectrometry (MS) experiments. High temperatures also result in greatly accelerated intrinsic rates of hydrogen exchange and signal loss in NMR heteronuclear single quantum coherence spectra from local unfolding. These losses are in excellent agreement with both room temperature hydrogen exchange experiments and hydrogen bond disruption in replica exchange molecular dynamics simulations. Our analysis reveals important principles for future miniprotein designs and the potential for high stability to result in long-lived alternate conformational states.
Collapse
|
2
|
In vivo selection of synthetic nucleocapsids for tissue targeting. Proc Natl Acad Sci U S A 2023; 120:e2306129120. [PMID: 37939083 PMCID: PMC10655225 DOI: 10.1073/pnas.2306129120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2023] [Accepted: 09/21/2023] [Indexed: 11/10/2023] Open
Abstract
Controlling the biodistribution of protein- and nanoparticle-based therapeutic formulations remains challenging. In vivo library selection is an effective method for identifying constructs that exhibit desired distribution behavior; library variants can be selected based on their ability to localize to the tissue or compartment of interest despite complex physiological challenges. Here, we describe further development of an in vivo library selection platform based on self-assembling protein nanoparticles encapsulating their own mRNA genomes (synthetic nucleocapsids or synNCs). We tested two distinct libraries: a low-diversity library composed of synNC surface mutations (45 variants) and a high-diversity library composed of synNCs displaying miniproteins with binder-like properties (6.2 million variants). While we did not identify any variants from the low-diversity surface library that yielded therapeutically relevant changes in biodistribution, the high-diversity miniprotein display library yielded variants that shifted accumulation toward lungs or muscles in just two rounds of in vivo selection. Our approach should contribute to achieving specific tissue homing patterns and identifying targeting ligands for diseases of interest.
Collapse
|
3
|
Mega-scale experimental analysis of protein folding stability in biology and design. Nature 2023; 620:434-444. [PMID: 37468638 PMCID: PMC10412457 DOI: 10.1038/s41586-023-06328-6] [Citation(s) in RCA: 29] [Impact Index Per Article: 29.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2023] [Accepted: 06/14/2023] [Indexed: 07/21/2023]
Abstract
Advances in DNA sequencing and machine learning are providing insights into protein sequences and structures on an enormous scale1. However, the energetics driving folding are invisible in these structures and remain largely unknown2. The hidden thermodynamics of folding can drive disease3,4, shape protein evolution5-7 and guide protein engineering8-10, and new approaches are needed to reveal these thermodynamics for every sequence and structure. Here we present cDNA display proteolysis, a method for measuring thermodynamic folding stability for up to 900,000 protein domains in a one-week experiment. From 1.8 million measurements in total, we curated a set of around 776,000 high-quality folding stabilities covering all single amino acid variants and selected double mutants of 331 natural and 148 de novo designed protein domains 40-72 amino acids in length. Using this extensive dataset, we quantified (1) environmental factors influencing amino acid fitness, (2) thermodynamic couplings (including unexpected interactions) between protein sites, and (3) the global divergence between evolutionary amino acid usage and protein folding stability. We also examined how our approach could identify stability determinants in designed proteins and evaluate design methods. The cDNA display proteolysis method is fast, accurate and uniquely scalable, and promises to reveal the quantitative rules for how amino acid sequences encode folding stability.
Collapse
|
4
|
Mega-scale analysis of protein folding stability in biology and protein design. Biophys J 2023; 122:17a-18a. [PMID: 36782850 DOI: 10.1016/j.bpj.2022.11.321] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/12/2023] Open
|
5
|
Large-scale design and refinement of stable proteins using sequence-only models. PLoS One 2022; 17:e0265020. [PMID: 35286324 PMCID: PMC8920274 DOI: 10.1371/journal.pone.0265020] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2021] [Accepted: 02/18/2022] [Indexed: 12/25/2022] Open
Abstract
Engineered proteins generally must possess a stable structure in order to achieve their designed function. Stable designs, however, are astronomically rare within the space of all possible amino acid sequences. As a consequence, many designs must be tested computationally and experimentally in order to find stable ones, which is expensive in terms of time and resources. Here we use a high-throughput, low-fidelity assay to experimentally evaluate the stability of approximately 200,000 novel proteins. These include a wide range of sequence perturbations, providing a baseline for future work in the field. We build a neural network model that predicts protein stability given only sequences of amino acids, and compare its performance to the assayed values. We also report another network model that is able to generate the amino acid sequences of novel stable proteins given requested secondary sequences. Finally, we show that the predictive model—despite weaknesses including a noisy data set—can be used to substantially increase the stability of both expert-designed and model-generated proteins.
Collapse
|
6
|
High throughput experimental determination of mini-protein local stability and fluctuation. Biophys J 2022. [DOI: 10.1016/j.bpj.2021.11.1121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022] Open
|
7
|
Prediction and Validation of a Protein's Free Energy Surface Using Hydrogen Exchange and (Importantly) Its Denaturant Dependence. J Chem Theory Comput 2021; 18:550-561. [PMID: 34936354 PMCID: PMC8757463 DOI: 10.1021/acs.jctc.1c00960] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
The denaturant dependence of hydrogen-deuterium exchange (HDX) is a powerful measurement to identify the breaking of individual H-bonds and map the free energy surface (FES) of a protein including the very rare states. Molecular dynamics (MD) can identify each partial unfolding event with atomic-level resolution. Hence, their combination provides a great opportunity to test the accuracy of simulations and to verify the interpretation of HDX data. For this comparison, we use Upside, our new and extremely fast MD package that is capable of folding proteins with an accuracy comparable to that of all-atom methods. The FESs of two naturally occurring and two designed proteins are so generated and compared to our NMR/HDX data. We find that Upside's accuracy is considerably improved upon modifying the energy function using a new machine-learning procedure that trains for proper protein behavior including realistic denatured states in addition to stable native states. The resulting increase in cooperativity is critical for replicating the HDX data and protein stability, indicating that we have properly encoded the underlying physiochemical interactions into an MD package. We did observe some mismatch, however, underscoring the ongoing challenges faced by simulations in calculating accurate FESs. Nevertheless, our ensembles can identify the properties of the fluctuations that lead to HDX, whether they be small-, medium-, or large-scale openings, and can speak to the breadth of the native ensemble that has been a matter of debate.
Collapse
|
8
|
Prediction of a Protein‘s Free Energy Surface and Validation with Hd Exchange. Biophys J 2021. [DOI: 10.1016/j.bpj.2020.11.917] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022] Open
|
9
|
Global analysis of protein folding using massively parallel design, synthesis, and testing. Science 2018; 357:168-175. [PMID: 28706065 PMCID: PMC5568797 DOI: 10.1126/science.aan0693] [Citation(s) in RCA: 266] [Impact Index Per Article: 44.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2017] [Accepted: 06/09/2017] [Indexed: 12/18/2022]
Abstract
Proteins fold into unique native structures stabilized by thousands of weak interactions that collectively overcome the entropic cost of folding. Although these forces are "encoded" in the thousands of known protein structures, "decoding" them is challenging because of the complexity of natural proteins that have evolved for function, not stability. We combined computational protein design, next-generation gene synthesis, and a high-throughput protease susceptibility assay to measure folding and stability for more than 15,000 de novo designed miniproteins, 1000 natural proteins, 10,000 point mutants, and 30,000 negative control sequences. This analysis identified more than 2500 stable designed proteins in four basic folds-a number sufficient to enable us to systematically examine how sequence determines folding and stability in uncharted protein space. Iteration between design and experiment increased the design success rate from 6% to 47%, produced stable proteins unlike those found in nature for topologies where design was initially unsuccessful, and revealed subtle contributions to stability as designs became increasingly optimized. Our approach achieves the long-standing goal of a tight feedback cycle between computation and experiment and has the potential to transform computational protein design into a data-driven science.
Collapse
|
10
|
Massively parallel de novo protein design for targeted therapeutics. Nature 2017; 550:74-79. [PMID: 28953867 DOI: 10.1038/nature23912] [Citation(s) in RCA: 268] [Impact Index Per Article: 38.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2017] [Accepted: 08/17/2017] [Indexed: 12/24/2022]
Abstract
De novo protein design holds promise for creating small stable proteins with shapes customized to bind therapeutic targets. We describe a massively parallel approach for designing, manufacturing and screening mini-protein binders, integrating large-scale computational design, oligonucleotide synthesis, yeast display screening and next-generation sequencing. We designed and tested 22,660 mini-proteins of 37-43 residues that target influenza haemagglutinin and botulinum neurotoxin B, along with 6,286 control sequences to probe contributions to folding and binding, and identified 2,618 high-affinity binders. Comparison of the binding and non-binding design sets, which are two orders of magnitude larger than any previously investigated, enabled the evaluation and improvement of the computational model. Biophysical characterization of a subset of the binder designs showed that they are extremely stable and, unlike antibodies, do not lose activity after exposure to high temperatures. The designs elicit little or no immune response and provide potent prophylactic and therapeutic protection against influenza, even after extensive repeated dosing.
Collapse
|
11
|
High-Throughput Protein Design Reveals Quantitative Protein Stability Requirements. Biophys J 2017. [DOI: 10.1016/j.bpj.2016.11.1076] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022] Open
|
12
|
Abstract
Naturally occurring, pharmacologically active peptides constrained with covalent crosslinks generally have shapes that have evolved to fit precisely into binding pockets on their targets. Such peptides can have excellent pharmaceutical properties, combining the stability and tissue penetration of small-molecule drugs with the specificity of much larger protein therapeutics. The ability to design constrained peptides with precisely specified tertiary structures would enable the design of shape-complementary inhibitors of arbitrary targets. Here we describe the development of computational methods for accurate de novo design of conformationally restricted peptides, and the use of these methods to design 18-47 residue, disulfide-crosslinked peptides, a subset of which are heterochiral and/or N-C backbone-cyclized. Both genetically encodable and non-canonical peptides are exceptionally stable to thermal and chemical denaturation, and 12 experimentally determined X-ray and NMR structures are nearly identical to the computational design models. The computational design methods and stable scaffolds presented here provide the basis for development of a new generation of peptide-based drugs.
Collapse
|
13
|
Predicting Charged-Ligand Binding from Molecular Simulations. Biophys J 2014. [DOI: 10.1016/j.bpj.2013.11.1462] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022] Open
|
14
|
Calculating the binding free energies of charged species based on explicit-solvent simulations employing lattice-sum methods: an accurate correction scheme for electrostatic finite-size effects. J Chem Phys 2013; 139:184103. [PMID: 24320250 PMCID: PMC3838431 DOI: 10.1063/1.4826261] [Citation(s) in RCA: 169] [Impact Index Per Article: 15.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2013] [Accepted: 09/30/2013] [Indexed: 01/12/2023] Open
Abstract
The calculation of a protein-ligand binding free energy based on molecular dynamics (MD) simulations generally relies on a thermodynamic cycle in which the ligand is alchemically inserted into the system, both in the solvated protein and free in solution. The corresponding ligand-insertion free energies are typically calculated in nanoscale computational boxes simulated under periodic boundary conditions and considering electrostatic interactions defined by a periodic lattice-sum. This is distinct from the ideal bulk situation of a system of macroscopic size simulated under non-periodic boundary conditions with Coulombic electrostatic interactions. This discrepancy results in finite-size effects, which affect primarily the charging component of the insertion free energy, are dependent on the box size, and can be large when the ligand bears a net charge, especially if the protein is charged as well. This article investigates finite-size effects on calculated charging free energies using as a test case the binding of the ligand 2-amino-5-methylthiazole (net charge +1 e) to a mutant form of yeast cytochrome c peroxidase in water. Considering different charge isoforms of the protein (net charges -5, 0, +3, or +9 e), either in the absence or the presence of neutralizing counter-ions, and sizes of the cubic computational box (edges ranging from 7.42 to 11.02 nm), the potentially large magnitude of finite-size effects on the raw charging free energies (up to 17.1 kJ mol(-1)) is demonstrated. Two correction schemes are then proposed to eliminate these effects, a numerical and an analytical one. Both schemes are based on a continuum-electrostatics analysis and require performing Poisson-Boltzmann (PB) calculations on the protein-ligand system. While the numerical scheme requires PB calculations under both non-periodic and periodic boundary conditions, the latter at the box size considered in the MD simulations, the analytical scheme only requires three non-periodic PB calculations for a given system, its dependence on the box size being analytical. The latter scheme also provides insight into the physical origin of the finite-size effects. These two schemes also encompass a correction for discrete solvent effects that persists even in the limit of infinite box sizes. Application of either scheme essentially eliminates the size dependence of the corrected charging free energies (maximal deviation of 1.5 kJ mol(-1)). Because it is simple to apply, the analytical correction scheme offers a general solution to the problem of finite-size effects in free-energy calculations involving charged solutes, as encountered in calculations concerning, e.g., protein-ligand binding, biomolecular association, residue mutation, pKa and redox potential estimation, substrate transformation, solvation, and solvent-solvent partitioning.
Collapse
|
15
|
Separated topologies--a method for relative binding free energy calculations using orientational restraints. J Chem Phys 2013; 138:085104. [PMID: 23464180 DOI: 10.1063/1.4792251] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
Orientational restraints can improve the efficiency of alchemical free energy calculations, but they are not typically applied in relative binding calculations, which compute the affinity difference been two ligands. Here, we describe a new "separated topologies" method, which computes relative binding free energies using orientational restraints and which has several advantages over existing methods. While standard approaches maintain the initial and final ligand in a shared orientation, the separated topologies approach allows the initial and final ligands to have distinct orientations. This avoids a slowly converging reorientation step in the calculation. The separated topologies approach can also be applied to determine the relative free energies of multiple orientations of the same ligand. We illustrate the approach by calculating the relative binding free energies of two compounds to an engineered site in Cytochrome C Peroxidase.
Collapse
|
16
|
Calculating the sensitivity and robustness of binding free energy calculations to force field parameters. J Chem Theory Comput 2013; 9:3072-3083. [PMID: 24015114 PMCID: PMC3763860 DOI: 10.1021/ct400315q] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Binding free energy calculations offer a thermodynamically rigorous method to compute protein-ligand binding, and they depend on empirical force fields with hundreds of parameters. We examined the sensitivity of computed binding free energies to the ligand's electrostatic and van der Waals parameters. Dielectric screening and cancellation of effects between ligand-protein and ligand-solvent interactions reduce the parameter sensitivity of binding affinity by 65%, compared with interaction strengths computed in the gas-phase. However, multiple changes to parameters combine additively on average, which can lead to large changes in overall affinity from many small changes to parameters. Using these results, we estimate that random, uncorrelated errors in force field nonbonded parameters must be smaller than 0.02 e per charge, 0.06 Å per radius, and 0.01 kcal/mol per well depth in order to obtain 68% (one standard deviation) confidence that a computed affinity for a moderately-sized lead compound will fall within 1 kcal/mol of the true affinity, if these are the only sources of error considered.
Collapse
|
17
|
Predicting ligand binding affinity with alchemical free energy methods in a polar model binding site. J Mol Biol 2009; 394:747-63. [PMID: 19782087 DOI: 10.1016/j.jmb.2009.09.049] [Citation(s) in RCA: 120] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2009] [Revised: 09/16/2009] [Accepted: 09/18/2009] [Indexed: 10/20/2022]
Abstract
We present a combined experimental and modeling study of organic ligand molecules binding to a slightly polar engineered cavity site in T4 lysozyme (L99A/M102Q). For modeling, we computed alchemical absolute binding free energies. These were blind tests performed prospectively on 13 diverse, previously untested candidate ligand molecules. We predicted that eight compounds would bind to the cavity and five would not; 11 of 13 predictions were correct at this level. The RMS error to the measurable absolute binding energies was 1.8 kcal/mol. In addition, we computed "relative" binding free energies for six phenol derivatives starting from two known ligands: phenol and catechol. The average RMS error in the relative free energy prediction was 2.5 kcal/mol (phenol) and 1.1 kcal/mol (catechol). To understand these results at atomic resolution, we obtained x-ray co-complex structures for nine of the diverse ligands and for all six phenol analogs. The average RMSD of the predicted pose to the experiment was 2.0 A (diverse set), 1.8 A (phenol-derived predictions), and 1.2 A (catechol-derived predictions). We found that predicting accurate affinities and rank-orderings required near-native starting orientations of the ligand in the binding site. Unanticipated binding modes, multiple ligand binding, and protein conformational change all proved challenging for the free energy methods. We believe that these results can help guide future improvements in physics-based absolute binding free energy methods.
Collapse
|