1
|
Pan X, Kortemme T. De novo protein fold families expand the designable ligand binding site space. PLoS Comput Biol 2021; 17:e1009620. [PMID: 34807909 PMCID: PMC8648124 DOI: 10.1371/journal.pcbi.1009620] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2021] [Revised: 12/06/2021] [Accepted: 11/08/2021] [Indexed: 11/19/2022] Open
Abstract
A major challenge in designing proteins de novo to bind user-defined ligands with high affinity is finding backbones structures into which a new binding site geometry can be engineered with high precision. Recent advances in methods to generate protein fold families de novo have expanded the space of accessible protein structures, but it is not clear to what extend de novo proteins with diverse geometries also expand the space of designable ligand binding functions. We constructed a library of 25,806 high-quality ligand binding sites and developed a fast protocol to place (“match”) these binding sites into both naturally occurring and de novo protein families with two fold topologies: Rossman and NTF2. Each matching step involves engineering new binding site residues into each protein “scaffold”, which is distinct from the problem of comparing already existing binding pockets. 5,896 and 7,475 binding sites could be matched to the Rossmann and NTF2 fold families, respectively. De novo designed Rossman and NTF2 protein families can support 1,791 and 678 binding sites that cannot be matched to naturally existing structures with the same topologies, respectively. While the number of protein residues in ligand binding sites is the major determinant of matching success, ligand size and primary sequence separation of binding site residues also play important roles. The number of matched binding sites are power law functions of the number of members in a fold family. Our results suggest that de novo sampling of geometric variations on diverse fold topologies can significantly expand the space of designable ligand binding sites for a wealth of possible new protein functions. De novo design of proteins that can bind to novel and highly diverse user-defined small molecule ligands could have broad biomedical and synthetic biology applications. Because ligand binding site geometries need to be accommodated by protein backbone scaffolds at high accuracy, the diversity of scaffolds is a major limitation for designing new ligand binding functions. Advances in computational protein structure design methods have significantly increased the number of accessible stable scaffold structures. Understanding how many new ligand binding sites can be designed into the de novo scaffolds is important for engineering novel ligand binding proteins. To answer this question, we constructed a large library of ligand binding sites from the Protein Data Bank (PDB). We tested the number of ligand binding sites that can be designed into de novo scaffolds and naturally existing scaffolds with the same fold topologies. The results showed that de novo scaffolds significantly expanded the potential ligand binding space of their respective fold topologies. We also identified factors that affect difficulties of binding site accommodation, as well as the relationship between the number of scaffolds and the accessible ligand binding site space. We believe our findings will benefit future method development and applications of ligand binding protein design.
Collapse
Affiliation(s)
- Xingjie Pan
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, California, United States of America
- UC Berkeley–UCSF Graduate Program in Bioengineering, University of California San Francisco, San Francisco, California, United States of America
- * E-mail: (XP); (TK)
| | - Tanja Kortemme
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, California, United States of America
- UC Berkeley–UCSF Graduate Program in Bioengineering, University of California San Francisco, San Francisco, California, United States of America
- Quantitative Biosciences Institute (QBI), University of California San Francisco, San Francisco, California, United States of America
- Chan Zuckerberg Biohub, San Francisco, California, United States of America
- * E-mail: (XP); (TK)
| |
Collapse
|
2
|
On the possible origin of protein homochirality, structure, and biochemical function. Proc Natl Acad Sci U S A 2019; 116:26571-26579. [PMID: 31822617 DOI: 10.1073/pnas.1908241116] [Citation(s) in RCA: 27] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023] Open
Abstract
Living systems have chiral molecules, e.g., native proteins that almost entirely contain L-amino acids. How protein homochirality emerged from a background of equal numbers of L and D amino acids is among many questions about life's origin. The origin of homochirality and its implications are explored in computer simulations examining the stability and structural and functional properties of an artificial library of compact proteins containing 1:1 (termed demi-chiral), 3:1, and 1:3 ratios of D:L and purely L or D amino acids generated without functional selection. Demi-chiral proteins have shorter secondary structures and fewer internal hydrogen bonds and are less stable than homochiral proteins. Selection for hydrogen bonding yields a preponderance of L or D amino acids. Demi-chiral proteins have native global folds, including similarity to early ribosomal proteins, similar small molecule ligand binding pocket geometries, and many constellations of L-chiral amino acids with a 1.0-Å RMSD to native enzyme active sites. For a representative subset containing 550 active site geometries matching 457 (2) 4-digit (3-digit) enzyme classification (E.C.) numbers, native active site amino acids were generated at random for 472 of 550 cases. This increases to 548 of 550 cases when similar residues are allowed. The most frequently generated sequences correspond to ancient enzymatic functions, e.g., glycolysis, replication, and nucleotide biosynthesis. Surprisingly, even without selection, demi-chiral proteins possess the requisite marginal biochemical function and structure of modern proteins, but were thermodynamically less stable. If demi-chiral proteins were present, they could engage in early metabolism, which created the feedback loop for transcription and cell formation.
Collapse
|
3
|
Hecht MH, Zarzhitsky S, Karas C, Chari S. Are natural proteins special? Can we do that? Curr Opin Struct Biol 2018; 48:124-132. [DOI: 10.1016/j.sbi.2017.11.009] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2017] [Accepted: 11/29/2017] [Indexed: 12/23/2022]
|
4
|
Kalinowska B, Banach M, Wiśniowski Z, Konieczny L, Roterman I. Is the hydrophobic core a universal structural element in proteins? J Mol Model 2017. [PMID: 28623601 PMCID: PMC5487895 DOI: 10.1007/s00894-017-3367-z] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
The hydrophobic core, when subjected to analysis based on the fuzzy oil drop model, appears to be a universal structural component of proteins irrespective of their secondary, supersecondary, and tertiary conformations. A study has been performed on a set of nonhomologous proteins representing a variety of CATH categories. The presence of a well-ordered hydrophobic core has been confirmed in each case, regardless of the protein’s biological function, chain length or source organism. In light of fuzzy oil drop (FOD) analysis, various supersecondary forms seem to share a common structural factor in the form of a hydrophobic core, emerging either as part of the whole protein or a specific domain. The variable status of individual folds with respect to the FOD model reflects their propensity for conformational changes, frequently associated with biological function. Such flexibility is expressed as variable stability of the hydrophobic core, along with specific encoding of potential conformational changes which depend on the properties of helices and β-folds.
Collapse
Affiliation(s)
- Barbara Kalinowska
- Department of Bioinformatics and Telemedicine, Jagiellonian University - Medical College, Lazarza 16, 31-530, Krakow, Poland.,Faculty of Physics, Astronomy and Applied Computer Science, Jagiellonian University, Łojasiewicza 11, 30-348, Krakow, Poland
| | - Mateusz Banach
- Department of Bioinformatics and Telemedicine, Jagiellonian University - Medical College, Lazarza 16, 31-530, Krakow, Poland.,Faculty of Physics, Astronomy and Applied Computer Science, Jagiellonian University, Łojasiewicza 11, 30-348, Krakow, Poland
| | - Zdzisław Wiśniowski
- Department of Bioinformatics and Telemedicine, Jagiellonian University - Medical College, Lazarza 16, 31-530, Krakow, Poland
| | - Leszek Konieczny
- Chair of Medical Biochemistry, Jagiellonian University - Medical College, Kopernika 7, 31-034, Krakow, Poland
| | - Irena Roterman
- Department of Bioinformatics and Telemedicine, Jagiellonian University - Medical College, Lazarza 16, 31-530, Krakow, Poland.
| |
Collapse
|
5
|
Tonddast-Navaei S, Srinivasan B, Skolnick J. On the importance of composite protein multiple ligand interactions in protein pockets. J Comput Chem 2017; 38:1252-1259. [PMID: 27864975 PMCID: PMC5403588 DOI: 10.1002/jcc.24523] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2016] [Revised: 09/26/2016] [Accepted: 10/11/2016] [Indexed: 01/08/2023]
Abstract
Conventional small molecule drug-discovery approaches target protein pockets. However, the limited number of geometrically distinct pockets leads to widespread promiscuity and deleterious side-effects. Here, the idea of COmposite protein LIGands (COLIG) that interact with each other as well as the protein within a single ligand binding pocket is examined. As a practical illustration, experimental evidence that E. coli Dihydrofolate reductase inhibitors are COLIGs is presented. Then, analysis of a non-redundant set of all holo PDB structures indicates that almost 47-76% of proteins (based on different sequence identity thresholds) can simultaneously bind multiple, interacting ligands in the same pocket. Moreover, most ligands that are either Singletons and COLIGs bind at the bottom of ligand binding pocket and occupy 30% and 43% of the volume of the bottom of the pocket. This suggests the use of COLIGs as a potential new class of small molecule drugs. © 2016 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Sam Tonddast-Navaei
- Center for the Study of Systems Biology, School of Biology, Georgia Institute of Technology, 950 Atlantic Drive, Atlanta, Georgia 30332, United States
| | - Bharath Srinivasan
- Center for the Study of Systems Biology, School of Biology, Georgia Institute of Technology, 950 Atlantic Drive, Atlanta, Georgia 30332, United States
| | - Jeffrey Skolnick
- Center for the Study of Systems Biology, School of Biology, Georgia Institute of Technology, 950 Atlantic Drive, Atlanta, Georgia 30332, United States
| |
Collapse
|
6
|
Skolnick J, Zhou H. Why Is There a Glass Ceiling for Threading Based Protein Structure Prediction Methods? J Phys Chem B 2016; 121:3546-3554. [PMID: 27748116 DOI: 10.1021/acs.jpcb.6b09517] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
Despite their different implementations, comparison of the best threading approaches to the prediction of evolutionary distant protein structures reveals that they tend to succeed or fail on the same protein targets. This is true despite the fact that the structural template library has good templates for all cases. Thus, a key question is why are certain protein structures threadable while others are not. Comparison with threading results on a set of artificial sequences selected for stability further argues that the failure of threading is due to the nature of the protein structures themselves. Using a new contact map based alignment algorithm, we demonstrate that certain folds are highly degenerate in that they can have very similar coarse grained fractions of native contacts aligned and yet differ significantly from the native structure. For threadable proteins, this is not the case. Thus, contemporary threading approaches appear to have reached a plateau, and new approaches to structure prediction are required.
Collapse
Affiliation(s)
- Jeffrey Skolnick
- Center for the Study of Systems Biology, School of Biological Sciences, Georgia Institute of Technology , 950 Atlantic Drive Northwest, Atlanta, Georgia 30318, United States
| | - Hongyi Zhou
- Center for the Study of Systems Biology, School of Biological Sciences, Georgia Institute of Technology , 950 Atlantic Drive Northwest, Atlanta, Georgia 30318, United States
| |
Collapse
|
7
|
Skolnick J. Perspective: On the importance of hydrodynamic interactions in the subcellular dynamics of macromolecules. J Chem Phys 2016; 145:100901. [PMID: 27634243 PMCID: PMC5018002 DOI: 10.1063/1.4962258] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2016] [Accepted: 08/01/2016] [Indexed: 12/30/2022] Open
Abstract
An outstanding challenge in computational biophysics is the simulation of a living cell at molecular detail. Over the past several years, using Stokesian dynamics, progress has been made in simulating coarse grained molecular models of the cytoplasm. Since macromolecules comprise 20%-40% of the volume of a cell, one would expect that steric interactions dominate macromolecular diffusion. However, the reduction in cellular diffusion rates relative to infinite dilution is due, roughly equally, to steric and hydrodynamic interactions, HI, with nonspecific attractive interactions likely playing rather a minor role. HI not only serve to slow down long time diffusion rates but also cause a considerable reduction in the magnitude of the short time diffusion coefficient relative to that at infinite dilution. More importantly, the long range contribution of the Rotne-Prager-Yamakawa diffusion tensor results in temporal and spatial correlations that persist up to microseconds and for intermolecular distances on the order of protein radii. While HI slow down the bimolecular association rate in the early stages of lipid bilayer formation, they accelerate the rate of large scale assembly of lipid aggregates. This is suggestive of an important role for HI in the self-assembly kinetics of large macromolecular complexes such as tubulin. Since HI are important, questions as to whether continuum models of HI are adequate as well as improved simulation methodologies that will make simulations of more complex cellular processes practical need to be addressed. Nevertheless, the stage is set for the molecular simulations of ever more complex subcellular processes.
Collapse
Affiliation(s)
- Jeffrey Skolnick
- Center for the Study of Systems Biology, School of Biology, Georgia Institute of Technology, 950 Atlantic Dr., NW, Atlanta, Georgia 30332, USA
| |
Collapse
|
8
|
Catalytic and substrate promiscuity: distinct multiple chemistries catalysed by the phosphatase domain of receptor protein tyrosine phosphatase. Biochem J 2016; 473:2165-77. [PMID: 27208174 DOI: 10.1042/bcj20160289] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2016] [Accepted: 05/16/2016] [Indexed: 02/04/2023]
Abstract
The presence of latent activities in enzymes is posited to underlie the natural evolution of new catalytic functions. However, the prevalence and extent of such substrate and catalytic ambiguity in evolved enzymes is difficult to address experimentally given the order-of-magnitude difference in the activities for native and, sometimes, promiscuous substrate/s. Further, such latent functions are of special interest when the activities concerned do not fall into the domain of substrate promiscuity. In the present study, we show a special case of such latent enzyme activity by demonstrating the presence of two mechanistically distinct reactions catalysed by the catalytic domain of receptor protein tyrosine phosphatase isoform δ (PTPRδ). The primary catalytic activity involves the hydrolysis of a phosphomonoester bond (C─O─P) with high catalytic efficiency, whereas the secondary activity is the hydrolysis of a glycosidic bond (C─O─C) with poorer catalytic efficiency. This enzyme also displays substrate promiscuity by hydrolysing diester bonds while being highly discriminative for its monoester substrates. To confirm these activities, we also demonstrated their presence on the catalytic domain of protein tyrosine phosphatase Ω (PTPRΩ), a homologue of PTPRδ. Studies on the rate, metal-ion dependence, pH dependence and inhibition of the respective activities showed that they are markedly different. This is the first study that demonstrates a novel sugar hydrolase and diesterase activity for the phosphatase domain (PD) of PTPRδ and PTPRΩ. This work has significant implications for both understanding the evolution of enzymatic activity and the possible physiological role of this new chemistry. Our findings suggest that the genome might harbour a wealth of such alternative latent enzyme activities in the same protein domain that renders our knowledge of metabolic networks incomplete.
Collapse
|