1
|
McPartlon M, Xu J. An end-to-end deep learning method for protein side-chain packing and inverse folding. Proc Natl Acad Sci U S A 2023; 120:e2216438120. [PMID: 37253017 PMCID: PMC10266014 DOI: 10.1073/pnas.2216438120] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2022] [Accepted: 04/24/2023] [Indexed: 06/01/2023] Open
Abstract
Protein side-chain packing (PSCP), the task of determining amino acid side-chain conformations given only backbone atom positions, has important applications to protein structure prediction, refinement, and design. Many methods have been proposed to tackle this problem, but their speed or accuracy is still unsatisfactory. To address this, we present AttnPacker, a deep learning (DL) method for directly predicting protein side-chain coordinates. Unlike existing methods, AttnPacker directly incorporates backbone 3D geometry to simultaneously compute all side-chain coordinates without delegating to a discrete rotamer library or performing expensive conformational search and sampling steps. This enables a significant increase in computational efficiency, decreasing inference time by over 100× compared to the DL-based method DLPacker and physics-based RosettaPacker. Tested on the CASP13 and CASP14 native and nonnative protein backbones, AttnPacker computes physically realistic side-chain conformations, reducing steric clashes and improving both rmsd and dihedral accuracy compared to state-of-the-art methods SCWRL4, FASPR, RosettaPacker, and DLPacker. Different from traditional PSCP approaches, AttnPacker can also codesign sequences and side chains, producing designs with subnative Rosetta energy and high in silico consistency.
Collapse
Affiliation(s)
- Matthew McPartlon
- Department of Computer Science, Physical Sciences, The University of Chicago, Chicago, IL60637
| | - Jinbo Xu
- Toyota Technical Institute of Chicago, Chicago, IL60637
- MoleculeMind Inc., Beijing100086, China
| |
Collapse
|
2
|
Misiura M, Shroff R, Thyer R, Kolomeisky AB. DLPacker: Deep learning for prediction of amino acid side chain conformations in proteins. Proteins 2022; 90:1278-1290. [PMID: 35122328 DOI: 10.1002/prot.26311] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2021] [Revised: 12/03/2021] [Accepted: 12/07/2021] [Indexed: 12/20/2022]
Abstract
Prediction of side chain conformations of amino acids in proteins (also termed "packing") is an important and challenging part of protein structure prediction with many interesting applications in protein design. A variety of methods for packing have been developed but more accurate ones are still needed. Machine learning (ML) methods have recently become a powerful tool for solving various problems in diverse areas of science, including structural biology. In this study, we evaluate the potential of deep neural networks (DNNs) for prediction of amino acid side chain conformations. We formulate the problem as image-to-image transformation and train a U-net style DNN to solve the problem. We show that our method outperforms other physics-based methods by a significant margin: reconstruction RMSDs for most amino acids are about 20% smaller compared to SCWRL4 and Rosetta Packer with RMSDs for bulky hydrophobic amino acids Phe, Tyr, and Trp being up to 50% smaller.
Collapse
Affiliation(s)
- Mikita Misiura
- Department of Chemistry, Center for Theoretical Biological Physics, Rice University, Houston, Texas, USA
| | | | - Ross Thyer
- Department of Chemical and Biomolecular Engineering, Rice University, Houston, Texas, USA
| | - Anatoly B Kolomeisky
- Department of Chemistry, Center for Theoretical Biological Physics, Rice University, Houston, Texas, USA.,Department of Chemical and Biomolecular Engineering, Rice University, Houston, Texas, USA.,Department of Physics and Astronomy, Center for Theoretical Biological Physics, Rice University, Houston, Texas, USA
| |
Collapse
|
3
|
Bouchiba Y, Cortés J, Schiex T, Barbe S. Molecular flexibility in computational protein design: an algorithmic perspective. Protein Eng Des Sel 2021; 34:6271252. [PMID: 33959778 DOI: 10.1093/protein/gzab011] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2020] [Revised: 03/12/2021] [Accepted: 03/29/2021] [Indexed: 12/19/2022] Open
Abstract
Computational protein design (CPD) is a powerful technique for engineering new proteins, with both great fundamental implications and diverse practical interests. However, the approximations usually made for computational efficiency, using a single fixed backbone and a discrete set of side chain rotamers, tend to produce rigid and hyper-stable folds that may lack functionality. These approximations contrast with the demonstrated importance of molecular flexibility and motions in a wide range of protein functions. The integration of backbone flexibility and multiple conformational states in CPD, in order to relieve the inaccuracies resulting from these simplifications and to improve design reliability, are attracting increased attention. However, the greatly increased search space that needs to be explored in these extensions defines extremely challenging computational problems. In this review, we outline the principles of CPD and discuss recent effort in algorithmic developments for incorporating molecular flexibility in the design process.
Collapse
Affiliation(s)
- Younes Bouchiba
- Toulouse Biotechnology Institute, TBI, CNRS, INRAE, INSA, ANITI, Toulouse 31400, France.,Laboratoire d'Analyse et d'Architecture des Systèmes, LAAS CNRS, Université de Toulouse, CNRS, Toulouse 31400, France
| | - Juan Cortés
- Laboratoire d'Analyse et d'Architecture des Systèmes, LAAS CNRS, Université de Toulouse, CNRS, Toulouse 31400, France
| | - Thomas Schiex
- Université de Toulouse, ANITI, INRAE, UR MIAT, F-31320, Castanet-Tolosan, France
| | - Sophie Barbe
- Toulouse Biotechnology Institute, TBI, CNRS, INRAE, INSA, ANITI, Toulouse 31400, France
| |
Collapse
|
4
|
Pereira JM, Vieira M, Santos SM. Step-by-step design of proteins for small molecule interaction: A review on recent milestones. Protein Sci 2021; 30:1502-1520. [PMID: 33934427 DOI: 10.1002/pro.4098] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2021] [Revised: 04/21/2021] [Accepted: 04/23/2021] [Indexed: 01/01/2023]
Abstract
Protein design is the field of synthetic biology that aims at developing de novo custom-made proteins and peptides for specific applications. Despite exploring an ambitious goal, recent computational advances in both hardware and software technologies have paved the way to high-throughput screening and detailed design of novel folds and improved functionalities. Modern advances in the field of protein design for small molecule targeting are described in this review, organized in a step-by-step fashion: from the conception of a new or upgraded active binding site, to scaffold design, sequence optimization, and experimental expression of the custom protein. In each step, contemporary examples are described, and state-of-the-art software is briefly explored.
Collapse
Affiliation(s)
- José M Pereira
- CICECO & Departamento de Química, Universidade de Aveiro, Aveiro, Portugal
| | - Maria Vieira
- CICECO & Departamento de Química, Universidade de Aveiro, Aveiro, Portugal
| | - Sérgio M Santos
- CICECO & Departamento de Química, Universidade de Aveiro, Aveiro, Portugal
| |
Collapse
|
5
|
Abstract
This chapter describes two computational methods for PDZ-peptide binding: high-throughput computational protein design (CPD) and a medium-throughput approach combining molecular dynamics for conformational sampling with a Poisson-Boltzmann (PB) Linear Interaction Energy for scoring. A new CPD method is outlined, which uses adaptive Monte Carlo simulations to efficiently sample peptide variants that tightly bind a PDZ domain, and provides at the same time precise estimates of their relative binding free energies. A detailed protocol is described based on the Proteus CPD software. The medium-throughput approach can be performed with standard MD and PB software, such as NAMD and Charmm. For 40 complexes between Tiam1 and peptide ligands, it gave high a2ccuracy, with mean errors of around 0.5 kcal/mol for relative binding free energies and no large errors. It requires a moderate amount of parameter fitting before it can be applied, and its transferability to other protein families is still untested.
Collapse
Affiliation(s)
- Nicolas Panel
- Laboratoire de Biologie Structurale de la Cellule (CNRS UMR7654), Ecole Polytechnique, Palaiseau, France
| | - Francesco Villa
- Laboratoire de Biologie Structurale de la Cellule (CNRS UMR7654), Ecole Polytechnique, Palaiseau, France
| | - Vaitea Opuu
- Laboratoire de Biologie Structurale de la Cellule (CNRS UMR7654), Ecole Polytechnique, Palaiseau, France
| | - David Mignon
- Laboratoire de Biologie Structurale de la Cellule (CNRS UMR7654), Ecole Polytechnique, Palaiseau, France
| | - Thomas Simonson
- Laboratoire de Biologie Structurale de la Cellule (CNRS UMR7654), Ecole Polytechnique, Palaiseau, France.
| |
Collapse
|
6
|
Zsidó BZ, Hetényi C. Molecular Structure, Binding Affinity, and Biological Activity in the Epigenome. Int J Mol Sci 2020; 21:ijms21114134. [PMID: 32531926 PMCID: PMC7311975 DOI: 10.3390/ijms21114134] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2020] [Revised: 06/07/2020] [Accepted: 06/08/2020] [Indexed: 02/07/2023] Open
Abstract
Development of valid structure–activity relationships (SARs) is a key to the elucidation of pathomechanisms of epigenetic diseases and the development of efficient, new drugs. The present review is based on selected methodologies and applications supplying molecular structure, binding affinity and biological activity data for the development of new SARs. An emphasis is placed on emerging trends and permanent challenges of new discoveries of SARs in the context of proteins as epigenetic drug targets. The review gives a brief overview and classification of the molecular background of epigenetic changes, and surveys both experimental and theoretical approaches in the field. Besides the results of sophisticated, cutting edge techniques such as cryo-electron microscopy, protein crystallography, and isothermal titration calorimetry, examples of frequently used assays and fast screening techniques are also selected. The review features how different experimental methods and theoretical approaches complement each other and result in valid SARs of the epigenome.
Collapse
|
7
|
MacLeod-Carey D, Solis-Céspedes E, Lamazares E, Mena-Ulecia K. Evaluation of new antihypertensive drugs designed in silico using Thermolysin as a target. Saudi Pharm J 2020; 28:582-592. [PMID: 32435139 PMCID: PMC7229335 DOI: 10.1016/j.jsps.2020.03.010] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2019] [Accepted: 03/18/2020] [Indexed: 12/18/2022] Open
Abstract
The search for new therapies for the treatment of Arterial hypertension is a major concern in the scientific community. Here, we employ a computational biochemistry protocol to evaluate the performance of six compounds (Lig783, Lig1022, Lig1392, Lig2177, Lig3444 and Lig6199) to act as antihypertensive agents. This protocol consists of Docking experiments, efficiency calculations of ligands, molecular dynamics simulations, free energy, pharmacological and toxicological properties predictions (ADME-Tox) of the six ligands against Thermolysin. Our results show that the docked structures had an adequate orientation in the pocket of the Thermolysin enzymes, reproducing the X-ray crystal structure of Inhibitor-Thermolysin complexes in an acceptable way. The most promising candidates to act as antihypertensive agents among the series are Lig2177 and Lig3444. These compounds form the most stable ligand-Thermolysin complexes according to their binding free energy values obtained in the docking experiments as well as MM-GBSA decomposition analysis calculations. They present the lowest values of Ki, indicating that these ligands bind strongly to Thermolysin. Lig2177 was oriented in the pocket of Thermolysin in such a way that both OH of the dihydroxyl-amino groups to establish hydrogen bond interactions with Glu146 and Glu166. In the same way, Lig3444 interacts with Asp150, Glu143 and Tyr157. Additionally, Lig2177 and Lig3444 fulfill all the requirements established by Lipinski Veber and Pfizer 3/75 rules, indicating that these compounds could be safe compounds to be used as antihypertensive agents. We are confident that our computational biochemistry protocol can be used to evaluate and predict the behavior of a broad range of compounds designed in silicoagainst a protein target.
Collapse
Affiliation(s)
- Desmond MacLeod-Carey
- Universidad Autónoma de Chile, Facultad de Ingeniería, Instituto de Ciencias Químicas Aplicadas, Inorganic Chemistry and Molecular Materials Center, El Llano Subercaseaux 2801, San Miguel, Santiago, Chile
| | - Eduardo Solis-Céspedes
- Vicerrectoría de Investigación y Postgrado, Universidad Católica del Maule, 3460000 Talca, Chile
| | - Emilio Lamazares
- Universidad de Concepción, Biotechnology and Biopharmaceutical Laboratory, Pathophysiology Department, School of Biological Sciences, Victor Lamas 1290, P.O. Box 160-C, Concepción, Chile
| | - Karel Mena-Ulecia
- Universidad Católica de Temuco, Facultad de Recursos Naturales, Departamento de Ciencias Biolígicas y Químicas, Ave. Rudecindo Ortega #02950, Temuco, Chile
- Corresponding author at: Universidad Católica de Temuco, Facultad de Recursos Naturales, Departamento de Ciencias Biológicas y Químicas, Ave. Rudecindo Ortega #02950, Temuco, Región de la Araucanía, Chile.
| |
Collapse
|
8
|
Adaptive landscape flattening allows the design of both enzyme: Substrate binding and catalytic power. PLoS Comput Biol 2020; 16:e1007600. [PMID: 31917825 PMCID: PMC7041857 DOI: 10.1371/journal.pcbi.1007600] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2019] [Revised: 02/25/2020] [Accepted: 12/11/2019] [Indexed: 01/30/2023] Open
Abstract
Designed enzymes are of fundamental and technological interest. Experimental directed evolution still has significant limitations, and computational approaches are a complementary route. A designed enzyme should satisfy multiple criteria: stability, substrate binding, transition state binding. Such multi-objective design is computationally challenging. Two recent studies used adaptive importance sampling Monte Carlo to redesign proteins for ligand binding. By first flattening the energy landscape of the apo protein, they obtained positive design for the bound state and negative design for the unbound. We have now extended the method to design an enzyme for specific transition state binding, i.e., for its catalytic power. We considered methionyl-tRNA synthetase (MetRS), which attaches methionine (Met) to its cognate tRNA, establishing codon identity. Previously, MetRS and other synthetases have been redesigned by experimental directed evolution to accept noncanonical amino acids as substrates, leading to genetic code expansion. Here, we have redesigned MetRS computationally to bind several ligands: the Met analog azidonorleucine, methionyl-adenylate (MetAMP), and the activated ligands that form the transition state for MetAMP production. Enzyme mutants known to have azidonorleucine activity were recovered by the design calculations, and 17 mutants predicted to bind MetAMP were characterized experimentally and all found to be active. Mutants predicted to have low activation free energies for MetAMP production were found to be active and the predicted reaction rates agreed well with the experimental values. We suggest the present method should become the paradigm for computational enzyme design. Designed enzymes are of major interest. Experimental directed evolution still has significant limitations, and computational approaches are another route. Enzymes must be stable, bind substrates, and be powerful catalysts. It is challenging to design for all these properties. A method to design substrate binding was proposed recently. It used an adaptive Monte Carlo method to explore mutations of a few amino acids near the substrate. A bias energy was gradually “learned” such that, in the absence of the ligand, the simulation visited most of the possible protein mutations with comparable probabilities. Remarkably, a simulation of the protein:ligand complex, including the bias, will then preferentially sample tight-binding sequences. We generalized the method to design binding specificity. We tested it for the methionyl-tRNA synthetase enzyme, which has been engineered in order to expand the genetic code. We redesigned the enzyme to obtain variants with low activation free energies for the catalytic step. The variants proposed by the simulations were shown experimentally to be active, and the predicted activation free energies were in reasonable agreement with the experimental values. We expect the new method will become the paradigm for computational enzyme design.
Collapse
|
9
|
Charpentier A, Mignon D, Barbe S, Cortes J, Schiex T, Simonson T, Allouche D. Variable Neighborhood Search with Cost Function Networks To Solve Large Computational Protein Design Problems. J Chem Inf Model 2018; 59:127-136. [DOI: 10.1021/acs.jcim.8b00510] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
| | - David Mignon
- Laboratoire de Biochimie (CNRS UMR 7654), École Polytechnique, 91128 Palaiseau, France
| | - Sophie Barbe
- Laboratoire d’Ingénierie des Systèmes Biologiques et Procédés, LISBP, Université de Toulouse, CNRS, INRA, INSA, 31077 Toulouse, France
| | - Juan Cortes
- LAAS-CNRS, Université de Toulouse, CNRS, 31400 Toulouse, France
| | - Thomas Schiex
- MIAT, Université de Toulouse, INRA, 31326 Castanet-Tolosan, France
| | - Thomas Simonson
- Laboratoire de Biochimie (CNRS UMR 7654), École Polytechnique, 91128 Palaiseau, France
| | - David Allouche
- MIAT, Université de Toulouse, INRA, 31326 Castanet-Tolosan, France
| |
Collapse
|
10
|
Colbes J, Corona RI, Lezcano C, Rodríguez D, Brizuela CA. Protein side-chain packing problem: is there still room for improvement? Brief Bioinform 2018; 18:1033-1043. [PMID: 27567382 DOI: 10.1093/bib/bbw079] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2016] [Indexed: 11/12/2022] Open
Abstract
The protein side-chain packing problem (PSCPP) is an important subproblem of both protein structure prediction and protein design. During the past two decades, a large number of methods have been proposed to tackle this problem. These methods consist of three main components: a rotamer library, a scoring function and a search strategy. The average overall accuracy level obtained by these methods is approximately 87%. Whether a better accuracy level could be achieved remains to be answered. To address this question, we calculated the maximum accuracy level attainable using a simple rotamer library, independently of the energy function or the search method. Using 2883 different structures from the Protein Data Bank, we compared this accuracy level with the accuracy level of five state-of-the-art methods. These comparisons indicated that, for buried residues in the protein, we are already close to the best possible accuracy results. In addition, for exposed residues, we found that a significant gap exists between the possible improvement and the maximum accuracy level achievable with current methods. After determining that an improvement is possible, the next step is to understand what limitations are preventing us from obtaining such an improvement. Previous works on protein structure prediction and protein design have shown that scoring function inaccuracies may represent the main obstacle to achieving better results for these problems. To show that the same is true for the PSCPP, we evaluated the quality of two scoring functions used by some state-of-the-art algorithms. Our results indicate that neither of these scoring functions can guide the search method correctly, thereby reinforcing the idea that efforts to solve the PSCPP must also focus on developing better scoring functions.
Collapse
|
11
|
Gorska-Ponikowska M, Kuban-Jankowska A, Eisler SA, Perricone U, Lo Bosco G, Barone G, Nussberger S. 2-Methoxyestradiol Affects Mitochondrial Biogenesis Pathway and Succinate Dehydrogenase Complex Flavoprotein Subunit A in Osteosarcoma Cancer Cells. Cancer Genomics Proteomics 2018; 15:73-89. [PMID: 29275365 DOI: 10.21873/cgp.20067] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2017] [Revised: 11/19/2017] [Accepted: 11/29/2017] [Indexed: 12/15/2022] Open
Abstract
BACKGROUND/AIM Dysregulation of mitochondrial pathways is implicated in several diseases, including cancer. Notably, mitochondrial respiration and mitochondrial biogenesis are favored in some invasive cancer cells, such as osteosarcoma. Hence, the aim of the current work was to investigate the effects of 2-methoxyestradiol (2-ME), a potent anticancer agent, on the mitochondrial biogenesis of osteosarcoma cells. MATERIALS AND METHODS Highly metastatic osteosarcoma 143B cells were treated with 2-ME separately or in combination with L-lactate, or with the solvent (non-treated control cells). Protein levels of α-syntrophin and peroxisome proliferator-activated receptor gamma, coactivator 1 alpha (PGC-1α) were determined by western blotting. Impact of 2-ME on mitochondrial mass, regulation of cytochrome c oxidase I (COXI) expression, and succinate dehydrogenase complex flavoprotein subunit A (SDHA) was determined by immunofluorescence analyses. Inhibition of sirtuin 3 (SIRT3) activity by 2-ME was investigated by fluorescence assay and also, using molecular docking and molecular dynamics simulations. RESULTS L-lactate induced mitochondrial biogenesis pathway via up-regulation of COXI. 2-ME inhibited mitochondrial biogenesis via regulation of PGC-1α, COXI, and SIRT3 in a concentration-dependent manner as a consequence of nuclear recruitment of neuronal nitric oxide synthase and nitric oxide generation. It was also proved that 2-ME inhibited SIRT3 activity by binding to both the canonical and allosteric inhibitor binding sites. Moreover, regardless of the mitochondrial biogenesis pathway, 2-ME affected the expression of SDHA. CONCLUSION Herein, mitochondrial biogenesis pathway regulation and SDHA were presented as novel targets of 2-ME, and moreover, 2-ME was demonstrated as a potent inhibitor of SIRT3. L-lactate was confirmed to exert pro-carcinogenic effects on osteosarcoma cells via the induction of the mitochondrial biogenesis pathway. Thus, L-lactate level may be considered as a prognostic biomarker for osteosarcoma.
Collapse
Affiliation(s)
- Magdalena Gorska-Ponikowska
- Department of Medical Chemistry, Medical University of Gdansk, Gdansk, Poland .,Department of Biophysics, Institute of Biomaterials and Biomolecular Systems, University of Stuttgart, Stuttgart, Germany
| | | | - Stephan A Eisler
- Stuttgart Research Center Systems Biology, University of Stuttgart, Stuttgart, Germany
| | | | - Giosuè Lo Bosco
- Department of Mathematics and Computer Science, Università degli Studi di Palermo, Palermo, Italy.,Euro-Mediterranean Institute of Science and Technology, Palermo, Italy
| | - Giampaolo Barone
- Department of Biological, Chemical and Pharmaceutical Sciences and Technologies, Università degli Studi di Palermo, Palermo, Italy
| | - Stephan Nussberger
- Department of Biophysics, Institute of Biomaterials and Biomolecular Systems, University of Stuttgart, Stuttgart, Germany
| |
Collapse
|
12
|
Colbes J, Aguila SA, Brizuela CA. Scoring of Side-Chain Packings: An Analysis of Weight Factors and Molecular Dynamics Structures. J Chem Inf Model 2018; 58:443-452. [PMID: 29368924 DOI: 10.1021/acs.jcim.7b00679] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
The protein side-chain packing problem (PSCPP) is a central task in computational protein design. The problem is usually modeled as a combinatorial optimization problem, which consists of searching for a set of rotamers, from a given rotamer library, that minimizes a scoring function (SF). The SF is a weighted sum of terms, that can be decomposed in physics-based and knowledge-based terms. Although there are many methods to obtain approximate solutions for this problem, all of them have similar performances and there has not been a significant improvement in recent years. Studies on protein structure prediction and protein design revealed the limitations of current SFs to achieve further improvements for these two problems. In the same line, a recent work reported a similar result for the PSCPP. In this work, we ask whether or not this negative result regarding further improvements in performance is due to (i) an incorrect weighting of the SFs terms or (ii) the constrained conformation resulting from the protein crystallization process. To analyze these questions, we (i) model the PSCPP as a bi-objective combinatorial optimization problem, optimizing, at the same time, the two most important terms of two SFs of state-of-the-art algorithms and (ii) performed a preprocessing relaxation of the crystal structure through molecular dynamics to simulate the protein in the solvent and evaluated the performance of these two state-of-the-art SFs under these conditions. Our results indicate that (i) no matter what combination of weight factors we use the current SFs will not lead to better performances and (ii) the evaluated SFs will not be able to improve performance on relaxed structures. Furthermore, the experiments revealed that the SFs and the methods are biased toward crystallized structures.
Collapse
Affiliation(s)
- Jose Colbes
- Computer Science Department, CICESE Research Center , 22860 Ensenada, Mexico
| | - Sergio A Aguila
- Centro de Nanociencias y Nanotecnologia, Universidad Nacional Autonoma de Mexico , Km. 107 Carretera Tijuana-Ensenada, Ensenada, Baja California, Mexico , C.P. 22860
| | - Carlos A Brizuela
- Computer Science Department, CICESE Research Center , 22860 Ensenada, Mexico
| |
Collapse
|
13
|
Shimizu M, Takada S. Reconstruction of Atomistic Structures from Coarse-Grained Models for Protein-DNA Complexes. J Chem Theory Comput 2018; 14:1682-1694. [PMID: 29397721 DOI: 10.1021/acs.jctc.7b00954] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
While coarse-grained (CG) simulations have widely been used to accelerate structure sampling of large biomolecular complexes, they are unavoidably less accurate and thus the reconstruction of all-atom (AA) structures and the subsequent refinement is desirable. In this study we developed an efficient method to reconstruct AA structures from sampled CG protein-DNA complex models, which attempts to model the protein-DNA interface accurately. First we developed a method to reconstruct atomic details of DNA structures from a three-site per nucleotide CG model, which uses a DNA fragment library. Next, for the protein-DNA interface, we referred to the side chain orientations in the known structure of the target interface when available. The other parts are modeled by existing tools. We confirmed the accuracy of the protocol in various aspects including the structure deviation in the self-reproduction, the base pair reproducibility, atomic contacts at the protein-DNA interface, and feasibility of the posterior AA simulations.
Collapse
Affiliation(s)
- Masahiro Shimizu
- Department of Biophysics, Graduate School of Science , Kyoto University , Sakyo, Kyoto 606-8502 Japan
| | - Shoji Takada
- Department of Biophysics, Graduate School of Science , Kyoto University , Sakyo, Kyoto 606-8502 Japan
| |
Collapse
|
14
|
Leem J, Georges G, Shi J, Deane CM. Antibody side chain conformations are position-dependent. Proteins 2018; 86:383-392. [PMID: 29318667 DOI: 10.1002/prot.25453] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2017] [Revised: 12/15/2017] [Accepted: 01/05/2018] [Indexed: 11/11/2022]
Abstract
Side chain prediction is an integral component of computational antibody design and structure prediction. Current antibody modelling tools use backbone-dependent rotamer libraries with conformations taken from general proteins. Here we present our antibody-specific rotamer library, where rotamers are binned according to their immunogenetics (IMGT) position, rather than their local backbone geometry. We find that for some amino acid types at certain positions, only a restricted number of side chain conformations are ever observed. Using this information, we are able to reduce the breadth of the rotamer sampling space. Based on our rotamer library, we built a side chain predictor, position-dependent antibody rotamer swapper (PEARS). On a blind test set of 95 antibody model structures, PEARS had the highest average χ1 and χ1+2 accuracy (78.7% and 64.8%) compared to three leading backbone-dependent side chain predictors. Our use of IMGT position, rather than backbone ϕ/ψ, meant that PEARS was more robust to errors in the backbone of the model structure. PEARS also achieved the lowest number of side chain-side chain clashes. PEARS is freely available as a web application at http://opig.stats.ox.ac.uk/webapps/pears.
Collapse
Affiliation(s)
- Jinwoo Leem
- Department of Statistics, University of Oxford, 24-29 St Giles, Oxford, OX1 3LB, United Kingdom
| | - Guy Georges
- Pharma Research and Early Development, Large Molecule Research, Roche Innovation Center Munich, Nonnenwald 2, Penzberg, 82377, Germany
| | - Jiye Shi
- Chemistry Department, UCB, 208 Bath Road, Slough, SL1 3WE, United Kingdom
| | - Charlotte M Deane
- Department of Statistics, University of Oxford, 24-29 St Giles, Oxford, OX1 3LB, United Kingdom
| |
Collapse
|
15
|
Gaillard T, Simonson T. Full Protein Sequence Redesign with an MMGBSA Energy Function. J Chem Theory Comput 2017; 13:4932-4943. [DOI: 10.1021/acs.jctc.7b00202] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Thomas Gaillard
- Laboratoire de Biochimie
(CNRS UMR7654), Department of Biology, Ecole Polytechnique, 91128 Palaiseau, France
| | - Thomas Simonson
- Laboratoire de Biochimie
(CNRS UMR7654), Department of Biology, Ecole Polytechnique, 91128 Palaiseau, France
| |
Collapse
|
16
|
Mignon D, Panel N, Chen X, Fuentes EJ, Simonson T. Computational Design of the Tiam1 PDZ Domain and Its Ligand Binding. J Chem Theory Comput 2017; 13:2271-2289. [PMID: 28394603 DOI: 10.1021/acs.jctc.6b01255] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
PDZ domains direct protein-protein interactions and serve as models for protein design. Here, we optimized a protein design energy function for the Tiam1 and Cask PDZ domains that combines a molecular mechanics energy, Generalized Born solvent, and an empirical unfolded state model. Designed sequences were recognized as PDZ domains by the Superfamily fold recognition tool and had similarity scores comparable to natural PDZ sequences. The optimized model was used to redesign the two PDZ domains, by gradually varying the chemical potential of hydrophobic amino acids; the tendency of each position to lose or gain a hydrophobic character represents a novel hydrophobicity index. We also redesigned four positions in the Tiam1 PDZ domain involved in peptide binding specificity. The calculated affinity differences between designed variants reproduced experimental data and suggest substitutions with altered specificities.
Collapse
Affiliation(s)
- David Mignon
- Laboratoire de Biochimie (CNRS UMR7654), Ecole Polytechnique , Palaiseau, France
| | - Nicolas Panel
- Laboratoire de Biochimie (CNRS UMR7654), Ecole Polytechnique , Palaiseau, France
| | - Xingyu Chen
- Laboratoire de Biochimie (CNRS UMR7654), Ecole Polytechnique , Palaiseau, France
| | - Ernesto J Fuentes
- Department of Biochemistry, Roy J. & Lucille A. Carver College of Medicine and Holden Comprehensive Cancer Center, University of Iowa , Iowa City, Iowa 52242-1109, United States
| | - Thomas Simonson
- Laboratoire de Biochimie (CNRS UMR7654), Ecole Polytechnique , Palaiseau, France
| |
Collapse
|