1
|
Principles and Methods in Computational Membrane Protein Design. J Mol Biol 2021; 433:167154. [PMID: 34271008 DOI: 10.1016/j.jmb.2021.167154] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2021] [Revised: 07/03/2021] [Accepted: 07/06/2021] [Indexed: 01/13/2023]
Abstract
After decades of progress in computational protein design, the design of proteins folding and functioning in lipid membranes appears today as the next frontier. Some notable successes in the de novo design of simplified model membrane protein systems have helped articulate fundamental principles of protein folding, architecture and interaction in the hydrophobic lipid environment. These principles are reviewed here, together with the computational methods and approaches that were used to identify them. We provide an overview of the methodological innovations in the generation of new protein structures and functions and in the development of membrane-specific energy functions. We highlight the opportunities offered by new machine learning approaches applied to protein design, and by new experimental characterization techniques applied to membrane proteins. Although membrane protein design is in its infancy, it appears more reachable than previously thought.
Collapse
|
2
|
Sacchi S, Rabattoni V, Miceli M, Pollegioni L. Yin and Yang in Post-Translational Modifications of Human D-Amino Acid Oxidase. Front Mol Biosci 2021; 8:684934. [PMID: 34041270 PMCID: PMC8141710 DOI: 10.3389/fmolb.2021.684934] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2021] [Accepted: 04/23/2021] [Indexed: 11/30/2022] Open
Abstract
In the central nervous system, the flavoprotein D-amino acid oxidase is responsible for catabolizing D-serine, the main endogenous coagonist of N-methyl-D-aspartate receptor. Dysregulation of D-serine brain levels in humans has been associated with neurodegenerative and psychiatric disorders. This D-amino acid is synthesized by the enzyme serine racemase, starting from the corresponding L-enantiomer, and degraded by both serine racemase (via an elimination reaction) and the flavoenzyme D-amino acid oxidase. To shed light on the role of human D-amino acid oxidase (hDAAO) in D-serine metabolism, the structural/functional relationships of this enzyme have been investigated in depth and several strategies aimed at controlling the enzymatic activity have been identified. Here, we focused on the effect of post-translational modifications: by using a combination of structural analyses, biochemical methods, and cellular studies, we investigated whether hDAAO is subjected to nitrosylation, sulfhydration, and phosphorylation. hDAAO is S-nitrosylated and this negatively affects its activity. In contrast, the hydrogen sulfide donor NaHS seems to alter the enzyme conformation, stabilizing a species with higher affinity for the flavin adenine dinucleotide cofactor and thus positively affecting enzymatic activity. Moreover, hDAAO is phosphorylated in cerebellum; however, the protein kinase involved is still unknown. Taken together, these findings indicate that D-serine levels can be also modulated by post-translational modifications of hDAAO as also known for the D-serine synthetic enzyme serine racemase.
Collapse
Affiliation(s)
- Silvia Sacchi
- "The Protein Factory 2.0", Dipartimento di Biotecnologie e Scienze della Vita, Università degli Studi Dell'Insubria, Varese, Italy
| | - Valentina Rabattoni
- "The Protein Factory 2.0", Dipartimento di Biotecnologie e Scienze della Vita, Università degli Studi Dell'Insubria, Varese, Italy
| | - Matteo Miceli
- "The Protein Factory 2.0", Dipartimento di Biotecnologie e Scienze della Vita, Università degli Studi Dell'Insubria, Varese, Italy
| | - Loredano Pollegioni
- "The Protein Factory 2.0", Dipartimento di Biotecnologie e Scienze della Vita, Università degli Studi Dell'Insubria, Varese, Italy
| |
Collapse
|
3
|
Mignon D, Druart K, Michael E, Opuu V, Polydorides S, Villa F, Gaillard T, Panel N, Archontis G, Simonson T. Physics-Based Computational Protein Design: An Update. J Phys Chem A 2020; 124:10637-10648. [DOI: 10.1021/acs.jpca.0c07605] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Affiliation(s)
- David Mignon
- Laboratoire de Biologie Structurale de la Cellule (CNRS UMR7654), Ecole Polytechnique, 91128 Palaiseau, France
| | - Karen Druart
- Laboratoire de Biologie Structurale de la Cellule (CNRS UMR7654), Ecole Polytechnique, 91128 Palaiseau, France
| | - Eleni Michael
- Department of Physics, University of Cyprus, PO20537, CY1678 Nicosia, Cyprus
| | - Vaitea Opuu
- Laboratoire de Biologie Structurale de la Cellule (CNRS UMR7654), Ecole Polytechnique, 91128 Palaiseau, France
| | - Savvas Polydorides
- Department of Physics, University of Cyprus, PO20537, CY1678 Nicosia, Cyprus
| | - Francesco Villa
- Laboratoire de Biologie Structurale de la Cellule (CNRS UMR7654), Ecole Polytechnique, 91128 Palaiseau, France
| | - Thomas Gaillard
- Laboratoire de Biologie Structurale de la Cellule (CNRS UMR7654), Ecole Polytechnique, 91128 Palaiseau, France
| | - Nicolas Panel
- Laboratoire de Biologie Structurale de la Cellule (CNRS UMR7654), Ecole Polytechnique, 91128 Palaiseau, France
| | - Georgios Archontis
- Department of Physics, University of Cyprus, PO20537, CY1678 Nicosia, Cyprus
| | - Thomas Simonson
- Laboratoire de Biologie Structurale de la Cellule (CNRS UMR7654), Ecole Polytechnique, 91128 Palaiseau, France
| |
Collapse
|
4
|
Opuu V, Sun YJ, Hou T, Panel N, Fuentes EJ, Simonson T. A physics-based energy function allows the computational redesign of a PDZ domain. Sci Rep 2020; 10:11150. [PMID: 32636412 PMCID: PMC7341745 DOI: 10.1038/s41598-020-67972-w] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2020] [Accepted: 06/08/2020] [Indexed: 11/30/2022] Open
Abstract
Computational protein design (CPD) can address the inverse folding problem, exploring a large space of sequences and selecting ones predicted to fold. CPD was used previously to redesign several proteins, employing a knowledge-based energy function for both the folded and unfolded states. We show that a PDZ domain can be entirely redesigned using a "physics-based" energy for the folded state and a knowledge-based energy for the unfolded state. Thousands of sequences were generated by Monte Carlo simulation. Three were chosen for experimental testing, based on their low energies and several empirical criteria. All three could be overexpressed and had native-like circular dichroism spectra and 1D-NMR spectra typical of folded structures. Two had upshifted thermal denaturation curves when a peptide ligand was present, indicating binding and suggesting folding to a correct, PDZ structure. Evidently, the physical principles that govern folded proteins, with a dash of empirical post-filtering, can allow successful whole-protein redesign.
Collapse
Affiliation(s)
- Vaitea Opuu
- Laboratoire de Biologie Structurale de la Cellule (CNRS UMR7654), Ecole Polytechnique, Institut Polytechnique de Paris, Palaiseau, France
| | - Young Joo Sun
- Department of Biochemistry, Carver College of Medicine, University of Iowa, Iowa City, USA
| | - Titus Hou
- Department of Biochemistry, Carver College of Medicine, University of Iowa, Iowa City, USA
| | - Nicolas Panel
- Laboratoire de Biologie Structurale de la Cellule (CNRS UMR7654), Ecole Polytechnique, Institut Polytechnique de Paris, Palaiseau, France
| | - Ernesto J Fuentes
- Department of Biochemistry, Carver College of Medicine, University of Iowa, Iowa City, USA.
| | - Thomas Simonson
- Laboratoire de Biologie Structurale de la Cellule (CNRS UMR7654), Ecole Polytechnique, Institut Polytechnique de Paris, Palaiseau, France.
| |
Collapse
|
5
|
Marashiyan M, Kalhor H, Ganji M, Rahimi H. Effects of tosyl-l-arginine methyl ester (TAME) on the APC/c subunits: An in silico investigation for inhibiting cell cycle. J Mol Graph Model 2020; 97:107563. [PMID: 32066079 DOI: 10.1016/j.jmgm.2020.107563] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2019] [Revised: 01/11/2020] [Accepted: 02/01/2020] [Indexed: 11/28/2022]
Abstract
The anaphase-promoting complex/cyclosome (APC/c) is requisite for controlling mitosis, which is activated by Cdh1 and Cdc20 activators. Dysregulation of APC/c is observed in many cancers and is known as a targeted drug particularly in cancer drug resistance. It was shown that tosyl-l-arginine methyl ester (TAME), via mimicking isoleucine-arginine (IR) tail of co-activators, inhibits APC/c functions. However, structure details and interaction of TAME with APC/c are poorly defined. In the current study, a well-established set of computational methods was used to identify the best binding pocket in order to inhibit APC activity. Therefore, the interaction of IR tail and Cbox of co-activators, as well as TAME as an inhibitor, as an inhibitor, with APC3 and APC8 subunits of APC/c were analyzed, regarding structure, molecular docking, molecular dynamics, and free binding energy. The results indicated that TAME bound to APC3 with a higher binding affinity (∼-7.3 kcal/mol) than APC8 (∼-5.7 kcal/mol). Also, the binding free energy value obtained for the APC3-TAME was -22.25 ± 1.12 kcal/mol. According to binding free energies, van der Waals energy was the major favorable contributor to the ligand binding. These results offer that TAME had more affinity to interact with the APC3 subunit, at the IR binding pocket than the APC8 subunit at the Cbox binding pocket. In conclusion, IR binding pocket can serve as an appropriate potential target for TAME as an inhibitor of APC/c.
Collapse
Affiliation(s)
- Mahya Marashiyan
- Molecular Medicine Department, Biotechnology Research Center, Pasteur Institute of Iran, Tehran, Iran
| | - Hourieh Kalhor
- Cellular and Molecular Research Center,Qom University of Medical Sciences, Qom, Iran
| | - Maziar Ganji
- Department of Medical Genetics, School of Medicine, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Hamzeh Rahimi
- Molecular Medicine Department, Biotechnology Research Center, Pasteur Institute of Iran, Tehran, Iran.
| |
Collapse
|
6
|
Design and structural characterisation of monomeric water-soluble α-helix and β-hairpin peptides: State-of-the-art. Arch Biochem Biophys 2019; 661:149-167. [DOI: 10.1016/j.abb.2018.11.014] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2018] [Revised: 11/06/2018] [Accepted: 11/14/2018] [Indexed: 02/06/2023]
|
7
|
Villa F, Panel N, Chen X, Simonson T. Adaptive landscape flattening in amino acid sequence space for the computational design of protein:peptide binding. J Chem Phys 2018; 149:072302. [PMID: 30134674 DOI: 10.1063/1.5022249] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
For the high throughput design of protein:peptide binding, one must explore a vast space of amino acid sequences in search of low binding free energies. This complex problem is usually addressed with either simple heuristic scoring or expensive sequence enumeration schemes. Far more efficient than enumeration is a recent Monte Carlo approach that adaptively flattens the energy landscape in sequence space of the unbound peptide and provides formally exact binding free energy differences. The method allows the binding free energy to be used directly as the design criterion. We propose several improvements that allow still more efficient sampling and can address larger design problems. They include the use of Replica Exchange Monte Carlo and landscape flattening for both the unbound and bound peptides. We used the method to design peptides that bind to the PDZ domain of the Tiam1 signaling protein and could serve as inhibitors of its activity. Four peptide positions were allowed to mutate freely. Almost 75 000 peptide variants were processed in two simulations of 109 steps each that used 1 CPU hour on a desktop machine. 96% of the theoretical sequence space was sampled. The relative binding free energies agreed qualitatively with values from experiment. The sampled sequences agreed qualitatively with an experimental library of Tiam1-binding peptides. The main assumption limiting accuracy is the fixed backbone approximation, which could be alleviated in future work by using increased computational resources and multi-backbone designs.
Collapse
Affiliation(s)
- Francesco Villa
- Laboratoire de Biochimie (CNRS UMR7654), Ecole Polytechnique, Palaiseau, France
| | - Nicolas Panel
- Laboratoire de Biochimie (CNRS UMR7654), Ecole Polytechnique, Palaiseau, France
| | - Xingyu Chen
- Laboratoire de Biochimie (CNRS UMR7654), Ecole Polytechnique, Palaiseau, France
| | - Thomas Simonson
- Laboratoire de Biochimie (CNRS UMR7654), Ecole Polytechnique, Palaiseau, France
| |
Collapse
|
8
|
Structural and functional dissection of differentially expressed tomato WRKY transcripts in host defense response against the vascular wilt pathogen (Fusarium oxysporum f. sp. lycopersici). PLoS One 2018; 13:e0193922. [PMID: 29709017 PMCID: PMC5927432 DOI: 10.1371/journal.pone.0193922] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2017] [Accepted: 02/21/2018] [Indexed: 11/24/2022] Open
Abstract
The WRKY transcription factors have indispensable role in plant growth, development and defense responses. The differential expression of WRKY genes following the stress conditions has been well demonstrated. We investigated the temporal and tissue-specific (root and leaf tissues) differential expression of plant defense-related WRKY genes, following the infection of Fusarium oxysporum f. sp. lycopersici (Fol) in tomato. The genome-wide computational analysis revealed that during the Fol infection in tomato, 16 different members of WRKY gene superfamily were found to be involved, of which only three WRKYs (SolyWRKY4, SolyWRKY33, and SolyWRKY37) were shown to have clear-cut differential gene expression. The quantitative real time PCR (qRT-PCR) studies revealed different gene expression profile changes in tomato root and leaf tissues. In root tissues, infected with Fol, an increased expression for SolyWRKY33 (2.76 fold) followed by SolyWRKY37 (1.93 fold) gene was found at 24 hrs which further increased at 48 hrs (5.0 fold). In contrast, the leaf tissues, the expression was more pronounced at an earlier stage of infection (24 hrs). However, in both cases, we found repression of SolyWRKY4 gene, which further decreased at an increased time interval. The biochemical defense programming against Fol pathogenesis was characterized by the highest accumulation of H2O2 (at 48 hrs) and enhanced lignification. The functional diversity across the characterized WRKYs was explored through motif scanning using MEME suite, and the WRKYs specific gene regulation was assessed through the DNA protein docking studies The functional WRKY domain modeled had β sheets like topology with coil and turns. The DNA-protein interaction results revealed the importance of core residues (Tyr, Arg, and Lys) in a feasible WRKY-W-box DNA interaction. The protein interaction network analysis revealed that the SolyWRKY33 could interact with other proteins, such as mitogen-activated protein kinase 5 (MAPK), sigma factor binding protein1 (SIB1) and with other WRKY members including WRKY70, WRKY1, and WRKY40, to respond various biotic and abiotic stresses. The STRING results were further validated through Predicted Tomato Interactome Resource (PTIR) database. The CELLO2GO web server revealed the functional gene ontology annotation and protein subcellular localization, which predicted that SolyWRKY33 is involved in amelioration of biological stress (39.3%) and other metabolic processes (39.3%). The protein (SolyWRKY33) most probably located inside the nucleus (91.3%) with having transcription factor binding activity. We conclude that the defense response following the Fol challenge was accompanied by differential expression of the SolyWRKY4(↓), SolyWRKY33(↑) and SolyWRKY37(↑) transcripts. The biochemical changes are occupied by elicitation of H2O2 generation and accumulation and enhanced lignified tissues.
Collapse
|
9
|
Setiawan D, Brender J, Zhang Y. Recent advances in automated protein design and its future challenges. Expert Opin Drug Discov 2018; 13:587-604. [PMID: 29695210 DOI: 10.1080/17460441.2018.1465922] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
Abstract
INTRODUCTION Protein function is determined by protein structure which is in turn determined by the corresponding protein sequence. If the rules that cause a protein to adopt a particular structure are understood, it should be possible to refine or even redefine the function of a protein by working backwards from the desired structure to the sequence. Automated protein design attempts to calculate the effects of mutations computationally with the goal of more radical or complex transformations than are accessible by experimental techniques. Areas covered: The authors give a brief overview of the recent methodological advances in computer-aided protein design, showing how methodological choices affect final design and how automated protein design can be used to address problems considered beyond traditional protein engineering, including the creation of novel protein scaffolds for drug development. Also, the authors address specifically the future challenges in the development of automated protein design. Expert opinion: Automated protein design holds potential as a protein engineering technique, particularly in cases where screening by combinatorial mutagenesis is problematic. Considering solubility and immunogenicity issues, automated protein design is initially more likely to make an impact as a research tool for exploring basic biology in drug discovery than in the design of protein biologics.
Collapse
Affiliation(s)
- Dani Setiawan
- a Department of Computational Medicine and Bioinformatics , University of Michigan , Ann Arbor , MI , USA
| | - Jeffrey Brender
- b Radiation Biology Branch , Center for Cancer Research, National Cancer Institute - NIH , Bethesda , MD , USA
| | - Yang Zhang
- a Department of Computational Medicine and Bioinformatics , University of Michigan , Ann Arbor , MI , USA.,c Department of Biological Chemistry , University of Michigan , Ann Arbor , MI , USA
| |
Collapse
|
10
|
Duran AM, Meiler J. Computational design of membrane proteins using RosettaMembrane. Protein Sci 2018; 27:341-355. [PMID: 29090504 PMCID: PMC5734395 DOI: 10.1002/pro.3335] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2017] [Revised: 10/27/2017] [Accepted: 10/30/2017] [Indexed: 11/11/2022]
Abstract
Computational membrane protein design is challenging due to the small number of high-resolution structures available to elucidate the physical basis of membrane protein structure, multiple functionally important conformational states, and a limited number of high-throughput biophysical assays to monitor function. However, structural determination of membrane proteins has made tremendous progress in the past years. Concurrently the field of soluble computational design has made impressive inroads. These developments allow us to tackle the formidable challenge of designing functional membrane proteins. Herein, Rosetta is benchmarked for membrane protein design. We evaluate strategies to cope with the often reduced quality of experimental membrane protein structures. Further, we test the usage of symmetry in design protocols, which is particularly important as many membrane proteins exist as homo-oligomers. We compare a soluble scoring function with a scoring function optimized for membrane proteins, RosettaMembrane. Both scoring functions recovered around half of the native sequence when completely redesigning membrane proteins. However, RosettaMembrane recovered the most native-like amino acid property composition. While leucine was overrepresented in the inner and outer-hydrophobic regions of RosettaMembrane designs, it resulted in a native-like surface hydrophobicity indicating that it is currently the best option for designing membrane proteins with Rosetta.
Collapse
Affiliation(s)
- Amanda M. Duran
- Department of ChemistryVanderbilt UniversityNashvilleTennessee37235
- Center for Structural BiologyVanderbilt UniversityNashvilleTennessee37240
| | - Jens Meiler
- Department of ChemistryVanderbilt UniversityNashvilleTennessee37235
- Center for Structural BiologyVanderbilt UniversityNashvilleTennessee37240
| |
Collapse
|
11
|
Gaillard T, Simonson T. Full Protein Sequence Redesign with an MMGBSA Energy Function. J Chem Theory Comput 2017; 13:4932-4943. [DOI: 10.1021/acs.jctc.7b00202] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Thomas Gaillard
- Laboratoire de Biochimie
(CNRS UMR7654), Department of Biology, Ecole Polytechnique, 91128 Palaiseau, France
| | - Thomas Simonson
- Laboratoire de Biochimie
(CNRS UMR7654), Department of Biology, Ecole Polytechnique, 91128 Palaiseau, France
| |
Collapse
|
12
|
Sun Z, Wang X, Song J. Extensive Assessment of Various Computational Methods for Aspartate's pK a Shift. J Chem Inf Model 2017. [PMID: 28644624 DOI: 10.1021/acs.jcim.7b00177] [Citation(s) in RCA: 35] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
A series of computational methods for pKa shift prediction are extensively tested on a set of benchmark protein systems, aiming at identifying pitfalls and evaluating their performance on high variants. Including 19 ASP residues in 10 protein systems, the benchmark set consists of both residues with highly shifted pKa values as well as those varying little from the reference value, with an experimental RMS free energy differences of 2.49 kcal/mol with respect to blocked amino acid, namely the RMS pKa shift being 1.82 pKa units. The constant pH molecular dynamics (MD), alchemical methods, PROPKA3.1, and multiconformation continuum electrostatics give RMSDs of 1.52, 2.58, 1.37, and 3.52 pKa units, respectively, on the benchmark set. The empirical scoring method is the most accurate one with extremely low computational cost, and the pH-dependent model is also able to provide accurate results, while the accuracy of MD sampling incorporating alchemical free energy simulation is prohibited by convergence achievement and the performance of conformational search incorporating multiconformation continuum electrostatics is bad. Former research works did not define statistical uncertainty with care and yielded the questionable conclusion that alchemical methods perform well in most benchmarks. In this work the traditional alchemical methods are thoroughly tested for high variants. We also performed the first application of nonequilibrium alchemical methods to the pKa cases.
Collapse
Affiliation(s)
- Zhaoxi Sun
- State Key Laboratory of Precision Spectroscopy, School of Physics and Material Science, East China Normal University , Shanghai 200062, China
| | - Xiaohui Wang
- State Key Laboratory of Precision Spectroscopy, School of Physics and Material Science, East China Normal University , Shanghai 200062, China
| | - Jianing Song
- NYU-ECNU Center for Computational Chemistry, NYU Shanghai , Shanghai 200062, China.,School of Chemistry and Molecular Engineering, East China Normal University , Shanghai 200062, China
| |
Collapse
|
13
|
Mignon D, Panel N, Chen X, Fuentes EJ, Simonson T. Computational Design of the Tiam1 PDZ Domain and Its Ligand Binding. J Chem Theory Comput 2017; 13:2271-2289. [PMID: 28394603 DOI: 10.1021/acs.jctc.6b01255] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
PDZ domains direct protein-protein interactions and serve as models for protein design. Here, we optimized a protein design energy function for the Tiam1 and Cask PDZ domains that combines a molecular mechanics energy, Generalized Born solvent, and an empirical unfolded state model. Designed sequences were recognized as PDZ domains by the Superfamily fold recognition tool and had similarity scores comparable to natural PDZ sequences. The optimized model was used to redesign the two PDZ domains, by gradually varying the chemical potential of hydrophobic amino acids; the tendency of each position to lose or gain a hydrophobic character represents a novel hydrophobicity index. We also redesigned four positions in the Tiam1 PDZ domain involved in peptide binding specificity. The calculated affinity differences between designed variants reproduced experimental data and suggest substitutions with altered specificities.
Collapse
Affiliation(s)
- David Mignon
- Laboratoire de Biochimie (CNRS UMR7654), Ecole Polytechnique , Palaiseau, France
| | - Nicolas Panel
- Laboratoire de Biochimie (CNRS UMR7654), Ecole Polytechnique , Palaiseau, France
| | - Xingyu Chen
- Laboratoire de Biochimie (CNRS UMR7654), Ecole Polytechnique , Palaiseau, France
| | - Ernesto J Fuentes
- Department of Biochemistry, Roy J. & Lucille A. Carver College of Medicine and Holden Comprehensive Cancer Center, University of Iowa , Iowa City, Iowa 52242-1109, United States
| | - Thomas Simonson
- Laboratoire de Biochimie (CNRS UMR7654), Ecole Polytechnique , Palaiseau, France
| |
Collapse
|
14
|
Yang R, Jain T, Lynaugh H, Nobrega RP, Lu X, Boland T, Burnina I, Sun T, Caffry I, Brown M, Zhi X, Lilov A, Xu Y. Rapid assessment of oxidation via middle-down LCMS correlates with methionine side-chain solvent-accessible surface area for 121 clinical stage monoclonal antibodies. MAbs 2017; 9:646-653. [PMID: 28281887 DOI: 10.1080/19420862.2017.1290753] [Citation(s) in RCA: 34] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023] Open
Abstract
Susceptibility of methionine to oxidation is an important concern for chemical stability during the development of a monoclonal antibody (mAb) therapeutic. To minimize downstream risks, leading candidates are usually screened under forced oxidation conditions to identify oxidation-labile molecules. Here we report results of forced oxidation on a large set of in-house expressed and purified mAbs with variable region sequences corresponding to 121 clinical stage mAbs. These mAb samples were treated with 0.1% H2O2 for 24 hours before enzymatic cleavage below the hinge, followed by reduction of inter-chain disulfide bonds for the detection of the light chain, Fab portion of heavy chain (Fd) and Fc by liquid chromatography-mass spectrometry. This high-throughput, middle-down approach allows detection of oxidation site(s) at the resolution of 3 distinct segments. The experimental oxidation data correlates well with theoretical predictions based on the solvent-accessible surface area of the methionine side-chains within these segments. These results validate the use of upstream computational modeling to predict mAb oxidation susceptibility at the sequence level.
Collapse
Affiliation(s)
- Rong Yang
- a Protein Analytics, Adimab , Lebanon , NH , USA
| | - Tushar Jain
- b Computational Biology, Adimab , Palo Alto , CA , USA
| | | | | | - Xiaojun Lu
- a Protein Analytics, Adimab , Lebanon , NH , USA
| | - Todd Boland
- b Computational Biology, Adimab , Palo Alto , CA , USA
| | | | - Tingwan Sun
- a Protein Analytics, Adimab , Lebanon , NH , USA
| | | | | | - Xiaoyong Zhi
- a Protein Analytics, Adimab , Lebanon , NH , USA
| | | | - Yingda Xu
- a Protein Analytics, Adimab , Lebanon , NH , USA
| |
Collapse
|
15
|
Topham CM, Barbe S, André I. An Atomistic Statistically Effective Energy Function for Computational Protein Design. J Chem Theory Comput 2016; 12:4146-68. [PMID: 27341125 DOI: 10.1021/acs.jctc.6b00090] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Shortcomings in the definition of effective free-energy surfaces of proteins are recognized to be a major contributory factor responsible for the low success rates of existing automated methods for computational protein design (CPD). The formulation of an atomistic statistically effective energy function (SEEF) suitable for a wide range of CPD applications and its derivation from structural data extracted from protein domains and protein-ligand complexes are described here. The proposed energy function comprises nonlocal atom-based and local residue-based SEEFs, which are coupled using a novel atom connectivity number factor to scale short-range, pairwise, nonbonded atomic interaction energies and a surface-area-dependent cavity energy term. This energy function was used to derive additional SEEFs describing the unfolded-state ensemble of any given residue sequence based on computed average energies for partially or fully solvent-exposed fragments in regions of irregular structure in native proteins. Relative thermal stabilities of 97 T4 bacteriophage lysozyme mutants were predicted from calculated energy differences for folded and unfolded states with an average unsigned error (AUE) of 0.84 kcal mol(-1) when compared to experiment. To demonstrate the utility of the energy function for CPD, further validation was carried out in tests of its capacity to recover cognate protein sequences and to discriminate native and near-native protein folds, loop conformers, and small-molecule ligand binding poses from non-native benchmark decoys. Experimental ligand binding free energies for a diverse set of 80 protein complexes could be predicted with an AUE of 2.4 kcal mol(-1) using an additional energy term to account for the loss in ligand configurational entropy upon binding. The atomistic SEEF is expected to improve the accuracy of residue-based coarse-grained SEEFs currently used in CPD and to extend the range of applications of extant atom-based protein statistical potentials.
Collapse
Affiliation(s)
- Christopher M Topham
- Université de Toulouse; INSA, UPS, INP; LISBP , 135 Avenue de Rangueil, F-31077 Toulouse, France.,CNRS, UMR5504 , F-31400 Toulouse, France.,INRA, UMR792 Ingénierie des Systèmes Biologiques et des Procédés , F-31400 Toulouse, France
| | - Sophie Barbe
- Université de Toulouse; INSA, UPS, INP; LISBP , 135 Avenue de Rangueil, F-31077 Toulouse, France.,CNRS, UMR5504 , F-31400 Toulouse, France.,INRA, UMR792 Ingénierie des Systèmes Biologiques et des Procédés , F-31400 Toulouse, France
| | - Isabelle André
- Université de Toulouse; INSA, UPS, INP; LISBP , 135 Avenue de Rangueil, F-31077 Toulouse, France.,CNRS, UMR5504 , F-31400 Toulouse, France.,INRA, UMR792 Ingénierie des Systèmes Biologiques et des Procédés , F-31400 Toulouse, France
| |
Collapse
|
16
|
Toluene promotes lid 2 interfacial activation of cold active solvent tolerant lipase from Pseudomonas fluorescens strain AMS8. J Mol Graph Model 2016; 68:224-235. [DOI: 10.1016/j.jmgm.2016.07.003] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2016] [Revised: 06/16/2016] [Accepted: 07/17/2016] [Indexed: 11/30/2022]
|
17
|
Mignon D, Simonson T. Comparing three stochastic search algorithms for computational protein design: Monte Carlo, replica exchange Monte Carlo, and a multistart, steepest-descent heuristic. J Comput Chem 2016; 37:1781-93. [PMID: 27197555 DOI: 10.1002/jcc.24393] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2015] [Revised: 02/26/2016] [Accepted: 03/27/2016] [Indexed: 01/11/2023]
Abstract
Computational protein design depends on an energy function and an algorithm to search the sequence/conformation space. We compare three stochastic search algorithms: a heuristic, Monte Carlo (MC), and a Replica Exchange Monte Carlo method (REMC). The heuristic performs a steepest-descent minimization starting from thousands of random starting points. The methods are applied to nine test proteins from three structural families, with a fixed backbone structure, a molecular mechanics energy function, and with 1, 5, 10, 20, 30, or all amino acids allowed to mutate. Results are compared to an exact, "Cost Function Network" method that identifies the global minimum energy conformation (GMEC) in favorable cases. The designed sequences accurately reproduce experimental sequences in the hydrophobic core. The heuristic and REMC agree closely and reproduce the GMEC when it is known, with a few exceptions. Plain MC performs well for most cases, occasionally departing from the GMEC by 3-4 kcal/mol. With REMC, the diversity of the sequences sampled agrees with exact enumeration where the latter is possible: up to 2 kcal/mol above the GMEC. Beyond, room temperature replicas sample sequences up to 10 kcal/mol above the GMEC, providing thermal averages and a solution to the inverse protein folding problem. © 2016 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- David Mignon
- Laboratoire De Biochimie (UMR CNRS 7654), Department Of Biology, Ecole Polytechnique, Palaiseau, France
| | - Thomas Simonson
- Laboratoire De Biochimie (UMR CNRS 7654), Department Of Biology, Ecole Polytechnique, Palaiseau, France
| |
Collapse
|
18
|
Kim MO, McCammon JA. Computation of pH-dependent binding free energies. Biopolymers 2016; 105:43-9. [PMID: 26202905 PMCID: PMC4623928 DOI: 10.1002/bip.22702] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2015] [Accepted: 07/20/2015] [Indexed: 01/21/2023]
Abstract
Protein-ligand binding accompanies changes in the surrounding electrostatic environments of the two binding partners and may lead to changes in protonation upon binding. In cases where the complex formation results in a net transfer of protons, the binding process is pH-dependent. However, conventional free energy computations or molecular docking protocols typically employ fixed protonation states for the titratable groups in both binding partners set a priori, which are identical for the free and bound states. In this review, we draw attention to these important yet largely ignored binding-induced protonation changes in protein-ligand association by outlining physical origins and prevalence of the protonation changes upon binding. Following a summary of various theoretical methods for pKa prediction, we discuss the theoretical framework to examine the pH dependence of protein-ligand binding processes.
Collapse
Affiliation(s)
- M. Olivia Kim
- Department of Pharmacology, University of California San Diego, La Jolla, CA 92093, USA
| | - J. Andrew McCammon
- Department of Pharmacology, University of California San Diego, La Jolla, CA 92093, USA
- Department of Chemistry and Biochemistry, University of California San Diego, La Jolla, CA 92093, USA
- Howard Hughes Medical Institute, University of California San Diego, La Jolla, CA 92093, USA
- National Biomedical Computation Resource, University of California San Diego, La Jolla, CA 92093, USA
| |
Collapse
|
19
|
Stanton CL, Houk KN. Benchmarking pKa Prediction Methods for Residues in Proteins. J Chem Theory Comput 2015; 4:951-66. [PMID: 26621236 DOI: 10.1021/ct8000014] [Citation(s) in RCA: 61] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Methods for estimation of pKa values of residues in proteins were tested on a set of benchmark proteins with experimentally known pKa values. The benchmark set includes 80 different residues (20 each for Asp, Glu, Lys, and His), half of which consists of significantly variant cases (ΔpKa ≥ 1 pKa unit from the amino acid in solution). The method introduced by Case and co-workers [J. Am. Chem. Soc. 2004, 126, 4167-4180], referred to as the molecular dynamics/generalized-Born/thermodynamic integration (MD/GB/TI) technique, gives a root-mean-square deviation (rmsd) of 1.4 pKa units on the benchmark set. The use of explicit waters in the immediate region surrounding the residue was shown to generally reduce high errors for this method. Longer simulation time was also shown to increase the accuracy of this method. The empirical approach developed by Jensen and co-workers [Proteins 2005, 61, 704-721], PROPKA, also gives an overall rmsd of 1.4 pKa units and is more or less accurate based on residue type-the method does very well for Lys and Glu, but less so for Asp and His. Likewise, the absolute deviation is quite similar for the two methods-5.2 for PROPKA and 5.1 for MD/GB/TI. A comparison of these results with several prediction methods from the literature is presented. The error in pKa prediction is analyzed as a function of variation of the pKa from that in water and the solvent accessible surface area (SASA) of the residue. A case study of the catalytic lysine residue in 2-deoxyribose-5-phosphate aldolase (DERA) is also presented.
Collapse
Affiliation(s)
- Courtney L Stanton
- Department of Chemistry and Biochemistry, University of California Los Angeles, 607 Charles E. Young Drive East, Los Angeles, California 90095
| | - Kendall N Houk
- Department of Chemistry and Biochemistry, University of California Los Angeles, 607 Charles E. Young Drive East, Los Angeles, California 90095
| |
Collapse
|
20
|
Computational identification, homology modelling and docking analysis of phytase protein from Fusarium oxysporum. Biologia (Bratisl) 2014. [DOI: 10.2478/s11756-014-0447-8] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
|
21
|
Gaillard T, Simonson T. Pairwise decomposition of an MMGBSA energy function for computational protein design. J Comput Chem 2014; 35:1371-87. [PMID: 24854675 DOI: 10.1002/jcc.23637] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2014] [Revised: 04/14/2014] [Accepted: 05/01/2014] [Indexed: 02/02/2023]
Abstract
Computational protein design (CPD) aims at predicting new proteins or modifying existing ones. The computational challenge is huge as it requires exploring an enormous sequence and conformation space. The difficulty can be reduced by considering a fixed backbone and a discrete set of sidechain conformations. Another common strategy consists in precalculating a pairwise energy matrix, from which the energy of any sequence/conformation can be quickly obtained. In this work, we examine the pairwise decomposition of protein MMGBSA energy functions from a general theoretical perspective, and an implementation proposed earlier for CPD. It includes a Generalized Born term, whose many-body character is overcome using an effective dielectric environment, and a Surface Area term, for which we present an improved pairwise decomposition. A detailed evaluation of the error introduced by the decomposition on the different energy components is performed. We show that the error remains reasonable, compared to other uncertainties.
Collapse
Affiliation(s)
- Thomas Gaillard
- Department of Biology, Laboratoire de Biochimie (CNRS UMR7654), Ecole Polytechnique, 91128, Palaiseau, France
| | | |
Collapse
|
22
|
Longo LM, Blaber M. Symmetric protein architecture in protein design: top-down symmetric deconstruction. Methods Mol Biol 2014; 1216:161-182. [PMID: 25213415 DOI: 10.1007/978-1-4939-1486-9_8] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
Top-down symmetric deconstruction (TDSD) is a joint experimental and computational approach to generate a highly stable, functionally benign protein scaffold for intended application in subsequent functional design studies. By focusing on symmetric protein folds, TDSD can leverage the dramatic reduction in sequence space achieved by applying a primary structure symmetric constraint to the design process. Fundamentally, TDSD is an iterative symmetrization process, in which the goal is to maintain or improve properties of thermodynamic stability and folding cooperativity inherent to a starting sequence (the "proxy"). As such, TDSD does not attempt to solve the inverse protein folding problem directly, which is computationally intractable. The present chapter will take the reader through all of the primary steps of TDSD-selecting a proxy, identifying potential mutations, establishing a stability/folding cooperativity screen-relying heavily on a successful TDSD solution for the common β-trefoil fold.
Collapse
Affiliation(s)
- Liam M Longo
- Department of Biomedical Sciences, College of Medicine, Florida State University, 1115 West Call Street, Tallahassee, FL, 32306-4300, USA
| | | |
Collapse
|
23
|
Abstract
Recent studies have elucidated key principles governing folding and stability of α-helices in short peptides and globular proteins. In this chapter we review briefly those principles and describe a protocol for the de novo design of highly stable α-helixes using the SEQOPT algorithm. This algorithm is based on AGADIR, the statistical mechanical theory for helix-coil transitions in monomeric peptides, and the tunneling algorithm for global sequence optimization.
Collapse
|
24
|
Protein engineering and the use of molecular modeling and simulation: the case of heterodimeric Fc engineering. Methods 2013; 65:77-94. [PMID: 24211748 DOI: 10.1016/j.ymeth.2013.10.016] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2013] [Revised: 10/12/2013] [Accepted: 10/25/2013] [Indexed: 11/23/2022] Open
Abstract
Computational and structure guided methods can make significant contributions to the development of solutions for difficult protein engineering problems, including the optimization of next generation of engineered antibodies. In this paper, we describe a contemporary industrial antibody engineering program, based on hypothesis-driven in silico protein optimization method. The foundational concepts and methods of computational protein engineering are discussed, and an example of a computational modeling and structure-guided protein engineering workflow is provided for the design of best-in-class heterodimeric Fc with high purity and favorable biophysical properties. We present the engineering rationale as well as structural and functional characterization data on these engineered designs.
Collapse
|
25
|
Simonson T, Gaillard T, Mignon D, Schmidt am Busch M, Lopes A, Amara N, Polydorides S, Sedano A, Druart K, Archontis G. Computational protein design: the Proteus software and selected applications. J Comput Chem 2013; 34:2472-84. [PMID: 24037756 DOI: 10.1002/jcc.23418] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2013] [Revised: 07/08/2013] [Accepted: 07/28/2013] [Indexed: 12/13/2022]
Abstract
We describe an automated procedure for protein design, implemented in a flexible software package, called Proteus. System setup and calculation of an energy matrix are done with the XPLOR modeling program and its sophisticated command language, supporting several force fields and solvent models. A second program provides algorithms to search sequence space. It allows a decomposition of the system into groups, which can be combined in different ways in the energy function, for both positive and negative design. The whole procedure can be controlled by editing 2-4 scripts. Two applications consider the tyrosyl-tRNA synthetase enzyme and its successful redesign to bind both O-methyl-tyrosine and D-tyrosine. For the latter, we present Monte Carlo simulations where the D-tyrosine concentration is gradually increased, displacing L-tyrosine from the binding pocket and yielding the binding free energy difference, in good agreement with experiment. Complete redesign of the Crk SH3 domain is presented. The top 10000 sequences are all assigned to the correct fold by the SUPERFAMILY library of Hidden Markov Models. Finally, we report the acid/base behavior of the SNase protein. Sidechain protonation is treated as a form of mutation; it is then straightforward to perform constant-pH Monte Carlo simulations, which yield good agreement with experiment. Overall, the software can be used for a wide range of application, producing not only native-like sequences but also thermodynamic properties with errors that appear comparable to other current software packages.
Collapse
Affiliation(s)
- Thomas Simonson
- Laboratoire de Biochimie (CNRS UMR7654), Department of Biology, Ecole Polytechnique, Palaiseau, 91128, France
| | | | | | | | | | | | | | | | | | | |
Collapse
|
26
|
Chitranshi N, Tiwari AK, Somvanshi P, Tripathi PK, Seth PK. Investigating the function of single nucleotide polymorphisms in the CTSB gene: a computational approach. FUTURE NEUROLOGY 2013. [DOI: 10.2217/fnl.13.26] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Aim: Recent genome-wide association studies have revealed large numbers of single nucleotide polymorphisms (SNPs) related to Alzheimer’s disease. Here, we have investigated the gene CTSB, which plays a crucial role in encoding CTSB, a lysosomal cysteine proteinase protein. CTSB is also involved in the proteolytic processing of amyloid precursor protein (APP), which is believed to be a causative factor in Alzheimer’s disease. Materials & methods: Several bioinformatics algorithms such as, Sorting Intolerant from Tolerant (SIFT), Polymorphism Phenotyping (PolyPhen) and CUPSAT could identify the synonymous SNPs and nonsynonymous SNPs (nsSNPs), which are predicted to be deleterious and nondeleterious, respectively. Similar tools were used to predict the impact of single amino acid substitutions on CTSB protein activity. The FASTSNP server and UTRscan were used to predict the influence on splicing regulations. The stability and solvent-accessible surface area of modeled mutated proteins were analyzed using PBEQ solver and NetASA view. Furthermore, the DSP program was used to determine the secondary structures of the modeled protein. Results: A total of 999 SNPs in CTSB were retrieved from the SNP database; 55 nsSNPs, 35 synonymous SNPs, 165 mRNA were found in the 3´untranslated region SNPs, 12 SNPs were found in the 5´untranslated region in addition to 732 intronic SNPs. Potential functions of SNPs in the CTSB gene were identified using different web servers. For example, SIFT, PolyPhen and CUPSAT servers predicted ten nsSNPs to be intolerant, three nsSNPs to be damaging and eight nsSNPs to have the potential to destabilize protein structure. The FASTSNP server predicted 12 SNPs to influence splicing regulation, whereas two SNPs could predict a risk in the range of 3–4 (medium to high). Furthermore, mutant proteins were modeled and the total energy values were compared with the native CTSB protein. It was observed that on the surface of the protein, a mutation from threonine to serine at position 235 (rs17573) caused the greatest impact on stability. Conclusion: The genome-wide association studies database has already found rs7003814 of the CTSB gene reported against Alzheimer’s disease. Our study demonstrates the presence of other deleterious nsSNPs, which may play a crucial role in predicting Alzheimer’s disease risk.
Collapse
Affiliation(s)
- Nitin Chitranshi
- Gautam Buddh Technical University, Lucknow 227202, Uttar Pradesh, India
- Bioinformatics Centre, Biotech Park, Sector-G, Jankipuram, Lucknow-226021, Uttar Pradesh, India.
| | - Amit K Tiwari
- Department of Biomedical Sciences, College of Veterinary Medicine, Nursing & Allied Health, Tuskegee University, Tuskegee, AL 36088, USA
| | - Pallavi Somvanshi
- Department of Biotechnology, TERI University, 10, Institutional Area, Vasantkunj, New Delhi 110070, India
| | | | - Prahlad K Seth
- Bioinformatics Centre, Biotech Park, Sector-G, Jankipuram, Lucknow-226021, Uttar Pradesh, India
| |
Collapse
|
27
|
Tan KP, Nguyen TB, Patel S, Varadarajan R, Madhusudhan MS. Depth: a web server to compute depth, cavity sizes, detect potential small-molecule ligand-binding cavities and predict the pKa of ionizable residues in proteins. Nucleic Acids Res 2013; 41:W314-21. [PMID: 23766289 PMCID: PMC3692129 DOI: 10.1093/nar/gkt503] [Citation(s) in RCA: 125] [Impact Index Per Article: 11.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023] Open
Abstract
Residue depth accurately measures burial and parameterizes local protein environment. Depth is the distance of any atom/residue to the closest bulk water. We consider the non-bulk waters to occupy cavities, whose volumes are determined using a Voronoi procedure. Our estimation of cavity sizes is statistically superior to estimates made by CASTp and VOIDOO, and on par with McVol over a data set of 40 cavities. Our calculated cavity volumes correlated best with the experimentally determined destabilization of 34 mutants from five proteins. Some of the cavities identified are capable of binding small molecule ligands. In this study, we have enhanced our depth-based predictions of binding sites by including evolutionary information. We have demonstrated that on a database (LigASite) of ∼200 proteins, we perform on par with ConCavity and better than MetaPocket 2.0. Our predictions, while less sensitive, are more specific and precise. Finally, we use depth (and other features) to predict pKas of GLU, ASP, LYS and HIS residues. Our results produce an average error of just <1 pH unit over 60 predictions. Our simple empirical method is statistically on par with two and superior to three other methods while inferior to only one. The DEPTH server (http://mspc.bii.a-star.edu.sg/depth/) is an ideal tool for rapid yet accurate structural analyses of protein structures.
Collapse
Affiliation(s)
- Kuan Pern Tan
- Bioinformatics Institute, 30 Biopolis Street, #07-01, Matrix, Singapore 138671
| | | | | | | | | |
Collapse
|
28
|
Liu M, He H, Su J. Is it possible to stabilize a thermophilic protein further using sequences and structures of mesophilic proteins: a theoretical case study concerning DgAS. Theor Biol Med Model 2013; 10:26. [PMID: 23575217 PMCID: PMC3639903 DOI: 10.1186/1742-4682-10-26] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2013] [Accepted: 03/29/2013] [Indexed: 11/13/2022] Open
Abstract
Incorporating structural elements of thermostable homologs can greatly improve the thermostability of a mesophilic protein. Despite the effectiveness of this method, applying it is often hampered. First, it requires alignment of the target mesophilic protein sequence with those of thermophilic homologs, but not every mesophilic protein has a thermophilic homolog. Second, not all favorable features of a thermophilic protein can be incorporated into the structure of a mesophilic protein. Furthermore, even the most stable native protein is not sufficiently stable for industrial applications. Therefore, creating an industrially applicable protein on the basis of the thermophilic protein could prove advantageous. Amylosucrase (AS) can catalyze the synthesis of an amylose-like polysaccharide composed of only α-1,4-linkages using sucrose as the lone energy source. However, industrial development of AS has been hampered owing to its low thermostability. To facilitate potential industrial applications, the aim of the current study was to improve the thermostability of Deinococcus geothermalis amylosucrase (DgAS) further; this is the most stable AS discovered to date. By integrating ideas from mesophilic AS with well-established protein design protocols, three useful design protocols are proposed, and several promising substitutions were identified using these protocols. The successful application of this hybrid design method indicates that it is possible to stabilize a thermostable protein further by incorporating structural elements of less-stable homologs.
Collapse
Affiliation(s)
- Ming Liu
- Institute of Materia Medica, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, China
| | | | | |
Collapse
|
29
|
Li Z, Yang Y, Zhan J, Dai L, Zhou Y. Energy functions in de novo protein design: current challenges and future prospects. Annu Rev Biophys 2013; 42:315-35. [PMID: 23451890 DOI: 10.1146/annurev-biophys-083012-130315] [Citation(s) in RCA: 65] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
In the past decade, a concerted effort to successfully capture specific tertiary packing interactions produced specific three-dimensional structures for many de novo designed proteins that are validated by nuclear magnetic resonance and/or X-ray crystallographic techniques. However, the success rate of computational design remains low. In this review, we provide an overview of experimentally validated, de novo designed proteins and compare four available programs, RosettaDesign, EGAD, Liang-Grishin, and RosettaDesign-SR, by assessing designed sequences computationally. Computational assessment includes the recovery of native sequences, the calculation of sizes of hydrophobic patches and total solvent-accessible surface area, and the prediction of structural properties such as intrinsic disorder, secondary structures, and three-dimensional structures. This computational assessment, together with a recent community-wide experiment in assessing scoring functions for interface design, suggests that the next-generation protein-design scoring function will come from the right balance of complementary interaction terms. Such balance may be found when more negative experimental data become available as part of a training set.
Collapse
Affiliation(s)
- Zhixiu Li
- School of Informatics, Indiana University-Purdue University, Indianapolis, Indiana 46202, USA
| | | | | | | | | |
Collapse
|
30
|
Designing electrostatic interactions in biological systems via charge optimization or combinatorial approaches: insights and challenges with a continuum electrostatic framework. Theor Chem Acc 2012. [DOI: 10.1007/s00214-012-1252-5] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
|
31
|
Subramaniam S, Senes A. An energy-based conformer library for side chain optimization: improved prediction and adjustable sampling. Proteins 2012; 80:2218-34. [PMID: 22576292 DOI: 10.1002/prot.24111] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2011] [Revised: 04/22/2012] [Accepted: 05/02/2012] [Indexed: 11/11/2022]
Abstract
Side chain optimization is a fundamental component of protein modeling applications such as docking, structural prediction, and design. In these applications side chain flexibility is often provided by rotamer or conformer libraries, which are collections of representative side chain conformations. Here we demonstrate that the sampling provided by the library can be substantially improved by adding an energetic criterion to its creation. The result of the new procedure is the Energy-Based library, a conformer library selected according to the propensity of its elements to fit energetically into natural protein environments. The new library performs outstandingly well in side chain optimization, producing structures with significantly lower energies and resulting in improved side chain conformation prediction. In addition, because the library was created as an ordered list, its size can be adjusted to any desired level. This feature provides unprecedented versatility in tuning sampling. It allows to precisely balance the number of conformers required by each amino acid type, equalizing their chances to fit into structural environments. It also allows to scale the amount of sampling to the specific requirement of any given side optimization problem. A rotameric version of the library was also produced with the same method to support applications that require a dihedral-only description of side chain conformation. The libraries are available at http://seneslab.org/EBL.
Collapse
Affiliation(s)
- Sabareesh Subramaniam
- Department of Biochemistry, University of Wisconsin-Madison, Madison, Wisconsin 53706, USA
| | | |
Collapse
|
32
|
Fogolari F, Corazza A, Yarra V, Jalaru A, Viglino P, Esposito G. Bluues: a program for the analysis of the electrostatic properties of proteins based on generalized Born radii. BMC Bioinformatics 2012; 13 Suppl 4:S18. [PMID: 22536964 PMCID: PMC3434445 DOI: 10.1186/1471-2105-13-s4-s18] [Citation(s) in RCA: 45] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
Abstract
BACKGROUND The Poisson-Boltzmann (PB) equation and its linear approximation have been widely used to describe biomolecular electrostatics. Generalized Born (GB) models offer a convenient computational approximation for the more fundamental approach based on the Poisson-Boltzmann equation, and allows estimation of pairwise contributions to electrostatic effects in the molecular context. RESULTS We have implemented in a single program most common analyses of the electrostatic properties of proteins. The program first computes generalized Born radii, via a surface integral and then it uses generalized Born radii (using a finite radius test particle) to perform electrostaic analyses. In particular the ouput of the program entails, depending on user's requirement: 1) the generalized Born radius of each atom; 2) the electrostatic solvation free energy; 3) the electrostatic forces on each atom (currently in a developmental stage); 4) the pH-dependent properties (total charge and pH-dependent free energy of folding in the pH range -2 to 18; 5) the pKa of all ionizable groups; 6) the electrostatic potential at the surface of the molecule; 7) the electrostatic potential in a volume surrounding the molecule; CONCLUSIONS Although at the expense of limited flexibility the program provides most common analyses with requirement of a single input file in PQR format. The results obtained are comparable to those obtained using state-of-the-art Poisson-Boltzmann solvers. A Linux executable with example input and output files is provided as supplementary material.
Collapse
Affiliation(s)
- Federico Fogolari
- Dipartimento di Scienze Mediche e Biologiche. Università di Udine, Piazzale Kolbe, 4, Udine 33100, Italy
- Istituto Nazionale Biostrutture e Biosistemi, Viale medaglie d'Oro 305, Roma 00136, Italy
| | - Alessandra Corazza
- Dipartimento di Scienze Mediche e Biologiche. Università di Udine, Piazzale Kolbe, 4, Udine 33100, Italy
- Istituto Nazionale Biostrutture e Biosistemi, Viale medaglie d'Oro 305, Roma 00136, Italy
| | - Vijaylakshmi Yarra
- Dipartimento di Scienze Mediche e Biologiche. Università di Udine, Piazzale Kolbe, 4, Udine 33100, Italy
| | - Anusha Jalaru
- Dipartimento di Scienze Mediche e Biologiche. Università di Udine, Piazzale Kolbe, 4, Udine 33100, Italy
| | - Paolo Viglino
- Dipartimento di Scienze Mediche e Biologiche. Università di Udine, Piazzale Kolbe, 4, Udine 33100, Italy
- Istituto Nazionale Biostrutture e Biosistemi, Viale medaglie d'Oro 305, Roma 00136, Italy
| | - Gennaro Esposito
- Dipartimento di Scienze Mediche e Biologiche. Università di Udine, Piazzale Kolbe, 4, Udine 33100, Italy
- Istituto Nazionale Biostrutture e Biosistemi, Viale medaglie d'Oro 305, Roma 00136, Italy
| |
Collapse
|
33
|
Williams SL, Blachly PG, McCammon JA. Measuring the successes and deficiencies of constant pH molecular dynamics: a blind prediction study. Proteins 2011; 79:3381-8. [PMID: 22072520 PMCID: PMC3227005 DOI: 10.1002/prot.23136] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2011] [Revised: 05/19/2011] [Accepted: 05/25/2011] [Indexed: 11/23/2022]
Abstract
A constant pH molecular dynamics method has been used in the blind prediction of pKa values of titratable residues in wild type and mutated structures of the Staphylococcal nuclease (SNase) protein. The predicted values have been subsequently compared to experimental values provided by the laboratory of García-Moreno. CpHMD performs well in predicting the pKa of solvent-exposed residues. For residues in the protein interior, the CpHMD method encounters some difficulties in reaching convergence and predicting the pKa values for residues having strong interactions with neighboring residues. These results show the need to accurately and sufficiently sample conformational space in order to obtain pKa values consistent with experimental results.
Collapse
Affiliation(s)
- Sarah L Williams
- Department of Chemistry & Biochemistry, University of California San Diego, La Jolla, California 92093-0365, USA.
| | | | | |
Collapse
|
34
|
Lei Y, Luo W, Zhu Y. A matching algorithm for catalytic residue site selection in computational enzyme design. Protein Sci 2011; 20:1566-75. [PMID: 21714026 DOI: 10.1002/pro.685] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2011] [Accepted: 06/07/2011] [Indexed: 11/07/2022]
Abstract
A loop closure-based sequential algorithm, PRODA_MATCH, was developed to match catalytic residues onto a scaffold for enzyme design in silico. The computational complexity of this algorithm is polynomial with respect to the number of active sites, the number of catalytic residues, and the maximal iteration number of cyclic coordinate descent steps. This matching algorithm is independent of a rotamer library that enables the catalytic residue to take any required conformation during the reaction coordinate. The catalytic geometric parameters defined between functional groups of transition state (TS) and the catalytic residues are continuously optimized to identify the accurate position of the TS. Pseudo-spheres are introduced for surrounding residues, which make the algorithm take binding into account as early as during the matching process. Recapitulation of native catalytic residue sites was used as a benchmark to evaluate the novel algorithm. The calculation results for the test set show that the native catalytic residue sites were successfully identified and ranked within the top 10 designs for 7 of the 10 chemical reactions. This indicates that the matching algorithm has the potential to be used for designing industrial enzymes for desired reactions.
Collapse
Affiliation(s)
- Yulin Lei
- Department of Chemical Engineering, Tsinghua University, Beijing 100084, China
| | | | | |
Collapse
|
35
|
Stroganov OV, Novikov FN, Zeifman AA, Stroylov VS, Chilov GG. TSAR, a new graph-theoretical approach to computational modeling of protein side-chain flexibility: Modeling of ionization properties of proteins. Proteins 2011; 79:2693-710. [DOI: 10.1002/prot.23099] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2010] [Revised: 05/16/2011] [Accepted: 05/27/2011] [Indexed: 11/09/2022]
|
36
|
Polydorides S, Amara N, Aubard C, Plateau P, Simonson T, Archontis G. Computational protein design with a generalized Born solvent model: application to Asparaginyl-tRNA synthetase. Proteins 2011; 79:3448-68. [PMID: 21563215 DOI: 10.1002/prot.23042] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2010] [Revised: 02/25/2011] [Accepted: 03/03/2011] [Indexed: 12/13/2022]
Abstract
Computational Protein Design (CPD) is a promising method for high throughput protein and ligand mutagenesis. Recently, we developed a CPD method that used a polar-hydrogen energy function for protein interactions and a Coulomb/Accessible Surface Area (CASA) model for solvent effects. We applied this method to engineer aspartyl-adenylate (AspAMP) specificity into Asparaginyl-tRNA synthetase (AsnRS), whose substrate is asparaginyl-adenylate (AsnAMP). Here, we implement a more accurate function, with an all-atom energy for protein interactions and a residue-pairwise generalized Born model for solvent effects. As a first test, we compute aminoacid affinities for several point mutants of Aspartyl-tRNA synthetase (AspRS) and Tyrosyl-tRNA synthetase and stability changes for three helical peptides and compare with experiment. As a second test, we readdress the problem of AsnRS aminoacid engineering. We compare three design criteria, which optimize the folding free-energy, the absolute AspAMP affinity, and the relative (AspAMP-AsnAMP) affinity. The sequences and conformations are improved with respect to our previous, polar-hydrogen/CASA study: For several designed complexes, the AspAMP carboxylate forms three interactions with a conserved arginine and a designed lysine, as in the active site of the AspRS:AspAMP complex. The conformations and interactions are well maintained in molecular dynamics simulations and the sequences have an inverted specificity, favoring AspAMP over AsnAMP. The method is not fully successful, since experimental measurements with the seven most promising sequences show that they do not catalyze at a detectable level the adenylation of Asp (or Asn) with ATP. This may be due to weak AspAMP binding and/or disruption of transition-state stabilization.
Collapse
|
37
|
Abstract
The ability to engineer novel proteins using the principles of molecular structure and energetics is a stringent test of our basic understanding of how proteins fold and maintain structure. The design of protein self-assembly has the potential to impact many fields of biology from molecular recognition to cell signaling to biomaterials. Most progress in computational design of protein self-assembly has focused on α-helical systems, exploring ways to concurrently optimize the stability and specificity of a target state. Applying these methods to collagen self-assembly is very challenging, due to fundamental differences in folding and structure of α- versus triple-helices. Here, we explore various computational methods for designing stable and specific oligomeric systems, with a focus on α-helix and collagen self-assembly.
Collapse
|
38
|
Olsson MHM, Søndergaard CR, Rostkowski M, Jensen JH. PROPKA3: Consistent Treatment of Internal and Surface Residues in Empirical pKa Predictions. J Chem Theory Comput 2011; 7:525-37. [PMID: 26596171 DOI: 10.1021/ct100578z] [Citation(s) in RCA: 2788] [Impact Index Per Article: 214.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
In this study, we have revised the rules and parameters for one of the most commonly used empirical pKa predictors, PROPKA, based on better physical description of the desolvation and dielectric response for the protein. We have introduced a new and consistent approach to interpolate the description between the previously distinct classifications into internal and surface residues, which otherwise is found to give rise to an erratic and discontinuous behavior. Since the goal of this study is to lay out the framework and validate the concept, it focuses on Asp and Glu residues where the protein pKa values and structures are assumed to be more reliable. The new and improved implementation is evaluated and discussed; it is found to agree better with experiment than the previous implementation (in parentheses): rmsd = 0.79 (0.91) for Asp and Glu, 0.75 (0.97) for Tyr, 0.65 (0.72) for Lys, and 1.00 (1.37) for His residues. The most significant advance, however, is in reducing the number of outliers and removing unreasonable sensitivity to small structural changes that arise from classifying residues as either internal or surface.
Collapse
Affiliation(s)
- Mats H M Olsson
- Department of Chemistry, University of Copenhagen, Universitetsparken 5, Copenhagen, Denmark
| | - Chresten R Søndergaard
- Department of Chemistry, University of Copenhagen, Universitetsparken 5, Copenhagen, Denmark
| | - Michal Rostkowski
- Department of Chemistry, University of Copenhagen, Universitetsparken 5, Copenhagen, Denmark
| | - Jan H Jensen
- Department of Chemistry, University of Copenhagen, Universitetsparken 5, Copenhagen, Denmark
| |
Collapse
|
39
|
Dai L, Yang Y, Kim HR, Zhou Y. Improving computational protein design by using structure-derived sequence profile. Proteins 2010; 78:2338-48. [PMID: 20544969 DOI: 10.1002/prot.22746] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Designing a protein sequence that will fold into a predefined structure is of both practical and fundamental interest. Many successful, computational designs in the last decade resulted from improved understanding of hydrophobic and polar interactions between side chains of amino acid residues in stabilizing protein tertiary structures. However, the coupling between main-chain backbone structure and local sequence has yet to be fully addressed. Here, we attempt to account for such coupling by using a sequence profile derived from the sequences of five residue fragments in a fragment library that are structurally matched to the five-residue segments contained in a target structure. We further introduced a term to reduce low complexity regions of designed sequences. These two terms together with optimized reference states for amino-acid residues were implemented in the RosettaDesign program. The new method, called RosettaDesign-SR, makes a 12% increase (from 34 to 46%) in fraction of proteins whose designed sequences are more than 35% identical to wild-type sequences. Meanwhile, it reduces 8% (from 22% to 14%) to the number of designed sequences that are not homologous to any known protein sequences according to psi-blast. More importantly, the sequences designed by RosettaDesign-SR have 2-3% more polar residues at the surface and core regions of proteins and these surface and core polar residues have about 4% higher sequence identity to wild-type sequences than by RosettaDesign. Thus, the proteins designed by RosettaDesign-SR should be less likely to aggregate and more likely to have unique structures due to more specific polar interactions.
Collapse
Affiliation(s)
- Liang Dai
- School of Informatics, Indiana University Purdue University, Indianapolis, Indiana 46202, USA
| | | | | | | |
Collapse
|
40
|
Lopes A, Schmidt Am Busch M, Simonson T. Computational design of protein-ligand binding: modifying the specificity of asparaginyl-tRNA synthetase. J Comput Chem 2010; 31:1273-86. [PMID: 19862811 DOI: 10.1002/jcc.21414] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
A method for computational design of protein-ligand interactions is implemented and tested on the asparaginyl- and aspartyl-tRNA synthetase enzymes (AsnRS, AspRS). The substrate specificity of these enzymes is crucial for the accurate translation of the genetic code. The method relies on a molecular mechanics energy function and a simple, continuum electrostatic, implicit solvent model. As test calculations, we first compute AspRS-substrate binding free energy changes due to nine point mutations, for which experimental data are available; we also perform large-scale redesign of the entire active site of each enzyme (40 amino acids) and compare to experimental sequences. We then apply the method to engineer an increased binding of aspartyl-adenylate (AspAMP) into AsnRS. Mutants are obtained using several directed evolution protocols, where four or five amino acid positions in the active site are randomized. Promising mutants are subjected to molecular dynamics simulations; Poisson-Boltzmann calculations provide an estimate of the corresponding, AspAMP, binding free energy changes, relative to the native AsnRS. Several of the mutants are predicted to have an inverted binding specificity, preferring to bind AspAMP rather than the natural substrate, AsnAMP. The computed binding affinities are significantly weaker than the native, AsnRS:AsnAMP affinity, and in most cases, the active site structure is significantly changed, compared to the native complex. This almost certainly precludes catalytic activity. One of the designed sequences has a higher affinity and more native-like structure and may represent a valid candidate for Asp activity.
Collapse
Affiliation(s)
- Anne Lopes
- Laboratoire de Biochimie, Department of Biology, UMR CNRS 7654, Ecole Polytechnique, 91128 Palaiseau, France
| | | | | |
Collapse
|
41
|
Schmidt am Busch M, Sedano A, Simonson T. Computational protein design: validation and possible relevance as a tool for homology searching and fold recognition. PLoS One 2010; 5:e10410. [PMID: 20463972 PMCID: PMC2864755 DOI: 10.1371/journal.pone.0010410] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2009] [Accepted: 03/31/2010] [Indexed: 11/19/2022] Open
Abstract
BACKGROUND Protein fold recognition usually relies on a statistical model of each fold; each model is constructed from an ensemble of natural sequences belonging to that fold. A complementary strategy may be to employ sequence ensembles produced by computational protein design. Designed sequences can be more diverse than natural sequences, possibly avoiding some limitations of experimental databases. METHODOLOGY/PRINCIPAL FINDINGS WE EXPLORE THIS STRATEGY FOR FOUR SCOP FAMILIES: Small Kunitz-type inhibitors (SKIs), Interleukin-8 chemokines, PDZ domains, and large Caspase catalytic subunits, represented by 43 structures. An automated procedure is used to redesign the 43 proteins. We use the experimental backbones as fixed templates in the folded state and a molecular mechanics model to compute the interaction energies between sidechain and backbone groups. Calculations are done with the Proteins@Home volunteer computing platform. A heuristic algorithm is used to scan the sequence and conformational space, yielding 200,000-300,000 sequences per backbone template. The results confirm and generalize our earlier study of SH2 and SH3 domains. The designed sequences ressemble moderately-distant, natural homologues of the initial templates; e.g., the SUPERFAMILY, profile Hidden-Markov Model library recognizes 85% of the low-energy sequences as native-like. Conversely, Position Specific Scoring Matrices derived from the sequences can be used to detect natural homologues within the SwissProt database: 60% of known PDZ domains are detected and around 90% of known SKIs and chemokines. Energy components and inter-residue correlations are analyzed and ways to improve the method are discussed. CONCLUSIONS/SIGNIFICANCE For some families, designed sequences can be a useful complement to experimental ones for homologue searching. However, improved tools are needed to extract more information from the designed profiles before the method can be of general use.
Collapse
Affiliation(s)
- Marcel Schmidt am Busch
- Laboratoire de Biochimie (CNRS UMR7654), Department of Biology, Ecole Polytechnique, Palaiseau, France
| | - Audrey Sedano
- Laboratoire de Biochimie (CNRS UMR7654), Department of Biology, Ecole Polytechnique, Palaiseau, France
| | - Thomas Simonson
- Laboratoire de Biochimie (CNRS UMR7654), Department of Biology, Ecole Polytechnique, Palaiseau, France
| |
Collapse
|
42
|
De novo self-assembling collagen heterotrimers using explicit positive and negative design. Biochemistry 2010; 49:2307-16. [PMID: 20170197 DOI: 10.1021/bi902077d] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
We sought to computationally design model collagen peptides that specifically associate as heterotrimers. Computational design has been successfully applied to the creation of new protein folds and functions. Despite the high abundance of collagen and its key role in numerous biological processes, fibrous proteins have received little attention as computational design targets. Collagens are composed of three polypeptide chains that wind into triple helices. We developed a discrete computational model to design heterotrimer-forming collagen-like peptides. Stability and specificity of oligomerization were concurrently targeted using a combined positive and negative design approach. The sequences of three 30-residue peptides, A, B, and C, were optimized to favor charge-pair interactions in an ABC heterotrimer, while disfavoring the 26 competing oligomers (i.e., AAA, ABB, BCA). Peptides were synthesized and characterized for thermal stability and triple-helical structure by circular dichroism and NMR. A unique A:B:C-type species was not achieved. Negative design was partially successful, with only A + B and B + C competing mixtures formed. Analysis of computed versus experimental stabilities helps to clarify the role of electrostatics and secondary-structure propensities determining collagen stability and to provide important insight into how subsequent designs can be improved.
Collapse
|
43
|
Abstract
Predictive methods for the computational design of proteins search for amino acid sequences adopting desired structures that perform specific functions. Typically, design of 'function' is formulated as engineering new and altered binding activities into proteins. Progress in the design of functional protein-protein interactions is directed toward engineering proteins to precisely control biological processes by specifically recognizing desired interaction partners while avoiding competitors. The field is aiming for strategies to harness recent advances in high-resolution computational modeling-particularly those exploiting protein conformational variability-to engineer new functions and incorporate many functional requirements simultaneously.
Collapse
Affiliation(s)
- Daniel J Mandell
- Graduate Program in Bioinformatics and Computational Biology, California Institute for Quantitative Biosciences, and Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, USA
| | | |
Collapse
|
44
|
Chang L, Zhou C, Xu M, Liu J. Interactions between anti-ErbB2 antibody A21 and the ErbB2 extracellular domain provide a basis for improving A21 affinity. J Comput Aided Mol Des 2009; 24:37-47. [PMID: 20012671 DOI: 10.1007/s10822-009-9312-1] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2009] [Accepted: 11/25/2009] [Indexed: 01/04/2023]
Abstract
Anti-ErbB2 antibodies are well researched for the therapy of ErbB2-overexpressing tumors. The therapeutic potential and efficacy of these antibodies are closely related to their affinities to ErbB2. Previously we reported that an anti-ErbB2 antibody A21 targeting a conformational epitope comprising several loops in ErbB2 extracellular subdomain I and II could inhibit the proliferation of ErbB2-overexpressing cancer cells in vitro and in vivo. Here we found that another structureless and non-conserved loop in subdomain I of ErbB2 extracellular domain (ECD) was important for binding to A21, and then the antigen-contact sites on A21 were determined by site-directed mutation. The loop was constructed by molecular modeling, and a new model of A21-ErbB2 complex was generated by docking using the crystal structure of the scfv A21 and the model of ErbB2 ECD with the loop built. Based on the complex model, computational design for A21 affinity improvement was performed to enhance its affinity to ErbB2. Two mutants with about 1.7-fold improvement in affinity were obtained. Our study provided a rational molecular basis for affinity improvement and mechanism investigation of A21.
Collapse
Affiliation(s)
- Liang Chang
- Lab of Cellular and Molecular Immunology, School of Life Sciences, University of Science and Technology of China, 230027 Hefei, People's Republic of China
| | | | | | | |
Collapse
|
45
|
Abstract
One of the most important physicochemical properties of small molecules and macromolecules are the dissociation constants for any weakly acidic or basic groups, generally expressed as the pK(a) of each group. This is a major factor in the pharmacokinetics of drugs and in the interactions of proteins with other molecules. For both the protein and small molecule cases, we survey the sources of experimental pK(a) values and then focus on current methods for predicting them. Of particular concern is an analysis of the scope, statistical validity, and predictive power of methods as well as their accuracy.
Collapse
Affiliation(s)
- Adam C Lee
- Department of Medicinal Chemistry, College of Pharmacy, University of Michigan, Ann Arbor, Michigan 48109, USA
| | | |
Collapse
|
46
|
am Busch MS, Mignon D, Simonson T. Computational protein design as a tool for fold recognition. Proteins 2009; 77:139-58. [PMID: 19408297 DOI: 10.1002/prot.22426] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Computationally designed protein sequences have been proposed as a basis to perform fold recognition and homology searching. To investigate this possibility, an automated procedure is used to completely redesign 24 SH3 proteins and 22 SH2 proteins. We use the experimental backbone coordinates as fixed templates in the folded state and a molecular mechanics model to compute the pairwise interaction energies between all sidechain types and conformations. Energy calculations are done with the Proteins@Home volunteer computing platform. A heuristic algorithm is then used to scan the sequence and conformational space for optimal solutions. We produced 200,000-450,000 sequences for each backbone template. The designed sequences ressemble moderately-distant, natural homologues of the initial templates, according to their identity scores and their similarity with respect to the Pfam sets of SH2 and SH3 domains. Standard homology detection tools document their native-like character: the Conserved Domain Database recognizes 61% (52%) of our low-energy sequences as SH3 (SH2) domains; the SUPERFAMILY, Hidden-Markov Model library recognizes 81% (84%). Conversely, position specific scoring matrices (PSSMs) derived from our designed sequences can be used to detect natural homologues in sequence databases. Within SwissProt, a set of natural SH3 PSSMs detects 772 SH3 domains, for example; our designed PSSMs detect 67% of these, plus one additional sequence and two false positives. If six amino acids involved in substrate binding (a selective pressure not accounted for in our design) are reset to their experimental types, then 77% of the experimental SH3 domains are detected. Results for the SH2 domains are similar. Several directions to improve the method further are discussed.
Collapse
Affiliation(s)
- Marcel Schmidt am Busch
- Laboratoire de Biochimie (CNRS UMR7654), Department of Biology, Ecole Polytechnique, 91128 Palaiseau, France
| | | | | |
Collapse
|
47
|
Marenich AV, Cramer CJ, Truhlar DG. Universal Solvation Model Based on the Generalized Born Approximation with Asymmetric Descreening. J Chem Theory Comput 2009; 5:2447-64. [DOI: 10.1021/ct900312z] [Citation(s) in RCA: 106] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
Affiliation(s)
- Aleksandr V. Marenich
- Department of Chemistry and Supercomputing Institute, University of Minnesota, 207 Pleasant Street S.E., Minneapolis, Minnesota 55455-0431
| | - Christopher J. Cramer
- Department of Chemistry and Supercomputing Institute, University of Minnesota, 207 Pleasant Street S.E., Minneapolis, Minnesota 55455-0431
| | - Donald G. Truhlar
- Department of Chemistry and Supercomputing Institute, University of Minnesota, 207 Pleasant Street S.E., Minneapolis, Minnesota 55455-0431
| |
Collapse
|
48
|
Durham E, Dorr B, Woetzel N, Staritzbichler R, Meiler J. Solvent accessible surface area approximations for rapid and accurate protein structure prediction. J Mol Model 2009; 15:1093-108. [PMID: 19234730 PMCID: PMC2712621 DOI: 10.1007/s00894-009-0454-9] [Citation(s) in RCA: 188] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2008] [Accepted: 01/02/2009] [Indexed: 12/01/2022]
Abstract
The burial of hydrophobic amino acids in the protein core is a driving force in protein folding. The extent to which an amino acid interacts with the solvent and the protein core is naturally proportional to the surface area exposed to these environments. However, an accurate calculation of the solvent-accessible surface area (SASA), a geometric measure of this exposure, is numerically demanding as it is not pair-wise decomposable. Furthermore, it depends on a full-atom representation of the molecule. This manuscript introduces a series of four SASA approximations of increasing computational complexity and accuracy as well as knowledge-based environment free energy potentials based on these SASA approximations. Their ability to distinguish correctly from incorrectly folded protein models is assessed to balance speed and accuracy for protein structure prediction. We find the newly developed “Neighbor Vector” algorithm provides the most optimal balance of accurate yet rapid exposure measures.
Collapse
Affiliation(s)
- Elizabeth Durham
- Department of Chemistry, Center for Structural Biology, Vanderbilt University, 465 21st Ave South, Nashville, TN 37232-8725, USA
| | | | | | | | | |
Collapse
|
49
|
Lesch BJ, Gehrke AR, Bulyk ML, Bargmann CI. Transcriptional regulation and stabilization of left-right neuronal identity in C. elegans. Genes Dev 2009; 23:345-58. [PMID: 19204119 PMCID: PMC2648548 DOI: 10.1101/gad.1763509] [Citation(s) in RCA: 47] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2008] [Accepted: 12/23/2008] [Indexed: 01/30/2023]
Abstract
At discrete points in development, transient signals are transformed into long-lasting cell fates. For example, the asymmetric identities of two Caenorhabditis elegans olfactory neurons called AWC(ON) and AWC(OFF) are specified by an embryonic signaling pathway, but maintained throughout the life of an animal. Here we show that the DNA-binding protein NSY-7 acts to convert a transient, partially differentiated state into a stable AWC(ON) identity. Expression of an AWC(ON) marker is initiated in nsy-7 loss-of-function mutants, but subsequently lost, so that most adult animals have two AWC(OFF) neurons and no AWC(ON) neurons. nsy-7 encodes a protein with distant similarity to a homeodomain. It is expressed in AWC(ON), and is an early transcriptional target of the embryonic signaling pathway that specifies AWC(ON) and AWC(OFF); its expression anticipates future AWC asymmetry. The NSY-7 protein binds a specific optimal DNA sequence that was identified through a complete biochemical survey of 8-mer DNA sequences. This sequence is present in the promoter of an AWC(OFF) marker and essential for its asymmetric expression. An 11-base-pair (bp) sequence required for AWC(OFF) expression has two activities: One region activates expression in both AWCs, and the overlapping NSY-7-binding site inhibits expression in AWC(ON). Our results suggest that NSY-7 responds to transient embryonic signaling by repressing AWC(OFF) genes in AWC(ON), thus acting as a transcriptional selector for a randomly specified neuronal identity.
Collapse
Affiliation(s)
- Bluma J. Lesch
- Howard Hughes Medical Institute, Laboratory of Neural Circuits and Behavior, The Rockefeller University, New York, New York 10065, USA
| | - Andrew R. Gehrke
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts 02115, USA
| | - Martha L. Bulyk
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts 02115, USA
- Department of Pathology; Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts 02115, USA
- Harvard-Massachusetts Institute of Technology Division of Health Sciences and Technology (HST), Harvard Medical School, Boston, Massachusetts 02115, USA
| | - Cornelia I. Bargmann
- Howard Hughes Medical Institute, Laboratory of Neural Circuits and Behavior, The Rockefeller University, New York, New York 10065, USA
| |
Collapse
|
50
|
Reynolds KA, Hanes MS, Thomson JM, Antczak AJ, Berger JM, Bonomo RA, Kirsch JF, Handel TM. Computational redesign of the SHV-1 beta-lactamase/beta-lactamase inhibitor protein interface. J Mol Biol 2008; 382:1265-75. [PMID: 18775544 PMCID: PMC4085744 DOI: 10.1016/j.jmb.2008.05.051] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2008] [Revised: 04/20/2008] [Accepted: 05/15/2008] [Indexed: 01/07/2023]
Abstract
Beta-lactamases are enzymes that catalyze the hydrolysis of beta-lactam antibiotics. beta-lactamase/beta-lactamase inhibitor protein (BLIP) complexes are emerging as a well characterized experimental model system for studying protein-protein interactions. BLIP is a 165 amino acid protein that inhibits several class A beta-lactamases with a wide range of affinities: picomolar affinity for K1; nanomolar affinity for TEM-1, SME-1, and BlaI; but only micromolar affinity for SHV-1 beta-lactamase. The large differences in affinity coupled with the availability of extensive mutagenesis data and high-resolution crystal structures for the TEM-1/BLIP and SHV-1/BLIP complexes make them attractive systems for the further development of computational design methodology. We used EGAD, a physics-based computational design program, to redesign BLIP in an attempt to increase affinity for SHV-1. Characterization of several of designs and point mutants revealed that in all cases, the mutations stabilize the interface by 10- to 1000-fold relative to wild type BLIP. The calculated changes in binding affinity for the mutants were within a mean absolute error of 0.87 kcal/mol from the experimental values, and comparison of the calculated and experimental values for a set of 30 SHV-1/BLIP complexes yielded a correlation coefficient of 0.77. Structures of the two complexes with the highest affinity, SHV-1/BLIP (E73M) and SHV-1/BLIP (E73M, S130K, S146M), are presented at 1.7 A resolution. While the predicted structures have much in common with the experimentally determined structures, they do not coincide perfectly; in particular a salt bridge between SHV-1 D104 and BLIP K74 is observed in the experimental structures, but not in the predicted design conformations. This discrepancy highlights the difficulty of modeling salt bridge interactions with a protein design algorithm that approximates side chains as discrete rotamers. Nevertheless, while local structural features of the interface were sometimes miscalculated, EGAD is globally successful in designing complexes with increased affinity.
Collapse
Affiliation(s)
- Kimberly A. Reynolds
- Biophysics Graduate Group, University of California, Berkeley, Berkeley, CA 94720,Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, 92093-0684
| | - Melinda S. Hanes
- Biophysics Graduate Group, University of California, Berkeley, Berkeley, CA 94720,Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, 92093-0684
| | - Jodi M. Thomson
- Louis Stokes Cleveland Department of Veterans Affairs Medical Center and the Department of Pharmacology, Case Western Reserve University, School of Medicine, Cleveland, Ohio, 44106
| | - Andrew J. Antczak
- Department of Molecular and Cell Biology and QB3 institute, University of California, Berkeley, Berkeley, CA 94720
| | - James M. Berger
- Department of Molecular and Cell Biology and QB3 institute, University of California, Berkeley, Berkeley, CA 94720
| | - Robert A. Bonomo
- Louis Stokes Cleveland Department of Veterans Affairs Medical Center and the Department of Pharmacology, Case Western Reserve University, School of Medicine, Cleveland, Ohio, 44106
| | - Jack F. Kirsch
- Department of Molecular and Cell Biology and QB3 institute, University of California, Berkeley, Berkeley, CA 94720
| | - Tracy M. Handel
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, 92093-0684,Address Correspondence to: Dr. Tracy M. Handel, Skaggs School of Pharmacy and Pharmaceutical Sciences, 9500 Gilman Dr. Mail Code 0684, La Jolla, CA 92093-0684; Tel: 858-822-6656; Fax: 858-822-6655;
| |
Collapse
|