1
|
Hwang W, Austin SL, Blondel A, Boittier ED, Boresch S, Buck M, Buckner J, Caflisch A, Chang HT, Cheng X, Choi YK, Chu JW, Crowley MF, Cui Q, Damjanovic A, Deng Y, Devereux M, Ding X, Feig MF, Gao J, Glowacki DR, Gonzales JE, Hamaneh MB, Harder ED, Hayes RL, Huang J, Huang Y, Hudson PS, Im W, Islam SM, Jiang W, Jones MR, Käser S, Kearns FL, Kern NR, Klauda JB, Lazaridis T, Lee J, Lemkul JA, Liu X, Luo Y, MacKerell AD, Major DT, Meuwly M, Nam K, Nilsson L, Ovchinnikov V, Paci E, Park S, Pastor RW, Pittman AR, Post CB, Prasad S, Pu J, Qi Y, Rathinavelan T, Roe DR, Roux B, Rowley CN, Shen J, Simmonett AC, Sodt AJ, Töpfer K, Upadhyay M, van der Vaart A, Vazquez-Salazar LI, Venable RM, Warrensford LC, Woodcock HL, Wu Y, Brooks CL, Brooks BR, Karplus M. CHARMM at 45: Enhancements in Accessibility, Functionality, and Speed. J Phys Chem B 2024; 128:9976-10042. [PMID: 39303207 PMCID: PMC11492285 DOI: 10.1021/acs.jpcb.4c04100] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2024] [Revised: 08/15/2024] [Accepted: 08/22/2024] [Indexed: 09/22/2024]
Abstract
Since its inception nearly a half century ago, CHARMM has been playing a central role in computational biochemistry and biophysics. Commensurate with the developments in experimental research and advances in computer hardware, the range of methods and applicability of CHARMM have also grown. This review summarizes major developments that occurred after 2009 when the last review of CHARMM was published. They include the following: new faster simulation engines, accessible user interfaces for convenient workflows, and a vast array of simulation and analysis methods that encompass quantum mechanical, atomistic, and coarse-grained levels, as well as extensive coverage of force fields. In addition to providing the current snapshot of the CHARMM development, this review may serve as a starting point for exploring relevant theories and computational methods for tackling contemporary and emerging problems in biomolecular systems. CHARMM is freely available for academic and nonprofit research at https://academiccharmm.org/program.
Collapse
Affiliation(s)
- Wonmuk Hwang
- Department
of Biomedical Engineering, Texas A&M
University, College
Station, Texas 77843, United States
- Department
of Materials Science and Engineering, Texas
A&M University, College Station, Texas 77843, United States
- Department
of Physics and Astronomy, Texas A&M
University, College Station, Texas 77843, United States
- Center for
AI and Natural Sciences, Korea Institute
for Advanced Study, Seoul 02455, Republic
of Korea
| | - Steven L. Austin
- Department
of Chemistry, University of South Florida, Tampa, Florida 33620, United States
| | - Arnaud Blondel
- Institut
Pasteur, Université Paris Cité, CNRS UMR3825, Structural
Bioinformatics Unit, 28 rue du Dr. Roux F-75015 Paris, France
| | - Eric D. Boittier
- Department
of Chemistry, University of Basel, Klingelbergstrasse 80, CH-4056 Basel, Switzerland
| | - Stefan Boresch
- Faculty of
Chemistry, Department of Computational Biological Chemistry, University of Vienna, Wahringerstrasse 17, 1090 Vienna, Austria
| | - Matthias Buck
- Department
of Physiology and Biophysics, Case Western
Reserve University, School of Medicine, Cleveland, Ohio 44106, United States
| | - Joshua Buckner
- Department
of Chemistry, University of Michigan, Ann Arbor, Michigan 48109, United States
| | - Amedeo Caflisch
- Department
of Biochemistry, University of Zürich, CH-8057 Zürich, Switzerland
| | - Hao-Ting Chang
- Institute
of Bioinformatics and Systems Biology, National
Yang Ming Chiao Tung University, Hsinchu 30010, Taiwan, ROC
| | - Xi Cheng
- Shanghai
Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China
| | - Yeol Kyo Choi
- Department
of Biological Sciences, Lehigh University, Bethlehem, Pennsylvania 18015, United States
| | - Jhih-Wei Chu
- Institute
of Bioinformatics and Systems Biology, Department of Biological Science
and Technology, Institute of Molecular Medicine and Bioengineering,
and Center for Intelligent Drug Systems and Smart Bio-devices (IDSB), National Yang Ming Chiao Tung
University, Hsinchu 30010, Taiwan,
ROC
| | - Michael F. Crowley
- Renewable
Resources and Enabling Sciences Center, National Renewable Energy Laboratory, Golden, Colorado 80401, United States
| | - Qiang Cui
- Department
of Chemistry, Boston University, 590 Commonwealth Avenue, Boston, Massachusetts 02215, United States
- Department
of Physics, Boston University, 590 Commonwealth Avenue, Boston, Massachusetts 02215, United States
- Department
of Biomedical Engineering, Boston University, 44 Cummington Mall, Boston, Massachusetts 02215, United States
| | - Ana Damjanovic
- Department
of Biophysics, Johns Hopkins University, Baltimore, Maryland 21218, United States
- Department
of Physics and Astronomy, Johns Hopkins
University, Baltimore, Maryland 21218, United States
- Laboratory
of Computational Biology, National Heart
Lung and Blood Institute, National Institutes of Health, Bethesda, Maryland 20892, United States
| | - Yuqing Deng
- Shanghai
R&D Center, DP Technology, Ltd., Shanghai 201210, China
| | - Mike Devereux
- Department
of Chemistry, University of Basel, Klingelbergstrasse 80, CH-4056 Basel, Switzerland
| | - Xinqiang Ding
- Department
of Chemistry, Tufts University, Medford, Massachusetts 02155, United States
| | - Michael F. Feig
- Department
of Biochemistry and Molecular Biology, Michigan
State University, East Lansing, Michigan 48824, United States
| | - Jiali Gao
- School
of Chemical Biology & Biotechnology, Peking University Shenzhen Graduate School, Shenzhen, Guangdong 518055, China
- Institute
of Systems and Physical Biology, Shenzhen
Bay Laboratory, Shenzhen, Guangdong 518055, China
- Department
of Chemistry and Supercomputing Institute, University of Minnesota, Minneapolis, Minnesota 55455, United States
| | - David R. Glowacki
- CiTIUS
Centro Singular de Investigación en Tecnoloxías Intelixentes
da USC, 15705 Santiago de Compostela, Spain
| | - James E. Gonzales
- Department
of Biomedical Engineering, Texas A&M
University, College
Station, Texas 77843, United States
- Laboratory
of Computational Biology, National Heart
Lung and Blood Institute, National Institutes of Health, Bethesda, Maryland 20892, United States
| | - Mehdi Bagerhi Hamaneh
- Department
of Physiology and Biophysics, Case Western
Reserve University, School of Medicine, Cleveland, Ohio 44106, United States
| | | | - Ryan L. Hayes
- Department
of Chemical and Biomolecular Engineering, University of California, Irvine, Irvine, California 92697, United States
- Department
of Pharmaceutical Sciences, University of
California, Irvine, Irvine, California 92697, United States
| | - Jing Huang
- Key Laboratory
of Structural Biology of Zhejiang Province, School of Life Sciences, Westlake University, Hangzhou, Zhejiang 310024, China
| | - Yandong Huang
- College
of Computer Engineering, Jimei University, Xiamen 361021, China
| | - Phillip S. Hudson
- Department
of Chemistry, University of South Florida, Tampa, Florida 33620, United States
- Medicine
Design, Pfizer Inc., Cambridge, Massachusetts 02139, United States
| | - Wonpil Im
- Department
of Biological Sciences, Lehigh University, Bethlehem, Pennsylvania 18015, United States
| | - Shahidul M. Islam
- Department
of Chemistry, Delaware State University, Dover, Delaware 19901, United States
| | - Wei Jiang
- Computational
Science Division, Argonne National Laboratory, Argonne, Illinois 60439, United States
| | - Michael R. Jones
- Laboratory
of Computational Biology, National Heart
Lung and Blood Institute, National Institutes of Health, Bethesda, Maryland 20892, United States
| | - Silvan Käser
- Department
of Chemistry, University of Basel, Klingelbergstrasse 80, CH-4056 Basel, Switzerland
| | - Fiona L. Kearns
- Department
of Chemistry, University of South Florida, Tampa, Florida 33620, United States
| | - Nathan R. Kern
- Department
of Biological Sciences, Lehigh University, Bethlehem, Pennsylvania 18015, United States
| | - Jeffery B. Klauda
- Department
of Chemical and Biomolecular Engineering, Institute for Physical Science
and Technology, Biophysics Program, University
of Maryland, College Park, Maryland 20742, United States
| | - Themis Lazaridis
- Department
of Chemistry, City College of New York, New York, New York 10031, United States
| | - Jinhyuk Lee
- Disease
Target Structure Research Center, Korea
Research Institute of Bioscience and Biotechnology, Daejeon 34141, Republic of Korea
- Department
of Bioinformatics, KRIBB School of Bioscience, University of Science and Technology, Daejeon 34141, Republic of Korea
| | - Justin A. Lemkul
- Department
of Biochemistry, Virginia Polytechnic Institute
and State University, Blacksburg, Virginia 24061, United States
| | - Xiaorong Liu
- Department
of Chemistry, University of Michigan, Ann Arbor, Michigan 48109, United States
| | - Yun Luo
- Department
of Biotechnology and Pharmaceutical Sciences, College of Pharmacy, Western University of Health Sciences, Pomona, California 91766, United States
| | - Alexander D. MacKerell
- Department
of Pharmaceutical Sciences, University of
Maryland School of Pharmacy, Baltimore, Maryland 21201, United States
| | - Dan T. Major
- Department
of Chemistry and Institute for Nanotechnology & Advanced Materials, Bar-Ilan University, Ramat-Gan 52900, Israel
| | - Markus Meuwly
- Department
of Chemistry, University of Basel, Klingelbergstrasse 80, CH-4056 Basel, Switzerland
- Department
of Chemistry, Brown University, Providence, Rhode Island 02912, United States
| | - Kwangho Nam
- Department
of Chemistry and Biochemistry, University
of Texas at Arlington, Arlington, Texas 76019, United States
| | - Lennart Nilsson
- Karolinska
Institutet, Department of Biosciences and
Nutrition, SE-14183 Huddinge, Sweden
| | - Victor Ovchinnikov
- Harvard
University, Department of Chemistry
and Chemical Biology, Cambridge, Massachusetts 02138, United States
| | - Emanuele Paci
- Dipartimento
di Fisica e Astronomia, Universitá
di Bologna, Bologna 40127, Italy
| | - Soohyung Park
- Department
of Biological Sciences, Lehigh University, Bethlehem, Pennsylvania 18015, United States
| | - Richard W. Pastor
- Laboratory
of Computational Biology, National Heart
Lung and Blood Institute, National Institutes of Health, Bethesda, Maryland 20892, United States
| | - Amanda R. Pittman
- Department
of Chemistry, University of South Florida, Tampa, Florida 33620, United States
| | - Carol Beth Post
- Borch Department
of Medicinal Chemistry and Molecular Pharmacology, Purdue University, West Lafayette, Indiana 47907, United States
| | - Samarjeet Prasad
- Laboratory
of Computational Biology, National Heart
Lung and Blood Institute, National Institutes of Health, Bethesda, Maryland 20892, United States
| | - Jingzhi Pu
- Department
of Chemistry and Chemical Biology, Indiana
University Indianapolis, Indianapolis, Indiana 46202, United States
| | - Yifei Qi
- School
of Pharmacy, Fudan University, Shanghai 201203, China
| | | | - Daniel R. Roe
- Laboratory
of Computational Biology, National Heart
Lung and Blood Institute, National Institutes of Health, Bethesda, Maryland 20892, United States
| | - Benoit Roux
- Department
of Chemistry, University of Chicago, Chicago, Illinois 60637, United States
| | | | - Jana Shen
- Department
of Pharmaceutical Sciences, University of
Maryland School of Pharmacy, Baltimore, Maryland 21201, United States
| | - Andrew C. Simmonett
- Laboratory
of Computational Biology, National Heart
Lung and Blood Institute, National Institutes of Health, Bethesda, Maryland 20892, United States
| | - Alexander J. Sodt
- Eunice
Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health, Bethesda, Maryland 20892, United States
| | - Kai Töpfer
- Department
of Chemistry, University of Basel, Klingelbergstrasse 80, CH-4056 Basel, Switzerland
| | - Meenu Upadhyay
- Department
of Chemistry, University of Basel, Klingelbergstrasse 80, CH-4056 Basel, Switzerland
| | - Arjan van der Vaart
- Department
of Chemistry, University of South Florida, Tampa, Florida 33620, United States
| | | | - Richard M. Venable
- Laboratory
of Computational Biology, National Heart
Lung and Blood Institute, National Institutes of Health, Bethesda, Maryland 20892, United States
| | - Luke C. Warrensford
- Department
of Chemistry, University of South Florida, Tampa, Florida 33620, United States
| | - H. Lee Woodcock
- Department
of Chemistry, University of South Florida, Tampa, Florida 33620, United States
| | - Yujin Wu
- Department
of Chemistry, University of Michigan, Ann Arbor, Michigan 48109, United States
| | - Charles L. Brooks
- Department
of Chemistry, University of Michigan, Ann Arbor, Michigan 48109, United States
| | - Bernard R. Brooks
- Laboratory
of Computational Biology, National Heart
Lung and Blood Institute, National Institutes of Health, Bethesda, Maryland 20892, United States
| | - Martin Karplus
- Harvard
University, Department of Chemistry
and Chemical Biology, Cambridge, Massachusetts 02138, United States
- Laboratoire
de Chimie Biophysique, ISIS, Université
de Strasbourg, 67000 Strasbourg, France
| |
Collapse
|
2
|
Opuu V, Nigro G, Lazennec‐Schurdevin C, Mechulam Y, Schmitt E, Simonson T. Redesigning methionyl-tRNA synthetase for β-methionine activity with adaptive landscape flattening and experiments. Protein Sci 2023; 32:e4738. [PMID: 37518893 PMCID: PMC10451022 DOI: 10.1002/pro.4738] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2023] [Revised: 07/21/2023] [Accepted: 07/23/2023] [Indexed: 08/01/2023]
Abstract
Amino acids (AAs) with a noncanonical backbone would be a valuable tool for protein engineering, enabling new structural motifs and building blocks. To incorporate them into an expanded genetic code, the first, key step is to obtain an appropriate aminoacyl-tRNA synthetase. Currently, directed evolution is not available to optimize AAs with noncanonical backbones, since an appropriate selective pressure has not been discovered. Computational protein design (CPD) is an alternative. We used a new CPD method to redesign MetRS and increase its activity towards β-Met, which has an extra backbone methylene. The new method considered a few active site positions for design and used a Monte Carlo exploration of the corresponding sequence space. During the exploration, a bias energy was adaptively learned, such that the free energy landscape of the apo enzyme was flattened. Enzyme variants could then be sampled, in the presence of the ligand and the bias energy, according to their β-Met binding affinities. Eighteen predicted variants were chosen for experimental testing; 10 exhibited detectable activity for β-Met adenylation. Top predicted hits were characterized experimentally in detail. Dissociation constants, catalytic rates, and Michaelis constants for both α-Met and β-Met were measured. The best mutant retained a preference for α-Met over β-Met; however, the preference was reduced, compared to the wildtype, by a factor of 29. For this mutant, high resolution crystal structures were obtained in complex with both α-Met and β-Met, indicating that the predicted, active conformation of β-Met in the active site was retained.
Collapse
Affiliation(s)
- Vaitea Opuu
- Laboratoire de Biologie Structurale de la Cellule (CNRS UMR7654), Ecole PolytechniqueInstitut Polytechnique de ParisPalaiseauFrance
| | - Giuliano Nigro
- Laboratoire de Biologie Structurale de la Cellule (CNRS UMR7654), Ecole PolytechniqueInstitut Polytechnique de ParisPalaiseauFrance
| | - Christine Lazennec‐Schurdevin
- Laboratoire de Biologie Structurale de la Cellule (CNRS UMR7654), Ecole PolytechniqueInstitut Polytechnique de ParisPalaiseauFrance
| | - Yves Mechulam
- Laboratoire de Biologie Structurale de la Cellule (CNRS UMR7654), Ecole PolytechniqueInstitut Polytechnique de ParisPalaiseauFrance
| | - Emmanuelle Schmitt
- Laboratoire de Biologie Structurale de la Cellule (CNRS UMR7654), Ecole PolytechniqueInstitut Polytechnique de ParisPalaiseauFrance
| | - Thomas Simonson
- Laboratoire de Biologie Structurale de la Cellule (CNRS UMR7654), Ecole PolytechniqueInstitut Polytechnique de ParisPalaiseauFrance
| |
Collapse
|
3
|
Michael E, Saint-Jalme R, Mignon D, Simonson T. Computational protein design repurposed to explore enzyme vitality and help predict antibiotic resistance. Front Mol Biosci 2023; 9:905588. [PMID: 36699702 PMCID: PMC9868620 DOI: 10.3389/fmolb.2022.905588] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2022] [Accepted: 12/19/2022] [Indexed: 01/11/2023] Open
Abstract
In response to antibiotics that inhibit a bacterial enzyme, resistance mutations inevitably arise. Predicting them ahead of time would aid target selection and drug design. The simplest resistance mechanism would be to reduce antibiotic binding without sacrificing too much substrate binding. The property that reflects this is the enzyme "vitality", defined here as the difference between the inhibitor and substrate binding free energies. To predict such mutations, we borrow methodology from computational protein design. We use a Monte Carlo exploration of mutation space and vitality changes, allowing us to rank thousands of mutations and identify ones that might provide resistance through the simple mechanism considered. As an illustration, we chose dihydrofolate reductase, an essential enzyme targeted by several antibiotics. We simulated its complexes with the inhibitor trimethoprim and the substrate dihydrofolate. 20 active site positions were mutated, or "redesigned" individually, then in pairs or quartets. We computed the resulting binding free energy and vitality changes. Out of seven known resistance mutations involving active site positions, five were correctly recovered. Ten positions exhibited mutations with significant predicted vitality gains. Direct couplings between designed positions were predicted to be small, which reduces the combinatorial complexity of the mutation space to be explored. It also suggests that over the course of evolution, resistance mutations involving several positions do not need the underlying point mutations to arise all at once: they can appear and become fixed one after the other.
Collapse
|
4
|
Magi Meconi G, Sasselli IR, Bianco V, Onuchic JN, Coluzza I. Key aspects of the past 30 years of protein design. REPORTS ON PROGRESS IN PHYSICS. PHYSICAL SOCIETY (GREAT BRITAIN) 2022; 85:086601. [PMID: 35704983 DOI: 10.1088/1361-6633/ac78ef] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/03/2021] [Accepted: 06/15/2022] [Indexed: 06/15/2023]
Abstract
Proteins are the workhorse of life. They are the building infrastructure of living systems; they are the most efficient molecular machines known, and their enzymatic activity is still unmatched in versatility by any artificial system. Perhaps proteins' most remarkable feature is their modularity. The large amount of information required to specify each protein's function is analogically encoded with an alphabet of just ∼20 letters. The protein folding problem is how to encode all such information in a sequence of 20 letters. In this review, we go through the last 30 years of research to summarize the state of the art and highlight some applications related to fundamental problems of protein evolution.
Collapse
Affiliation(s)
- Giulia Magi Meconi
- Computational Biophysics Lab, Center for Cooperative Research in Biomaterials (CIC biomaGUNE), Basque Research and Technology Alliance (BRTA), Paseo de Miramon 182, 20014, Donostia-San Sebastián, Spain
| | - Ivan R Sasselli
- Computational Biophysics Lab, Center for Cooperative Research in Biomaterials (CIC biomaGUNE), Basque Research and Technology Alliance (BRTA), Paseo de Miramon 182, 20014, Donostia-San Sebastián, Spain
| | | | - Jose N Onuchic
- Center for Theoretical Biological Physics, Department of Physics & Astronomy, Department of Chemistry, Department of Biosciences, Rice University, Houston, TX 77251, United States of America
| | - Ivan Coluzza
- BCMaterials, Basque Center for Materials, Applications and Nanostructures, Bld. Martina Casiano, UPV/EHU Science Park, Barrio Sarriena s/n, 48940 Leioa, Spain
- Basque Foundation for Science, Ikerbasque, 48009, Bilbao, Spain
| |
Collapse
|
5
|
Vankayala SL, Warrensford LC, Pittman AR, Pollard BC, Kearns FL, Larkin JD, Woodcock HL. CIFDock: A novel CHARMM-based flexible receptor-flexible ligand docking protocol. J Comput Chem 2022; 43:84-95. [PMID: 34741467 DOI: 10.1002/jcc.26759] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2020] [Revised: 01/28/2021] [Accepted: 03/25/2021] [Indexed: 12/13/2022]
Abstract
Docking studies play a critical role in the current workflow of drug discovery. However, limitations may often arise through factors including inadequate ligand sampling, a lack of protein flexibility, scoring function inadequacies (e.g., due to metals, co-factors, etc.), and difficulty in retaining explicit water molecules. Herein, we present a novel CHARMM-based induced fit docking (CIFDock) workflow that can circumvent these limitations by employing all-atom force fields coupled to enhanced sampling molecular dynamics procedures. Self-guided Langevin dynamics simulations are used to effectively sample relevant ligand conformations, side chain orientations, crystal water positions, and active site residue motion. Protein flexibility is further enhanced by dynamic sampling of side chain orientations using an expandable rotamer library. Steps in the procedure consisting of fixing individual components (e.g., the ligand) while sampling the other components (e.g., the residues in the active site of the protein) allow for the complex to adapt to conformational changes. Ultimately, all components of the complex-the protein, ligand, and waters-are sampled simultaneously and unrestrained with SGLD to capture any induced fit effects. This modular flexible docking procedure is automated using CHARMM scripting, interfaced with SLURM array processing, and parallelized to use the desired number of processors. We validated the CIFDock procedure by performing cross-docking studies using a data set comprised of 21 pharmaceutically relevant proteins. Five variants of the CHARMM-based SWISSDOCK scoring functions were created to quantify the results of the final generated poses. Results obtained were comparable to, or in some cases improved upon, commercial docking program data.
Collapse
Affiliation(s)
- Sai L Vankayala
- Department of Chemistry, Eckerd College, St. Petersburg, Florida, USA
| | | | - Amanda R Pittman
- Department of Chemistry, Eckerd College, St. Petersburg, Florida, USA
| | - Benjamin C Pollard
- Department of Chemistry, University of South Florida, Tampa, Florida, USA
| | - Fiona L Kearns
- Department of Chemistry, Eckerd College, St. Petersburg, Florida, USA
| | - Joseph D Larkin
- Department of Chemistry, University of South Florida, Tampa, Florida, USA
| | - H Lee Woodcock
- Department of Chemistry, Eckerd College, St. Petersburg, Florida, USA
| |
Collapse
|
6
|
Michael E, Polydorides S, Archontis G. Computational Design of Peptides with Improved Recognition of the Focal Adhesion Kinase FAT Domain. Methods Mol Biol 2022; 2405:383-402. [PMID: 35298823 DOI: 10.1007/978-1-0716-1855-4_18] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
We describe a two-stage computational protein design (CPD) methodology for the design of peptides binding to the FAT domain of the protein focal adhesion kinase. The first stage involves high-throughput CPD calculations with the Proteus software. The energies of the folded state are described by a physics-based energy function and of the unfolded peptides by a knowledge-based model that reproduces aminoacid compositions consistent with a helicity scale. The obtained sequences are filtered in terms of the affinity and the stability of the complex. In the second stage, design sequences are further evaluated by all-atom molecular dynamics simulations and binding free energy calculations with a molecular mechanics/implicit solvent free energy function.
Collapse
Affiliation(s)
- Eleni Michael
- Department of Physics, University of Cyprus, Nicosia, Cyprus
| | | | | |
Collapse
|
7
|
Zhu J, Avakyan N, Kakkis AA, Hoffnagle AM, Han K, Li Y, Zhang Z, Choi TS, Na Y, Yu CJ, Tezcan FA. Protein Assembly by Design. Chem Rev 2021; 121:13701-13796. [PMID: 34405992 PMCID: PMC9148388 DOI: 10.1021/acs.chemrev.1c00308] [Citation(s) in RCA: 112] [Impact Index Per Article: 37.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Proteins are nature's primary building blocks for the construction of sophisticated molecular machines and dynamic materials, ranging from protein complexes such as photosystem II and nitrogenase that drive biogeochemical cycles to cytoskeletal assemblies and muscle fibers for motion. Such natural systems have inspired extensive efforts in the rational design of artificial protein assemblies in the last two decades. As molecular building blocks, proteins are highly complex, in terms of both their three-dimensional structures and chemical compositions. To enable control over the self-assembly of such complex molecules, scientists have devised many creative strategies by combining tools and principles of experimental and computational biophysics, supramolecular chemistry, inorganic chemistry, materials science, and polymer chemistry, among others. Owing to these innovative strategies, what started as a purely structure-building exercise two decades ago has, in short order, led to artificial protein assemblies with unprecedented structures and functions and protein-based materials with unusual properties. Our goal in this review is to give an overview of this exciting and highly interdisciplinary area of research, first outlining the design strategies and tools that have been devised for controlling protein self-assembly, then describing the diverse structures of artificial protein assemblies, and finally highlighting the emergent properties and functions of these assemblies.
Collapse
Affiliation(s)
| | | | - Albert A. Kakkis
- Department of Chemistry and Biochemistry, University of California, San Diego, 9500 Gilman Drive, La Jolla, California 92093-0340, United States
| | - Alexander M. Hoffnagle
- Department of Chemistry and Biochemistry, University of California, San Diego, 9500 Gilman Drive, La Jolla, California 92093-0340, United States
| | - Kenneth Han
- Department of Chemistry and Biochemistry, University of California, San Diego, 9500 Gilman Drive, La Jolla, California 92093-0340, United States
| | - Yiying Li
- Department of Chemistry and Biochemistry, University of California, San Diego, 9500 Gilman Drive, La Jolla, California 92093-0340, United States
| | - Zhiyin Zhang
- Department of Chemistry and Biochemistry, University of California, San Diego, 9500 Gilman Drive, La Jolla, California 92093-0340, United States
| | - Tae Su Choi
- Department of Chemistry and Biochemistry, University of California, San Diego, 9500 Gilman Drive, La Jolla, California 92093-0340, United States
| | - Youjeong Na
- Department of Chemistry and Biochemistry, University of California, San Diego, 9500 Gilman Drive, La Jolla, California 92093-0340, United States
| | - Chung-Jui Yu
- Department of Chemistry and Biochemistry, University of California, San Diego, 9500 Gilman Drive, La Jolla, California 92093-0340, United States
| | - F. Akif Tezcan
- Department of Chemistry and Biochemistry, University of California, San Diego, 9500 Gilman Drive, La Jolla, California 92093-0340, United States
| |
Collapse
|
8
|
Polydorides S, Archontis G. Computational optimization of the SARS-CoV-2 receptor-binding-motif affinity for human ACE2. Biophys J 2021; 120:2859-2871. [PMID: 33984310 PMCID: PMC8110322 DOI: 10.1016/j.bpj.2021.02.049] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2020] [Revised: 01/19/2021] [Accepted: 02/15/2021] [Indexed: 01/15/2023] Open
Abstract
The coronavirus severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), which is responsible for the coronavirus disease 2019 pandemic, and the closely related SARS-CoV coronavirus enter cells by binding at the human angiotensin converting enzyme 2 (hACE2). The stronger hACE2 affinity of SARS-CoV-2 has been connected with its higher infectivity. In this work, we study hACE2 complexes with the receptor-binding domains (RBDs) of the human SARS-CoV-2 and human SARS-CoV viruses, using all-atom molecular dynamics simulations and computational protein design with a physics-based energy function. The molecular dynamics simulations identify charge-modifying substitutions between the CoV-2 and CoV RBDs, which either increase or decrease the hACE2 affinity of the SARS-CoV-2 RBD. The combined effect of these mutations is small, and the relative affinity is mainly determined by substitutions at residues in contact with hACE2. Many of these findings are in line and interpret recent experiments. Our computational protein design calculations redesign positions 455, 493, 494, and 501 of the SARS-CoV-2 receptor binding motif, which contact hACE2 in the complex and are important for ACE2 recognition. Sampling is enhanced by an adaptive importance sampling Monte Carlo method. Sequences with increased affinity replace CoV-2 glutamine by a negative residue at position 493; serine by a nonpolar or aromatic residue or an asparagine at position 494; and asparagine by valine or threonine at position 501. Substitutions at positions 455 and 501 have a smaller effect on affinity. Substitutions suggested by our design are seen in viral sequences encountered in other species, including bat and pangolin. Our results might be used to identify potential virus strains with higher human infectivity and assist in the design of peptide-based or peptidomimetic compounds with the potential to inhibit SARS-CoV-2 binding at hACE2.
Collapse
|
9
|
Michael E, Polydorides S, Simonson T, Archontis G. Hybrid MC/MD for protein design. J Chem Phys 2021; 153:054113. [PMID: 32770896 DOI: 10.1063/5.0013320] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023] Open
Abstract
Computational protein design relies on simulations of a protein structure, where selected amino acids can mutate randomly, and mutations are selected to enhance a target property, such as stability. Often, the protein backbone is held fixed and its degrees of freedom are modeled implicitly to reduce the complexity of the conformational space. We present a hybrid method where short molecular dynamics (MD) segments are used to explore conformations and alternate with Monte Carlo (MC) moves that apply mutations to side chains. The backbone is fully flexible during MD. As a test, we computed side chain acid/base constants or pKa's in five proteins. This problem can be considered a special case of protein design, with protonation/deprotonation playing the role of mutations. The solvent was modeled as a dielectric continuum. Due to cost, in each protein we allowed just one side chain position to change its protonation state and the other position to change its type or mutate. The pKa's were computed with a standard method that scans a range of pH values and with a new method that uses adaptive landscape flattening (ALF) to sample all protonation states in a single simulation. The hybrid method gave notably better accuracy than standard, fixed-backbone MC. ALF decreased the computational cost a factor of 13.
Collapse
Affiliation(s)
- Eleni Michael
- Department of Physics, University of Cyprus, P.O 20537, CY678 Nicosia, Cyprus
| | - Savvas Polydorides
- Department of Physics, University of Cyprus, P.O 20537, CY678 Nicosia, Cyprus
| | - Thomas Simonson
- Laboratoire de Biochimie (CNRS UMR7654), Ecole Polytechnique, Palaiseau, France
| | - Georgios Archontis
- Department of Physics, University of Cyprus, P.O 20537, CY678 Nicosia, Cyprus
| |
Collapse
|
10
|
Abstract
This chapter describes two computational methods for PDZ-peptide binding: high-throughput computational protein design (CPD) and a medium-throughput approach combining molecular dynamics for conformational sampling with a Poisson-Boltzmann (PB) Linear Interaction Energy for scoring. A new CPD method is outlined, which uses adaptive Monte Carlo simulations to efficiently sample peptide variants that tightly bind a PDZ domain, and provides at the same time precise estimates of their relative binding free energies. A detailed protocol is described based on the Proteus CPD software. The medium-throughput approach can be performed with standard MD and PB software, such as NAMD and Charmm. For 40 complexes between Tiam1 and peptide ligands, it gave high a2ccuracy, with mean errors of around 0.5 kcal/mol for relative binding free energies and no large errors. It requires a moderate amount of parameter fitting before it can be applied, and its transferability to other protein families is still untested.
Collapse
Affiliation(s)
- Nicolas Panel
- Laboratoire de Biologie Structurale de la Cellule (CNRS UMR7654), Ecole Polytechnique, Palaiseau, France
| | - Francesco Villa
- Laboratoire de Biologie Structurale de la Cellule (CNRS UMR7654), Ecole Polytechnique, Palaiseau, France
| | - Vaitea Opuu
- Laboratoire de Biologie Structurale de la Cellule (CNRS UMR7654), Ecole Polytechnique, Palaiseau, France
| | - David Mignon
- Laboratoire de Biologie Structurale de la Cellule (CNRS UMR7654), Ecole Polytechnique, Palaiseau, France
| | - Thomas Simonson
- Laboratoire de Biologie Structurale de la Cellule (CNRS UMR7654), Ecole Polytechnique, Palaiseau, France.
| |
Collapse
|
11
|
Roberts J, Song Y, Crocker M, Risko C. A Genetic Algorithmic Approach to Determine the Structure of Li-Al Layered Double Hydroxides. J Chem Inf Model 2020; 60:4845-4855. [PMID: 32794767 DOI: 10.1021/acs.jcim.0c00493] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Layered double hydroxides (LDH) demonstrate significant potential across a range of applications, including as catalysts, delivery vehicles for pharmaceuticals, environmental remediation, and supercapacitors. Explaining the mechanism of LDH action at the atomic scale in these and other applications is challenging, however, due to the difficulty in precisely defining the bulk and surface structure and chemical compositions. Here, we focus on the determination of the structure of lithium-aluminum (Li-Al) LDH, which has shown promise in the catalytic depolymerization of lignin, both directly as the catalyst and as a support for gold nanoparticles. While the relative positions of the Li and Al metals are generally well resolved by X-ray crystallography, it is the structures of the anionic layers, consisting of water and carbonate, that are less well established. Combinatorial analyses of all possible positions and rotations of the water and carbonate in the three-layered Li-AL LDH polytope reveals that the phase space is much too large to examine in any reasonable time frame in a one-by-one structure exploration. To overcome this limitation, we develop and deploy a genetic algorithm (GA) wherein fitness is determined by matching a calculated X-ray diffraction (XRD) pattern for a given structure to the known experimental XRD pattern. The GA approach results in structures of high fitness that portend the bulk Li-Al LDH structure. Importantly, the GA approach offers the potential to determine the structures of other LDH, and more generally layered materials, which are generally difficult to describe given the large chemical and structural space to be explored.
Collapse
Affiliation(s)
- Josiah Roberts
- Department of Chemistry and Center for Applied Energy Research (CAER), University of Kentucky, Lexington, Kentucky 40506, United States
| | - Yang Song
- Department of Chemistry and Center for Applied Energy Research (CAER), University of Kentucky, Lexington, Kentucky 40506, United States
| | - Mark Crocker
- Department of Chemistry and Center for Applied Energy Research (CAER), University of Kentucky, Lexington, Kentucky 40506, United States
| | - Chad Risko
- Department of Chemistry and Center for Applied Energy Research (CAER), University of Kentucky, Lexington, Kentucky 40506, United States
| |
Collapse
|
12
|
Adaptive landscape flattening allows the design of both enzyme: Substrate binding and catalytic power. PLoS Comput Biol 2020; 16:e1007600. [PMID: 31917825 PMCID: PMC7041857 DOI: 10.1371/journal.pcbi.1007600] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2019] [Revised: 02/25/2020] [Accepted: 12/11/2019] [Indexed: 01/30/2023] Open
Abstract
Designed enzymes are of fundamental and technological interest. Experimental directed evolution still has significant limitations, and computational approaches are a complementary route. A designed enzyme should satisfy multiple criteria: stability, substrate binding, transition state binding. Such multi-objective design is computationally challenging. Two recent studies used adaptive importance sampling Monte Carlo to redesign proteins for ligand binding. By first flattening the energy landscape of the apo protein, they obtained positive design for the bound state and negative design for the unbound. We have now extended the method to design an enzyme for specific transition state binding, i.e., for its catalytic power. We considered methionyl-tRNA synthetase (MetRS), which attaches methionine (Met) to its cognate tRNA, establishing codon identity. Previously, MetRS and other synthetases have been redesigned by experimental directed evolution to accept noncanonical amino acids as substrates, leading to genetic code expansion. Here, we have redesigned MetRS computationally to bind several ligands: the Met analog azidonorleucine, methionyl-adenylate (MetAMP), and the activated ligands that form the transition state for MetAMP production. Enzyme mutants known to have azidonorleucine activity were recovered by the design calculations, and 17 mutants predicted to bind MetAMP were characterized experimentally and all found to be active. Mutants predicted to have low activation free energies for MetAMP production were found to be active and the predicted reaction rates agreed well with the experimental values. We suggest the present method should become the paradigm for computational enzyme design. Designed enzymes are of major interest. Experimental directed evolution still has significant limitations, and computational approaches are another route. Enzymes must be stable, bind substrates, and be powerful catalysts. It is challenging to design for all these properties. A method to design substrate binding was proposed recently. It used an adaptive Monte Carlo method to explore mutations of a few amino acids near the substrate. A bias energy was gradually “learned” such that, in the absence of the ligand, the simulation visited most of the possible protein mutations with comparable probabilities. Remarkably, a simulation of the protein:ligand complex, including the bias, will then preferentially sample tight-binding sequences. We generalized the method to design binding specificity. We tested it for the methionyl-tRNA synthetase enzyme, which has been engineered in order to expand the genetic code. We redesigned the enzyme to obtain variants with low activation free energies for the catalytic step. The variants proposed by the simulations were shown experimentally to be active, and the predicted activation free energies were in reasonable agreement with the experimental values. We expect the new method will become the paradigm for computational enzyme design.
Collapse
|
13
|
Rose GD. Ramachandran maps for side chains in globular proteins. Proteins 2019; 87:357-364. [PMID: 30629766 DOI: 10.1002/prot.25656] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2018] [Accepted: 12/30/2018] [Indexed: 11/05/2022]
Abstract
The Ramachandran plot for backbone ϕ,ψ-angles in a blocked monopeptide has played a central role in understanding protein structure. Curiously, a similar analysis for side chain χ-angles has been comparatively neglected. Instead, efforts have focused on compiling various types of side chain libraries extracted from proteins of known structure. Departing from this trend, the following analysis presents backbone-based maps of side chains in blocked monopeptides. As in the original ϕ,ψ-plot, these maps are derived solely from hard-sphere steric repulsion. Remarkably, the side chain biases exhibit marked similarities to corresponding biases seen in high-resolution protein structures. Consequently, some of the entropic cost for side chain localization in proteins is prepaid prior to the onset of folding events because conformational bias is built into the chain at the covalent level. Furthermore, side chain conformations are seen to experience fewer steric restrictions for backbone conformations in either the α or β basins, those map regions where repetitive ϕ,ψ-angles result in α-helices or strands of β-sheet, respectively. Here, these α and β basins are entropically favored for steric reasons alone; a blocked monopeptide is too short to accommodate the peptide hydrogen bonds that stabilize repetitive secondary structure. Thus, despite differing energetics, α/β-basins are favored for both monopeptides and repetitive secondary structure, underpinning an energetically unfrustrated compatibility between these two levels of protein structure.
Collapse
Affiliation(s)
- George D Rose
- T.C. Jenkins Department of Biophysics, Johns Hopkins University, Baltimore, Maryland
| |
Collapse
|
14
|
Villa F, Simonson T. Protein pKa’s from Adaptive Landscape Flattening Instead of Constant-pH Simulations. J Chem Theory Comput 2018; 14:6714-6721. [DOI: 10.1021/acs.jctc.8b00970] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Francesco Villa
- Laboratoire de Biochimie (CNRS UMR7654), Ecole Polytechnique, Palaiseau, France
| | - Thomas Simonson
- Laboratoire de Biochimie (CNRS UMR7654), Ecole Polytechnique, Palaiseau, France
| |
Collapse
|
15
|
Charpentier A, Mignon D, Barbe S, Cortes J, Schiex T, Simonson T, Allouche D. Variable Neighborhood Search with Cost Function Networks To Solve Large Computational Protein Design Problems. J Chem Inf Model 2018; 59:127-136. [DOI: 10.1021/acs.jcim.8b00510] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
| | - David Mignon
- Laboratoire de Biochimie (CNRS UMR 7654), École Polytechnique, 91128 Palaiseau, France
| | - Sophie Barbe
- Laboratoire d’Ingénierie des Systèmes Biologiques et Procédés, LISBP, Université de Toulouse, CNRS, INRA, INSA, 31077 Toulouse, France
| | - Juan Cortes
- LAAS-CNRS, Université de Toulouse, CNRS, 31400 Toulouse, France
| | - Thomas Schiex
- MIAT, Université de Toulouse, INRA, 31326 Castanet-Tolosan, France
| | - Thomas Simonson
- Laboratoire de Biochimie (CNRS UMR 7654), École Polytechnique, 91128 Palaiseau, France
| | - David Allouche
- MIAT, Université de Toulouse, INRA, 31326 Castanet-Tolosan, France
| |
Collapse
|
16
|
Bywater RP. Why twenty amino acid residue types suffice(d) to support all living systems. PLoS One 2018; 13:e0204883. [PMID: 30321190 PMCID: PMC6188899 DOI: 10.1371/journal.pone.0204883] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2017] [Accepted: 09/17/2018] [Indexed: 11/21/2022] Open
Abstract
It is well known that proteins are built up from an alphabet of 20 different amino acid types. These suffice to enable the protein to fold into its operative form relevant to its required functional roles. For carrying out these allotted functions, there may in some cases be a need for post-translational modifications and it has been established that an additional three types of amino acid have at some point been recruited into this process. But it still remains the case that the 20 residue types referred to are the major building blocks in all terrestrial proteins, and probably "universally". Given this fact, it is surprising that no satisfactory answer has been given to the two questions: "why 20?" and "why just these 20?". Furthermore, a suggestion is made as to how these 20 map to the codon repertoire which in principle has the capacity to cater for 64 different residue types. Attempts are made in this paper to answer these questions by employing a combination of quantum chemical and chemoinformatic tools which are applied to the standard 20 amino acid types as well as 3 “non-standard” types found in nature, a set of fictitious but feasible analog structures designed to test the need for greater coverage of function space and the collection of candidate alternative structures found either on meteorites or in experiments designed to reconstruct pre-life scenarios.
Collapse
|
17
|
Villa F, Panel N, Chen X, Simonson T. Adaptive landscape flattening in amino acid sequence space for the computational design of protein:peptide binding. J Chem Phys 2018; 149:072302. [PMID: 30134674 DOI: 10.1063/1.5022249] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
For the high throughput design of protein:peptide binding, one must explore a vast space of amino acid sequences in search of low binding free energies. This complex problem is usually addressed with either simple heuristic scoring or expensive sequence enumeration schemes. Far more efficient than enumeration is a recent Monte Carlo approach that adaptively flattens the energy landscape in sequence space of the unbound peptide and provides formally exact binding free energy differences. The method allows the binding free energy to be used directly as the design criterion. We propose several improvements that allow still more efficient sampling and can address larger design problems. They include the use of Replica Exchange Monte Carlo and landscape flattening for both the unbound and bound peptides. We used the method to design peptides that bind to the PDZ domain of the Tiam1 signaling protein and could serve as inhibitors of its activity. Four peptide positions were allowed to mutate freely. Almost 75 000 peptide variants were processed in two simulations of 109 steps each that used 1 CPU hour on a desktop machine. 96% of the theoretical sequence space was sampled. The relative binding free energies agreed qualitatively with values from experiment. The sampled sequences agreed qualitatively with an experimental library of Tiam1-binding peptides. The main assumption limiting accuracy is the fixed backbone approximation, which could be alleviated in future work by using increased computational resources and multi-backbone designs.
Collapse
Affiliation(s)
- Francesco Villa
- Laboratoire de Biochimie (CNRS UMR7654), Ecole Polytechnique, Palaiseau, France
| | - Nicolas Panel
- Laboratoire de Biochimie (CNRS UMR7654), Ecole Polytechnique, Palaiseau, France
| | - Xingyu Chen
- Laboratoire de Biochimie (CNRS UMR7654), Ecole Polytechnique, Palaiseau, France
| | - Thomas Simonson
- Laboratoire de Biochimie (CNRS UMR7654), Ecole Polytechnique, Palaiseau, France
| |
Collapse
|
18
|
Wang H, Liu F, Dong T, Du L, Zhang D, Gao J. Charge-Transfer Knowledge Graph among Amino Acids Derived from High-Throughput Electronic Structure Calculations for Protein Database. ACS OMEGA 2018; 3:4094-4104. [PMID: 31458645 PMCID: PMC6641752 DOI: 10.1021/acsomega.8b00336] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/24/2018] [Accepted: 03/30/2018] [Indexed: 05/25/2023]
Abstract
The charge-transfer coupling is an important component in tight-binding methods. Because of the highly complex chemical structure of biomolecules, the anisotropic feature of charge-transfer couplings in realistic proteins cannot be ignored. In this work, we have performed the first large-scale quantitative assessment of charge-transfer preference by calculating the charge-transfer couplings in all 20 × 20 possible amino acid side-chain combinations, which are extracted from available high-quality structures of thousands of protein complexes. The charge-transfer database quantitatively shows distinct features of charge-transfer couplings among millions of amino acid side-chain combinations. The overall distribution of charge-transfer couplings reveals that only one average or representative structure cannot be regarded as the typical charge-transfer preference in realistic proteins. This work provides us an alternative route to comprehensively understand the charge-transfer couplings for the overall distribution of realistic proteins in the foreseen big data scenario.
Collapse
Affiliation(s)
- Hongwei Wang
- Hubei
Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, P. R. China
| | - Fang Liu
- Hubei
Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, P. R. China
| | - Tiange Dong
- Hubei
Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, P. R. China
| | - Likai Du
- Hubei
Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, P. R. China
| | - Dongju Zhang
- Institute
of Theoretical Chemistry, Shandong University, Jinan 250100, P. R. China
| | - Jun Gao
- Hubei
Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, P. R. China
| |
Collapse
|
19
|
Leem J, Georges G, Shi J, Deane CM. Antibody side chain conformations are position-dependent. Proteins 2018; 86:383-392. [PMID: 29318667 DOI: 10.1002/prot.25453] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2017] [Revised: 12/15/2017] [Accepted: 01/05/2018] [Indexed: 11/11/2022]
Abstract
Side chain prediction is an integral component of computational antibody design and structure prediction. Current antibody modelling tools use backbone-dependent rotamer libraries with conformations taken from general proteins. Here we present our antibody-specific rotamer library, where rotamers are binned according to their immunogenetics (IMGT) position, rather than their local backbone geometry. We find that for some amino acid types at certain positions, only a restricted number of side chain conformations are ever observed. Using this information, we are able to reduce the breadth of the rotamer sampling space. Based on our rotamer library, we built a side chain predictor, position-dependent antibody rotamer swapper (PEARS). On a blind test set of 95 antibody model structures, PEARS had the highest average χ1 and χ1+2 accuracy (78.7% and 64.8%) compared to three leading backbone-dependent side chain predictors. Our use of IMGT position, rather than backbone ϕ/ψ, meant that PEARS was more robust to errors in the backbone of the model structure. PEARS also achieved the lowest number of side chain-side chain clashes. PEARS is freely available as a web application at http://opig.stats.ox.ac.uk/webapps/pears.
Collapse
Affiliation(s)
- Jinwoo Leem
- Department of Statistics, University of Oxford, 24-29 St Giles, Oxford, OX1 3LB, United Kingdom
| | - Guy Georges
- Pharma Research and Early Development, Large Molecule Research, Roche Innovation Center Munich, Nonnenwald 2, Penzberg, 82377, Germany
| | - Jiye Shi
- Chemistry Department, UCB, 208 Bath Road, Slough, SL1 3WE, United Kingdom
| | - Charlotte M Deane
- Department of Statistics, University of Oxford, 24-29 St Giles, Oxford, OX1 3LB, United Kingdom
| |
Collapse
|
20
|
Mignon D, Panel N, Chen X, Fuentes EJ, Simonson T. Computational Design of the Tiam1 PDZ Domain and Its Ligand Binding. J Chem Theory Comput 2017; 13:2271-2289. [PMID: 28394603 DOI: 10.1021/acs.jctc.6b01255] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
PDZ domains direct protein-protein interactions and serve as models for protein design. Here, we optimized a protein design energy function for the Tiam1 and Cask PDZ domains that combines a molecular mechanics energy, Generalized Born solvent, and an empirical unfolded state model. Designed sequences were recognized as PDZ domains by the Superfamily fold recognition tool and had similarity scores comparable to natural PDZ sequences. The optimized model was used to redesign the two PDZ domains, by gradually varying the chemical potential of hydrophobic amino acids; the tendency of each position to lose or gain a hydrophobic character represents a novel hydrophobicity index. We also redesigned four positions in the Tiam1 PDZ domain involved in peptide binding specificity. The calculated affinity differences between designed variants reproduced experimental data and suggest substitutions with altered specificities.
Collapse
Affiliation(s)
- David Mignon
- Laboratoire de Biochimie (CNRS UMR7654), Ecole Polytechnique , Palaiseau, France
| | - Nicolas Panel
- Laboratoire de Biochimie (CNRS UMR7654), Ecole Polytechnique , Palaiseau, France
| | - Xingyu Chen
- Laboratoire de Biochimie (CNRS UMR7654), Ecole Polytechnique , Palaiseau, France
| | - Ernesto J Fuentes
- Department of Biochemistry, Roy J. & Lucille A. Carver College of Medicine and Holden Comprehensive Cancer Center, University of Iowa , Iowa City, Iowa 52242-1109, United States
| | - Thomas Simonson
- Laboratoire de Biochimie (CNRS UMR7654), Ecole Polytechnique , Palaiseau, France
| |
Collapse
|
21
|
Abstract
Computational protein design (CPD), a yet evolving field, includes computer-aided engineering for partial or full de novo designs of proteins of interest. Designs are defined by a requested structure, function, or working environment. This chapter describes the birth and maturation of the field by presenting 101 CPD examples in a chronological order emphasizing achievements and pending challenges. Integrating these aspects presents the plethora of CPD approaches with the hope of providing a "CPD 101". These reflect on the broader structural bioinformatics and computational biophysics field and include: (1) integration of knowledge-based and energy-based methods, (2) hierarchical designated approach towards local, regional, and global motifs and the integration of high- and low-resolution design schemes that fit each such region, (3) systematic differential approaches towards different protein regions, (4) identification of key hot-spot residues and the relative effect of remote regions, (5) assessment of shape-complementarity, electrostatics and solvation effects, (6) integration of thermal plasticity and functional dynamics, (7) negative design, (8) systematic integration of experimental approaches, (9) objective cross-assessment of methods, and (10) successful ranking of potential designs. Future challenges also include dissemination of CPD software to the general use of life-sciences researchers and the emphasis of success within an in vivo milieu. CPD increases our understanding of protein structure and function and the relationships between the two along with the application of such know-how for the benefit of mankind. Applied aspects range from biological drugs, via healthier and tastier food products to nanotechnology and environmentally friendly enzymes replacing toxic chemicals utilized in the industry.
Collapse
|
22
|
Druart K, Bigot J, Audit E, Simonson T. A Hybrid Monte Carlo Scheme for Multibackbone Protein Design. J Chem Theory Comput 2016; 12:6035-6048. [DOI: 10.1021/acs.jctc.6b00421] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- Karen Druart
- Laboratoire
de Biochimie (CNRS UMR7654), Ecole Polytechnique, Palaiseau, France
- Maison
de la Simulation, CEA, CNRS, Univ. Paris-Sud, UVSQ, Université Paris-Saclay, 91191 Gif-sur-Yvette, France
| | - Julien Bigot
- Maison
de la Simulation, CEA, CNRS, Univ. Paris-Sud, UVSQ, Université Paris-Saclay, 91191 Gif-sur-Yvette, France
| | - Edouard Audit
- Maison
de la Simulation, CEA, CNRS, Univ. Paris-Sud, UVSQ, Université Paris-Saclay, 91191 Gif-sur-Yvette, France
| | - Thomas Simonson
- Laboratoire
de Biochimie (CNRS UMR7654), Ecole Polytechnique, Palaiseau, France
| |
Collapse
|
23
|
Hintze BJ, Lewis SM, Richardson JS, Richardson DC. Molprobity's ultimate rotamer-library distributions for model validation. Proteins 2016; 84:1177-89. [PMID: 27018641 DOI: 10.1002/prot.25039] [Citation(s) in RCA: 69] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2016] [Revised: 03/16/2016] [Accepted: 03/18/2016] [Indexed: 12/22/2022]
Abstract
Here we describe the updated MolProbity rotamer-library distributions derived from an order-of-magnitude larger and more stringently quality-filtered dataset of about 8000 (vs. 500) protein chains, and we explain the resulting changes and improvements to model validation as seen by users. To include only side-chains with satisfactory justification for their given conformation, we added residue-specific filters for electron-density value and model-to-density fit. The combined new protocol retains a million residues of data, while cleaning up false-positive noise in the multi- χ datapoint distributions. It enables unambiguous characterization of conformational clusters nearly 1000-fold less frequent than the most common ones. We describe examples of local interactions that favor these rare conformations, including the role of authentic covalent bond-angle deviations in enabling presumably strained side-chain conformations. Further, along with favored and outlier, an allowed category (0.3-2.0% occurrence in reference data) has been added, analogous to Ramachandran validation categories. The new rotamer distributions are used for current rotamer validation in MolProbity and PHENIX, and for rotamer choice in PHENIX model-building and refinement. The multi-dimensional χ distributions and Top8000 reference dataset are freely available on GitHub. These rotamers are termed "ultimate" because data sampling and quality are now fully adequate for this task, and also because we believe the future of conformational validation should integrate side-chain with backbone criteria. Proteins 2016; 84:1177-1189. © 2016 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Bradley J Hintze
- Department of Biochemistry, Duke University, Durham North Carolina 27710
| | - Steven M Lewis
- Department of Biochemistry, Duke University, Durham North Carolina 27710
| | - Jane S Richardson
- Department of Biochemistry, Duke University, Durham North Carolina 27710
| | - David C Richardson
- Department of Biochemistry, Duke University, Durham North Carolina 27710
| |
Collapse
|
24
|
Mignon D, Simonson T. Comparing three stochastic search algorithms for computational protein design: Monte Carlo, replica exchange Monte Carlo, and a multistart, steepest-descent heuristic. J Comput Chem 2016; 37:1781-93. [PMID: 27197555 DOI: 10.1002/jcc.24393] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2015] [Revised: 02/26/2016] [Accepted: 03/27/2016] [Indexed: 01/11/2023]
Abstract
Computational protein design depends on an energy function and an algorithm to search the sequence/conformation space. We compare three stochastic search algorithms: a heuristic, Monte Carlo (MC), and a Replica Exchange Monte Carlo method (REMC). The heuristic performs a steepest-descent minimization starting from thousands of random starting points. The methods are applied to nine test proteins from three structural families, with a fixed backbone structure, a molecular mechanics energy function, and with 1, 5, 10, 20, 30, or all amino acids allowed to mutate. Results are compared to an exact, "Cost Function Network" method that identifies the global minimum energy conformation (GMEC) in favorable cases. The designed sequences accurately reproduce experimental sequences in the hydrophobic core. The heuristic and REMC agree closely and reproduce the GMEC when it is known, with a few exceptions. Plain MC performs well for most cases, occasionally departing from the GMEC by 3-4 kcal/mol. With REMC, the diversity of the sequences sampled agrees with exact enumeration where the latter is possible: up to 2 kcal/mol above the GMEC. Beyond, room temperature replicas sample sequences up to 10 kcal/mol above the GMEC, providing thermal averages and a solution to the inverse protein folding problem. © 2016 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- David Mignon
- Laboratoire De Biochimie (UMR CNRS 7654), Department Of Biology, Ecole Polytechnique, Palaiseau, France
| | - Thomas Simonson
- Laboratoire De Biochimie (UMR CNRS 7654), Department Of Biology, Ecole Polytechnique, Palaiseau, France
| |
Collapse
|
25
|
Ryu J, Lee M, Cha J, Laskowski RA, Ryu SE, Kim DS. BetaSCPWeb: side-chain prediction for protein structures using Voronoi diagrams and geometry prioritization. Nucleic Acids Res 2016; 44:W416-23. [PMID: 27151195 PMCID: PMC4987919 DOI: 10.1093/nar/gkw368] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2016] [Accepted: 04/23/2016] [Indexed: 11/13/2022] Open
Abstract
Many applications, such as protein design, homology modeling, flexible docking, etc. require the prediction of a protein's optimal side-chain conformations from just its amino acid sequence and backbone structure. Side-chain prediction (SCP) is an NP-hard energy minimization problem. Here, we present BetaSCPWeb which efficiently computes a conformation close to optimal using a geometry-prioritization method based on the Voronoi diagram of spherical atoms. Its outputs are visual, textual and PDB file format. The web server is free and open to all users at http://voronoi.hanyang.ac.kr/betascpweb with no login requirement.
Collapse
Affiliation(s)
- Joonghyun Ryu
- Vorononi Diagram Research Center, Hanyang University, Korea
| | - Mokwon Lee
- Vorononi Diagram Research Center, Hanyang University, Korea
| | - Jehyun Cha
- Vorononi Diagram Research Center, Hanyang University, Korea
| | | | - Seong Eon Ryu
- Department of Bioengineering, Hanyang University, Korea
| | - Deok-Soo Kim
- School of Mechanical Engineering, Hanyang University, Korea
| |
Collapse
|
26
|
Gaillard T, Panel N, Simonson T. Protein side chain conformation predictions with an MMGBSA energy function. Proteins 2016; 84:803-19. [PMID: 26948696 DOI: 10.1002/prot.25030] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2015] [Revised: 02/22/2016] [Accepted: 02/27/2016] [Indexed: 12/17/2022]
Abstract
The prediction of protein side chain conformations from backbone coordinates is an important task in structural biology, with applications in structure prediction and protein design. It is a difficult problem due to its combinatorial nature. We study the performance of an "MMGBSA" energy function, implemented in our protein design program Proteus, which combines molecular mechanics terms, a Generalized Born and Surface Area (GBSA) solvent model, with approximations that make the model pairwise additive. Proteus is not a competitor to specialized side chain prediction programs due to its cost, but it allows protein design applications, where side chain prediction is an important step and MMGBSA an effective energy model. We predict the side chain conformations for 18 proteins. The side chains are first predicted individually, with the rest of the protein in its crystallographic conformation. Next, all side chains are predicted together. The contributions of individual energy terms are evaluated and various parameterizations are compared. We find that the GB and SA terms, with an appropriate choice of the dielectric constant and surface energy coefficients, are beneficial for single side chain predictions. For the prediction of all side chains, however, errors due to the pairwise additive approximation overcome the improvement brought by these terms. We also show the crucial contribution of side chain minimization to alleviate the rigid rotamer approximation. Even without GB and SA terms, we obtain accuracies comparable to SCWRL4, a specialized side chain prediction program. In particular, we obtain a better RMSD than SCWRL4 for core residues (at a higher cost), despite our simpler rotamer library. Proteins 2016; 84:803-819. © 2016 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Thomas Gaillard
- Department of Biology, Laboratoire de Biochimie (CNRS UMR7654), Ecole Polytechnique, Palaiseau, 91128, France
| | - Nicolas Panel
- Department of Biology, Laboratoire de Biochimie (CNRS UMR7654), Ecole Polytechnique, Palaiseau, 91128, France
| | - Thomas Simonson
- Department of Biology, Laboratoire de Biochimie (CNRS UMR7654), Ecole Polytechnique, Palaiseau, 91128, France
| |
Collapse
|
27
|
Khan FI, Wei DQ, Gu KR, Hassan MI, Tabrez S. Current updates on computer aided protein modeling and designing. Int J Biol Macromol 2016; 85:48-62. [DOI: 10.1016/j.ijbiomac.2015.12.072] [Citation(s) in RCA: 72] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2015] [Revised: 12/17/2015] [Accepted: 12/21/2015] [Indexed: 12/15/2022]
|
28
|
Taghizadeh M, Goliaei B, Madadkar-Sobhani A. SDRL: a sequence-dependent protein side-chain rotamer library. MOLECULAR BIOSYSTEMS 2016; 11:2000-7. [PMID: 25953624 DOI: 10.1039/c5mb00057b] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Since the introduction of the first protein side-chain rotamer library (RL) almost half a century ago, RLs have been components of many programs and algorithms in structural bioinformatics. Based on the dependence of side-chain dihedral angles on the local backbone, three types of RLs have been identified: backbone-independent, secondary-structure-dependent and backbone-dependent. In all previous studies, the effect of sequence specificity on side-chain conformational preferences was neglected. In the effort to develop a new class of RLs, we considered that the side-chain conformation of the central residue in each triplet on a protein backbone depends on the sequence of the triplet; therefore, we developed a sequence-dependent rotamer library (SDRL). To accomplish this, 400 possible triplet sequences for 18 natural amino acids as the central residue, which corresponds to 7200 triplet sequences in total, were considered. Searching the set of 11 546 selected PDB entries for the 7200 triplet sequences resulted in 2 364 541 instances occurring for 18 amino acids. Our results show that Leu and Val experience minimal impact from the adjacent residues in adopting side-chain conformations. Cys, Ile, Trp, His, Asp, Met, Glu, Gln, Arg and Lys, on the other hand, adopt their side-chain conformations mostly based on the adjacent residues on the backbone. The remaining residue types were moderately dependent on the adjacent residues. Using the new library, side-chain repacking algorithms can find preferred conformations of each residue more easily than with other backbone-independent RLs.
Collapse
Affiliation(s)
- Mohammad Taghizadeh
- Laboratory of Biophysics and Molecular Biology, Institute of Biochemistry and Biophysics (IBB), Tehran University, P.O. Box 13145-1384, Tehran, Iran.
| | | | | |
Collapse
|
29
|
Simonson T, Ye-Lehmann S, Palmai Z, Amara N, Wydau-Dematteis S, Bigan E, Druart K, Moch C, Plateau P. Redesigning the stereospecificity of tyrosyl-tRNA synthetase. Proteins 2016; 84:240-53. [PMID: 26676967 DOI: 10.1002/prot.24972] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2015] [Revised: 09/30/2015] [Accepted: 11/26/2015] [Indexed: 12/14/2022]
Abstract
D-Amino acids are largely excluded from protein synthesis, yet they are of great interest in biotechnology. Unnatural amino acids have been introduced into proteins using engineered aminoacyl-tRNA synthetases (aaRSs), and this strategy might be applicable to D-amino acids. Several aaRSs can aminoacylate their tRNA with a D-amino acid; of these, tyrosyl-tRNA synthetase (TyrRS) has the weakest stereospecificity. We use computational protein design to suggest active site mutations in Escherichia coli TyrRS that could increase its D-Tyr binding further, relative to L-Tyr. The mutations selected all modify one or more sidechain charges in the Tyr binding pocket. We test their effect by probing the aminoacyl-adenylation reaction through pyrophosphate exchange experiments. We also perform extensive alchemical free energy simulations to obtain L-Tyr/D-Tyr binding free energy differences. Agreement with experiment is good, validating the structural models and detailed thermodynamic predictions the simulations provide. The TyrRS stereospecificity proves hard to engineer through charge-altering mutations in the first and second coordination shells of the Tyr ammonium group. Of six mutants tested, two are active towards D-Tyr; one of these has an inverted stereospecificity, with a large preference for D-Tyr. However, its activity is low. Evidently, the TyrRS stereospecificity is robust towards charge rearrangements near the ligand. Future design may have to consider more distant and/or electrically neutral target mutations, and possibly design for binding of the transition state, whose structure however can only be modeled.
Collapse
Affiliation(s)
- Thomas Simonson
- Department of Biology, Laboratoire De Biochimie (CNRS UMR7654), Ecole Polytechnique, Palaiseau, 91128, France
| | | | - Zoltan Palmai
- Department of Biology, Laboratoire De Biochimie (CNRS UMR7654), Ecole Polytechnique, Palaiseau, 91128, France
| | - Najette Amara
- Department of Biology, Laboratoire De Biochimie (CNRS UMR7654), Ecole Polytechnique, Palaiseau, 91128, France
| | - Sandra Wydau-Dematteis
- Department of Biology, Laboratoire De Biochimie (CNRS UMR7654), Ecole Polytechnique, Palaiseau, 91128, France
| | - Erwan Bigan
- Department of Biology, Laboratoire De Biochimie (CNRS UMR7654), Ecole Polytechnique, Palaiseau, 91128, France
| | - Karen Druart
- Department of Biology, Laboratoire De Biochimie (CNRS UMR7654), Ecole Polytechnique, Palaiseau, 91128, France
| | - Clara Moch
- Department of Biology, Laboratoire De Biochimie (CNRS UMR7654), Ecole Polytechnique, Palaiseau, 91128, France
| | - Pierre Plateau
- Department of Biology, Laboratoire De Biochimie (CNRS UMR7654), Ecole Polytechnique, Palaiseau, 91128, France
| |
Collapse
|
30
|
Polydorides S, Michael E, Mignon D, Druart K, Archontis G, Simonson T. Proteus and the Design of Ligand Binding Sites. Methods Mol Biol 2016; 1414:77-97. [PMID: 27094287 DOI: 10.1007/978-1-4939-3569-7_6] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
This chapter describes the organization and use of Proteus, a multitool computational suite for the optimization of protein and ligand conformations and sequences, and the calculation of pK α shifts and relative binding affinities. The software offers the use of several molecular mechanics force fields and solvent models, including two generalized Born variants, and a large range of scoring functions, which can combine protein stability, ligand affinity, and ligand specificity terms, for positive and negative design. We present in detail the steps for structure preparation, system setup, construction of the interaction energy matrix, protein sequence and structure optimizations, pK α calculations, and ligand titration calculations. We discuss illustrative examples, including the chemical/structural optimization of a complex between the MHC class II protein HLA-DQ8 and the vinculin epitope, and the chemical optimization of the compstatin analog Ac-Val4Trp/His9Ala, which regulates the function of protein C3 of the complement system.
Collapse
Affiliation(s)
- Savvas Polydorides
- Theoretical and Computational Biophysics Group, Department of Physics, University of Cyprus, 1678, Nicosia, Cyprus
| | - Eleni Michael
- Theoretical and Computational Biophysics Group, Department of Physics, University of Cyprus, 1678, Nicosia, Cyprus
| | - David Mignon
- Department of Biology, Laboratoire de Biochimie (CNRS UMR7654), Ecole Polytechnique, 91128, Palaiseau, France
| | - Karen Druart
- Department of Biology, Laboratoire de Biochimie (CNRS UMR7654), Ecole Polytechnique, 91128, Palaiseau, France
| | - Georgios Archontis
- Theoretical and Computational Biophysics Group, Department of Physics, University of Cyprus, 1678, Nicosia, Cyprus.
| | - Thomas Simonson
- Department of Biology, Laboratoire de Biochimie (CNRS UMR7654), Ecole Polytechnique, 91128, Palaiseau, France.
| |
Collapse
|
31
|
Druart K, Palmai Z, Omarjee E, Simonson T. Protein:Ligand binding free energies: A stringent test for computational protein design. J Comput Chem 2015; 37:404-15. [PMID: 26503829 DOI: 10.1002/jcc.24230] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2015] [Revised: 10/01/2015] [Accepted: 10/02/2015] [Indexed: 01/29/2023]
Abstract
A computational protein design method is extended to allow Monte Carlo simulations where two ligands are titrated into a protein binding pocket, yielding binding free energy differences. These provide a stringent test of the physical model, including the energy surface and sidechain rotamer definition. As a test, we consider tyrosyl-tRNA synthetase (TyrRS), which has been extensively redesigned experimentally. We consider its specificity for its substrate l-tyrosine (l-Tyr), compared to the analogs d-Tyr, p-acetyl-, and p-azido-phenylalanine (ac-Phe, az-Phe). We simulate l- and d-Tyr binding to TyrRS and six mutants, and compare the structures and binding free energies to a more rigorous "MD/GBSA" procedure: molecular dynamics with explicit solvent for structures and a Generalized Born + Surface Area model for binding free energies. Next, we consider l-Tyr, ac- and az-Phe binding to six other TyrRS variants. The titration results are sensitive to the precise rotamer definition, which involves a short energy minimization for each sidechain pair to help relax bad contacts induced by the discrete rotamer set. However, when designed mutant structures are rescored with a standard GBSA energy model, results agree well with the more rigorous MD/GBSA. As a third test, we redesign three amino acid positions in the substrate coordination sphere, with either l-Tyr or d-Tyr as the ligand. For two, we obtain good agreement with experiment, recovering the wildtype residue when l-Tyr is the ligand and a d-Tyr specific mutant when d-Tyr is the ligand. For the third, we recover His with either ligand, instead of wildtype Gln.
Collapse
Affiliation(s)
- Karen Druart
- Laboratoire De Biochimie (UMR CNRS 7654), Department of Biology, Ecole Polytechnique, Palaiseau, France
| | - Zoltan Palmai
- Laboratoire De Biochimie (UMR CNRS 7654), Department of Biology, Ecole Polytechnique, Palaiseau, France
| | - Eyaz Omarjee
- Laboratoire De Biochimie (UMR CNRS 7654), Department of Biology, Ecole Polytechnique, Palaiseau, France
| | - Thomas Simonson
- Laboratoire De Biochimie (UMR CNRS 7654), Department of Biology, Ecole Polytechnique, Palaiseau, France
| |
Collapse
|
32
|
Three-dimensional protein structure prediction: Methods and computational strategies. Comput Biol Chem 2014; 53PB:251-276. [DOI: 10.1016/j.compbiolchem.2014.10.001] [Citation(s) in RCA: 121] [Impact Index Per Article: 12.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2014] [Revised: 10/03/2014] [Accepted: 10/07/2014] [Indexed: 01/01/2023]
|
33
|
Smadbeck J, Chan KH, Khoury GA, Xue B, Robinson RC, Hauser CAE, Floudas CA. De novo design and experimental characterization of ultrashort self-associating peptides. PLoS Comput Biol 2014; 10:e1003718. [PMID: 25010703 PMCID: PMC4091692 DOI: 10.1371/journal.pcbi.1003718] [Citation(s) in RCA: 32] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2014] [Accepted: 05/31/2014] [Indexed: 12/19/2022] Open
Abstract
Self-association is a common phenomenon in biology and one that can have positive and negative impacts, from the construction of the architectural cytoskeleton of cells to the formation of fibrils in amyloid diseases. Understanding the nature and mechanisms of self-association is important for modulating these systems and in creating biologically-inspired materials. Here, we present a two-stage de novo peptide design framework that can generate novel self-associating peptide systems. The first stage uses a simulated multimeric template structure as input into the optimization-based Sequence Selection to generate low potential energy sequences. The second stage is a computational validation procedure that calculates Fold Specificity and/or Approximate Association Affinity (K*association) based on metrics that we have devised for multimeric systems. This framework was applied to the design of self-associating tripeptides using the known self-associating tripeptide, Ac-IVD, as a structural template. Six computationally predicted tripeptides (Ac-LVE, Ac-YYD, Ac-LLE, Ac-YLD, Ac-MYD, Ac-VIE) were chosen for experimental validation in order to illustrate the self-association outcomes predicted by the three metrics. Self-association and electron microscopy studies revealed that Ac-LLE formed bead-like microstructures, Ac-LVE and Ac-YYD formed fibrillar aggregates, Ac-VIE and Ac-MYD formed hydrogels, and Ac-YLD crystallized under ambient conditions. An X-ray crystallographic study was carried out on a single crystal of Ac-YLD, which revealed that each molecule adopts a β-strand conformation that stack together to form parallel β-sheets. As an additional validation of the approach, the hydrogel-forming sequences of Ac-MYD and Ac-VIE were shuffled. The shuffled sequences were computationally predicted to have lower K*association values and were experimentally verified to not form hydrogels. This illustrates the robustness of the framework in predicting self-associating tripeptides. We expect that this enhanced multimeric de novo peptide design framework will find future application in creating novel self-associating peptides based on unnatural amino acids, and inhibitor peptides of detrimental self-aggregating biological proteins.
Collapse
Affiliation(s)
- James Smadbeck
- Department of Chemical and Biological Engineering, Princeton University, Princeton, New Jersey, United States of America
| | - Kiat Hwa Chan
- Institute of Bioengineering and Nanotechnology, Singapore, Singapore
| | - George A. Khoury
- Department of Chemical and Biological Engineering, Princeton University, Princeton, New Jersey, United States of America
| | - Bo Xue
- Institute of Molecular and Cell Biology, A*STAR (Agency of Science, Technology and Research), Biopolis, Singapore, Singapore
| | - Robert C. Robinson
- Institute of Molecular and Cell Biology, A*STAR (Agency of Science, Technology and Research), Biopolis, Singapore, Singapore
| | | | - Christodoulos A. Floudas
- Department of Chemical and Biological Engineering, Princeton University, Princeton, New Jersey, United States of America
| |
Collapse
|
34
|
Francis-Lyon P, Koehl P. Protein side-chain modeling with a protein-dependent optimized rotamer library. Proteins 2014; 82:2000-17. [PMID: 24623614 DOI: 10.1002/prot.24555] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2013] [Revised: 02/28/2014] [Accepted: 03/07/2014] [Indexed: 12/16/2022]
Abstract
Despite years of effort, the problem of predicting the conformations of protein side chains remains a subject of inquiry. This problem has three major issues, namely defining the conformations that a side chain may adopt within a protein, developing a sampling procedure for generating possible side-chain packings, and defining a scoring function that can rank these possible packings. To solve the former of these issues, most procedures rely on a rotamer library derived from databases of known protein structures. We introduce an alternative method that is free of statistics. We begin with a rotamer library that is based only on stereochemical considerations; this rotamer library is then optimized independently for each protein under study. We show that this optimization step restores the diversity of conformations observed in native proteins. We combine this protein-dependent rotamer library (PDRL) method with the self-consistent mean field (SCMF) sampling approach and a physics-based scoring function into a new side-chain prediction method, SCMF-PDRL. Using two large test sets of 831 and 378 proteins, respectively, we show that this new method compares favorably with competing methods such as SCAP, OPUS-Rota, and SCWRL4 for energy-minimized structures.
Collapse
Affiliation(s)
- Patricia Francis-Lyon
- Department of Computer Science, University of San Francisco, San Francisco, California, 94117
| | | |
Collapse
|
35
|
Abstract
Modeling of side-chain conformations on a fixed protein backbone, also called side-chain packing, plays an important role in protein structure prediction, protein design, molecular docking, and functional analysis. RASP, or RApid Side-chain Predictor, is a recently developed program that can model protein side-chain conformations with both high accuracy and high speed. Moreover, it can generate structures with few atomic clashes. This chapter first provides a brief introduction to the principle and performances of the RASP package. Then details on how to use RASP programs to predict protein side-chain conformations are elaborated. Finally, it describes case studies for structure refinement in homology modeling and residue substitution.
Collapse
|
36
|
Webb B, Eswar N, Fan H, Khuri N, Pieper U, Dong G, Sali A. Comparative Modeling of Drug Target Proteins☆. REFERENCE MODULE IN CHEMISTRY, MOLECULAR SCIENCES AND CHEMICAL ENGINEERING 2014. [PMCID: PMC7157477 DOI: 10.1016/b978-0-12-409547-2.11133-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
In this perspective, we begin by describing the comparative protein structure modeling technique and the accuracy of the corresponding models. We then discuss the significant role that comparative prediction plays in drug discovery. We focus on virtual ligand screening against comparative models and illustrate the state-of-the-art by a number of specific examples.
Collapse
|
37
|
Polydorides S, Simonson T. Monte Carlo simulations of proteins at constant pH with generalized Born solvent, flexible sidechains, and an effective dielectric boundary. J Comput Chem 2013; 34:2742-56. [PMID: 24122878 DOI: 10.1002/jcc.23450] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2013] [Revised: 09/04/2013] [Accepted: 09/08/2013] [Indexed: 12/11/2022]
Abstract
Titratable residues determine the acid/base behavior of proteins, strongly influencing their function; in addition, proton binding is a valuable reporter on electrostatic interactions. We describe a method for pK(a) calculations, using constant-pH Monte Carlo (MC) simulations to explore the space of sidechain conformations and protonation states, with an efficient and accurate generalized Born model (GB) for the solvent effects. To overcome the many-body dependency of the GB model, we use a "Native Environment" approximation, whose accuracy is shown to be good. It allows the precalculation and storage of interactions between all sidechain pairs, a strategy borrowed from computational protein design, which makes the MC simulations themselves very fast. The method is tested for 12 proteins and 167 titratable sidechains. It gives an rms error of 1.1 pH units, similar to the trivial "Null" model. The only adjustable parameter is the protein dielectric constant. The best accuracy is achieved for values between 4 and 8, a range that is physically plausible for a protein interior. For sidechains with large pKa shifts, ≥2, the rms error is 1.6, compared to 2.5 with the Null model and 1.5 with the empirical PROPKA method.
Collapse
Affiliation(s)
- Savvas Polydorides
- Department of Biology, Laboratoire de Biochimie (CNRS UMR7654), Ecole Polytechnique, 91128, Palaiseau, France
| | | |
Collapse
|
38
|
Simonson T. What Is the Dielectric Constant of a Protein When Its Backbone Is Fixed? J Chem Theory Comput 2013; 9:4603-8. [DOI: 10.1021/ct400398e] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Thomas Simonson
- Laboratoire de Biochimie
(CNRS UMR7654), Department of Biology, Ecole Polytechnique, 91128 Palaiseau, France
| |
Collapse
|
39
|
Simonson T, Gaillard T, Mignon D, Schmidt am Busch M, Lopes A, Amara N, Polydorides S, Sedano A, Druart K, Archontis G. Computational protein design: the Proteus software and selected applications. J Comput Chem 2013; 34:2472-84. [PMID: 24037756 DOI: 10.1002/jcc.23418] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2013] [Revised: 07/08/2013] [Accepted: 07/28/2013] [Indexed: 12/13/2022]
Abstract
We describe an automated procedure for protein design, implemented in a flexible software package, called Proteus. System setup and calculation of an energy matrix are done with the XPLOR modeling program and its sophisticated command language, supporting several force fields and solvent models. A second program provides algorithms to search sequence space. It allows a decomposition of the system into groups, which can be combined in different ways in the energy function, for both positive and negative design. The whole procedure can be controlled by editing 2-4 scripts. Two applications consider the tyrosyl-tRNA synthetase enzyme and its successful redesign to bind both O-methyl-tyrosine and D-tyrosine. For the latter, we present Monte Carlo simulations where the D-tyrosine concentration is gradually increased, displacing L-tyrosine from the binding pocket and yielding the binding free energy difference, in good agreement with experiment. Complete redesign of the Crk SH3 domain is presented. The top 10000 sequences are all assigned to the correct fold by the SUPERFAMILY library of Hidden Markov Models. Finally, we report the acid/base behavior of the SNase protein. Sidechain protonation is treated as a form of mutation; it is then straightforward to perform constant-pH Monte Carlo simulations, which yield good agreement with experiment. Overall, the software can be used for a wide range of application, producing not only native-like sequences but also thermodynamic properties with errors that appear comparable to other current software packages.
Collapse
Affiliation(s)
- Thomas Simonson
- Laboratoire de Biochimie (CNRS UMR7654), Department of Biology, Ecole Polytechnique, Palaiseau, 91128, France
| | | | | | | | | | | | | | | | | | | |
Collapse
|
40
|
MOIRAE: A computational strategy to extract and represent structural information from experimental protein templates. Soft comput 2013. [DOI: 10.1007/s00500-013-1087-6] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
|
41
|
Venkatesan A, Gopal J, Candavelou M, Gollapalli S, Karthikeyan K. Computational approach for protein structure prediction. Healthc Inform Res 2013; 19:137-47. [PMID: 23882419 PMCID: PMC3717437 DOI: 10.4258/hir.2013.19.2.137] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2012] [Revised: 03/30/2013] [Accepted: 04/01/2013] [Indexed: 11/23/2022] Open
Abstract
Objectives To predict the structure of protein, which dictates the function it performs, a newly designed algorithm is developed which blends the concept of self-organization and the genetic algorithm. Methods Among many other approaches, genetic algorithm is found to be a promising cooperative computational method to solve protein structure prediction in a reasonable time. To automate the right choice of parameter values the influence of self-organization is adopted to design a new genetic operator to optimize the process of prediction. Torsion angles, the local structural parameters which define the backbone of protein are considered to encode the chromosome that enhances the quality of the confirmation. Newly designed self-configured genetic operators are used to develop self-organizing genetic algorithm to facilitate the accurate structure prediction. Results Peptides are used to gauge the validity of the proposed algorithm. As a result, the structure predicted shows clear improvements in the root mean square deviation on overlapping the native indicates the overall performance of the algorithm. In addition, the Ramachandran plot results implies that the conformations of phi-psi angles in the predicted structure are better as compared to native and also free from steric hindrances. Conclusions The proposed algorithm is promising which contributes to the prediction of a native-like structure by eliminating the time constraint and effort demand. In addition, the energy of the predicted structure is minimized to a greater extent, which proves the stability of protein.
Collapse
Affiliation(s)
- Amouda Venkatesan
- Centre for Bioinformatics, Pondicherry University, Kalapet, Pondicherry, India
| | | | | | | | | |
Collapse
|
42
|
Mach P, Koehl P. Capturing protein sequence-structure specificity using computational sequence design. Proteins 2013; 81:1556-70. [DOI: 10.1002/prot.24307] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2012] [Revised: 03/28/2013] [Accepted: 04/11/2013] [Indexed: 02/05/2023]
Affiliation(s)
- Paul Mach
- Department of Applied Mathematics; Genome Center; University of California; Davis 95616 California
| | - Patrice Koehl
- Department of Computer Science; Genome Center; University of California; Davis 95616 California
| |
Collapse
|
43
|
Kirys T, Ruvinsky AM, Tuzikov AV, Vakser IA. Correlation analysis of the side-chains conformational distribution in bound and unbound proteins. BMC Bioinformatics 2012; 13:236. [PMID: 22984947 PMCID: PMC3479416 DOI: 10.1186/1471-2105-13-236] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2012] [Accepted: 09/11/2012] [Indexed: 01/30/2023] Open
Abstract
BACKGROUND Protein interactions play a key role in life processes. Characterization of conformational properties of protein-protein interactions is important for understanding the mechanisms of protein association. The rapidly increasing amount of experimentally determined structures of proteins and protein-protein complexes provides foundation for research on protein interactions and complex formation. The knowledge of the conformations of the surface side chains is essential for modeling of protein complexes. The purpose of this study was to analyze and compare dihedral angle distribution functions of the side chains at the interface and non-interface areas in bound and unbound proteins. RESULTS To calculate the dihedral angle distribution functions, the configuration space was divided into grid cells. Statistical analysis showed that the similarity between bound and unbound interface and non-interface surface depends on the amino acid type and the grid resolution. The correlation coefficients between the distribution functions increased with the grid spacing increase for all amino acid types. The Manhattan distance showing the degree of dissimilarity between the distribution functions decreased accordingly. Short residues with one or two dihedral angles had higher correlations and smaller Manhattan distances than the longer residues. Met and Arg had the slowest growth of the correlation coefficient with the grid spacing increase. The correlations between the interface and non-interface distribution functions had a similar dependence on the grid resolution in both bound and unbound states. The interface and non-interface differences between bound and unbound distribution functions, caused by biological protein-protein interactions or crystal contacts, disappeared at the 70° grid spacing for interfaces and 30° for non-interface surface, which agrees with an average span of the side-chain rotamers. CONCLUSIONS The two-fold difference in the critical grid spacing indicates larger conformational changes upon binding at the interface than at the rest of the surface. At the same time, transitions between rotamers induced by interactions across the interface or the crystal packing are rare, with most side chains having local readjustments that do not change the rotameric state. The analysis is important for better understanding of protein interactions and development of flexible docking approaches.
Collapse
Affiliation(s)
- Tatsiana Kirys
- Center for Bioinformatics, The University of Kansas, Lawrence, KS 66047, USA
| | | | | | | |
Collapse
|
44
|
Kirys T, Ruvinsky AM, Tuzikov AV, Vakser IA. Rotamer libraries and probabilities of transition between rotamers for the side chains in protein-protein binding. Proteins 2012; 80:2089-98. [PMID: 22544766 DOI: 10.1002/prot.24103] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2012] [Revised: 04/12/2012] [Accepted: 04/17/2012] [Indexed: 01/26/2023]
Abstract
Conformational changes in the side chains are essential for protein-protein binding. Rotameric states and unbound- to-bound conformational changes in the surface residues were systematically studied on a representative set of protein complexes. The side-chain conformations were mapped onto dihedral angles space. The variable threshold algorithm was developed to cluster the dihedral angle distributions and to derive rotamers, defined as the most probable conformation in a cluster. Six rotamer libraries were generated: full surface, surface noninterface, and surface interface-each for bound and unbound states. The libraries were used to calculate the probabilities of the rotamer transitions upon binding. The stability of amino acids was quantified based on the transition maps. The noninterface residues' stability was higher than that of the interface. Long side chains with three or four dihedral angles were less stable than the shorter ones. The transitions between the rotamers at the interface occurred more frequently than on the noninterface surface. Most side chains changed conformation within the same rotamer or moved to an adjacent rotamer. The highest percentage of the transitions was observed primarily between the two most occupied rotamers. The probability of the transition between rotamers increased with the decrease of the rotamer stability. The analysis revealed characteristics of the surface side-chain conformational transitions that can be utilized in flexible docking protocols.
Collapse
Affiliation(s)
- Tatsiana Kirys
- Center for Bioinformatics, The University of Kansas, Lawrence, Kansas 66047, USA
| | | | | | | |
Collapse
|
45
|
Gront D, Kmiecik S, Blaszczyk M, Ekonomiuk D, Koliński A. Optimization of protein models. WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL MOLECULAR SCIENCE 2012. [DOI: 10.1002/wcms.1090] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Affiliation(s)
- Dominik Gront
- Laboratory of Theory of Biopolymers, Faculty of Chemistry, University of Warsaw, Warsaw, Poland
| | - Sebastian Kmiecik
- Laboratory of Theory of Biopolymers, Faculty of Chemistry, University of Warsaw, Warsaw, Poland
| | - Maciej Blaszczyk
- Laboratory of Theory of Biopolymers, Faculty of Chemistry, University of Warsaw, Warsaw, Poland
| | - Dariusz Ekonomiuk
- Laboratory of Theory of Biopolymers, Faculty of Chemistry, University of Warsaw, Warsaw, Poland
| | - Andrzej Koliński
- Laboratory of Theory of Biopolymers, Faculty of Chemistry, University of Warsaw, Warsaw, Poland
| |
Collapse
|
46
|
Protein-water interactions in MD simulations: POPS/POPSCOMP solvent accessibility analysis, solvation forces and hydration sites. Methods Mol Biol 2012; 819:375-92. [PMID: 22183548 DOI: 10.1007/978-1-61779-465-0_23] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
The effects of solvation on molecular recognition are investigated from different perspectives, ranging from methods to analyse explicit solvent dynamical behaviour at the protein surface to methods for the implicit treatment of solvent effects associated with the conformational behaviour of biomolecules. The here presented implicit solvation method is based on an analytical approximation of the Solvent Accessible Surface Area (SASA) of solute molecules, which is computationally efficient and easy to parametrise. The parametrised SASA solvation method is discussed in the light of protein design and ligand binding studies. The POPS program for the SASA computation on single molecules and complex interfaces is described in detail. Explicit solvent behaviour is described here in the form of solvent density maps at the protein surface. We highlight the usefulness of that approach in defining the organisation of specific water molecules at functional sites and in determining hydrophobicity scores for the identification of potential interaction patches.
Collapse
|
47
|
Li SC, Bu D, Li M. Residues with similar hexagon neighborhoods share similar side-chain conformations. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2012; 9:240-248. [PMID: 21519113 DOI: 10.1109/tcbb.2011.74] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]
Abstract
We present in this study a new approach to code protein side-chain conformations into hexagon substructures. Classical side-chain packing methods consist of two steps: first, side-chain conformations, known as rotamers, are extracted from known protein structures as candidates for each residue; second, a searching method along with an energy function is used to resolve conflicts among residues and to optimize the combinations of side chain conformations for all residues. These methods benefit from the fact that the number of possible side-chain conformations is limited, and the rotamer candidates are readily extracted; however, these methods also suffer from the inaccuracy of energy functions. Inspired by threading and Ab Initio approaches to protein structure prediction, we propose to use hexagon substructures to implicitly capture subtle issues of energy functions. Our initial results indicate that even without guidance from an energy function, hexagon structures alone can capture side-chain conformations at an accuracy of 83.8 percent, higher than 82.6 percent by the state-of-art side-chain packing methods.
Collapse
|
48
|
Veljkovic N, Glisic S, Perovic V, Veljkovic V. The role of long-range intermolecular interactions in discovery of new drugs. Expert Opin Drug Discov 2011; 6:1263-70. [PMID: 22647065 DOI: 10.1517/17460441.2012.638280] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
INTRODUCTION Long-range intermolecular interactions (interactions at distances between 100 and 1000 Å) play an important role in the interaction between drugs and therapeutic targets, and design techniques based on this concept could significantly improve and accelerate new drug discovery. Understanding these long-range intermolecular interactions will also help further our understanding of the molecular mechanisms and the underlying basic biological processes. AREAS COVERED This article looks at the physical bases of long-range intermolecular interactions in biological systems with a brief review of the literature data to support this concept. The article also gives some examples of techniques used in drug discovery that were based on the long-range intermolecular interaction concept. EXPERT OPINION The electron-ion interaction potential (EIIP) and average quasivalence number (AQVN) concepts shed new light on the role of long-range intermolecular interactions in biological systems. Further research of physicochemical mechanisms underlying long-range interactions between biological molecules is necessary for a better understanding of the basic biological processes. The addition of the computer-aided design techniques based on the EIIP/AQVN concept to the research and development will lead not only to a significant reduction in cost but also to an acceleration in the development of new drugs.
Collapse
Affiliation(s)
- Nevena Veljkovic
- University of Belgrade, Institute of Nuclear Sciences Vinca , Center for Multidisciplinary Research, P.O.Box 522, 11001 Belgrade , Serbia +381 11 2453 686 ; +381 11 3440 100 ;
| | | | | | | |
Collapse
|
49
|
Zeng J, Roberts KE, Zhou P, Donald BR. A Bayesian approach for determining protein side-chain rotamer conformations using unassigned NOE data. J Comput Biol 2011; 18:1661-79. [PMID: 21970619 DOI: 10.1089/cmb.2011.0172] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
A major bottleneck in protein structure determination via nuclear magnetic resonance (NMR) is the lengthy and laborious process of assigning resonances and nuclear Overhauser effect (NOE) cross peaks. Recent studies have shown that accurate backbone folds can be determined using sparse NMR data, such as residual dipolar couplings (RDCs) or backbone chemical shifts. This opens a question of whether we can also determine the accurate protein side-chain conformations using sparse or unassigned NMR data. We attack this question by using unassigned nuclear Overhauser effect spectroscopy (NOESY) data, which records the through-space dipolar interactions between protons nearby in three-dimensional (3D) space. We propose a Bayesian approach with a Markov random field (MRF) model to integrate the likelihood function derived from observed experimental data, with prior information (i.e., empirical molecular mechanics energies) about the protein structures. We unify the side-chain structure prediction problem with the side-chain structure determination problem using unassigned NMR data, and apply the deterministic dead-end elimination (DEE) and A* search algorithms to provably find the global optimum solution that maximizes the posterior probability. We employ a Hausdorff-based measure to derive the likelihood of a rotamer or a pairwise rotamer interaction from unassigned NOESY data. In addition, we apply a systematic and rigorous approach to estimate the experimental noise in NMR data, which also determines the weighting factor of the data term in the scoring function derived from the Bayesian framework. We tested our approach on real NMR data of three proteins: the FF Domain 2 of human transcription elongation factor CA150 (FF2), the B1 domain of Protein G (GB1), and human ubiquitin. The promising results indicate that our algorithm can be applied in high-resolution protein structure determination. Since our approach does not require any NOE assignment, it can accelerate the NMR structure determination process.
Collapse
Affiliation(s)
- Jianyang Zeng
- Department of Computer Science, Duke University, Durham, NC 27708, USA
| | | | | | | |
Collapse
|
50
|
Abstract
MOTIVATION Modeling of side chain conformations constitutes an indispensable effort in protein structure modeling, protein-protein docking and protein design. Thanks to an intensive attention to this field, many of the existing programs can achieve reasonably good and comparable prediction accuracy. Moreover, in our previous work on CIS-RR, we argued that the prediction with few atomic clashes can complement the current existing methods for subsequent analysis and refinement of protein structures. However, these recent efforts to enhance the quality of predicted side chains have been accompanied by a significant increase of computational cost. RESULTS In this study, by mainly focusing on improving the speed of side chain conformation prediction, we present a RApid Side-chain Predictor, called RASP. To achieve a much faster speed with a comparable accuracy to the best existing methods, we not only employ the clash elimination strategy of CIS-RR, but also carefully optimize energy terms and integrate different search algorithms. In comprehensive benchmark testings, RASP is over one order of magnitude faster (~ 40 times over CIS-RR) than the recently developed methods, while achieving comparable or even better accuracy.
Collapse
Affiliation(s)
- Zhichao Miao
- National Laboratory of Biomacromolecules, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China
| | | | | |
Collapse
|