1
|
Chennakesavalu S, Rotskoff GM. Data-Efficient Generation of Protein Conformational Ensembles with Backbone-to-Side-Chain Transformers. J Phys Chem B 2024; 128:2114-2123. [PMID: 38394363 DOI: 10.1021/acs.jpcb.3c08195] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/25/2024]
Abstract
Excitement at the prospect of using data-driven generative models to sample configurational ensembles of biomolecular systems stems from the extraordinary success of these models on a diverse set of high-dimensional sampling tasks. Unlike image generation or even the closely related problem of protein structure prediction, there are currently no data sources with sufficient breadth to parametrize generative models for conformational ensembles. To enable discovery, a fundamentally different approach to building generative models is required: models should be able to propose rare, albeit physical, conformations that may not arise in even the largest data sets. Here we introduce a modular strategy to generate conformations based on "backmapping" from a fixed protein backbone that (1) maintains conformational diversity of the side chains and (2) couples the side-chain fluctuations using global information about the protein conformation. Our model combines simple statistical models of side-chain conformations based on rotamer libraries with the now ubiquitous transformer architecture to sample with atomistic accuracy. Together, these ingredients provide a strategy for rapid data acquisition and hence a crucial ingredient for scalable physical simulation with generative neural networks.
Collapse
Affiliation(s)
| | - Grant M Rotskoff
- Department of Chemistry, Stanford University, Stanford, California 94305, United States
- Institute for Computational and Mathematical Engineering, Stanford University, Stanford, California 94305, United States
| |
Collapse
|
2
|
Dicks L, Wales DJ. Exploiting Sequence-Dependent Rotamer Information in Global Optimization of Proteins. J Phys Chem B 2022; 126:8381-8390. [PMID: 36257022 PMCID: PMC9623586 DOI: 10.1021/acs.jpcb.2c04647] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]
Abstract
Rotamers, namely amino acid side chain conformations common to many different peptides, can be compiled into libraries. These rotamer libraries are used in protein modeling, where the limited conformational space occupied by amino acid side chains is exploited. Here, we construct a sequence-dependent rotamer library from simulations of all possible tripeptides, which provides rotameric states dependent on adjacent amino acids. We observe significant sensitivity of rotamer populations to sequence and find that the library is successful in locating side chain conformations present in crystal structures. The library is designed for applications with basin-hopping global optimization, where we use it to propose moves in conformational space. The addition of rotamer moves significantly increases the efficiency of protein structure prediction within this framework, and we determine parameters to optimize efficiency.
Collapse
Affiliation(s)
- L. Dicks
- Yusuf
Hamied Department of Chemistry, University
of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom,IBM
Research, The Hartree Centre STFC Laboratory,
Sci-Tech Daresbury, Warrington WA4 4AD, United Kingdom
| | - D. J. Wales
- Yusuf
Hamied Department of Chemistry, University
of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom,
| |
Collapse
|
3
|
Misiura M, Shroff R, Thyer R, Kolomeisky AB. DLPacker: Deep learning for prediction of amino acid side chain conformations in proteins. Proteins 2022; 90:1278-1290. [PMID: 35122328 DOI: 10.1002/prot.26311] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2021] [Revised: 12/03/2021] [Accepted: 12/07/2021] [Indexed: 12/20/2022]
Abstract
Prediction of side chain conformations of amino acids in proteins (also termed "packing") is an important and challenging part of protein structure prediction with many interesting applications in protein design. A variety of methods for packing have been developed but more accurate ones are still needed. Machine learning (ML) methods have recently become a powerful tool for solving various problems in diverse areas of science, including structural biology. In this study, we evaluate the potential of deep neural networks (DNNs) for prediction of amino acid side chain conformations. We formulate the problem as image-to-image transformation and train a U-net style DNN to solve the problem. We show that our method outperforms other physics-based methods by a significant margin: reconstruction RMSDs for most amino acids are about 20% smaller compared to SCWRL4 and Rosetta Packer with RMSDs for bulky hydrophobic amino acids Phe, Tyr, and Trp being up to 50% smaller.
Collapse
Affiliation(s)
- Mikita Misiura
- Department of Chemistry, Center for Theoretical Biological Physics, Rice University, Houston, Texas, USA
| | | | - Ross Thyer
- Department of Chemical and Biomolecular Engineering, Rice University, Houston, Texas, USA
| | - Anatoly B Kolomeisky
- Department of Chemistry, Center for Theoretical Biological Physics, Rice University, Houston, Texas, USA.,Department of Chemical and Biomolecular Engineering, Rice University, Houston, Texas, USA.,Department of Physics and Astronomy, Center for Theoretical Biological Physics, Rice University, Houston, Texas, USA
| |
Collapse
|
4
|
Kotelnikov S, Alekseenko A, Liu C, Ignatov M, Padhorny D, Brini E, Lukin M, Coutsias E, Dill KA, Kozakov D. Sampling and refinement protocols for template-based macrocycle docking: 2018 D3R Grand Challenge 4. J Comput Aided Mol Des 2019; 34:179-189. [PMID: 31879831 DOI: 10.1007/s10822-019-00257-1] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2019] [Accepted: 11/19/2019] [Indexed: 12/25/2022]
Abstract
We describe a new template-based method for docking flexible ligands such as macrocycles to proteins. It combines Monte-Carlo energy minimization on the manifold, a fast manifold search method, with BRIKARD for complex flexible ligand searching, and with the MELD accelerator of Replica-Exchange Molecular Dynamics simulations for atomistic degrees of freedom. Here we test the method in the Drug Design Data Resource blind Grand Challenge competition. This method was among the best performers in the competition, giving sub-angstrom prediction quality for the majority of the targets.
Collapse
Affiliation(s)
- Sergei Kotelnikov
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, NY, USA.,Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, NY, USA.,Innopolis University, Innopolis, Russia
| | - Andrey Alekseenko
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, NY, USA.,Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, NY, USA
| | - Cong Liu
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, NY, USA.,Department of Chemistry, Stony Brook University, Stony Brook, NY, USA
| | - Mikhail Ignatov
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, NY, USA.,Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, NY, USA.,Institute for Advanced Computational Sciences, Stony Brook University, Stony Brook, NY, USA
| | - Dzmitry Padhorny
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, NY, USA.,Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, NY, USA
| | - Emiliano Brini
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, NY, USA
| | - Mark Lukin
- Department of Pharmacological Sciences, Stony Brook University, Stony Brook, NY, USA
| | - Evangelos Coutsias
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, NY, USA.,Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, NY, USA
| | - Ken A Dill
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, NY, USA.,Department of Chemistry, Stony Brook University, Stony Brook, NY, USA.,Department of Physics and Astronomy, Stony Brook University, Stony Brook, NY, USA
| | - Dima Kozakov
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, NY, USA. .,Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, NY, USA. .,Institute for Advanced Computational Sciences, Stony Brook University, Stony Brook, NY, USA.
| |
Collapse
|
5
|
Alekseenko A, Kotelnikov S, Ignatov M, Egbert M, Kholodov Y, Vajda S, Kozakov D. ClusPro LigTBM: Automated Template-based Small Molecule Docking. J Mol Biol 2019; 432:3404-3410. [PMID: 31863748 DOI: 10.1016/j.jmb.2019.12.011] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2019] [Revised: 12/03/2019] [Accepted: 12/04/2019] [Indexed: 12/31/2022]
Abstract
The template-based approach has been essential for achieving high-quality models in the recent rounds of blind protein-protein docking competition CAPRI (Critical Assessment of Predicted Interactions). However, few such automated methods exist for protein-small molecule docking. In this paper, we present an algorithm for template-based docking of small molecules. It searches for known complexes with ligands that have partial coverage of the target ligand, performs conformational sampling and template-guided energy refinement to produce a variety of possible poses, and then scores the refined poses. The algorithm is available as the automated ClusPro LigTBM server. It allows the user to specify the target protein as a PDB file and the ligand as a SMILES string. The server then searches for templates and uses them for docking, presenting the user with top-scoring poses and their confidence scores. The method is tested on the Astex Diverse benchmark, as well as on the targets from the last round of the D3R (Drug Design Data Resource) Grand Challenge. The server is publicly available as part of the ClusPro docking server suite at https://ligtbm.cluspro.org/.
Collapse
Affiliation(s)
- Andrey Alekseenko
- Department of Applied Mathematics and Statistics, Stony Brook University, 11794 Stony Brook, NY, USA; Laufer Center for Physical and Quantitative Biology, Stony Brook University, 11794 Stony Brook, NY, USA
| | - Sergei Kotelnikov
- Department of Applied Mathematics and Statistics, Stony Brook University, 11794 Stony Brook, NY, USA; Laufer Center for Physical and Quantitative Biology, Stony Brook University, 11794 Stony Brook, NY, USA; Innopolis University, 420500, Innopolis, Russia
| | - Mikhail Ignatov
- Department of Applied Mathematics and Statistics, Stony Brook University, 11794 Stony Brook, NY, USA; Laufer Center for Physical and Quantitative Biology, Stony Brook University, 11794 Stony Brook, NY, USA; Institute for Advanced Computational Sciences, Stony Brook University, 11794, Stony Brook, NY, USA
| | - Megan Egbert
- Department of Biomedical Engineering, Boston University, 02215, Boston, MA, USA
| | | | - Sandor Vajda
- Department of Biomedical Engineering, Boston University, 02215, Boston, MA, USA; Department of Chemistry, Boston University, 02215, Boston, MA, USA
| | - Dima Kozakov
- Department of Applied Mathematics and Statistics, Stony Brook University, 11794 Stony Brook, NY, USA; Laufer Center for Physical and Quantitative Biology, Stony Brook University, 11794 Stony Brook, NY, USA; Institute for Advanced Computational Sciences, Stony Brook University, 11794, Stony Brook, NY, USA.
| |
Collapse
|
6
|
Dauzhenka T, Kundrotas PJ, Vakser IA. Computational Feasibility of an Exhaustive Search of Side-Chain Conformations in Protein-Protein Docking. J Comput Chem 2018; 39:2012-2021. [PMID: 30226647 DOI: 10.1002/jcc.25381] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2017] [Revised: 03/24/2018] [Accepted: 05/26/2018] [Indexed: 11/07/2022]
Abstract
Protein-protein docking procedures typically perform the global scan of the proteins relative positions, followed by the local refinement of the putative matches. Because of the size of the search space, the global scan is usually implemented as rigid-body search, using computationally inexpensive intermolecular energy approximations. An adequate refinement has to take into account structural flexibility. Since the refinement performs conformational search of the interacting proteins, it is extremely computationally challenging, given the enormous amount of the internal degrees of freedom. Different approaches limit the search space by restricting the search to the side chains, rotameric states, coarse-grained structure representation, principal normal modes, and so on. Still, even with the approximations, the refinement presents an extreme computational challenge due to the very large number of the remaining degrees of freedom. Given the complexity of the search space, the advantage of the exhaustive search is obvious. The obstacle to such search is computational feasibility. However, the growing computational power of modern computers, especially due to the increasing utilization of Graphics Processing Unit (GPU) with large amount of specialized computing cores, extends the ranges of applicability of the brute-force search methods. This proof-of-concept study demonstrates computational feasibility of an exhaustive search of side-chain conformations in protein pocking. The procedure, implemented on the GPU architecture, was used to generate the optimal conformations in a large representative set of protein-protein complexes. © 2018 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Taras Dauzhenka
- Center for Computational Biology, The University of Kansas, Lawrence, Kansas, 66047
| | - Petras J Kundrotas
- Center for Computational Biology, The University of Kansas, Lawrence, Kansas, 66047
| | - Ilya A Vakser
- Center for Computational Biology, The University of Kansas, Lawrence, Kansas, 66047.,Department of Molecular Biosciences, The University of Kansas, Lawrence, Kansas, 66047
| |
Collapse
|
7
|
Hogues H, Gaudreault F, Corbeil CR, Deprez C, Sulea T, Purisima EO. ProPOSE: Direct Exhaustive Protein-Protein Docking with Side Chain Flexibility. J Chem Theory Comput 2018; 14:4938-4947. [PMID: 30107730 DOI: 10.1021/acs.jctc.8b00225] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
Despite decades of development, protein-protein docking remains a largely unsolved problem. The main difficulties are the immense space spanned by the translational and rotational degrees of freedom and the prediction of the conformational changes of proteins upon binding. FFT is generally the preferred method to exhaustively explore the translation-rotation space at a fine grid resolution, albeit with the trade-off of approximating force fields with correlation functions. This work presents a direct search alternative that samples the states in Cartesian space at the same resolution and computational cost as standard FFT methods. Operating in real space allows the use of standard force field functional forms used in typical non-FFT methods as well as the implementation of strategies for focused exploration of conformational flexibility. Currently, a few misplaced side chains can cause docking programs to fail. This work specifically addresses the problem of side chain rearrangements upon complex formation. Based on the observation that most side chains retain their unbound conformation upon binding, each rigidly docked pose is initially scored ignoring up to a limited number of side chain overlaps which are resolved in subsequent repacking and minimization steps. On test systems where side chains are altered and backbones held in their bound state, this implementation provides significantly better native pose recovery and higher quality (lower RMSD) predictions when compared with five of the most popular docking programs. The method is implemented in the software program ProPOSE (Protein Pose Optimization by Systematic Enumeration).
Collapse
Affiliation(s)
- Hervé Hogues
- Human Health Therapeutics , National Research Council Canada , 6100 Royalmount Avenue , Montreal , Quebec H4P 2R2 , Canada
| | - Francis Gaudreault
- Human Health Therapeutics , National Research Council Canada , 6100 Royalmount Avenue , Montreal , Quebec H4P 2R2 , Canada
| | - Christopher R Corbeil
- Human Health Therapeutics , National Research Council Canada , 6100 Royalmount Avenue , Montreal , Quebec H4P 2R2 , Canada
| | - Christophe Deprez
- Human Health Therapeutics , National Research Council Canada , 6100 Royalmount Avenue , Montreal , Quebec H4P 2R2 , Canada
| | - Traian Sulea
- Human Health Therapeutics , National Research Council Canada , 6100 Royalmount Avenue , Montreal , Quebec H4P 2R2 , Canada
| | - Enrico O Purisima
- Human Health Therapeutics , National Research Council Canada , 6100 Royalmount Avenue , Montreal , Quebec H4P 2R2 , Canada
| |
Collapse
|
8
|
Zarbafian S, Moghadasi M, Roshandelpoor A, Nan F, Li K, Vakli P, Vajda S, Kozakov D, Paschalidis IC. Protein docking refinement by convex underestimation in the low-dimensional subspace of encounter complexes. Sci Rep 2018; 8:5896. [PMID: 29650980 PMCID: PMC5955889 DOI: 10.1038/s41598-018-23982-3] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2017] [Accepted: 03/21/2018] [Indexed: 01/18/2023] Open
Abstract
We propose a novel stochastic global optimization algorithm with applications to the refinement stage of protein docking prediction methods. Our approach can process conformations sampled from multiple clusters, each roughly corresponding to a different binding energy funnel. These clusters are obtained using a density-based clustering method. In each cluster, we identify a smooth “permissive” subspace which avoids high-energy barriers and then underestimate the binding energy function using general convex polynomials in this subspace. We use the underestimator to bias sampling towards its global minimum. Sampling and subspace underestimation are repeated several times and the conformations sampled at the last iteration form a refined ensemble. We report computational results on a comprehensive benchmark of 224 protein complexes, establishing that our refined ensemble significantly improves the quality of the conformations of the original set given to the algorithm. We also devise a method to enhance the ensemble from which near-native models are selected.
Collapse
Affiliation(s)
- Shahrooz Zarbafian
- Department of Mechanical Engineering, Boston University, Boston, Massachusetts, United States of America
| | - Mohammad Moghadasi
- Division of Systems Engineering, Boston University, Boston, Massachusetts, United States of America
| | - Athar Roshandelpoor
- Division of Systems Engineering, Boston University, Boston, Massachusetts, United States of America
| | - Feng Nan
- Division of Systems Engineering, Boston University, Boston, Massachusetts, United States of America
| | - Keyong Li
- Division of Systems Engineering, Boston University, Boston, Massachusetts, United States of America
| | - Pirooz Vakli
- Division of Systems Engineering, Boston University, Boston, Massachusetts, United States of America.,Department of Mechanical Engineering, Boston University, Boston, Massachusetts, United States of America
| | - Sandor Vajda
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts, United States of America.
| | - Dima Kozakov
- Department of Applied Mathematics and Statistics and Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York, United States of America.
| | - Ioannis Ch Paschalidis
- Division of Systems Engineering, Boston University, Boston, Massachusetts, United States of America. .,Department of Biomedical Engineering, Boston University, Boston, Massachusetts, United States of America. .,Department of Electrical and Computer Engineering, Boston University, Boston, Massachusetts, United States of America. .,8 Saint Mary's St., Boston, MA, 02215, United States of America.
| |
Collapse
|
9
|
Padhorny D, Hall DR, Mirzaei H, Mamonov AB, Moghadasi M, Alekseenko A, Beglov D, Kozakov D. Protein-ligand docking using FFT based sampling: D3R case study. J Comput Aided Mol Des 2018; 32:225-230. [PMID: 29101520 PMCID: PMC5767528 DOI: 10.1007/s10822-017-0069-7] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2017] [Accepted: 09/16/2017] [Indexed: 12/15/2022]
Abstract
Fast Fourier transform (FFT) based approaches have been successful in application to modeling of relatively rigid protein-protein complexes. Recently, we have been able to adapt the FFT methodology to treatment of flexible protein-peptide interactions. Here, we report our latest attempt to expand the capabilities of the FFT approach to treatment of flexible protein-ligand interactions in application to the D3R PL-2016-1 challenge. Based on the D3R assessment, our FFT approach in conjunction with Monte Carlo minimization off-grid refinement was among the top performing methods in the challenge. The potential advantage of our method is its ability to globally sample the protein-ligand interaction landscape, which will be explored in further applications.
Collapse
Affiliation(s)
- Dzmitry Padhorny
- Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, NY, 11794, USA
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, NY, 11794, USA
| | | | - Hanieh Mirzaei
- Department of Biomedical Engineering, Boston University, Boston, MA, 02215, USA
| | - Artem B Mamonov
- Department of Biomedical Engineering, Boston University, Boston, MA, 02215, USA
| | - Mohammad Moghadasi
- Department of Biomedical Engineering, Boston University, Boston, MA, 02215, USA
| | - Andrey Alekseenko
- Moscow Institute of Physics and Technology (State University), Institutskii per. 9, Dolgoprudny, Moscow Oblast, Russia, 141700
- Institute of Computer Aided Design of the Russian Academy of Sciences, 19/18, 2-nd Brestskaya St, Moscow, Russia, 123056
| | - Dmitri Beglov
- Department of Biomedical Engineering, Boston University, Boston, MA, 02215, USA.
| | - Dima Kozakov
- Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, NY, 11794, USA.
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, NY, 11794, USA.
| |
Collapse
|
10
|
Kozakov D, Hall DR, Xia B, Porter KA, Padhorny D, Yueh C, Beglov D, Vajda S. The ClusPro web server for protein-protein docking. Nat Protoc 2017. [PMID: 28079879 DOI: 10.1038/nprot2016169] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/12/2023]
Abstract
The ClusPro server (https://cluspro.org) is a widely used tool for protein-protein docking. The server provides a simple home page for basic use, requiring only two files in Protein Data Bank (PDB) format. However, ClusPro also offers a number of advanced options to modify the search; these include the removal of unstructured protein regions, application of attraction or repulsion, accounting for pairwise distance restraints, construction of homo-multimers, consideration of small-angle X-ray scattering (SAXS) data, and location of heparin-binding sites. Six different energy functions can be used, depending on the type of protein. Docking with each energy parameter set results in ten models defined by centers of highly populated clusters of low-energy docked structures. This protocol describes the use of the various options, the construction of auxiliary restraints files, the selection of the energy parameters, and the analysis of the results. Although the server is heavily used, runs are generally completed in <4 h.
Collapse
Affiliation(s)
- Dima Kozakov
- Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, New York, USA
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts, USA
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, New York, USA
| | | | - Bing Xia
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts, USA
| | - Kathryn A Porter
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts, USA
| | - Dzmitry Padhorny
- Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, New York, USA
| | - Christine Yueh
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts, USA
| | - Dmitri Beglov
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts, USA
| | - Sandor Vajda
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts, USA
| |
Collapse
|
11
|
Abstract
The ClusPro server (https://cluspro.org) is a widely used tool for protein-protein docking. The server provides a simple home page for basic use, requiring only two files in Protein Data Bank (PDB) format. However, ClusPro also offers a number of advanced options to modify the search; these include the removal of unstructured protein regions, application of attraction or repulsion, accounting for pairwise distance restraints, construction of homo-multimers, consideration of small-angle X-ray scattering (SAXS) data, and location of heparin-binding sites. Six different energy functions can be used, depending on the type of protein. Docking with each energy parameter set results in ten models defined by centers of highly populated clusters of low-energy docked structures. This protocol describes the use of the various options, the construction of auxiliary restraints files, the selection of the energy parameters, and the analysis of the results. Although the server is heavily used, runs are generally completed in <4 h.
Collapse
|
12
|
Vajda S, Yueh C, Beglov D, Bohnuud T, Mottarella SE, Xia B, Hall DR, Kozakov D. New additions to the ClusPro server motivated by CAPRI. Proteins 2017; 85:435-444. [PMID: 27936493 DOI: 10.1002/prot.25219] [Citation(s) in RCA: 347] [Impact Index Per Article: 49.6] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2016] [Revised: 11/28/2016] [Accepted: 11/29/2016] [Indexed: 12/12/2022]
Abstract
The heavily used protein-protein docking server ClusPro performs three computational steps as follows: (1) rigid body docking, (2) RMSD based clustering of the 1000 lowest energy structures, and (3) the removal of steric clashes by energy minimization. In response to challenges encountered in recent CAPRI targets, we added three new options to ClusPro. These are (1) accounting for small angle X-ray scattering data in docking; (2) considering pairwise interaction data as restraints; and (3) enabling discrimination between biological and crystallographic dimers. In addition, we have developed an extremely fast docking algorithm based on 5D rotational manifold FFT, and an algorithm for docking flexible peptides that include known sequence motifs. We feel that these developments will further improve the utility of ClusPro. However, CAPRI emphasized several shortcomings of the current server, including the problem of selecting the right energy parameters among the five options provided, and the problem of selecting the best models among the 10 generated for each parameter set. In addition, results convinced us that further development is needed for docking homology models. Finally, we discuss the difficulties we have encountered when attempting to develop a refinement algorithm that would be computationally efficient enough for inclusion in a heavily used server. Proteins 2017; 85:435-444. © 2016 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Sandor Vajda
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts, 02215.,Department of Chemistry, Boston University, Boston, Massachusetts, 02215
| | - Christine Yueh
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts, 02215
| | - Dmitri Beglov
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts, 02215
| | - Tanggis Bohnuud
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts, 02215.,Program in Bioinformatics, Boston University, Boston, Massachusetts, 02215
| | - Scott E Mottarella
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts, 02215.,Program in Bioinformatics, Boston University, Boston, Massachusetts, 02215
| | - Bing Xia
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts, 02215
| | | | - Dima Kozakov
- Department of Applied Mathematics and Statistics, Stony Brook University, New York.,Laufer Center for Physical and Quantitative Biology, Stony Brook University, New York
| |
Collapse
|
13
|
Anishchenko I, Kundrotas PJ, Vakser IA. Structural quality of unrefined models in protein docking. Proteins 2017; 85:39-45. [PMID: 27756103 PMCID: PMC5167671 DOI: 10.1002/prot.25188] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2016] [Revised: 09/29/2016] [Accepted: 10/11/2016] [Indexed: 11/11/2022]
Abstract
Structural characterization of protein-protein interactions is essential for understanding life processes at the molecular level. However, only a fraction of protein interactions have experimentally resolved structures. Thus, reliable computational methods for structural modeling of protein interactions (protein docking) are important for generating such structures and understanding the principles of protein recognition. Template-based docking techniques that utilize structural similarity between target protein-protein interaction and cocrystallized protein-protein complexes (templates) are gaining popularity due to generally higher reliability than that of the template-free docking. However, the template-based approach lacks explicit penalties for intermolecular penetration, as opposed to the typical free docking where such penalty is inherent due to the shape complementarity paradigm. Thus, template-based docking models are commonly assumed to require special treatment to remove large structural penetrations. In this study, we compared clashes in the template-based and free docking of the same proteins, with crystallographically determined and modeled structures. The results show that for the less accurate protein models, free docking produces fewer clashes than the template-based approach. However, contrary to the common expectation, in acceptable and better quality docking models of unbound crystallographically determined proteins, the clashes in the template-based docking are comparable to those in the free docking, due to the overall higher quality of the template-based docking predictions. This suggests that the free docking refinement protocols can in principle be applied to the template-based docking predictions as well. Proteins 2016; 85:39-45. © 2016 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Ivan Anishchenko
- Center for Computational Biology and Department of Molecular Biosciences, The University of Kansas, Lawrence, Kansas 66047, USA
| | - Petras J. Kundrotas
- Center for Computational Biology and Department of Molecular Biosciences, The University of Kansas, Lawrence, Kansas 66047, USA
| | - Ilya A. Vakser
- Center for Computational Biology and Department of Molecular Biosciences, The University of Kansas, Lawrence, Kansas 66047, USA
| |
Collapse
|
14
|
Mamonov AB, Moghadasi M, Mirzaei H, Zarbafian S, Grove LE, Bohnuud T, Vakili P, Paschalidis IC, Vajda S, Kozakov D. Focused grid-based resampling for protein docking and mapping. J Comput Chem 2016; 37:961-70. [PMID: 26837000 PMCID: PMC4814242 DOI: 10.1002/jcc.24273] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2015] [Revised: 08/31/2015] [Accepted: 09/26/2015] [Indexed: 12/27/2022]
Abstract
The fast Fourier transform (FFT) sampling algorithm has been used with success in application to protein-protein docking and for protein mapping, the latter docking a variety of small organic molecules for the identification of binding hot spots on the target protein. Here we explore the local rather than global usage of the FFT sampling approach in docking applications. If the global FFT based search yields a near-native cluster of docked structures for a protein complex, then focused resampling of the cluster generally leads to a substantial increase in the number of conformations close to the native structure. In protein mapping, focused resampling of the selected hot spot regions generally reveals further hot spots that, while not as strong as the primary hot spots, also contribute to ligand binding. The detection of additional ligand binding regions is shown by the improved overlap between hot spots and bound ligands.
Collapse
Affiliation(s)
- Artem B. Mamonov
- Department of Biomedical Engineering, Boston University, Boston MA 02215
| | - Mohammad Moghadasi
- Center for Information and Systems Engineering, Boston University, Boston, MA 02215
| | - Hanieh Mirzaei
- Center for Information and Systems Engineering, Boston University, Boston, MA 02215
| | - Shahrooz Zarbafian
- Department of Mechanical Engineering, Boston University, Boston MA 02215
| | - Laurie E. Grove
- Department of Sciences, Wentworth Institute of Technology, Boston, MA 02115, USA
| | - Tanggis Bohnuud
- Department of Biomedical Engineering, Boston University, Boston MA 02215
| | - Pirooz Vakili
- Center for Information and Systems Engineering, Boston University, Boston, MA 02215
- Department of Mechanical Engineering, Boston University, Boston MA 02215
| | - Ioannis Ch. Paschalidis
- Center for Information and Systems Engineering, Boston University, Boston, MA 02215
- Department of Electrical and Computer Engineering, Boston University, Boston MA 02215
| | - Sandor Vajda
- Department of Biomedical Engineering, Boston University, Boston MA 02215
- Center for Information and Systems Engineering, Boston University, Boston, MA 02215
- Department of Chemistry, Boston University, Boston MA 02215
| | - Dima Kozakov
- Department of Biomedical Engineering, Boston University, Boston MA 02215
- Departemnt of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, NY, 11790
| |
Collapse
|
15
|
Pottel J, Moitessier N. Single-Point Mutation with a Rotamer Library Toolkit: Toward Protein Engineering. J Chem Inf Model 2015; 55:2657-71. [PMID: 26623941 DOI: 10.1021/acs.jcim.5b00525] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
Protein engineers have long been hard at work to harness biocatalysts as a natural source of regio-, stereo-, and chemoselectivity in order to carry out chemistry (reactions and/or substrates) not previously achieved with these enzymes. The extreme labor demands and exponential number of mutation combinations have induced computational advances in this domain. The first step in our virtual approach is to predict the correct conformations upon mutation of residues (i.e., rebuilding side chains). For this purpose, we opted for a combination of molecular mechanics and statistical data. In this work, we have developed automated computational tools to extract protein structural information and created conformational libraries for each amino acid dependent on a variable number of parameters (e.g., resolution, flexibility, secondary structure). We have also developed the necessary tool to apply the mutation and optimize the conformation accordingly. For side-chain conformation prediction, we obtained overall average root-mean-square deviations (RMSDs) of 0.91 and 1.01 Å for the 18 flexible natural amino acids within two distinct sets of over 3000 and 1500 side-chain residues, respectively. The commonly used dihedral angle differences were also evaluated and performed worse than the state of the art. These two metrics are also compared. Furthermore, we generated a family-specific library for kinases that produced an average 2% lower RMSD upon side-chain reconstruction and a residue-specific library that yielded a 17% improvement. Ultimately, since our protein engineering outlook involves using our docking software, Fitted/Impacts, we applied our mutation protocol to a benchmarked data set for self- and cross-docking. Our side-chain reconstruction does not hinder our docking software, demonstrating differences in pose prediction accuracy of approximately 2% (RMSD cutoff metric) for a set of over 200 protein/ligand structures. Similarly, when docking to a set of over 100 kinases, side-chain reconstruction (using both general and biased conformation libraries) had minimal detriment to the docking accuracy.
Collapse
Affiliation(s)
- Joshua Pottel
- Department of Chemistry, McGill University , 801 Sherbrooke Street West, Montreal, QC, Canada H3A 0B8
| | - Nicolas Moitessier
- Department of Chemistry, McGill University , 801 Sherbrooke Street West, Montreal, QC, Canada H3A 0B8
| |
Collapse
|