1
|
Ge Y, Pande V, Seierstad MJ, Damm-Ganamet KL. Exploring the Application of SiteMap and Site Finder for Focused Cryptic Pocket Identification. J Phys Chem B 2024; 128:6233-6245. [PMID: 38904218 DOI: 10.1021/acs.jpcb.4c00664] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/22/2024]
Abstract
The characterization of cryptic pockets has been elusive, despite substantial efforts. Computational modeling approaches, such as molecular dynamics (MD) simulations, can provide atomic-level details of binding site motions and binding pathways. However, the time scale that MD can achieve at a reasonable cost often limits its application for cryptic pocket identification. Enhanced sampling techniques can improve the efficiency of MD simulations by focused sampling of important regions of the protein, but prior knowledge of the simulated system is required to define the appropriate coordinates. In the case of a novel, unknown cryptic pocket, such information is not available, limiting the application of enhanced sampling techniques for cryptic pocket identification. In this work, we explore the ability of SiteMap and Site Finder, widely used commercial packages for pocket identification, to detect focus points on the protein and further apply other advanced computational methods. The information gained from this analysis enables the use of computational modeling, including enhanced MD sampling techniques, to explore potential cryptic binding pockets suggested by SiteMap and Site Finder. Here, we examined SiteMap and Site Finder results on 136 known cryptic pockets from a combination of the PocketMiner dataset (a recently curated set of cryptic pockets), the Cryptosite Set (a classic set of cryptic pockets), and Natural killer group 2D (NKG2D, a protein target where a cryptic pocket is confirmed). Our findings demonstrate the application of existing, well-studied tools in efficiently mapping potential regions harboring cryptic pockets.
Collapse
Affiliation(s)
- Yunhui Ge
- Computer-Aided Drug Design, Therapeutics Discovery, Janssen Research & Development, 3210 Merryfield Row, San Diego, California 92121, United States
| | - Vineet Pande
- Computer-Aided Drug Design, Therapeutics Discovery, Janssen Research & Development, Turnhoutseweg 30, 2340 Beerse, Belgium
| | - Mark J Seierstad
- Computer-Aided Drug Design, Therapeutics Discovery, Janssen Research & Development, 3210 Merryfield Row, San Diego, California 92121, United States
| | - Kelly L Damm-Ganamet
- Computer-Aided Drug Design, Therapeutics Discovery, Janssen Research & Development, 3210 Merryfield Row, San Diego, California 92121, United States
| |
Collapse
|
2
|
Banayan NE, Hsu A, Hunt JF, Palmer AG, Friesner RA. Parsing Dynamics of Protein Backbone NH and Side-Chain Methyl Groups using Molecular Dynamics Simulations. J Chem Theory Comput 2024. [PMID: 38957960 DOI: 10.1021/acs.jctc.4c00378] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/04/2024]
Abstract
Experimental NMR spectroscopy and theoretical molecular dynamics (MD) simulations provide complementary insights into protein conformational dynamics and hence into biological function. The present work describes an extensive set of backbone NH and side-chain methyl group generalized order parameters for the Escherichia coli ribonuclease HI (RNH) enzyme derived from 2-μs microsecond MD simulations using the OPLS4 and AMBER-FF19SB force fields. The simulated generalized order parameters are compared with values derived from NMR 15N and 13CH2D spin relaxation measurements. The squares of the generalized order parameters, S2 for the N-H bond vector and Saxis2 for the methyl group symmetry axis, characterize the equilibrium distribution of vector orientations in a molecular frame of reference. Optimal agreement between simulated and experimental results was obtained by averaging S2 or Saxis2 calculated by dividing the simulated trajectories into 50 ns blocks (∼five times the rotational diffusion correlation time for RNH). With this procedure, the median absolute deviations (MAD) between experimental and simulated values of S2 and Saxis2 are 0.030 (NH) and 0.061 (CH3) for OPLS4 and 0.041 (NH) and 0.078 (CH3) for AMBER-FF19SB. The MAD between OPLS4 and AMBER-FF19SB are 0.021 (NH) and 0.072 (CH3). The generalized order parameters for the methyl group symmetry axis can be decomposed into contributions from backbone fluctuations, between-rotamer dihedral angle transitions, and within-rotamer dihedral angle fluctuations. Analysis of the simulation trajectories shows that (i) backbone and side chain conformational fluctuations exhibit little correlation and that (ii) fluctuations within rotamers are limited and highly uniform with values that depend on the number of dihedral angles considered. Low values of Saxis2, indicative of enhanced side-chain flexibility, result from between-rotamer transitions that can be enhanced by increased local backbone flexibility.
Collapse
Affiliation(s)
- Nooriel E Banayan
- Department of Biological Sciences, Columbia University, 3000 Broadway, New York, New York 10027, United States
| | - Andrew Hsu
- Department of Chemistry, Columbia University, 3000 Broadway, New York, New York 10027, United States
| | - John F Hunt
- Department of Biological Sciences, Columbia University, 3000 Broadway, New York, New York 10027, United States
| | - Arthur G Palmer
- Department of Biochemistry and Molecular Biophysics, Columbia University, 701 West 168th Street, New York, New York 10032, United States
| | - Richard A Friesner
- Department of Chemistry, Columbia University, 3000 Broadway, New York, New York 10027, United States
| |
Collapse
|
3
|
Suruzhon M, Abdel-Maksoud K, Bodnarchuk MS, Ciancetta A, Wall ID, Essex JW. Enhancing torsional sampling using fully adaptive simulated tempering. J Chem Phys 2024; 160:154110. [PMID: 38639317 DOI: 10.1063/5.0190659] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2023] [Accepted: 03/23/2024] [Indexed: 04/20/2024] Open
Abstract
Enhanced sampling algorithms are indispensable when working with highly disconnected multimodal distributions. An important application of these is the conformational exploration of particular internal degrees of freedom of molecular systems. However, despite the existence of many commonly used enhanced sampling algorithms to explore these internal motions, they often rely on system-dependent parameters, which negatively impact efficiency and reproducibility. Here, we present fully adaptive simulated tempering (FAST), a variation of the irreversible simulated tempering algorithm, which continuously optimizes the number, parameters, and weights of intermediate distributions to achieve maximally fast traversal over a space defined by the change in a predefined thermodynamic control variable such as temperature or an alchemical smoothing parameter. This work builds on a number of previously published methods, such as sequential Monte Carlo, and introduces a novel parameter optimization procedure that can, in principle, be used in any expanded ensemble algorithms. This method is validated by being applied on a number of different molecular systems with high torsional kinetic barriers. We also consider two different soft-core potentials during the interpolation procedure and compare their performance. We conclude that FAST is a highly efficient algorithm, which improves simulation reproducibility and can be successfully used in a variety of settings with the same initial hyperparameters.
Collapse
Affiliation(s)
- Miroslav Suruzhon
- School of Chemistry, University of Southampton, Highfield, Southampton SO17 1BJ, United Kingdom
| | - Khaled Abdel-Maksoud
- School of Chemistry, University of Southampton, Highfield, Southampton SO17 1BJ, United Kingdom
| | - Michael S Bodnarchuk
- Computational Chemistry, R&D Oncology, AstraZeneca, Cambridge CB4 0WG, United Kingdom
| | | | - Ian D Wall
- GSK Medicines Research Centre, Gunnels Wood Road, Stevenage SG1 2NY, United Kingdom
| | - Jonathan W Essex
- School of Chemistry, University of Southampton, Highfield, Southampton SO17 1BJ, United Kingdom
| |
Collapse
|
4
|
Amezcua M, Setiadi J, Mobley DL. The SAMPL9 host-guest blind challenge: an overview of binding free energy predictive accuracy. Phys Chem Chem Phys 2024; 26:9207-9225. [PMID: 38444308 PMCID: PMC10954238 DOI: 10.1039/d3cp05111k] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2023] [Accepted: 02/03/2024] [Indexed: 03/07/2024]
Abstract
We report the results of the SAMPL9 host-guest blind challenge for predicting binding free energies. The challenge focused on macrocycles from pillar[n]-arene and cyclodextrin host families, including WP6, and bCD and HbCD. A variety of methods were used by participants to submit binding free energy predictions. A machine learning approach based on molecular descriptors achieved the highest accuracy (RMSE of 2.04 kcal mol-1) among the ranked methods in the WP6 dataset. Interestingly, predictions for WP6 obtained via docking tended to outperform all methods (RMSE of 1.70 kcal mol-1), most of which are MD based and computationally more expensive. In general, methods applying force fields achieved better correlation with experiments for WP6 opposed to the machine learning and docking models. In the cyclodextrin-phenothiazine challenge, the ATM approach emerged as the top performing method with RMSE less than 1.86 kcal mol-1. Correlation metrics of ranked methods in this dataset were relatively poor compared to WP6. We also highlight several lessons learned to guide future work and help improve studies on the systems discussed. For example, WP6 may be present in other microstates other than its -12 state in the presence of certain guests. Machine learning approaches can be used to fine tune or help train force fields for certain chemistry (i.e. WP6-G4). Certain phenothiazines occupy distinct primary and secondary orientations, some of which were considered individually for accurate binding free energies. The accuracy of predictions from certain methods while starting from a single binding pose/orientation demonstrates the sensitivity of calculated binding free energies to the orientation, and in some cases the likely dominant orientation for the system. Computational and experimental results suggest that guest phenothiazine core traverses both the secondary and primary faces of the cyclodextrin hosts, a bulky cationic side chain will primarily occupy the primary face, and the phenothiazine core substituent resides at the larger secondary face.
Collapse
Affiliation(s)
- Martin Amezcua
- Department of Pharmaceutical Sciences, University of California, Irvine, Irvine, California 92697, USA.
| | - Jeffry Setiadi
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, California 92093, USA
| | - David L Mobley
- Department of Pharmaceutical Sciences, University of California, Irvine, Irvine, California 92697, USA.
- Department of Chemistry, University of California, Irvine, Irvine, California 92697, USA
| |
Collapse
|
5
|
Melling O, Samways ML, Ge Y, Mobley DL, Essex JW. Enhanced Grand Canonical Sampling of Occluded Water Sites Using Nonequilibrium Candidate Monte Carlo. J Chem Theory Comput 2023; 19:1050-1062. [PMID: 36692215 PMCID: PMC9933432 DOI: 10.1021/acs.jctc.2c00823] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2022] [Indexed: 01/25/2023]
Abstract
Water molecules play a key role in many biomolecular systems, particularly when bound at protein-ligand interfaces. However, molecular simulation studies on such systems are hampered by the relatively long time scales over which water exchange between a protein and solvent takes place. Grand canonical Monte Carlo (GCMC) is a simulation technique that avoids this issue by attempting the insertion and deletion of water molecules within a given structure. The approach is constrained by low acceptance probabilities for insertions in congested systems, however. To address this issue, here, we combine GCMC with nonequilibium candidate Monte Carlo (NCMC) to yield a method that we refer to as grand canonical nonequilibrium candidate Monte Carlo (GCNCMC), in which the water insertions and deletions are carried out in a gradual, nonequilibrium fashion. We validate this new approach by comparing GCNCMC and GCMC simulations of bulk water and three protein binding sites. We find that not only is the efficiency of the water sampling improved by GCNCMC but that it also results in increased sampling of ligand conformations in a protein binding site, revealing new water-mediated ligand-binding geometries that are not observed using alternative enhanced sampling techniques.
Collapse
Affiliation(s)
- Oliver
J. Melling
- School
of Chemistry, University of Southampton, SouthamptonSO17 1BJ, U.K.
| | - Marley L. Samways
- School
of Chemistry, University of Southampton, SouthamptonSO17 1BJ, U.K.
| | - Yunhui Ge
- Department
of Pharmaceutical Sciences, University of
California, Irvine, California92697, United States
| | - David L. Mobley
- Department
of Pharmaceutical Sciences, University of
California, Irvine, California92697, United States
- Department
of Chemistry, University of California, Irvine, California92697, United States
| | - Jonathan W. Essex
- School
of Chemistry, University of Southampton, SouthamptonSO17 1BJ, U.K.
| |
Collapse
|
6
|
Amezcua M, Setiadi J, Ge Y, Mobley DL. An overview of the SAMPL8 host-guest binding challenge. J Comput Aided Mol Des 2022; 36:707-734. [PMID: 36229622 PMCID: PMC9596595 DOI: 10.1007/s10822-022-00462-5] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2022] [Accepted: 06/21/2022] [Indexed: 11/23/2022]
Abstract
The SAMPL series of challenges aim to focus the community on specific modeling challenges, while testing and hopefully driving progress of computational methods to help guide pharmaceutical drug discovery. In this study, we report on the results of the SAMPL8 host–guest blind challenge for predicting absolute binding affinities. SAMPL8 focused on two host–guest datasets, one involving the cucurbituril CB8 (with a series of common drugs of abuse) and another involving two different Gibb deep-cavity cavitands. The latter dataset involved a previously featured deep cavity cavitand (TEMOA) as well as a new variant (TEETOA), both binding to a series of relatively rigid fragment-like guests. Challenge participants employed a reasonably wide variety of methods, though many of these were based on molecular simulations, and predictive accuracy was mixed. As in some previous SAMPL iterations (SAMPL6 and SAMPL7), we found that one approach to achieve greater accuracy was to apply empirical corrections to the binding free energy predictions, taking advantage of prior data on binding to these hosts. Another approach which performed well was a hybrid MD-based approach with reweighting to a force matched QM potential. In the cavitand challenge, an alchemical method using the AMOEBA-polarizable force field achieved the best success with RMSE less than 1 kcal/mol, while another alchemical approach (ATM/GAFF2-AM1BCC/TIP3P/HREM) had RMSE less than 1.75 kcal/mol. The work discussed here also highlights several important lessons; for example, retrospective studies of reference calculations demonstrate the sensitivity of predicted binding free energies to ethyl group sampling and/or guest starting pose, providing guidance to help improve future studies on these systems.
Collapse
Affiliation(s)
- Martin Amezcua
- Department of Pharmaceutical Sciences, University of California, Irvine, CA, 92697, USA
| | - Jeffry Setiadi
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, 92093, USA
| | - Yunhui Ge
- Department of Pharmaceutical Sciences, University of California, Irvine, CA, 92697, USA
| | - David L Mobley
- Department of Pharmaceutical Sciences, University of California, Irvine, CA, 92697, USA. .,Department of Chemistry, University of California, Irvine, CA, 92697, USA.
| |
Collapse
|
7
|
When machine learning meets molecular synthesis. TRENDS IN CHEMISTRY 2022. [DOI: 10.1016/j.trechm.2022.07.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
8
|
Suruzhon M, Bodnarchuk MS, Ciancetta A, Wall ID, Essex JW. Enhancing Ligand and Protein Sampling Using Sequential Monte Carlo. J Chem Theory Comput 2022; 18:3894-3910. [PMID: 35588256 PMCID: PMC9202307 DOI: 10.1021/acs.jctc.1c01198] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
The sampling problem is one of the most widely studied topics in computational chemistry. While various methods exist for sampling along a set of reaction coordinates, many require system-dependent hyperparameters to achieve maximum efficiency. In this work, we present an alchemical variation of adaptive sequential Monte Carlo (SMC), an irreversible importance resampling method that is part of a well-studied class of methods that have been used in various applications but have been underexplored in computational biophysics. Afterward, we apply alchemical SMC on a variety of test cases, including torsional rotations of solvated ligands (butene and a terphenyl derivative), translational and rotational movements of protein-bound ligands, and protein side chain rotation coupled to the ligand degrees of freedom (T4-lysozyme, protein tyrosine phosphatase 1B, and transforming growth factor β). We find that alchemical SMC is an efficient way to explore targeted degrees of freedom and can be applied to a variety of systems using the same hyperparameters to achieve a similar performance. Alchemical SMC is a promising tool for preparatory exploration of systems where long-timescale sampling of the entire system can be traded off against short-timescale sampling of a particular set of degrees of freedom over a population of conformers.
Collapse
Affiliation(s)
- Miroslav Suruzhon
- School of Chemistry, University of Southampton, Highfield, Southampton SO17 1BJ, U.K
| | | | - Antonella Ciancetta
- Sygnature Discovery, Bio City, Pennyfoot Street, Nottingham NG1 1GR, U.K.,Department of Chemical, Pharmaceutical and Agricultural Sciences─DOCPAS, University of Ferrara, Via Fossato di Mortara 17/19, 44121 Ferrara, Italy
| | - Ian D Wall
- GSK Medicines Research Centre, Gunnels Wood Road, Stevenage SG1 2NY, U.K
| | - Jonathan W Essex
- School of Chemistry, University of Southampton, Highfield, Southampton SO17 1BJ, U.K
| |
Collapse
|
9
|
Ge Y, Wych DC, Samways ML, Wall ME, Essex JW, Mobley DL. Enhancing Sampling of Water Rehydration on Ligand Binding: A Comparison of Techniques. J Chem Theory Comput 2022; 18:1359-1381. [PMID: 35148093 PMCID: PMC9241631 DOI: 10.1021/acs.jctc.1c00590] [Citation(s) in RCA: 20] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
Abstract
Water often plays a key role in protein structure, molecular recognition, and mediating protein-ligand interactions. Thus, free energy calculations must adequately sample water motions, which often proves challenging in typical MD simulation time scales. Thus, the accuracy of methods relying on MD simulations ends up limited by slow water sampling. Particularly, as a ligand is removed or modified, bulk water may not have time to fill or rearrange in the binding site. In this work, we focus on several molecular dynamics (MD) simulation-based methods attempting to help rehydrate buried water sites: BLUES, using nonequilibrium candidate Monte Carlo (NCMC); grand, using grand canonical Monte Carlo (GCMC); and normal MD. We assess the accuracy and efficiency of these methods in rehydrating target water sites. We selected a range of systems with varying numbers of waters in the binding site, as well as those where water occupancy is coupled to the identity or binding mode of the ligand. We analyzed the rehydration of buried water sites in binding pockets using both clustering of trajectories and direct analysis of electron density maps. Our results suggest both BLUES and grand enhance water sampling relative to normal MD and grand is more robust than BLUES, but also that water sampling remains a major challenge for all of the methods tested. The lessons we learned for these methods and systems are discussed.
Collapse
Affiliation(s)
- Yunhui Ge
- Department of Pharmaceutical Sciences, University of California, Irvine, California 92697, United States
| | - David C Wych
- Department of Pharmaceutical Sciences, University of California, Irvine, California 92697, United States
- Computer, Computational, and Statistical Sciences Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Marley L Samways
- School of Chemistry, University of Southampton, Southampton SO17 1BJ, United Kingdom
| | - Michael E Wall
- Computer, Computational, and Statistical Sciences Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Jonathan W Essex
- School of Chemistry, University of Southampton, Southampton SO17 1BJ, United Kingdom
| | - David L Mobley
- Department of Pharmaceutical Sciences, University of California, Irvine, California 92697, United States
- Department of Chemistry, University of California, Irvine, California 92697, United States
| |
Collapse
|
10
|
Bradford SYC, El Khoury L, Ge Y, Osato M, Mobley DL, Fischer M. Temperature artifacts in protein structures bias ligand-binding predictions. Chem Sci 2021; 12:11275-11293. [PMID: 34667539 PMCID: PMC8447925 DOI: 10.1039/d1sc02751d] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2021] [Accepted: 07/09/2021] [Indexed: 12/14/2022] Open
Abstract
X-ray crystallography is the gold standard to resolve conformational ensembles that are significant for protein function, ligand discovery, and computational methods development. However, relevant conformational states may be missed at common cryogenic (cryo) data-collection temperatures but can be populated at room temperature. To assess the impact of temperature on making structural and computational discoveries, we systematically investigated protein conformational changes in response to temperature and ligand binding in a structural and computational workhorse, the T4 lysozyme L99A cavity. Despite decades of work on this protein, shifting to RT reveals new global and local structural changes. These include uncovering an apo helix conformation that is hidden at cryo but relevant for ligand binding, and altered side chain and ligand conformations. To evaluate the impact of temperature-induced protein and ligand changes on the utility of structural information in computation, we evaluated how temperature can mislead computational methods that employ cryo structures for validation. We find that when comparing simulated structures just to experimental cryo structures, hidden successes and failures often go unnoticed. When using structural information in ligand binding predictions, both coarse docking and rigorous binding free energy calculations are influenced by temperature effects. The trend that cryo artifacts limit the utility of structures for computation holds across five distinct protein classes. Our results suggest caution when consulting cryogenic structural data alone, as temperature artifacts can conceal errors and prevent successful computational predictions, which can mislead the development and application of computational methods in discovering bioactive molecules.
Collapse
Affiliation(s)
- Shanshan Y C Bradford
- Department of Chemical Biology & Therapeutics, St. Jude Children's Research Hospital Memphis TN 38105 USA
| | - Léa El Khoury
- Department of Pharmaceutical Sciences, University of California Irvine CA 92697 USA
| | - Yunhui Ge
- Department of Pharmaceutical Sciences, University of California Irvine CA 92697 USA
| | - Meghan Osato
- Department of Pharmaceutical Sciences, University of California Irvine CA 92697 USA
| | - David L Mobley
- Department of Pharmaceutical Sciences, University of California Irvine CA 92697 USA
- Department of Chemistry, University of California Irvine CA 92697 USA
| | - Marcus Fischer
- Department of Chemical Biology & Therapeutics, St. Jude Children's Research Hospital Memphis TN 38105 USA
- Department of Structural Biology, St. Jude Children's Research Hospital Memphis TN 38105 USA
| |
Collapse
|
11
|
Baumann HM, Gapsys V, de Groot BL, Mobley DL. Challenges Encountered Applying Equilibrium and Nonequilibrium Binding Free Energy Calculations. J Phys Chem B 2021; 125:4241-4261. [PMID: 33905257 PMCID: PMC8240641 DOI: 10.1021/acs.jpcb.0c10263] [Citation(s) in RCA: 29] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
Binding free energy calculations have become increasingly valuable to drive decision making in drug discovery projects. However, among other issues, inadequate sampling can reduce accuracy, limiting the value of the technique. In this paper, we apply absolute binding free energy calculations to ligands binding to T4 lysozyme L99A and HSP90 using equilibrium and nonequilibrium approaches. We highlight sampling problems encountered in these systems, such as slow side chain rearrangements and slow changes of water placement upon ligand binding. These same types of challenges are also likely to show up in other protein-ligand systems, and we propose some strategies to diagnose and test for such problems in alchemical free energy calculations. We also explore similarities and differences in how the equilibrium and the nonequilibrium approaches handle these problems. Our results show the large amount of work still to be done to make free energy calculations robust and reliable and provide insight for future research in this area.
Collapse
Affiliation(s)
- Hannah M Baumann
- Department of Pharmaceutical Sciences, University of California, Irvine, California 92617, United States
| | - Vytautas Gapsys
- Computational Biomolecular Dynamics Group, Department of Theoretical and Computational Biophysics, Max Planck Institute for Biophysical Chemistry, D-37077 Göttingen, Germany
| | - Bert L de Groot
- Computational Biomolecular Dynamics Group, Department of Theoretical and Computational Biophysics, Max Planck Institute for Biophysical Chemistry, D-37077 Göttingen, Germany
| | - David L Mobley
- Department of Pharmaceutical Sciences, University of California, Irvine, California 92617, United States
- Department of Chemistry, University of California, Irvine, California 92617, United States
| |
Collapse
|
12
|
Bergazin TD, Ben-Shalom IY, Lim NM, Gill SC, Gilson MK, Mobley DL. Enhancing water sampling of buried binding sites using nonequilibrium candidate Monte Carlo. J Comput Aided Mol Des 2021; 35:167-177. [PMID: 32968887 PMCID: PMC7904576 DOI: 10.1007/s10822-020-00344-8] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2020] [Accepted: 09/16/2020] [Indexed: 11/26/2022]
Abstract
Water molecules can be found interacting with the surface and within cavities in proteins. However, water exchange between bulk and buried hydration sites can be slow compared to simulation timescales, thus leading to the inefficient sampling of the locations of water. This can pose problems for free energy calculations for computer-aided drug design. Here, we apply a hybrid method that combines nonequilibrium candidate Monte Carlo (NCMC) simulations and molecular dynamics (MD) to enhance sampling of water in specific areas of a system, such as the binding site of a protein. Our approach uses NCMC to gradually remove interactions between a selected water molecule and its environment, then translates the water to a new region, before turning the interactions back on. This approach of gradual removal of interactions, followed by a move and then reintroduction of interactions, allows the environment to relax in response to the proposed water translation, improving acceptance of moves and thereby accelerating water exchange and sampling. We validate this approach on several test systems including the ligand-bound MUP-1 and HSP90 proteins with buried crystallographic waters removed. We show that our BLUES (NCMC/MD) method enhances water sampling relative to normal MD when applied to these systems. Thus, this approach provides a strategy to improve water sampling in molecular simulations which may be useful in practical applications in drug discovery and biomolecular design.
Collapse
Affiliation(s)
| | - Ido Y Ben-Shalom
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, 92093, USA
| | - Nathan M Lim
- Department of Pharmaceutical Sciences, University of California, Irvine, CA, 92697, USA
| | - Sam C Gill
- Department of Chemistry, University of California, Irvine, Irvine, CA, 92697, USA
| | - Michael K Gilson
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, 92093, USA
| | - David L Mobley
- Department of Pharmaceutical Sciences, University of California, Irvine, CA, 92697, USA.
- Department of Chemistry, University of California, Irvine, Irvine, CA, 92697, USA.
| |
Collapse
|
13
|
Gill SC, Mobley DL. Reversibly Sampling Conformations and Binding Modes Using Molecular Darting. J Chem Theory Comput 2021; 17:302-314. [PMID: 33289558 DOI: 10.1021/acs.jctc.0c00752] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]
Abstract
Sampling multiple binding modes of a ligand in a single molecular dynamics simulation is difficult. A given ligand may have many internal degrees of freedom, along with many different ways it might orient itself in a binding site or across several binding sites, all of which might be separated by large energy barriers. We have developed a novel Monte Carlo move called molecular darting (MolDarting) to reversibly sample between predefined binding modes of a ligand. Here, we couple this with nonequilibrium candidate Monte Carlo (NCMC) to improve acceptance of moves. We apply this technique to a simple dipeptide system, a ligand binding to T4 lysozyme L99A, and ligand binding to HIV integrase to test this new method. We observe significant increases in acceptance compared to uniformly sampling the internal and rotational/translational degrees of freedom in these systems.
Collapse
Affiliation(s)
- Samuel C Gill
- Department of Chemistry, University of California, Irvine, California 92617, United States
| | - David L Mobley
- Department of Chemistry, University of California, Irvine, California 92617, United States.,Department of Pharmaceutical Sciences, University of California, Irvine, California 92617, United States
| |
Collapse
|
14
|
Sasmal S, Gill SC, Lim NM, Mobley DL. Sampling Conformational Changes of Bound Ligands Using Nonequilibrium Candidate Monte Carlo and Molecular Dynamics. J Chem Theory Comput 2020; 16:1854-1865. [PMID: 32058713 DOI: 10.1021/acs.jctc.9b01066] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
Flexible ligands often have multiple binding modes or bound conformations that differ by rotation of a portion of the molecule around internal rotatable bonds. Knowledge of these binding modes is important for understanding the interactions stabilizing the ligand in the binding pocket, and other studies indicate it is important for calculating accurate binding affinities. In this work, we use a hybrid molecular dynamics (MD)/nonequilibrium candidate Monte Carlo (NCMC) method to sample the different binding modes of several flexible ligands and also to estimate the population distribution of the modes. The NCMC move proposal is divided into three parts. The flexible part of the ligand is alchemically turned off by decreasing the electrostatics and steric interactions gradually, followed by rotating the rotatable bond by a random angle and then slowly turning the ligand back on to its fully interacting state. The alchemical steps prior to and after the move proposal help the surrounding protein and water atoms in the binding pocket relax around the proposed ligand conformation and increase move acceptance rates. The protein-ligand system is propagated using classical MD in between the NCMC proposals. Using this MD/NCMC method, we were able to correctly reproduce the different binding modes of inhibitors binding to two kinase targets-c-Jun N-terminal kinase-1 and cyclin-dependent kinase 2-at a much lower computational cost compared to conventional MD and umbrella sampling. This method is available as a part of the BLUES software package.
Collapse
Affiliation(s)
- Sukanya Sasmal
- Department of Pharmaceutical Sciences, University of California, Irvine, California 92697, United States
| | - Samuel C Gill
- Department of Chemistry, University of California, Irvine, California 92697, United States
| | - Nathan M Lim
- Department of Pharmaceutical Sciences, University of California, Irvine, California 92697, United States
| | - David L Mobley
- Department of Chemistry, University of California, Irvine, California 92697, United States.,Department of Pharmaceutical Sciences, University of California, Irvine, California 92697, United States
| |
Collapse
|
15
|
de Ruiter A, Oostenbrink C. Advances in the calculation of binding free energies. Curr Opin Struct Biol 2020; 61:207-212. [PMID: 32088376 DOI: 10.1016/j.sbi.2020.01.016] [Citation(s) in RCA: 39] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2019] [Revised: 01/21/2020] [Accepted: 01/24/2020] [Indexed: 01/19/2023]
Abstract
In recent years, calculations of binding affinities from molecular simulations seem to have matured significantly. While the number of applications of such methods in drug design and biotechnology increases, the number of truly new methodological developments decreases. This review provides an overview of the current status of the field as reflected in recent publications. The focus is on the challenges that remain when using endstate, alchemical and pathway methods. For endstate methods this is the calculation of entropic contributions. For alchemical methods there are unsolved problems associated with the solvation of the active site, sampling slow degrees of freedom and when modifying the net charge. For pathway methods achieving sufficient sampling remains challenging. New trends are also highlighted, including the use of pathway methods for the quantification of protein-protein interactions.
Collapse
Affiliation(s)
- Anita de Ruiter
- Institute for Molecular Modeling and Simulation, University of Natural Resources and Life Sciences (BOKU), Vienna, Austria
| | - Chris Oostenbrink
- Institute for Molecular Modeling and Simulation, University of Natural Resources and Life Sciences (BOKU), Vienna, Austria.
| |
Collapse
|
16
|
Badaczewska-Dawid AE, Kolinski A, Kmiecik S. Computational reconstruction of atomistic protein structures from coarse-grained models. Comput Struct Biotechnol J 2019; 18:162-176. [PMID: 31969975 PMCID: PMC6961067 DOI: 10.1016/j.csbj.2019.12.007] [Citation(s) in RCA: 34] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2019] [Accepted: 12/10/2019] [Indexed: 01/02/2023] Open
Abstract
Three-dimensional protein structures, whether determined experimentally or theoretically, are often too low resolution. In this mini-review, we outline the computational methods for protein structure reconstruction from incomplete coarse-grained to all atomistic models. Typical reconstruction schemes can be divided into four major steps. Usually, the first step is reconstruction of the protein backbone chain starting from the C-alpha trace. This is followed by side-chains rebuilding based on protein backbone geometry. Subsequently, hydrogen atoms can be reconstructed. Finally, the resulting all-atom models may require structure optimization. Many methods are available to perform each of these tasks. We discuss the available tools and their potential applications in integrative modeling pipelines that can transfer coarse-grained information from computational predictions, or experiment, to all atomistic structures.
Collapse
Affiliation(s)
| | | | - Sebastian Kmiecik
- Faculty of Chemistry, Biological and Chemical Research Center, University of Warsaw, Pasteura 1, 02-093 Warsaw, Poland
| |
Collapse
|
17
|
Xu G, Ma T, Du J, Wang Q, Ma J. OPUS-Rota2: An Improved Fast and Accurate Side-Chain Modeling Method. J Chem Theory Comput 2019; 15:5154-5160. [PMID: 31412199 DOI: 10.1021/acs.jctc.9b00309] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
Side-chain modeling plays a critical role in protein structure prediction. However, in many current methods, balancing the speed and accuracy is still challenging. In this paper, on the basis of our previous work OPUS-Rota (Protein Sci. 2008, 17, 1576-1585), we introduce a new side-chain modeling method, OPUS-Rota2, which is tested on both a 65-protein test set (DB65) in the OPUS-Rota paper and a 379-protein test set (DB379) in the SCWRL4 paper. If the main chain is native, OPUS-Rota2 is more accurate than OPUS-Rota, SCWRL4, and OSCAR-star but slightly less accurate than OSCAR-o. Also, if the main chain is non-native, OPUS-Rota2 is more accurate than any other method. Moreover, OPUS-Rota2 is significantly faster than any other method, in particular, 2 orders of magnitude faster than OSCAR-o. Thus, the combination of higher accuracy and speed of OPUS-Rota2 in modeling side chains on both the native and non-native main chains makes OPUS-Rota2 a very useful tool in protein structure modeling.
Collapse
Affiliation(s)
- Gang Xu
- Multiscale Research Institute of Complex Systems , Fudan University , Shanghai 200433 , China.,School of Life Sciences , Tsinghua University , Beijing 100084 , China
| | | | - Junqing Du
- Verna and Marrs Mclean Department of Biochemistry and Molecular Biology , Baylor College of Medicine , One Baylor Plaza, BCM-125 , Houston , Texas 77030 , United States
| | - Qinghua Wang
- Verna and Marrs Mclean Department of Biochemistry and Molecular Biology , Baylor College of Medicine , One Baylor Plaza, BCM-125 , Houston , Texas 77030 , United States
| | - Jianpeng Ma
- Multiscale Research Institute of Complex Systems , Fudan University , Shanghai 200433 , China.,School of Life Sciences , Tsinghua University , Beijing 100084 , China.,Verna and Marrs Mclean Department of Biochemistry and Molecular Biology , Baylor College of Medicine , One Baylor Plaza, BCM-125 , Houston , Texas 77030 , United States.,School of Life Sciences , Fudan University , Shanghai 200433 , China
| |
Collapse
|
18
|
Harder-Viddal C, McDougall M, Roshko RM, Stetefeld J. Energetics of Storage and Diffusion of Water and Cyclo-Octasulfur for a Nonpolar Cavity of RHCC Tetrabrachion by Molecular Dynamics Simulations. Comput Struct Biotechnol J 2019; 17:675-683. [PMID: 31198494 PMCID: PMC6555900 DOI: 10.1016/j.csbj.2019.05.004] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2019] [Revised: 05/11/2019] [Accepted: 05/13/2019] [Indexed: 02/07/2023] Open
Abstract
Tetrabrachion forms the key component of the S-layer of Staphylothermus marinus. Molecular dynamics simulations have been used to study the energetics of occupancy of cavity 3 of the right-handed coiled-coil stalk of tetrabrachion by both water molecules and cyclooctasulfur S8 crowns, as well as to determine possible pathways and free energy barriers for the diffusion of both water and cyclooctasulfur through the peptide walls of RHCC tetrabrachion between cavity 3 and bulk solvent. Calculations of the transfer free energy from solvent to cavity show that clusters of six, seven and eight water molecules are marginally stable in cavity 3, but that occupancy of the cavity by a cyclooctasulfur ring is favoured significantly over water clusters of all sizes. Thermal activation simulations at T = 400K revealed that water molecules diffusing through the wall pass through a sequence of metastable configurations where they are temporarily immobilized by forming networks of hydrogen bonds with specific wall residues. Calculations of the free energy of these metastable configurations using multi-configurational thermodynamic integration yielded a free energy profile with a principal free energy maximum ∆G~50 kJ/mol and a slight activation asymmetry in favour of the direction from cavity to solvent. Potential exit pathways for cyclooctasulfur were investigated with the methods of steered molecular dynamics and umbrella sampling. The cyclooctasulfur was steered through a gap in the tetrabrachion wall along a linear path from cavity 3 into the solvent and the resulting trajectory was subdivided into 22 sampling windows. The free energy profile created for the trajectory by umbrella sampling showed a sharp principal maximum as a function of the reaction coordinate with asymmetric free energy barriers ∆Gexit~220 kJ/mol and ∆Gentrance~100 kJ/mol for cavity exit and entrance, respectively.
Collapse
Affiliation(s)
- C Harder-Viddal
- Department of Chemistry and Physics, Canadian Mennonite University, 500 Shaftesbury Blvd, Winnipeg, Manitoba, Canada
| | - M McDougall
- Department of Chemistry, University of Manitoba, 144 Dysart Rd, Winnipeg, Manitoba, Canada.,Center for Oil and Gas Research and Development (COGRAD), Canada
| | - R M Roshko
- Department of Physics and Astronomy, University of Manitoba, 30A Sifton Rd, Winnipeg, Manitoba, Canada
| | - J Stetefeld
- Department of Chemistry, University of Manitoba, 144 Dysart Rd, Winnipeg, Manitoba, Canada.,Center for Oil and Gas Research and Development (COGRAD), Canada.,Department of Biochemistry and Medical Genetics, University of Manitoba, Canada.,Department of Human Anatomy and Cell Science, University of Manitoba, Canada
| |
Collapse
|
19
|
Spiriti J, Subramanian SR, Palli R, Wu M, Zuckerman DM. Middle-way flexible docking: Pose prediction using mixed-resolution Monte Carlo in estrogen receptor α. PLoS One 2019; 14:e0215694. [PMID: 31013302 PMCID: PMC6478315 DOI: 10.1371/journal.pone.0215694] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2019] [Accepted: 04/06/2019] [Indexed: 12/17/2022] Open
Abstract
There is a vast gulf between the two primary strategies for simulating protein-ligand interactions. Docking methods significantly limit or eliminate protein flexibility to gain great speed at the price of uncontrolled inaccuracy, whereas fully flexible atomistic molecular dynamics simulations are expensive and often suffer from limited sampling. We have developed a flexible docking approach geared especially for highly flexible or poorly resolved targets based on mixed-resolution Monte Carlo (MRMC), which is intended to offer a balance among speed, protein flexibility, and sampling power. The binding region of the protein is treated with a standard atomistic force field, while the remainder of the protein is modeled at the residue level with a Gō model that permits protein flexibility while saving computational cost. Implicit solvation is used. Here we assess three facets of the MRMC approach with implications for other docking studies: (i) the role of receptor flexibility in cross-docking pose prediction; (ii) the use of non-equilibrium candidate Monte Carlo (NCMC) and (iii) the use of pose-clustering in scoring. We examine 61 co-crystallized ligands of estrogen receptor α, an important cancer target known for its flexibility. We also compare the performance of the MRMC approach with Autodock smina. Adding protein flexibility, not surprisingly, leads to significantly lower total energies and stronger interactions between protein and ligand, but notably we document the important role of backbone flexibility in the improvement. The improved backbone flexibility also leads to improved performance relative to smina. Somewhat unexpectedly, our implementation of NCMC leads to only modestly improved sampling of ligand poses. Overall, the addition of protein flexibility improves the performance of docking, as measured by energy-ranked poses, but we do not find significant improvements based on cluster information or the use of NCMC. We discuss possible improvements for the model including alternative coarse-grained force fields, improvements to the treatment of solvation, and adding additional types of NCMC moves.
Collapse
Affiliation(s)
- Justin Spiriti
- Department of Biomedical Engineering, Oregon Health and Science University, Portland, OR 97239, United States of America
| | - Sundar Raman Subramanian
- Department of Computational and Systems Biology, University of Pittsburgh, Pittsburgh, PA 15260, United States of America
| | - Rohith Palli
- Department of Computational and Systems Biology, University of Pittsburgh, Pittsburgh, PA 15260, United States of America
| | - Maria Wu
- Department of Computational and Systems Biology, University of Pittsburgh, Pittsburgh, PA 15260, United States of America
| | - Daniel M. Zuckerman
- Department of Biomedical Engineering, Oregon Health and Science University, Portland, OR 97239, United States of America
- * E-mail:
| |
Collapse
|