1
|
Amezcua M, Setiadi J, Mobley DL. The SAMPL9 host-guest blind challenge: an overview of binding free energy predictive accuracy. Phys Chem Chem Phys 2024; 26:9207-9225. [PMID: 38444308 PMCID: PMC10954238 DOI: 10.1039/d3cp05111k] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2023] [Accepted: 02/03/2024] [Indexed: 03/07/2024]
Abstract
We report the results of the SAMPL9 host-guest blind challenge for predicting binding free energies. The challenge focused on macrocycles from pillar[n]-arene and cyclodextrin host families, including WP6, and bCD and HbCD. A variety of methods were used by participants to submit binding free energy predictions. A machine learning approach based on molecular descriptors achieved the highest accuracy (RMSE of 2.04 kcal mol-1) among the ranked methods in the WP6 dataset. Interestingly, predictions for WP6 obtained via docking tended to outperform all methods (RMSE of 1.70 kcal mol-1), most of which are MD based and computationally more expensive. In general, methods applying force fields achieved better correlation with experiments for WP6 opposed to the machine learning and docking models. In the cyclodextrin-phenothiazine challenge, the ATM approach emerged as the top performing method with RMSE less than 1.86 kcal mol-1. Correlation metrics of ranked methods in this dataset were relatively poor compared to WP6. We also highlight several lessons learned to guide future work and help improve studies on the systems discussed. For example, WP6 may be present in other microstates other than its -12 state in the presence of certain guests. Machine learning approaches can be used to fine tune or help train force fields for certain chemistry (i.e. WP6-G4). Certain phenothiazines occupy distinct primary and secondary orientations, some of which were considered individually for accurate binding free energies. The accuracy of predictions from certain methods while starting from a single binding pose/orientation demonstrates the sensitivity of calculated binding free energies to the orientation, and in some cases the likely dominant orientation for the system. Computational and experimental results suggest that guest phenothiazine core traverses both the secondary and primary faces of the cyclodextrin hosts, a bulky cationic side chain will primarily occupy the primary face, and the phenothiazine core substituent resides at the larger secondary face.
Collapse
Affiliation(s)
- Martin Amezcua
- Department of Pharmaceutical Sciences, University of California, Irvine, Irvine, California 92697, USA.
| | - Jeffry Setiadi
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, California 92093, USA
| | - David L Mobley
- Department of Pharmaceutical Sciences, University of California, Irvine, Irvine, California 92697, USA.
- Department of Chemistry, University of California, Irvine, Irvine, California 92697, USA
| |
Collapse
|
2
|
Liu X, Zheng L, Qin C, Li Y, Zhang JZH, Sun Z. Screening Power of End-Point Free-Energy Calculations in Cucurbituril Host-Guest Systems. J Chem Inf Model 2023; 63:6938-6946. [PMID: 37908066 DOI: 10.1021/acs.jcim.3c01356] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2023]
Abstract
End-point free-energy methods as an indispensable component in virtual screening are commonly recognized as a tool with a certain level of screening power in pharmaceutical research. While a huge number of records could be found for end-point applications in protein-ligand, protein-protein, and protein-DNA complexes from academic and industrial reports, up to now, there is no large-scale benchmark in host-guest complexes supporting the screening power of end-point free-energy techniques. A good benchmark requires a data set of sufficient coverage of pharmaceutically relevant chemical space, a long-time sampling length supporting the trajectory approximation of the ensemble average, and a sufficient sample size of receptor-acceptor pairs to stabilize the performance statistics. In this work, selecting a popular family of macrocyclic hosts named cucurbiturils, we construct a large data set containing 154 host-guest pairs, perform extensive end-point sampling of several hundred nanosecond lengths for each system, and extract the free-energy estimates with a variety of end-point free-energy techniques, including the advanced three-trajectory dielectric-constant-variable regime proposed in our recent work. The best-performing end-point protocol employs GAFF2 for solute descriptions, the three-trajectory end-point sampling regime, and the MM/GBSA Hamiltonian in free-energy extraction, achieving a high ranking metrics of Kendall τ > 0.6, a Pearlman predictive index of ∼0.8, and a high scoring power of Pearson r > 0.8. The current project as the first large-scale systematic benchmark of end-point methods in host-guest complexes in academic publications provides solid evidence of the applicability of end-point techniques and direct guidance of computational setups in practical host-guest systems.
Collapse
Affiliation(s)
- Xiao Liu
- School of Mathematics, Physics and Statistics, Shanghai University of Engineering Science, Shanghai 201620, China
| | - Lei Zheng
- NYU-ECNU Center for Computational Chemistry at NYU Shanghai, Shanghai 200062, China
- Department of Chemistry, New York University, New York, New York 10003, United States
| | - Chu Qin
- School of Mathematics, Physics and Statistics, Shanghai University of Engineering Science, Shanghai 201620, China
| | - Yang Li
- College of Information Science and Engineering, Shandong Agricultural University, Tai'an 271018, China
| | - John Z H Zhang
- NYU-ECNU Center for Computational Chemistry at NYU Shanghai, Shanghai 200062, China
- Department of Chemistry, New York University, New York, New York 10003, United States
- School of Chemistry and Molecular Engineering, East China Normal University, Shanghai 200062, China
- Faculty of Synthetic Biology and Institute of Synthetic Biology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen Guangdong 518055, China
| | - Zhaoxi Sun
- Changping Laboratory, Beijing 102206, China
| |
Collapse
|
3
|
Liu X, Zheng L, Qin C, Cong Y, Zhang JZH, Sun Z. Comprehensive Evaluation of End-Point Free Energy Techniques in Carboxylated-Pillar[6]arene Host–Guest Binding: III. Force-Field Comparison, Three-Trajectory Realization and Further Dielectric Augmentation. Molecules 2023; 28:molecules28062767. [PMID: 36985739 PMCID: PMC10059726 DOI: 10.3390/molecules28062767] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2023] [Revised: 03/13/2023] [Accepted: 03/17/2023] [Indexed: 03/22/2023] Open
Abstract
Host–guest binding, despite the relatively simple structural and chemical features of individual components, still poses a challenge in computational modelling. The extreme underperformance of standard end-point methods in host–guest binding makes them practically useless. In the current work, we explore a potentially promising modification of the three-trajectory realization. The alteration couples the binding-induced structural reorganization into free energy estimation and suffers from dramatic fluctuations in internal energies in protein–ligand situations. Fortunately, the relatively small size of host–guest systems minimizes the magnitude of internal fluctuations and makes the three-trajectory realization practically suitable. Due to the incorporation of intra-molecular interactions in free energy estimation, a strong dependence on the force field parameters could be incurred. Thus, a term-specific investigation of transferable GAFF derivatives is presented, and noticeable differences in many aspects are identified between commonly applied GAFF and GAFF2. These force-field differences lead to different dynamic behaviors of the macrocyclic host, which ultimately would influence the end-point sampling and binding thermodynamics. Therefore, the three-trajectory end-point free energy calculations are performed with both GAFF versions. Additionally, due to the noticeable differences between host dynamics under GAFF and GAFF2, we add additional benchmarks of the single-trajectory end-point calculations. When only the ranks of binding affinities are pursued, the three-trajectory realization performs very well, comparable to and even better than the regressed PBSA_E scoring function and the dielectric constant-variable regime. With the GAFF parameter set, the TIP3P water in explicit solvent sampling and either PB or GB implicit solvent model in free energy estimation, the predictive power of the three-trajectory realization in ranking calculations surpasses all existing end-point methods on this dataset. We further combine the three-trajectory realization with another promising modified end-point regime of varying the interior dielectric constant. The combined regime does not incur sizable improvements for ranks and deviations from experiment exhibit non-monotonic variations.
Collapse
Affiliation(s)
- Xiao Liu
- School of Mathematics, Physics and Statistics, Shanghai University of Engineering Science, Shanghai 201620, China
- Correspondence: (X.L.); (Y.C.); (Z.S.)
| | - Lei Zheng
- NYU-ECNU Center for Computational Chemistry at NYU Shanghai, Shanghai 200062, China
- Department of Chemistry, New York University, New York, NY 10003, USA
| | - Chu Qin
- School of Mathematics, Physics and Statistics, Shanghai University of Engineering Science, Shanghai 201620, China
| | - Yalong Cong
- School of Chemistry and Molecular Engineering, East China Normal University, Shanghai 200062, China
- Correspondence: (X.L.); (Y.C.); (Z.S.)
| | - John Z. H. Zhang
- NYU-ECNU Center for Computational Chemistry at NYU Shanghai, Shanghai 200062, China
- Department of Chemistry, New York University, New York, NY 10003, USA
- School of Chemistry and Molecular Engineering, East China Normal University, Shanghai 200062, China
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
| | - Zhaoxi Sun
- College of Chemistry and Molecular Engineering, Peking University, Beijing 100871, China
- Correspondence: (X.L.); (Y.C.); (Z.S.)
| |
Collapse
|
4
|
Amezcua M, Setiadi J, Ge Y, Mobley DL. An overview of the SAMPL8 host-guest binding challenge. J Comput Aided Mol Des 2022; 36:707-734. [PMID: 36229622 PMCID: PMC9596595 DOI: 10.1007/s10822-022-00462-5] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2022] [Accepted: 06/21/2022] [Indexed: 11/23/2022]
Abstract
The SAMPL series of challenges aim to focus the community on specific modeling challenges, while testing and hopefully driving progress of computational methods to help guide pharmaceutical drug discovery. In this study, we report on the results of the SAMPL8 host–guest blind challenge for predicting absolute binding affinities. SAMPL8 focused on two host–guest datasets, one involving the cucurbituril CB8 (with a series of common drugs of abuse) and another involving two different Gibb deep-cavity cavitands. The latter dataset involved a previously featured deep cavity cavitand (TEMOA) as well as a new variant (TEETOA), both binding to a series of relatively rigid fragment-like guests. Challenge participants employed a reasonably wide variety of methods, though many of these were based on molecular simulations, and predictive accuracy was mixed. As in some previous SAMPL iterations (SAMPL6 and SAMPL7), we found that one approach to achieve greater accuracy was to apply empirical corrections to the binding free energy predictions, taking advantage of prior data on binding to these hosts. Another approach which performed well was a hybrid MD-based approach with reweighting to a force matched QM potential. In the cavitand challenge, an alchemical method using the AMOEBA-polarizable force field achieved the best success with RMSE less than 1 kcal/mol, while another alchemical approach (ATM/GAFF2-AM1BCC/TIP3P/HREM) had RMSE less than 1.75 kcal/mol. The work discussed here also highlights several important lessons; for example, retrospective studies of reference calculations demonstrate the sensitivity of predicted binding free energies to ethyl group sampling and/or guest starting pose, providing guidance to help improve future studies on these systems.
Collapse
Affiliation(s)
- Martin Amezcua
- Department of Pharmaceutical Sciences, University of California, Irvine, CA, 92697, USA
| | - Jeffry Setiadi
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, 92093, USA
| | - Yunhui Ge
- Department of Pharmaceutical Sciences, University of California, Irvine, CA, 92697, USA
| | - David L Mobley
- Department of Pharmaceutical Sciences, University of California, Irvine, CA, 92697, USA. .,Department of Chemistry, University of California, Irvine, CA, 92697, USA.
| |
Collapse
|
5
|
Liu X, Zheng L, Qin C, Zhang JZH, Sun Z. Comprehensive evaluation of end-point free energy techniques in carboxylated-pillar[6]arene host-guest binding: I. Standard procedure. J Comput Aided Mol Des 2022; 36:735-752. [PMID: 36136209 DOI: 10.1007/s10822-022-00475-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2022] [Accepted: 09/06/2022] [Indexed: 10/14/2022]
Abstract
Despite the massive application of end-point free energy methods in protein-ligand and protein-protein interactions, computational understandings about their performance in relatively simple and prototypical host-guest systems are limited. In this work, we present a comprehensive benchmark calculation with standard end-point free energy techniques in a recent host-guest dataset containing 13 host-guest pairs involving the carboxylated-pillar[6]arene host. We first assess the charge schemes for solutes by comparing the charge-produced electrostatics with many ab initio references, in order to obtain a preliminary albeit detailed view of the charge quality. Then, we focus on four modelling details of end-point free energy calculations, including the docking procedure for the generation of initial condition, the charge scheme for host and guest molecules, the water model used in explicit-solvent sampling, and the end-point methods for free energy estimation. The binding thermodynamics obtained with different modelling schemes are compared with experimental references, and some practical guidelines on maximizing the performance of end-point methods in practical host-guest systems are summarized. Further, we compare our simulation outcome with predictions in the grand challenge and discuss further developments to improve the prediction quality of end-point free energy methods. Overall, unlike the widely acknowledged applicability in protein-ligand binding, the standard end-point calculations cannot produce useful outcomes in host-guest binding and thus are not recommended unless alterations are performed.
Collapse
Affiliation(s)
- Xiao Liu
- School of Mathematics, Physics and Statistics, Shanghai University of Engineering Science, Shanghai, 201620, China.
| | - Lei Zheng
- NYU-ECNU Center for Computational Chemistry at NYU Shanghai, Shanghai, 200062, China
| | - Chu Qin
- School of Mathematics, Physics and Statistics, Shanghai University of Engineering Science, Shanghai, 201620, China
| | - John Z H Zhang
- NYU-ECNU Center for Computational Chemistry at NYU Shanghai, Shanghai, 200062, China.,School of Chemistry and Molecular Engineering, East China Normal University, Shanghai, 200062, China.,Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, Guangdong, China.,Department of Chemistry, New York University, New York, NY, 10003, USA
| | - Zhaoxi Sun
- College of Chemistry and Molecular Engineering, Peking University, Beijing, 100871, China.
| |
Collapse
|
6
|
Li J, Li H, Pei S, Kang N, Zhang G, Zhang C, Shuang S. Sensitive Detection of Sulfur Dioxide by Constructing a Protein Supramolecular Complex: a New Fluorescence Sensing Strategy. FOOD ANAL METHOD 2022. [DOI: 10.1007/s12161-022-02365-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
7
|
Bucinsky L, Bortňák D, Gall M, Matúška J, Milata V, Pitoňák M, Štekláč M, Végh D, Zajaček D. Machine learning prediction of 3CLpro SARS-CoV-2 docking scores. Comput Biol Chem 2022; 98:107656. [PMID: 35288359 PMCID: PMC8881816 DOI: 10.1016/j.compbiolchem.2022.107656] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2021] [Revised: 02/23/2022] [Accepted: 02/24/2022] [Indexed: 12/14/2022]
Abstract
Molecular docking results of two training sets containing 866 and 8,696 compounds were used to train three different machine learning (ML) approaches. Neural network approaches according to Keras and TensorFlow libraries and the gradient boosted decision trees approach of XGBoost were used with DScribe’s Smooth Overlap of Atomic Positions molecular descriptors. In addition, neural networks using the SchNetPack library and descriptors were used. The ML performance was tested on three different sets, including compounds for future organic synthesis. The final evaluation of the ML predicted docking scores was based on the ZINC in vivo set, from which 1,200 compounds were randomly selected with respect to their size. The results obtained showed a consistent ML prediction capability of docking scores, and even though compounds with more than 60 atoms were found slightly overestimated they remain valid for a subsequent evaluation of their drug repurposing suitability.
Collapse
|
8
|
Castro LHE, Sant'Anna CMR. Molecular Modeling Techniques Applied to the Design of Multitarget Drugs: Methods and Applications. Curr Top Med Chem 2021; 22:333-346. [PMID: 34844540 DOI: 10.2174/1568026621666211129140958] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2021] [Revised: 10/23/2021] [Accepted: 10/28/2021] [Indexed: 11/22/2022]
Abstract
Multifactorial diseases, such as cancer and diabetes present a challenge for the traditional "one-target, one disease" paradigm due to their complex pathogenic mechanisms. Although a combination of drugs can be used, a multitarget drug may be a better choice face of its efficacy, lower adverse effects and lower chance of resistance development. The computer-based design of these multitarget drugs can explore the same techniques used for single-target drug design, but the difficulties associated to the obtention of drugs that are capable of modulating two or more targets with similar efficacy impose new challenges, whose solutions involve the adaptation of known techniques and also to the development of new ones, including machine-learning approaches. In this review, some SBDD and LBDD techniques for the multitarget drug design are discussed, together with some cases where the application of such techniques led to effective multitarget ligands.
Collapse
Affiliation(s)
| | - Carlos Mauricio R Sant'Anna
- Programa de Pós-Graduação em Química, Instituto de Química, Universidade Federal Rural do Rio de Janeiro, Seropédica. Brazil
| |
Collapse
|