1
|
Hahn DF, Gapsys V, de Groot BL, Mobley DL, Tresadern G. Current State of Open Source Force Fields in Protein-Ligand Binding Affinity Predictions. J Chem Inf Model 2024; 64:5063-5076. [PMID: 38895959 PMCID: PMC11234369 DOI: 10.1021/acs.jcim.4c00417] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2024] [Revised: 04/23/2024] [Accepted: 04/25/2024] [Indexed: 06/21/2024]
Abstract
In drug discovery, the in silico prediction of binding affinity is one of the major means to prioritize compounds for synthesis. Alchemical relative binding free energy (RBFE) calculations based on molecular dynamics (MD) simulations are nowadays a popular approach for the accurate affinity ranking of compounds. MD simulations rely on empirical force field parameters, which strongly influence the accuracy of the predicted affinities. Here, we evaluate the ability of six different small-molecule force fields to predict experimental protein-ligand binding affinities in RBFE calculations on a set of 598 ligands and 22 protein targets. The public force fields OpenFF Parsley and Sage, GAFF, and CGenFF show comparable accuracy, while OPLS3e is significantly more accurate. However, a consensus approach using Sage, GAFF, and CGenFF leads to accuracy comparable to OPLS3e. While Parsley and Sage are performing comparably based on aggregated statistics across the whole dataset, there are differences in terms of outliers. Analysis of the force field reveals that improved parameters lead to significant improvement in the accuracy of affinity predictions on subsets of the dataset involving those parameters. Lower accuracy can not only be attributed to the force field parameters but is also dependent on input preparation and sampling convergence of the calculations. Especially large perturbations and nonconverged simulations lead to less accurate predictions. The input structures, Gromacs force field files, as well as the analysis Python notebooks are available on GitHub.
Collapse
Affiliation(s)
- David F. Hahn
- Computational
Chemistry, Janssen Research & Development, Turnhoutseweg 30, Beerse 2340, Belgium
| | - Vytautas Gapsys
- Computational
Chemistry, Janssen Research & Development, Turnhoutseweg 30, Beerse 2340, Belgium
- Computational
Biomolecular Dynamics Group, Max Planck
Institute for Multidisciplinary Sciences, Am Fassberg 11, Göttingen 37077, Germany
| | - Bert L. de Groot
- Computational
Biomolecular Dynamics Group, Max Planck
Institute for Multidisciplinary Sciences, Am Fassberg 11, Göttingen 37077, Germany
| | - David L. Mobley
- Department
of Chemistry, University of California, Irvine, California 92697, United States
- Department
of Pharmaceutical Sciences, University of
California, Irvine, California 92697, United States
| | - Gary Tresadern
- Computational
Chemistry, Janssen Research & Development, Turnhoutseweg 30, Beerse 2340, Belgium
| |
Collapse
|
2
|
Khuttan S, Gallicchio E. What to Make of Zero: Resolving the Statistical Noise from Conformational Reorganization in Alchemical Binding Free Energy Estimates with Metadynamics Sampling. J Chem Theory Comput 2024; 20:1489-1501. [PMID: 38252868 PMCID: PMC10867849 DOI: 10.1021/acs.jctc.3c01250] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2023] [Revised: 01/03/2024] [Accepted: 01/03/2024] [Indexed: 01/24/2024]
Abstract
We introduce the self-relative binding free energy (self-RBFE) approach to evaluate the intrinsic statistical variance of dual-topology alchemical binding free energy estimators. The self-RBFE is the relative binding free energy between a ligand and a copy of the same ligand, and its true value is zero. Nevertheless, because the two copies of the ligand move independently, the self-RBFE value produced by a finite-length simulation fluctuates and can be used to measure the variance of the model. The results of this validation provide evidence that a significant fraction of the errors observed in benchmark studies reflect the statistical fluctuations of unconverged estimates rather than the models' accuracy. Furthermore, we find that ligand reorganization is a significant contributing factor to the statistical variance of binding free energy estimates and that metadynamics-accelerated conformational sampling of the torsional degrees of freedom of the ligand can drastically reduce the time to convergence.
Collapse
Affiliation(s)
- Sheenam Khuttan
- Department
of Chemistry and Biochemistry, Brooklyn
College of the City University of New York, New York, New York 11210, United States
- Ph.D.
Program in Biochemistry, The Graduate Center
of the City University of New York, New York, New York 10016, United States
| | - Emilio Gallicchio
- Department
of Chemistry and Biochemistry, Brooklyn
College of the City University of New York, New York, New York 11210, United States
- Ph.D.
Program in Biochemistry, The Graduate Center
of the City University of New York, New York, New York 10016, United States
- Ph.D.
Program in Chemistry, The Graduate Center
of the City University of New York, New York, New York 10016, United States
| |
Collapse
|
3
|
Papadourakis M, Sinenka H, Matricon P, Hénin J, Brannigan G, Pérez-Benito L, Pande V, van Vlijmen H, de Graaf C, Deflorian F, Tresadern G, Cecchini M, Cournia Z. Alchemical Free Energy Calculations on Membrane-Associated Proteins. J Chem Theory Comput 2023; 19:7437-7458. [PMID: 37902715 PMCID: PMC11017255 DOI: 10.1021/acs.jctc.3c00365] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2023] [Indexed: 10/31/2023]
Abstract
Membrane proteins have diverse functions within cells and are well-established drug targets. The advances in membrane protein structural biology have revealed drug and lipid binding sites on membrane proteins, while computational methods such as molecular simulations can resolve the thermodynamic basis of these interactions. Particularly, alchemical free energy calculations have shown promise in the calculation of reliable and reproducible binding free energies of protein-ligand and protein-lipid complexes in membrane-associated systems. In this review, we present an overview of representative alchemical free energy studies on G-protein-coupled receptors, ion channels, transporters as well as protein-lipid interactions, with emphasis on best practices and critical aspects of running these simulations. Additionally, we analyze challenges and successes when running alchemical free energy calculations on membrane-associated proteins. Finally, we highlight the value of alchemical free energy calculations calculations in drug discovery and their applicability in the pharmaceutical industry.
Collapse
Affiliation(s)
- Michail Papadourakis
- Biomedical
Research Foundation, Academy of Athens, 4 Soranou Ephessiou, 11527 Athens, Greece
| | - Hryhory Sinenka
- Institut
de Chimie de Strasbourg, UMR7177, CNRS, Université de Strasbourg, F-67083 Strasbourg Cedex, France
| | - Pierre Matricon
- Sosei
Heptares, Steinmetz Building,
Granta Park, Great Abington, Cambridge CB21 6DG, United
Kingdom
| | - Jérôme Hénin
- Laboratoire
de Biochimie Théorique UPR 9080, CNRS and Université Paris Cité, 75005 Paris, France
| | - Grace Brannigan
- Center
for Computational and Integrative Biology, Rutgers University−Camden, Camden, New Jersey 08103, United States of America
- Department
of Physics, Rutgers University−Camden, Camden, New Jersey 08102, United States
of America
| | - Laura Pérez-Benito
- CADD,
In Silico Discovery, Janssen Research &
Development, Turnhoutseweg 30, 2340 Beerse, Belgium
| | - Vineet Pande
- CADD,
In Silico Discovery, Janssen Research &
Development, Turnhoutseweg 30, 2340 Beerse, Belgium
| | - Herman van Vlijmen
- CADD,
In Silico Discovery, Janssen Research &
Development, Turnhoutseweg 30, 2340 Beerse, Belgium
| | - Chris de Graaf
- Sosei
Heptares, Steinmetz Building,
Granta Park, Great Abington, Cambridge CB21 6DG, United
Kingdom
| | - Francesca Deflorian
- Sosei
Heptares, Steinmetz Building,
Granta Park, Great Abington, Cambridge CB21 6DG, United
Kingdom
| | - Gary Tresadern
- CADD,
In Silico Discovery, Janssen Research &
Development, Turnhoutseweg 30, 2340 Beerse, Belgium
| | - Marco Cecchini
- Institut
de Chimie de Strasbourg, UMR7177, CNRS, Université de Strasbourg, F-67083 Strasbourg Cedex, France
| | - Zoe Cournia
- Biomedical
Research Foundation, Academy of Athens, 4 Soranou Ephessiou, 11527 Athens, Greece
| |
Collapse
|
4
|
Khalak Y, Tresadern G, Hahn DF, de Groot BL, Gapsys V. Chemical Space Exploration with Active Learning and Alchemical Free Energies. J Chem Theory Comput 2022; 18:6259-6270. [PMID: 36148968 PMCID: PMC9558370 DOI: 10.1021/acs.jctc.2c00752] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
![]()
Drug discovery can be thought of as a search for a needle
in a
haystack: searching through a large chemical space for the most active
compounds. Computational techniques can narrow the search space for
experimental follow up, but even they become unaffordable when evaluating
large numbers of molecules. Therefore, machine learning (ML) strategies
are being developed as computationally cheaper complementary techniques
for navigating and triaging large chemical libraries. Here, we explore
how an active learning protocol can be combined with first-principles
based alchemical free energy calculations to identify high affinity
phosphodiesterase 2 (PDE2) inhibitors. We first calibrate the procedure
using a set of experimentally characterized PDE2 binders. The optimized
protocol is then used prospectively on a large chemical library to
navigate toward potent inhibitors. In the active learning cycle, at
every iteration a small fraction of compounds is probed by alchemical
calculations and the obtained affinities are used to train ML models.
With successive rounds, high affinity binders are identified by explicitly
evaluating only a small subset of compounds in a large chemical library,
thus providing an efficient protocol that robustly identifies a large
fraction of true positives.
Collapse
Affiliation(s)
- Yuriy Khalak
- Computational Biomolecular Dynamics Group, Department of Theoretical and Computational Biophysics, Max Planck Institute for Multidisciplinary Sciences, Am Fassberg 11, D-37077 Göttingen, Germany
| | - Gary Tresadern
- Computational Chemistry, Janssen Research & Development, Janssen Pharmaceutica N. V., Turnhoutseweg 30, 2340 Beerse, Belgium
| | - David F Hahn
- Computational Chemistry, Janssen Research & Development, Janssen Pharmaceutica N. V., Turnhoutseweg 30, 2340 Beerse, Belgium
| | - Bert L de Groot
- Computational Biomolecular Dynamics Group, Department of Theoretical and Computational Biophysics, Max Planck Institute for Multidisciplinary Sciences, Am Fassberg 11, D-37077 Göttingen, Germany
| | - Vytautas Gapsys
- Computational Biomolecular Dynamics Group, Department of Theoretical and Computational Biophysics, Max Planck Institute for Multidisciplinary Sciences, Am Fassberg 11, D-37077 Göttingen, Germany
| |
Collapse
|
5
|
Bhati A, Coveney PV. Large Scale Study of Ligand-Protein Relative Binding Free Energy Calculations: Actionable Predictions from Statistically Robust Protocols. J Chem Theory Comput 2022; 18:2687-2702. [PMID: 35293737 PMCID: PMC9009079 DOI: 10.1021/acs.jctc.1c01288] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2021] [Indexed: 12/28/2022]
Abstract
The accurate and reliable prediction of protein-ligand binding affinities can play a central role in the drug discovery process as well as in personalized medicine. Of considerable importance during lead optimization are the alchemical free energy methods that furnish an estimation of relative binding free energies (RBFE) of similar molecules. Recent advances in these methods have increased their speed, accuracy, and precision. This is evident from the increasing number of retrospective as well as prospective studies employing them. However, such methods still have limited applicability in real-world scenarios due to a number of important yet unresolved issues. Here, we report the findings from a large data set comprising over 500 ligand transformations spanning over 300 ligands binding to a diverse set of 14 different protein targets which furnish statistically robust results on the accuracy, precision, and reproducibility of RBFE calculations. We use ensemble-based methods which are the only way to provide reliable uncertainty quantification given that the underlying molecular dynamics is chaotic. These are implemented using TIES (Thermodynamic Integration with Enhanced Sampling). Results achieve chemical accuracy in all cases. Ensemble simulations also furnish information on the statistical distributions of the free energy calculations which exhibit non-normal behavior. We find that the "enhanced sampling" method known as replica exchange with solute tempering degrades RBFE predictions. We also report definitively on numerous associated alchemical factors including the choice of ligand charge method, flexibility in ligand structure, and the size of the alchemical region including the number of atoms involved in transforming one ligand into another. Our findings provide a key set of recommendations that should be adopted for the reliable application of RBFE methods.
Collapse
Affiliation(s)
- Agastya
P. Bhati
- Centre
for Computational Science, Department of Chemistry, University College London, London WC1H 0AJ, United Kingdom
| | - Peter V. Coveney
- Centre
for Computational Science, Department of Chemistry, University College London, London WC1H 0AJ, United Kingdom
- Informatics
Institute, University of Amsterdam, P.O. Box 94323, 1090 GH Amsterdam, Netherlands
| |
Collapse
|