1
|
Distler K, Maschauer S, Neu E, Hübner H, Einsiedel J, Prante O, Gmeiner P. Structure-guided discovery of orexin receptor-binding PET ligands. Bioorg Med Chem 2024; 110:117823. [PMID: 38964170 DOI: 10.1016/j.bmc.2024.117823] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2024] [Revised: 06/25/2024] [Accepted: 06/25/2024] [Indexed: 07/06/2024]
Abstract
Molecular imaging using positron emission tomography (PET) can serve as a promising tool for visualizing biological targets in the brain. Insights into the expression pattern and the in vivo imaging of the G protein-coupled orexin receptors OX1R and OX2R will further our understanding of the orexin system and its role in various physiological and pathophysiological processes. Guided by crystal structures of our lead compound JH112 and the approved hypnotic drug suvorexant bound to OX1R and OX2R, respectively, we herein describe the design and synthesis of two novel radioligands, [18F]KD23 and [18F]KD10. Key to the success of our structural modifications was a bioisosteric replacement of the triazole moiety with a fluorophenyl group. The 19F-substituted analog KD23 showed high affinity for the OX1R and selectivity over OX2R, while the high affinity ligand KD10 displayed similar Ki values for both subtypes. Radiolabeling starting from the respective pinacol ester precursors resulted in excellent radiochemical yields of 93% and 88% for [18F]KD23 and [18F]KD10, respectively, within 20 min. The new compounds will be useful in PET studies aimed at subtype-selective imaging of orexin receptors in brain tissue.
Collapse
Affiliation(s)
- Katharina Distler
- Department of Chemistry and Pharmacy, Medicinal Chemistry, Friedrich-Alexander-Universität Erlangen-Nürnberg, Nikolaus-Fiebiger-Str. 10, 91058 Erlangen, Germany; FAU NeW - Research Center New Bioactive Compounds, Friedrich-Alexander-Universität Erlangen-Nürnberg, Nikolaus-Fiebiger-Str. 10, 91058 Erlangen, Germany
| | - Simone Maschauer
- Department of Nuclear Medicine, Molecular Imaging and Radiochemistry, Friedrich-Alexander-Universität Erlangen-Nürnberg, Kussmaulallee 12, 91054 Erlangen, Germany
| | - Eduard Neu
- Department of Chemistry and Pharmacy, Medicinal Chemistry, Friedrich-Alexander-Universität Erlangen-Nürnberg, Nikolaus-Fiebiger-Str. 10, 91058 Erlangen, Germany; FAU NeW - Research Center New Bioactive Compounds, Friedrich-Alexander-Universität Erlangen-Nürnberg, Nikolaus-Fiebiger-Str. 10, 91058 Erlangen, Germany
| | - Harald Hübner
- Department of Chemistry and Pharmacy, Medicinal Chemistry, Friedrich-Alexander-Universität Erlangen-Nürnberg, Nikolaus-Fiebiger-Str. 10, 91058 Erlangen, Germany; FAU NeW - Research Center New Bioactive Compounds, Friedrich-Alexander-Universität Erlangen-Nürnberg, Nikolaus-Fiebiger-Str. 10, 91058 Erlangen, Germany
| | - Jürgen Einsiedel
- Department of Chemistry and Pharmacy, Medicinal Chemistry, Friedrich-Alexander-Universität Erlangen-Nürnberg, Nikolaus-Fiebiger-Str. 10, 91058 Erlangen, Germany; FAU NeW - Research Center New Bioactive Compounds, Friedrich-Alexander-Universität Erlangen-Nürnberg, Nikolaus-Fiebiger-Str. 10, 91058 Erlangen, Germany
| | - Olaf Prante
- FAU NeW - Research Center New Bioactive Compounds, Friedrich-Alexander-Universität Erlangen-Nürnberg, Nikolaus-Fiebiger-Str. 10, 91058 Erlangen, Germany; Department of Nuclear Medicine, Molecular Imaging and Radiochemistry, Friedrich-Alexander-Universität Erlangen-Nürnberg, Kussmaulallee 12, 91054 Erlangen, Germany
| | - Peter Gmeiner
- Department of Chemistry and Pharmacy, Medicinal Chemistry, Friedrich-Alexander-Universität Erlangen-Nürnberg, Nikolaus-Fiebiger-Str. 10, 91058 Erlangen, Germany; FAU NeW - Research Center New Bioactive Compounds, Friedrich-Alexander-Universität Erlangen-Nürnberg, Nikolaus-Fiebiger-Str. 10, 91058 Erlangen, Germany.
| |
Collapse
|
2
|
Behara PK, Jang H, Horton JT, Gokey T, Dotson DL, Boothroyd S, Bayly CI, Cole DJ, Wang LP, Mobley DL. Benchmarking Quantum Mechanical Levels of Theory for Valence Parametrization in Force Fields. J Phys Chem B 2024. [PMID: 39087913 DOI: 10.1021/acs.jpcb.4c03167] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/02/2024]
Abstract
A wide range of density functional methods and basis sets are available to derive the electronic structure and properties of molecules. Quantum mechanical calculations are too computationally intensive for routine simulation of molecules in the condensed phase, prompting the development of computationally efficient force fields based on quantum mechanical data. Parametrizing general force fields, which cover a vast chemical space, necessitates the generation of sizable quantum mechanical data sets with optimized geometries and torsion scans. To achieve this efficiently, choosing a quantum mechanical method that balances computational cost and accuracy is crucial. In this study, we seek to assess the accuracy of quantum mechanical theory for specific properties such as conformer energies and torsion energetics. To comprehensively evaluate various methods, we focus on a representative set of 59 diverse small molecules, comparing approximately 25 combinations of functional and basis sets against the reference level coupled cluster calculations at the complete basis set limit.
Collapse
Affiliation(s)
- Pavan Kumar Behara
- Center for Neurotherapeutics, University of California, Irvine, California 92697, United States
| | - Hyesu Jang
- Chemistry Department, University of California at Davis, Davis, California 95616, United States
- OpenEye Scientific Software, Santa Fe, New Mexico 87508, United States
| | - Joshua T Horton
- School of Natural and Environmental Sciences, Newcastle University, Newcastle upon Tyne NE1 7RU, U.K
| | - Trevor Gokey
- Department of Chemistry, University of California, Irvine, California 92697, United States
| | - David L Dotson
- The Open Force Field Initiative, Open Molecular Software Foundation, Davis, California 95616, United States
- Datryllic LLC, Phoenix, Arizona 85003, United States
| | | | | | - Daniel J Cole
- School of Natural and Environmental Sciences, Newcastle University, Newcastle upon Tyne NE1 7RU, U.K
| | - Lee-Ping Wang
- Chemistry Department, University of California at Davis, Davis, California 95616, United States
| | - David L Mobley
- Center for Neurotherapeutics, University of California, Irvine, California 92697, United States
- Department of Chemistry, University of California, Irvine, California 92697, United States
| |
Collapse
|
3
|
Wang L, Behara PK, Thompson MW, Gokey T, Wang Y, Wagner JR, Cole DJ, Gilson MK, Shirts MR, Mobley DL. The Open Force Field Initiative: Open Software and Open Science for Molecular Modeling. J Phys Chem B 2024; 128:7043-7067. [PMID: 38989715 DOI: 10.1021/acs.jpcb.4c01558] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/12/2024]
Abstract
Force fields are a key component of physics-based molecular modeling, describing the energies and forces in a molecular system as a function of the positions of the atoms and molecules involved. Here, we provide a review and scientific status report on the work of the Open Force Field (OpenFF) Initiative, which focuses on the science, infrastructure and data required to build the next generation of biomolecular force fields. We introduce the OpenFF Initiative and the related OpenFF Consortium, describe its approach to force field development and software, and discuss accomplishments to date as well as future plans. OpenFF releases both software and data under open and permissive licensing agreements to enable rapid application, validation, extension, and modification of its force fields and software tools. We discuss lessons learned to date in this new approach to force field development. We also highlight ways that other force field researchers can get involved, as well as some recent successes of outside researchers taking advantage of OpenFF tools and data.
Collapse
Affiliation(s)
- Lily Wang
- Open Force Field, Open Molecular Software Foundation, Davis, California 95616, United States
| | - Pavan Kumar Behara
- Center for Neurotherapeutics, University of California, Irvine, California 92697, United States
| | - Matthew W Thompson
- Open Force Field, Open Molecular Software Foundation, Davis, California 95616, United States
| | - Trevor Gokey
- Department of Chemistry, University of California, Irvine, California 92697, United States
| | - Yuanqing Wang
- Simons Center for Computational Physical Chemistry and Center for Data Science, New York, New York 10004, United States
| | - Jeffrey R Wagner
- Open Force Field, Open Molecular Software Foundation, Davis, California 95616, United States
| | - Daniel J Cole
- School of Natural and Environmental Sciences, Newcastle University, Newcastle upon Tyne NE1 7RU, United Kingdom
| | - Michael K Gilson
- Skaggs School of Pharmacy and Pharmaceutical Sciences, The University of California at San Diego, La Jolla, California 92093, United States
| | - Michael R Shirts
- Department of Chemical and Biological Engineering, University of Colorado Boulder, Boulder, Colorado 80305, United States
| | - David L Mobley
- Department of Chemistry, University of California, Irvine, California 92697, United States
- Department of Pharmaceutical Sciences, University of California, Irvine, California 92697, United States
| |
Collapse
|
4
|
Sun Z, Procacci P. Methodological and force field effects in the molecular dynamics-based prediction of binding free energies of host-guest systems. Phys Chem Chem Phys 2024; 26:19887-19899. [PMID: 38990073 DOI: 10.1039/d4cp01804d] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/12/2024]
Abstract
As a contribution to the understanding and rationalization of methodological and modeling effects in recent host-guest SAMPL challenges, using an alchemical molecular dynamics technique we have examined the impact of force field parameterization and ionic strength in connection with guest charge neutralization on computed dissociation free energies in two typical SAMPL heavily charged macrocyclic hosts encapsulating small protonated amines with disparate binding affinities. We have shown that the methodological treatment for host neutralization, with explicit ions or with the background neutralizing plasma in the context of alchemical calculations under periodic boundary conditions, has a moderate effect on the calculated affinities. On the other hand, we have shown that seemingly small differences in the force field parameterization in highly symmetric hosts can produce systematic effects on the structural features that can have a significant impact on the predicted binding affinities.
Collapse
Affiliation(s)
- Zhaoxi Sun
- Changping Laboratory, Beijing 102206, China
| | - Piero Procacci
- Dipartimento di Chimica "Ugo Schiff", Università degli Studi di Firenze, Via della Lastruccia 3, 50019 Sesto Fiorentino, Italy.
| |
Collapse
|
5
|
Gilson MK, Stewart LE, Potter MJ, Webb SP. Rapid, Accurate, Ranking of Protein-Ligand Binding Affinities with VM2, the Second-Generation Mining Minima Method. J Chem Theory Comput 2024; 20:6328-6340. [PMID: 38989926 DOI: 10.1021/acs.jctc.4c00407] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/12/2024]
Abstract
The structure-based technologies most widely used to rank the affinities of candidate small molecule drugs for proteins range from faster but less reliable docking methods to slower but more accurate explicit solvent free energy methods. In recent years, we have advanced another technology, which is called mining minima because it "mines" out the main contributions to the chemical potentials of the free and bound molecular species by identifying and characterizing their main local energy minima. The present study provides systematic benchmarks of the accuracy and computational speed of mining minima, as implemented in the VeraChem Mining Minima Generation 2 (VM2) code, across two well-regarded protein-ligand benchmark data sets, for which there are already benchmark data for docking, free energy, and other computational methods. A core result is that VM2's accuracy approaches that of explicit solvent free energy methods at a far lower computational cost. In finer-grained analyses, we also examine the influence of various run settings, such as the treatment of crystallographic water molecules, on the accuracy, and define the costs in time and dollars of representative runs on Amazon Web Services (AWS) compute instances with various CPU and GPU combinations. We also use the benchmark data to determine the importance of VM2's correction from generalized Born to finite-difference Poisson-Boltzmann results for each energy well and find that this correction affords a remarkably consistent improvement in accuracy at a modest computational cost. The present results establish VM2 as a distinctive technology for early-stage drug discovery, which provides a strong combination of efficiency and predictivity.
Collapse
Affiliation(s)
- Michael K Gilson
- VeraChem LLC, 12850 Middlebrook Rd, Ste 205, Germantown, Maryland 20874, United States
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, 9255 Pharmacy Lane, La Jolla, California 92093, United States
| | - Lawrence E Stewart
- VeraChem LLC, 12850 Middlebrook Rd, Ste 205, Germantown, Maryland 20874, United States
| | - Michael J Potter
- VeraChem LLC, 12850 Middlebrook Rd, Ste 205, Germantown, Maryland 20874, United States
| | - Simon P Webb
- VeraChem LLC, 12850 Middlebrook Rd, Ste 205, Germantown, Maryland 20874, United States
| |
Collapse
|
6
|
Karwounopoulos J, Wu Z, Tkaczyk S, Wang S, Baskerville A, Ranasinghe K, Langer T, Wood GPF, Wieder M, Boresch S. Insights and Challenges in Correcting Force Field Based Solvation Free Energies Using a Neural Network Potential. J Phys Chem B 2024; 128:6693-6703. [PMID: 38976601 PMCID: PMC11264272 DOI: 10.1021/acs.jpcb.4c01417] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2024] [Revised: 05/31/2024] [Accepted: 06/14/2024] [Indexed: 07/10/2024]
Abstract
We present a comprehensive study investigating the potential gain in accuracy for calculating absolute solvation free energies (ASFE) using a neural network potential to describe the intramolecular energy of the solute. We calculated the ASFE for most compounds from the FreeSolv database using the Open Force Field (OpenFF) and compared them to earlier results obtained with the CHARMM General Force Field (CGenFF). By applying a nonequilibrium (NEQ) switching approach between the molecular mechanics (MM) description (either OpenFF or CGenFF) and the neural net potential (NNP)/MM level of theory (using ANI-2x as the NNP potential), we attempted to improve the accuracy of the calculated ASFEs. The predictive performance of the results did not change when this approach was applied to all 589 small molecules in the FreeSolv database that ANI-2x can describe. When selecting a subset of 156 molecules, focusing on compounds where the force fields performed poorly, we saw a slight improvement in the root-mean-square error (RMSE) and mean absolute error (MAE). The majority of our calculations utilized unidirectional NEQ protocols based on Jarzynski's equation. Additionally, we conducted bidirectional NEQ switching for a subset of 156 solutes. Notably, only a small fraction (10 out of 156) exhibited statistically significant discrepancies between unidirectional and bidirectional NEQ switching free energy estimates.
Collapse
Affiliation(s)
- Johannes Karwounopoulos
- Faculty
of Chemistry, Institute of Computational Biological Chemistry, University Vienna, Währingerstr. 17, 1090 Vienna, Austria
- Vienna
Doctoral School of Chemistry (DoSChem), University of Vienna, Währingerstr. 42, 1090 Vienna, Austria
| | - Zhiyi Wu
- Exscientia
plc, Schroedinger Building, Oxford OX4 4GE, United Kingdom
| | - Sara Tkaczyk
- Department
of Pharmaceutical Sciences, Pharmaceutical Chemistry Division, University of Vienna, Josef-Holaubek-Platz 2, 1090 Vienna, Austria
- Vienna
Doctoral School of Pharmaceutical, Nutritional and Sport Sciences
(PhaNuSpo),University of Vienna, Josef-Holaubek-Platz 2, 1090 Vienna, Austria
| | - Shuzhe Wang
- Exscientia
plc, Schroedinger Building, Oxford OX4 4GE, United Kingdom
| | - Adam Baskerville
- Exscientia
plc, Schroedinger Building, Oxford OX4 4GE, United Kingdom
| | | | - Thierry Langer
- Department
of Pharmaceutical Sciences, Pharmaceutical Chemistry Division, University of Vienna, Josef-Holaubek-Platz 2, 1090 Vienna, Austria
| | | | - Marcus Wieder
- Exscientia
plc, Schroedinger Building, Oxford OX4 4GE, United Kingdom
- Open
Molecular Software Foundation, Davis, California 95616, United States
| | - Stefan Boresch
- Faculty
of Chemistry, Institute of Computational Biological Chemistry, University Vienna, Währingerstr. 17, 1090 Vienna, Austria
| |
Collapse
|
7
|
Katzberger P, Riniker S. A general graph neural network based implicit solvation model for organic molecules in water. Chem Sci 2024; 15:10794-10802. [PMID: 39027274 PMCID: PMC11253111 DOI: 10.1039/d4sc02432j] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2024] [Accepted: 05/24/2024] [Indexed: 07/20/2024] Open
Abstract
The dynamical behavior of small molecules in their environment can be studied with classical molecular dynamics (MD) simulations to gain deeper insight on an atomic level and thus complement and rationalize the interpretation of experimental findings. Such approaches are of great value in various areas of research, e.g., in the development of new therapeutics. The accurate description of solvation effects in such simulations is thereby key and has in consequence been an active field of research since the introduction of MD. So far, the most accurate approaches involve computationally expensive explicit solvent simulations, while widely applied models using an implicit solvent description suffer from reduced accuracy. Recently, machine learning (ML) approaches that provide a probabilistic representation of solvation effects have been proposed as potential alternatives. However, the associated computational costs and minimal or lack of transferability render them unusable in practice. Here, we report the first example of a transferable ML-based implicit solvent model trained on a diverse set of 3 000 000 molecular structures that can be applied to organic small molecules for simulations in water. Extensive testing against reference calculations demonstrated that the model delivers on par accuracy with explicit solvent simulations while providing an up to 18-fold increase in sampling rate.
Collapse
Affiliation(s)
- Paul Katzberger
- Department of Chemistry and Applied Biosciences, ETH Zürich Vladimir-Prelog-Weg 2 8093 Zürich Switzerland
| | - Sereina Riniker
- Department of Chemistry and Applied Biosciences, ETH Zürich Vladimir-Prelog-Weg 2 8093 Zürich Switzerland
| |
Collapse
|
8
|
Panei FP, Gkeka P, Bonomi M. Identifying small-molecules binding sites in RNA conformational ensembles with SHAMAN. Nat Commun 2024; 15:5725. [PMID: 38977675 PMCID: PMC11231146 DOI: 10.1038/s41467-024-49638-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2023] [Accepted: 06/05/2024] [Indexed: 07/10/2024] Open
Abstract
The rational targeting of RNA with small molecules is hampered by our still limited understanding of RNA structural and dynamic properties. Most in silico tools for binding site identification rely on static structures and therefore cannot face the challenges posed by the dynamic nature of RNA molecules. Here, we present SHAMAN, a computational technique to identify potential small-molecule binding sites in RNA structural ensembles. SHAMAN enables exploring the conformational landscape of RNA with atomistic molecular dynamics simulations and at the same time identifying RNA pockets in an efficient way with the aid of probes and enhanced-sampling techniques. In our benchmark composed of large, structured riboswitches as well as small, flexible viral RNAs, SHAMAN successfully identifies all the experimentally resolved pockets and ranks them among the most favorite probe hotspots. Overall, SHAMAN sets a solid foundation for future drug design efforts targeting RNA with small molecules, effectively addressing the long-standing challenges in the field.
Collapse
Affiliation(s)
- F P Panei
- Integrated Drug Discovery, Molecular Design Sciences, Sanofi, Vitry-sur-Seine, France
- Institut Pasteur, Université Paris Cité, CNRS UMR 3528, Computational Structural Biology Unit, Paris, France
- Sorbonne Université, Ecole Doctorale Complexité du Vivant, Paris, France
| | - P Gkeka
- Integrated Drug Discovery, Molecular Design Sciences, Sanofi, Vitry-sur-Seine, France.
| | - M Bonomi
- Institut Pasteur, Université Paris Cité, CNRS UMR 3528, Computational Structural Biology Unit, Paris, France.
| |
Collapse
|
9
|
Hahn DF, Gapsys V, de Groot BL, Mobley DL, Tresadern G. Current State of Open Source Force Fields in Protein-Ligand Binding Affinity Predictions. J Chem Inf Model 2024; 64:5063-5076. [PMID: 38895959 PMCID: PMC11234369 DOI: 10.1021/acs.jcim.4c00417] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2024] [Revised: 04/23/2024] [Accepted: 04/25/2024] [Indexed: 06/21/2024]
Abstract
In drug discovery, the in silico prediction of binding affinity is one of the major means to prioritize compounds for synthesis. Alchemical relative binding free energy (RBFE) calculations based on molecular dynamics (MD) simulations are nowadays a popular approach for the accurate affinity ranking of compounds. MD simulations rely on empirical force field parameters, which strongly influence the accuracy of the predicted affinities. Here, we evaluate the ability of six different small-molecule force fields to predict experimental protein-ligand binding affinities in RBFE calculations on a set of 598 ligands and 22 protein targets. The public force fields OpenFF Parsley and Sage, GAFF, and CGenFF show comparable accuracy, while OPLS3e is significantly more accurate. However, a consensus approach using Sage, GAFF, and CGenFF leads to accuracy comparable to OPLS3e. While Parsley and Sage are performing comparably based on aggregated statistics across the whole dataset, there are differences in terms of outliers. Analysis of the force field reveals that improved parameters lead to significant improvement in the accuracy of affinity predictions on subsets of the dataset involving those parameters. Lower accuracy can not only be attributed to the force field parameters but is also dependent on input preparation and sampling convergence of the calculations. Especially large perturbations and nonconverged simulations lead to less accurate predictions. The input structures, Gromacs force field files, as well as the analysis Python notebooks are available on GitHub.
Collapse
Affiliation(s)
- David F. Hahn
- Computational
Chemistry, Janssen Research & Development, Turnhoutseweg 30, Beerse 2340, Belgium
| | - Vytautas Gapsys
- Computational
Chemistry, Janssen Research & Development, Turnhoutseweg 30, Beerse 2340, Belgium
- Computational
Biomolecular Dynamics Group, Max Planck
Institute for Multidisciplinary Sciences, Am Fassberg 11, Göttingen 37077, Germany
| | - Bert L. de Groot
- Computational
Biomolecular Dynamics Group, Max Planck
Institute for Multidisciplinary Sciences, Am Fassberg 11, Göttingen 37077, Germany
| | - David L. Mobley
- Department
of Chemistry, University of California, Irvine, California 92697, United States
- Department
of Pharmaceutical Sciences, University of
California, Irvine, California 92697, United States
| | - Gary Tresadern
- Computational
Chemistry, Janssen Research & Development, Turnhoutseweg 30, Beerse 2340, Belgium
| |
Collapse
|
10
|
Wehrhan L, Keller BG. Fluorinated Protein-Ligand Complexes: A Computational Perspective. J Phys Chem B 2024; 128:5925-5934. [PMID: 38886167 PMCID: PMC11215785 DOI: 10.1021/acs.jpcb.4c01493] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2024] [Revised: 05/28/2024] [Accepted: 05/30/2024] [Indexed: 06/20/2024]
Abstract
Fluorine is an element renowned for its unique properties. Its powerful capability to modulate molecular properties makes it an attractive substituent for protein binding ligands; however, the rational design of fluorination can be challenging with effects on interactions and binding energies being difficult to predict. In this Perspective, we highlight how computational methods help us to understand the role of fluorine in protein-ligand binding with a focus on molecular simulation. We underline the importance of an accurate force field, present fluoride channels as a showcase for biomolecular interactions with fluorine, and discuss fluorine specific interactions like the ability to form hydrogen bonds and interactions with aryl groups. We put special emphasis on the disruption of water networks and entropic effects.
Collapse
Affiliation(s)
- Leon Wehrhan
- Department of Chemistry,
Biology and Pharmacy, Freie Universität
Berlin, Arnimallee 22, 14195 Berlin, Germany
| | - Bettina G. Keller
- Department of Chemistry,
Biology and Pharmacy, Freie Universität
Berlin, Arnimallee 22, 14195 Berlin, Germany
| |
Collapse
|
11
|
Newstead S, Parker J, Deme J, Lichtinger S, Kuteyi G, Biggin P, Lea S. Structural basis for antibiotic transport and inhibition in PepT2, the mammalian proton-coupled peptide transporter. RESEARCH SQUARE 2024:rs.3.rs-4435259. [PMID: 38903084 PMCID: PMC11188089 DOI: 10.21203/rs.3.rs-4435259/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/22/2024]
Abstract
The uptake and elimination of beta-lactam antibiotics in the human body are facilitated by the proton-coupled peptide transporters PepT1 (SLC15A1) and PepT2 (SLC15A2). The mechanism by which SLC15 family transporters recognize and discriminate between different drug classes and dietary peptides remains unclear, hampering efforts to improve antibiotic pharmacokinetics through targeted drug design and delivery. Here, we present cryo-EM structures of the mammalian proton-coupled peptide transporter, PepT2, in complex with the widely used beta-lactam antibiotics cefadroxil, amoxicillin and cloxacillin. Our structures, combined with pharmacophore mapping, molecular dynamics simulations and biochemical assays, establish the mechanism of antibiotic recognition and the important role of protonation in drug binding and transport.
Collapse
Affiliation(s)
| | | | - Justin Deme
- National Cancer Institute, National Institutes of Health
| | | | | | | | - Susan Lea
- Center for Structural Biology, Center for Cancer Research, National Cancer Institute
| |
Collapse
|
12
|
Zhu J, Gu Z, Pei J, Lai L. DiffBindFR: an SE(3) equivariant network for flexible protein-ligand docking. Chem Sci 2024; 15:7926-7942. [PMID: 38817560 PMCID: PMC11134415 DOI: 10.1039/d3sc06803j] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2023] [Accepted: 04/07/2024] [Indexed: 06/01/2024] Open
Abstract
Molecular docking, a key technique in structure-based drug design, plays pivotal roles in protein-ligand interaction modeling, hit identification and optimization, in which accurate prediction of protein-ligand binding mode is essential. Conventional docking approaches perform well in redocking tasks with known protein binding pocket conformation in the complex state. However, in real-world docking scenario without knowing the protein binding conformation for a new ligand, accurately modeling the binding complex structure remains challenging as flexible docking is computationally expensive and inaccurate. Typical deep learning-based docking methods do not explicitly consider protein side chain conformations and fail to ensure the physical plausibility and detailed atomic interactions. In this study, we present DiffBindFR, a full-atom diffusion-based flexible docking model that operates over the product space of ligand overall movements and flexibility and pocket side chain torsion changes. We show that DiffBindFR has higher accuracy in producing native-like binding structures with physically plausible and detailed interactions than available docking methods. Furthermore, in the Apo and AlphaFold2 modeled structures, DiffBindFR demonstrates superior advantages in accurate ligand binding pose and protein binding conformation prediction, making it suitable for Apo and AlphaFold2 structure-based drug design. DiffBindFR provides a powerful flexible docking tool for modeling accurate protein-ligand binding structures.
Collapse
Affiliation(s)
- Jintao Zhu
- Center for Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking University Beijing 100871 China
| | - Zhonghui Gu
- Peking-Tsinghua Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Peking University Beijing 100871 China
| | - Jianfeng Pei
- Center for Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking University Beijing 100871 China
| | - Luhua Lai
- Center for Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking University Beijing 100871 China
- Peking-Tsinghua Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Peking University Beijing 100871 China
- BNLMS, College of Chemistry and Molecular Engineering, Peking University Beijing 100871 China
- Peking University Chengdu Academy for Advanced Interdisciplinary Biotechnologies Chengdu Sichuan China
| |
Collapse
|
13
|
Champion C, Hünenberger PH, Riniker S. Multistate Method to Efficiently Account for Tautomerism and Protonation in Alchemical Free-Energy Calculations. J Chem Theory Comput 2024; 20:4350-4362. [PMID: 38742760 PMCID: PMC11137823 DOI: 10.1021/acs.jctc.4c00370] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2024] [Revised: 04/25/2024] [Accepted: 04/29/2024] [Indexed: 05/16/2024]
Abstract
The majority of drug-like molecules contain at least one ionizable group, and many common drug scaffolds are subject to tautomeric equilibria. Thus, these compounds are found in a mixture of protonation and/or tautomeric states at physiological pH. Intrinsically, standard classical molecular dynamics (MD) simulations cannot describe such equilibria between states, which negatively impacts the prediction of key molecular properties in silico. Following the formalism described by de Oliveira and co-workers (J. Chem. Theory Comput. 2019, 15, 424-435) to consider the influence of all states on the binding process based on alchemical free-energy calculations, we demonstrate in this work that the multistate method replica-exchange enveloping distribution sampling (RE-EDS) is well suited to describe molecules with multiple protonation and/or tautomeric states in a single simulation. We apply our methodology to a series of eight inhibitors of factor Xa with two protonation states and a series of eight inhibitors of glycogen synthase kinase 3β (GSK3β) with two tautomeric states. In particular, we show that given a sufficient phase-space overlap between the states, RE-EDS is computationally more efficient than standard pairwise free-energy methods.
Collapse
Affiliation(s)
- Candide Champion
- Department of Chemistry and Applied
Biosciences, ETH Zürich, Vladimir-Prelog-Weg 2, 8093 Zürich, Switzerland
| | - Philippe H. Hünenberger
- Department of Chemistry and Applied
Biosciences, ETH Zürich, Vladimir-Prelog-Weg 2, 8093 Zürich, Switzerland
| | - Sereina Riniker
- Department of Chemistry and Applied
Biosciences, ETH Zürich, Vladimir-Prelog-Weg 2, 8093 Zürich, Switzerland
| |
Collapse
|
14
|
Chakravorty A, Hussain A, Cervantes LF, Lai TT, Brooks CL. Exploring the Limits of the Generalized CHARMM and AMBER Force Fields through Predictions of Hydration Free Energy of Small Molecules. J Chem Inf Model 2024; 64:4089-4101. [PMID: 38717640 PMCID: PMC11275216 DOI: 10.1021/acs.jcim.4c00126] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/28/2024]
Abstract
Accurate force field parameters, potential energy functions, and receptor-ligand models are essential for modeling the solvation and binding of drug-like molecules to a receptor. A large and ever-growing chemical space of medicinally relevant scaffolds has also required these factors, especially force field parameters, to be highly transferable. Generalized force fields such as the CHARMM General Force Field (CGenFF) and the generalized AMBER force field (GAFF) have accomplished this feat along with other contemporaneous ones like OPLS. Here, we analyze the limits in the parametrization of drug-like small molecules by CGenFF and GAFF in terms of the various functional groups represented within them. Specifically, we link the presence of specific functional groups to the error in the absolute hydration free energy of over 600 small molecules, predicted by alchemical free energy methods implemented in the CHARMM program. Our investigation reveals that molecules with (i) a nitro group in CGenFF and GAFF are, respectively, over- or undersolubilized in aqueous medium, (ii) amine groups are undersolubilized more so in CGenFF than in GAFF, and (iii) carboxyl groups are more oversolubilized in GAFF than in CGenFF. We present our analyses of the potential factors underlying these trends. We also showcase the use of a machine-learning-based approach combined with the SHapley Additive exPlanations framework to attribute these trends to specific functional groups, which can be easily adopted to explore the limits of other general force fields.
Collapse
Affiliation(s)
- Arghya Chakravorty
- Department of Chemistry, University of Michigan, Ann Arbor, Michigan 48109, United States
| | - Azam Hussain
- Department of Macromolecular Science and Engineering, University of Michigan, Ann Arbor, Michigan 48109, United States
| | - Luis F Cervantes
- Department of Medicinal Chemistry, University of Michigan, Ann Arbor, Michigan 48109, United States
| | - Thanh T Lai
- Biophysics Program, Department of Chemistry, University of Michigan, Ann Arbor, Michigan 48109, United States
| | - Charles L Brooks
- Department of Medicinal Chemistry, University of Michigan, Ann Arbor, Michigan 48109, United States
- Biophysics Program, Department of Chemistry, University of Michigan, Ann Arbor, Michigan 48109, United States
| |
Collapse
|
15
|
Li J, Guan X, Zhang O, Sun K, Wang Y, Bagni D, Head-Gordon T. Leak Proof PDBBind: A Reorganized Dataset of Protein-Ligand Complexes for More Generalizable Binding Affinity Prediction. ARXIV 2024:arXiv:2308.09639v2. [PMID: 37645037 PMCID: PMC10462179] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 08/31/2023]
Abstract
Many physics-based and machine-learned scoring functions (SFs) used to predict protein-ligand binding free energies have been trained on the PDBBind dataset. However, it is controversial as to whether new SFs are actually improving since the general, refined, and core datasets of PDBBind are cross-contaminated with proteins and ligands with high similarity, and hence they may not perform comparably well in binding prediction of new protein-ligand complexes. In this work we have carefully prepared a cleaned PDBBind data set of non-covalent binders that are split into training, validation, and test datasets to control for data leakage, defined as proteins and ligands with high sequence and structural similarity. The resulting leak-proof (LP)-PDBBind data is used to retrain four popular SFs: AutoDock Vina, Random Forest (RF)-Score, InteractionGraphNet (IGN), and DeepDTA, to better test their capabilities when applied to new protein-ligand complexes. In particular we have formulated a new independent data set, BDB2020+, by matching high quality binding free energies from BindingDB with co-crystalized ligand-protein complexes from the PDB that have been deposited since 2020. Based on all the benchmark results, the retrained models using LP-PDBBind consistently perform better, with IGN especially being recommended for scoring and ranking applications for new protein-ligand systems.
Collapse
|
16
|
Ries B, Alibay I, Swenson DWH, Baumann HM, Henry MM, Eastwood JRB, Gowers RJ. Kartograf: A Geometrically Accurate Atom Mapper for Hybrid-Topology Relative Free Energy Calculations. J Chem Theory Comput 2024; 20:1862-1877. [PMID: 38330251 PMCID: PMC10941767 DOI: 10.1021/acs.jctc.3c01206] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Revised: 01/17/2024] [Accepted: 01/18/2024] [Indexed: 02/10/2024]
Abstract
Relative binding free energy (RBFE) calculations have emerged as a powerful tool that supports ligand optimization in drug discovery. Despite many successes, the use of RBFEs can often be limited by automation problems, in particular, the setup of such calculations. Atom mapping algorithms are an essential component in setting up automatic large-scale hybrid-topology RBFE calculation campaigns. Traditional algorithms typically employ a 2D subgraph isomorphism solver (SIS) in order to estimate the maximum common substructure. SIS-based approaches can be limited by time-intensive operations and issues with capturing geometry-linked chemical properties, potentially leading to suboptimal solutions. To overcome these limitations, we have developed Kartograf, a geometric-graph-based algorithm that uses primarily the 3D coordinates of atoms to find a mapping between two ligands. In free energy approaches, the ligand conformations are usually derived from docking or other previous modeling approaches, giving the coordinates a certain importance. By considering the spatial relationships between atoms related to the molecule coordinates, our algorithm bypasses the computationally complex subgraph matching of SIS-based approaches and reduces the problem to a much simpler bipartite graph matching problem. Moreover, Kartograf effectively circumvents typical mapping issues induced by molecule symmetry and stereoisomerism, making it a more robust approach for atom mapping from a geometric perspective. To validate our method, we calculated mappings with our novel approach using a diverse set of small molecules and used the mappings in relative hydration and binding free energy calculations. The comparison with two SIS-based algorithms showed that Kartograf offers a fast alternative approach. The code for Kartograf is freely available on GitHub (https://github.com/OpenFreeEnergy/kartograf). While developed for the OpenFE ecosystem, Kartograf can also be utilized as a standalone Python package.
Collapse
Affiliation(s)
- Benjamin Ries
- Medicinal
Chemistry, Boehringer Ingelheim Pharma GmbH
& Co KG, Birkendorfer Str 65, 88397 Biberach an der Riss, Germany
- Open
Free Energy, Open Molecular Software Foundation, Davis, 95616 California, United States
| | - Irfan Alibay
- Open
Free Energy, Open Molecular Software Foundation, Davis, 95616 California, United States
| | - David W. H. Swenson
- Open
Free Energy, Open Molecular Software Foundation, Davis, 95616 California, United States
| | - Hannah M. Baumann
- Open
Free Energy, Open Molecular Software Foundation, Davis, 95616 California, United States
| | - Michael M. Henry
- Open
Free Energy, Open Molecular Software Foundation, Davis, 95616 California, United States
- Computational
and Systems Biology Program, Sloan Kettering
Institute, Memorial Sloan Kettering Cancer Center, New York, 1275 New York, United States
| | - James R. B. Eastwood
- Open
Free Energy, Open Molecular Software Foundation, Davis, 95616 California, United States
| | - Richard J. Gowers
- Open
Free Energy, Open Molecular Software Foundation, Davis, 95616 California, United States
| |
Collapse
|
17
|
Draper MR, Waterman A, Dannatt JE, Patel P. Integrating multiscale and machine learning approaches towards the SAMPL9 log P challenge. Phys Chem Chem Phys 2024; 26:7907-7919. [PMID: 38376855 PMCID: PMC10938873 DOI: 10.1039/d3cp04140a] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/21/2024]
Abstract
The partition coefficient (log P) is an important physicochemical property that provides information regarding a molecule's pharmacokinetics, toxicity, and bioavailability. Methods to accurately predict the partition coefficient have the potential to accelerate drug design. In an effort to test current methods and explore new computational techniques, the statistical assessment of the modeling of proteins and ligands (SAMPL) has established a blind prediction challenge. The ninth iteration challenge was to predict the toluene-water partition coefficient (log Ptol/w) of sixteen drug molecules. Herein, three approaches are reported broadly under the categories of quantum mechanics (QM), molecular mechanics (MM), and data-driven machine learning (ML). The three blind submissions yield mean unsigned errors (MUE) ranging from 1.53-2.93 log Ptol/w units. The MUEs were reduced to 1.00 log Ptol/w for the QM methods. While MM and ML methods outperformed DFT approaches for challenge molecules with fewer rotational degrees of freedom, they suffered for the larger molecules in this dataset. Overall, DFT functionals paired with a triple-ζ basis set were the simplest and most effective tool to obtain quantitatively accurate partition coefficients.
Collapse
Affiliation(s)
- Michael R Draper
- Chemistry Department, University of Dallas, Irving, Texas, 75062, USA.
| | - Asa Waterman
- Chemistry Department, University of Dallas, Irving, Texas, 75062, USA.
| | | | - Prajay Patel
- Chemistry Department, University of Dallas, Irving, Texas, 75062, USA.
| |
Collapse
|
18
|
Davel CM, Bernat T, Wagner JR, Shirts MR. Parameterization of General Organic Polymers within the Open Force Field Framework. J Chem Inf Model 2024; 64:1290-1305. [PMID: 38303159 PMCID: PMC11090695 DOI: 10.1021/acs.jcim.3c01691] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2024]
Abstract
Polymer and chemically modified biopolymer systems present unique challenges to traditional molecular simulation preparation workflows. First, typical polymer and biomolecular input formats, such as Protein Data Bank (PDB) files, lack adequate chemical information needed for the parameterization of new chemistries. Second, polymers are typically too large for accurate partial charge generation methods. In this work, we employ direct chemical perception through the Open Force Field toolkit to create a flexible polymer simulation workflow for organic polymers, encompassing everything from biopolymers to soft materials. We propose and test a new input specification for monomer information that can, along with a 3D conformational geometry, parametrize and simulate most soft-material systems within the same workflow used for smaller ligands. The monomer format encompasses a subset of the SMIRKS substructure query language to uniquely identify chemical information and repeating charges in underspecified systems through matching atomic connectivity. This workflow is combined with several different approaches for automatic partial-charge generation for larger systems. As an initial proof of concept, a variety of diverse polymeric systems were parametrized with the Open Force Field toolkit, including functionalized proteins, DNA, homopolymers, cross-linked systems, and sugars. Additionally, shape properties and radial distribution functions were computed from molecular dynamics simulations of poly(ethylene glycol), polyacrylamide, and poly(N-isopropylacrylamide) homopolymers in aqueous solution and compared to previous simulation results in order to demonstrate a start-to-finish workflow for simulation and property prediction. We expect that these tools will greatly expedite the day-to-day computational research of soft-matter simulations and create a robust atomic-scale polymer specification in conjunction with existing polymer structural notations.
Collapse
Affiliation(s)
- Connor M Davel
- Department of Chemical and Biological Engineering, University of Colorado Boulder, Boulder, Colorado 80309, United States
| | - Timotej Bernat
- Department of Chemical and Biological Engineering, University of Colorado Boulder, Boulder, Colorado 80309, United States
| | - Jeffrey R Wagner
- The Open Force Field Initiative, Open Molecular Software Foundation, Davis, California 95616, United States
| | - Michael R Shirts
- Department of Chemical and Biological Engineering, University of Colorado Boulder, Boulder, Colorado 80309, United States
| |
Collapse
|
19
|
Wang L, Schauperl M, Mobley DL, Bayly C, Gilson MK. A Fast, Convenient, Polarizable Electrostatic Model for Molecular Dynamics. J Chem Theory Comput 2024; 20:1293-1305. [PMID: 38240687 PMCID: PMC10867846 DOI: 10.1021/acs.jctc.3c01171] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/17/2024]
Abstract
We present an efficient polarizable electrostatic model, utilizing typed, atom-centered polarizabilities and the fast direct approximation, designed for efficient use in molecular dynamics (MD) simulations. The model provides two convenient approaches for assigning partial charges in the context of atomic polarizabilities. One is a generalization of RESP, called RESP-dPol, and the other, AM1-BCC-dPol, is an adaptation of the widely used AM1-BCC method. Both are designed to accurately replicate gas-phase quantum mechanical electrostatic potentials. Benchmarks of this polarizable electrostatic model against gas-phase dipole moments, molecular polarizabilities, bulk liquid densities, and static dielectric constants of organic liquids show good agreement with the reference values. Of note, the model yields markedly more accurate dielectric constants of organic liquids, relative to a matched nonpolarizable force field. MD simulations with this method, which is currently parametrized for molecules containing elements C, N, O, and H, run only about 3.6-fold slower than fixed charge force fields, while simulations with the self-consistent mutual polarization average 4.5-fold slower. Our results suggest that RESP-dPol and AM1-BCC-dPol afford improved accuracy relative to fixed charge force fields and are good starting points for developing general, affordable, and transferable polarizable force fields. The software implementing these approaches has been designed to utilize the force field fitting frameworks developed and maintained by the Open Force Field Initiative, setting the stage for further exploration of this approach to polarizable force field development.
Collapse
Affiliation(s)
- Liangyue Wang
- Department
of Chemistry and Biochemistry, University
of California, San Diego, California 92093, United States
| | - Michael Schauperl
- HotSpot
Therapeutics, Inc., Boston, Massachusetts 02210, United States
| | - David L. Mobley
- Department
of Pharmaceutical Sciences, University of
California, Irvine, California 92697, United States
| | - Christopher Bayly
- OpenEye
Scientific, Cadence Molecular Sciences, Santa Fe, New Mexico 87508, United States
| | - Michael K. Gilson
- Skaggs
School of Pharmacy and Pharmaceutical Sciences, University of California, San
Diego, California 92093, United States
| |
Collapse
|
20
|
Xue B, Yang Q, Zhang Q, Wan X, Fang D, Lin X, Sun G, Gobbo G, Cao F, Mathiowetz AM, Burke BJ, Kumpf RA, Rai BK, Wood GPF, Pickard FC, Wang J, Zhang P, Ma J, Jiang YA, Wen S, Hou X, Zou J, Yang M. Development and Comprehensive Benchmark of a High-Quality AMBER-Consistent Small Molecule Force Field with Broad Chemical Space Coverage for Molecular Modeling and Free Energy Calculation. J Chem Theory Comput 2024; 20:799-818. [PMID: 38157475 DOI: 10.1021/acs.jctc.3c00920] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2024]
Abstract
Biomolecular simulations have become an essential tool in contemporary drug discovery, and molecular mechanics force fields (FFs) constitute its cornerstone. Developing a high quality and broad coverage general FF is a significant undertaking that requires substantial expert knowledge and computing resources, which is beyond the scope of general practitioners. Existing FFs originate from only a limited number of groups and organizations, and they either suffer from limited numbers of training sets, lower than desired quality because of oversimplified representations, or are costly for the molecular modeling community to access. To address these issues, in this work, we developed an AMBER-consistent small molecule FF with extensive chemical space coverage, and we provide Open Access parameters for the entire modeling community. To validate our FF, we carried out benchmarks of quantum mechanics (QM)/molecular mechanics conformer comparison and free energy perturbation calculations on several benchmark data sets. Our FF achieves a higher level of performance at reproducing QM energies and geometries than two popular open-source FFs, OpenFF2 and GAFF2. In relative binding free energy calculations for 31 protein-ligand data sets, comprising 1079 pairs of ligands, the new FF achieves an overall root-mean-square error of 1.19 kcal/mol for ΔΔG and 0.92 kcal/mol for ΔG on a subset of 463 ligands without bespoke fitting to the data sets. The results are on par with those of the leading commercial series of OPLS FFs.
Collapse
Affiliation(s)
- Bai Xue
- Shenzhen Jingtai Technology Co., Ltd. (XtalPi), Floor 3, Sf Industrial Plant, No. 2 Hongliu Road, Fubao Community, Fubao Street, Futian District, Shenzhen 518045, China
| | - Qingyi Yang
- Medicine Design, Pfizer Inc., 1 Portland Street, Cambridge, Massachusetts 02139, United States
| | - Qiaochu Zhang
- Shenzhen Jingtai Technology Co., Ltd. (XtalPi), Floor 3, Sf Industrial Plant, No. 2 Hongliu Road, Fubao Community, Fubao Street, Futian District, Shenzhen 518045, China
| | - Xiao Wan
- Shenzhen Jingtai Technology Co., Ltd. (XtalPi), Floor 3, Sf Industrial Plant, No. 2 Hongliu Road, Fubao Community, Fubao Street, Futian District, Shenzhen 518045, China
| | - Dong Fang
- Shenzhen Jingtai Technology Co., Ltd. (XtalPi), Floor 3, Sf Industrial Plant, No. 2 Hongliu Road, Fubao Community, Fubao Street, Futian District, Shenzhen 518045, China
| | - Xiaolu Lin
- Shenzhen Jingtai Technology Co., Ltd. (XtalPi), Floor 3, Sf Industrial Plant, No. 2 Hongliu Road, Fubao Community, Fubao Street, Futian District, Shenzhen 518045, China
| | - Guangxu Sun
- Shenzhen Jingtai Technology Co., Ltd. (XtalPi), Floor 3, Sf Industrial Plant, No. 2 Hongliu Road, Fubao Community, Fubao Street, Futian District, Shenzhen 518045, China
| | - Gianpaolo Gobbo
- XtalPi Inc., 245 Main Street, Cambridge, Massachusetts 02142, United States
| | - Fenglei Cao
- Shenzhen Jingtai Technology Co., Ltd. (XtalPi), Floor 3, Sf Industrial Plant, No. 2 Hongliu Road, Fubao Community, Fubao Street, Futian District, Shenzhen 518045, China
| | - Alan M Mathiowetz
- Medicine Design, Pfizer Inc., 1 Portland Street, Cambridge, Massachusetts 02139, United States
| | - Benjamin J Burke
- Medicine Design, Pfizer Inc., 10777 Science Center Drive, San Diego, California 92121, United States
| | - Robert A Kumpf
- Medicine Design, Pfizer Inc., 10777 Science Center Drive, San Diego, California 92121, United States
| | - Brajesh K Rai
- Machine Learning and Computational Sciences, Pfizer Inc., 610 Main Street, Cambridge, Massachusetts 02139, United States
| | - Geoffrey P F Wood
- Pharmaceutical Science Small Molecule, Pfizer Inc., Eastern Point Road, Groton, Connecticut 06340, United States
| | - Frank C Pickard
- Pharmaceutical Science Small Molecule, Pfizer Inc., Eastern Point Road, Groton, Connecticut 06340, United States
| | - Junmei Wang
- Department of Pharmaceutical Sciences and Computational Chemical Genomics Screening Center, University of Pittsburgh, Pittsburgh, Pennsylvania 15261, United States
| | - Peiyu Zhang
- Shenzhen Jingtai Technology Co., Ltd. (XtalPi), Floor 3, Sf Industrial Plant, No. 2 Hongliu Road, Fubao Community, Fubao Street, Futian District, Shenzhen 518045, China
| | - Jian Ma
- Shenzhen Jingtai Technology Co., Ltd. (XtalPi), Floor 3, Sf Industrial Plant, No. 2 Hongliu Road, Fubao Community, Fubao Street, Futian District, Shenzhen 518045, China
| | - Yide Alan Jiang
- XtalPi Inc., 245 Main Street, Cambridge, Massachusetts 02142, United States
| | - Shuhao Wen
- Shenzhen Jingtai Technology Co., Ltd. (XtalPi), Floor 3, Sf Industrial Plant, No. 2 Hongliu Road, Fubao Community, Fubao Street, Futian District, Shenzhen 518045, China
| | - Xinjun Hou
- Medicine Design, Pfizer Inc., 1 Portland Street, Cambridge, Massachusetts 02139, United States
| | - Junjie Zou
- Shenzhen Jingtai Technology Co., Ltd. (XtalPi), Floor 3, Sf Industrial Plant, No. 2 Hongliu Road, Fubao Community, Fubao Street, Futian District, Shenzhen 518045, China
| | - Mingjun Yang
- Shenzhen Jingtai Technology Co., Ltd. (XtalPi), Floor 3, Sf Industrial Plant, No. 2 Hongliu Road, Fubao Community, Fubao Street, Futian District, Shenzhen 518045, China
| |
Collapse
|
21
|
Gelžinytė E, Öeren M, Segall MD, Csányi G. Transferable Machine Learning Interatomic Potential for Bond Dissociation Energy Prediction of Drug-like Molecules. J Chem Theory Comput 2024; 20:164-177. [PMID: 38108269 PMCID: PMC10782450 DOI: 10.1021/acs.jctc.3c00710] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2023] [Revised: 11/30/2023] [Accepted: 11/30/2023] [Indexed: 12/19/2023]
Abstract
We present a transferable MACE interatomic potential that is applicable to open- and closed-shell drug-like molecules containing hydrogen, carbon, and oxygen atoms. Including an accurate description of radical species extends the scope of possible applications to bond dissociation energy (BDE) prediction, for example, in the context of cytochrome P450 (CYP) metabolism. The transferability of the MACE potential was validated on the COMP6 data set, containing only closed-shell molecules, where it reaches better accuracy than the readily available general ANI-2x potential. MACE achieves similar accuracy on two CYP metabolism-specific data sets, which include open- and closed-shell structures. This model enables us to calculate the aliphatic C-H BDE, which allows us to compare reaction energies of hydrogen abstraction, which is the rate-limiting step of the aliphatic hydroxylation reaction catalyzed by CYPs. On the "CYP 3A4" data set, MACE achieves a BDE RMSE of 1.37 kcal/mol and better prediction of BDE ranks than alternatives: the semiempirical AM1 and GFN2-xTB methods and the ALFABET model that directly predicts bond dissociation enthalpies. Finally, we highlight the smoothness of the MACE potential over paths of sp3C-H bond elongation and show that a minimal extension is enough for the MACE model to start finding reasonable minimum energy paths of methoxy radical-mediated hydrogen abstraction. Altogether, this work lays the ground for further extensions of scope in terms of chemical elements, (CYP-mediated) reaction classes and modeling the full reaction paths, not only BDEs.
Collapse
Affiliation(s)
- Elena Gelžinytė
- Engineering
Laboratory, University of Cambridge, Trumpington Street, Cambridge CB2 1PZ, U.K.
| | - Mario Öeren
- Optibrium
Limited, Cambridge Innovation Park, Denny End Road, Cambridge CB25 9GL, U.K.
| | - Matthew D. Segall
- Optibrium
Limited, Cambridge Innovation Park, Denny End Road, Cambridge CB25 9GL, U.K.
| | - Gábor Csányi
- Engineering
Laboratory, University of Cambridge, Trumpington Street, Cambridge CB2 1PZ, U.K.
| |
Collapse
|
22
|
Setiadi J, Boothroyd S, Slochower DR, Dotson DL, Thompson MW, Wagner JR, Wang LP, Gilson MK. Tuning Potential Functions to Host-Guest Binding Data. J Chem Theory Comput 2024; 20:239-252. [PMID: 38147689 PMCID: PMC10838530 DOI: 10.1021/acs.jctc.3c01050] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2023]
Abstract
Software to more rapidly and accurately predict protein-ligand binding affinities is of high interest for early-stage drug discovery, and physics-based methods are among the most widely used technologies for this purpose. The accuracy of these methods depends critically on the accuracy of the potential functions that they use. Potential functions are typically trained against a combination of quantum chemical and experimental data. However, although binding affinities are among the most important quantities to predict, experimental binding affinities have not to date been integrated into the experimental data set used to train potential functions. In recent years, the use of host-guest complexes as simple and tractable models of binding thermodynamics has gained popularity due to their small size and simplicity, relative to protein-ligand systems. Host-guest complexes can also avoid ambiguities that arise in protein-ligand systems such as uncertain protonation states. Thus, experimental host-guest binding data are an appealing additional data type to integrate into the experimental data set used to optimize potential functions. Here, we report the extension of the Open Force Field Evaluator framework to enable the systematic calculation of host-guest binding free energies and their gradients with respect to force field parameters, coupled with the curation of 126 host-guest complexes with available experimental binding free energies. As an initial application of this novel infrastructure, we optimized generalized Born (GB) cavity radii for the OBC2 GB implicit solvent model against experimental data for 36 host-guest systems. This refitting led to a dramatic improvement in accuracy for both the training set and a separate test set with 90 additional host-guest systems. The optimized radii also showed encouraging transferability from host-guest systems to 59 protein-ligand systems. However, the new radii are significantly smaller than the baseline radii and lead to excessively favorable hydration free energies (HFEs). Thus, users of the OBC2 GB model currently may choose between GB cavity radii that yield more accurate binding affinities and GB cavity radii that yield more accurate HFEs. We suspect that achieving good accuracy on both will require more far-reaching adjustments to the GB model. We note that binding free-energy calculations using the OBC2 model in OpenMM gain about a 10× speedup relative to corresponding explicit solvent calculations, suggesting a future role for implicit solvent absolute binding free-energy (ABFE) calculations in virtual compound screening. This study proves the principle of using host-guest systems to train potential functions that are transferrable to protein-ligand systems and provides an infrastructure that enables a range of applications.
Collapse
Affiliation(s)
- Jeffry Setiadi
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, 9255 Pharmacy Lane, La Jolla, California 92093, United States
| | - Simon Boothroyd
- Boothroyd Scientific Consulting Ltd., London WC2H 9JQ, U.K
- Psivant Therapeutics, Boston, Massachusetts 02210, United States
| | | | - David L Dotson
- Datryllic LLC, Phoenix, Arizona 85003, United States
- The Open Force Field Consortium, Open Molecular Software Foundation, Davis, California 95616, United States
| | - Matthew W Thompson
- The Open Force Field Consortium, Open Molecular Software Foundation, Davis, California 95616, United States
| | - Jeffrey R Wagner
- The Open Force Field Consortium, Open Molecular Software Foundation, Davis, California 95616, United States
| | - Lee-Ping Wang
- Chemistry Department, University of California Davis, Davis, California 95616, United States
| | - Michael K Gilson
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, 9255 Pharmacy Lane, La Jolla, California 92093, United States
| |
Collapse
|
23
|
Bogetti X, Saxena S. Integrating Electron Paramagnetic Resonance Spectroscopy and Computational Modeling to Measure Protein Structure and Dynamics. Chempluschem 2024; 89:e202300506. [PMID: 37801003 DOI: 10.1002/cplu.202300506] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2023] [Revised: 10/05/2023] [Accepted: 10/06/2023] [Indexed: 10/07/2023]
Abstract
Electron paramagnetic resonance (EPR) has become a powerful probe of conformational heterogeneity and dynamics of biomolecules. In this Review, we discuss different computational modeling techniques that enrich the interpretation of EPR measurements of dynamics or distance restraints. A variety of spin labels are surveyed to provide a background for the discussion of modeling tools. Molecular dynamics (MD) simulations of models containing spin labels provide dynamical properties of biomolecules and their labels. These simulations can be used to predict EPR spectra, sample stable conformations and sample rotameric preferences of label sidechains. For molecular motions longer than milliseconds, enhanced sampling strategies and de novo prediction software incorporating or validated by EPR measurements are able to efficiently refine or predict protein conformations, respectively. To sample large-amplitude conformational transition, a coarse-grained or an atomistic weighted ensemble (WE) strategy can be guided with EPR insights. Looking forward, we anticipate an integrative strategy for efficient sampling of alternate conformations by de novo predictions, followed by validations by systematic EPR measurements and MD simulations. Continuous pathways between alternate states can be further sampled by WE-MD including all intermediate states.
Collapse
Affiliation(s)
- Xiaowei Bogetti
- Department of Chemistry, University of Pittsburgh, 219 Parkman Avenue, Pittsburgh, PA, 15260, USA
| | - Sunil Saxena
- Department of Chemistry, University of Pittsburgh, 219 Parkman Avenue, Pittsburgh, PA, 15260, USA
| |
Collapse
|
24
|
Baillif B, Cole J, Giangreco I, McCabe P, Bender A. Applying atomistic neural networks to bias conformer ensembles towards bioactive-like conformations. J Cheminform 2023; 15:124. [PMID: 38129933 PMCID: PMC10740246 DOI: 10.1186/s13321-023-00794-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2023] [Accepted: 12/10/2023] [Indexed: 12/23/2023] Open
Abstract
Identifying bioactive conformations of small molecules is an essential process for virtual screening applications relying on three-dimensional structure such as molecular docking. For most small molecules, conformer generators retrieve at least one bioactive-like conformation, with an atomic root-mean-square deviation (ARMSD) lower than 1 Å, among the set of low-energy conformers generated. However, there is currently no general method to prioritise these likely target-bound conformations in the ensemble. In this work, we trained atomistic neural networks (AtNNs) on 3D information of generated conformers of a curated subset of PDBbind ligands to predict the ARMSD to their closest bioactive conformation, and evaluated the early enrichment of bioactive-like conformations when ranking conformers by AtNN prediction. AtNN ranking was compared with bioactivity-unaware baselines such as ascending Sage force field energy ranking, and a slower bioactivity-based baseline ranking by ascending Torsion Fingerprint Deviation to the Maximum Common Substructure to the most similar molecule in the training set (TFD2SimRefMCS). On test sets from random ligand splits of PDBbind, ranking conformers using ComENet, the AtNN encoding the most 3D information, leads to early enrichment of bioactive-like conformations with a median BEDROC of 0.29 ± 0.02, outperforming the best bioactivity-unaware Sage energy ranking baseline (median BEDROC of 0.18 ± 0.02), and performing on a par with the bioactivity-based TFD2SimRefMCS baseline (median BEDROC of 0.31 ± 0.02). The improved performance of the AtNN and TFD2SimRefMCS baseline is mostly observed on test set ligands that bind proteins similar to proteins observed in the training set. On a more challenging subset of flexible molecules, the bioactivity-unaware baselines showed median BEDROCs up to 0.02, while AtNNs and TFD2SimRefMCS showed median BEDROCs between 0.09 and 0.13. When performing rigid ligand re-docking of PDBbind ligands with GOLD using the 1% top-ranked conformers, ComENet ranked conformers showed a higher successful docking rate than bioactivity-unaware baselines, with a rate of 0.48 ± 0.02 compared to CSD probability baseline with a rate of 0.39 ± 0.02. Similarly, on a pharmacophore searching experiment, selecting the 20% top-ranked conformers ranked by ComENet showed higher hit rate compared to baselines. Hence, the approach presented here uses AtNNs successfully to focus conformer ensembles towards bioactive-like conformations, representing an opportunity to reduce computational expense in virtual screening applications on known targets that require input conformations.
Collapse
Affiliation(s)
- Benoit Baillif
- Yusuf Hamied Department of Chemistry, University of Cambridge, Lensfield Rd, Cambridge, CB2 1EW, UK
| | - Jason Cole
- Cambridge Crystallographic Data Centre, 12 Union Road, Cambridge, CB2 1EZ, UK
| | - Ilenia Giangreco
- Cambridge Crystallographic Data Centre, 12 Union Road, Cambridge, CB2 1EZ, UK
- Exscientia plc, The Schrödinger Building, Oxford Science Park, Oxford, OX4 4GE, UK
| | - Patrick McCabe
- Cambridge Crystallographic Data Centre, 12 Union Road, Cambridge, CB2 1EZ, UK
| | - Andreas Bender
- Yusuf Hamied Department of Chemistry, University of Cambridge, Lensfield Rd, Cambridge, CB2 1EW, UK.
| |
Collapse
|
25
|
Champion C, Gall R, Ries B, Rieder SR, Barros EP, Riniker S. Accelerating Alchemical Free Energy Prediction Using a Multistate Method: Application to Multiple Kinases. J Chem Inf Model 2023; 63:7133-7147. [PMID: 37948537 PMCID: PMC10685456 DOI: 10.1021/acs.jcim.3c01469] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2023] [Revised: 10/23/2023] [Accepted: 10/23/2023] [Indexed: 11/12/2023]
Abstract
Alchemical free-energy methods based on molecular dynamics (MD) simulations have become important tools to identify modifications of small organic molecules that improve their protein binding affinity during lead optimization. The routine application of pairwise free-energy methods to rank potential binders from best to worst is impacted by the combinatorial increase in calculations to perform when the number of molecules to assess grows. To address this fundamental limitation, our group has developed replica-exchange enveloping distribution sampling (RE-EDS), a pathway-independent multistate method, enabling the calculation of alchemical free-energy differences between multiple ligands (N > 2) from a single MD simulation. In this work, we apply the method to a set of four kinases with diverse binding pockets and their corresponding inhibitors (42 in total), chosen to showcase the general applicability of RE-EDS in prospective drug design campaigns. We show that for the targets studied, RE-EDS is able to model up to 13 ligands simultaneously with high sampling efficiency, leading to a substantial decrease in computational cost when compared to pairwise methods.
Collapse
Affiliation(s)
- Candide Champion
- Department of Chemistry and
Applied Biosciences, ETH Zürich, Vladimir-Prelog-Weg 2, 8093 Zürich, Switzerland
| | - René Gall
- Department of Chemistry and
Applied Biosciences, ETH Zürich, Vladimir-Prelog-Weg 2, 8093 Zürich, Switzerland
| | | | - Salomé R. Rieder
- Department of Chemistry and
Applied Biosciences, ETH Zürich, Vladimir-Prelog-Weg 2, 8093 Zürich, Switzerland
| | - Emilia P. Barros
- Department of Chemistry and
Applied Biosciences, ETH Zürich, Vladimir-Prelog-Weg 2, 8093 Zürich, Switzerland
| | - Sereina Riniker
- Department of Chemistry and
Applied Biosciences, ETH Zürich, Vladimir-Prelog-Weg 2, 8093 Zürich, Switzerland
| |
Collapse
|
26
|
Thürlemann M, Riniker S. Hybrid classical/machine-learning force fields for the accurate description of molecular condensed-phase systems. Chem Sci 2023; 14:12661-12675. [PMID: 38020395 PMCID: PMC10646964 DOI: 10.1039/d3sc04317g] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2023] [Accepted: 10/24/2023] [Indexed: 12/01/2023] Open
Abstract
Electronic structure methods offer in principle accurate predictions of molecular properties, however, their applicability is limited by computational costs. Empirical methods are cheaper, but come with inherent approximations and are dependent on the quality and quantity of training data. The rise of machine learning (ML) force fields (FFs) exacerbates limitations related to training data even further, especially for condensed-phase systems for which the generation of large and high-quality training datasets is difficult. Here, we propose a hybrid ML/classical FF model that is parametrized exclusively on high-quality ab initio data of dimers and monomers in vacuum but is transferable to condensed-phase systems. The proposed hybrid model combines our previous ML-parametrized classical model with ML corrections for situations where classical approximations break down, thus combining the robustness and efficiency of classical FFs with the flexibility of ML. Extensive validation on benchmarking datasets and experimental condensed-phase data, including organic liquids and small-molecule crystal structures, showcases how the proposed approach may promote FF development and unlock the full potential of classical FFs.
Collapse
Affiliation(s)
- Moritz Thürlemann
- Department of Chemistry and Applied Biosciences, ETH Zürich Vladimir-Prelog-Weg 2 Zürich 8093 Switzerland
| | - Sereina Riniker
- Department of Chemistry and Applied Biosciences, ETH Zürich Vladimir-Prelog-Weg 2 Zürich 8093 Switzerland
| |
Collapse
|
27
|
Lehtola S. A call to arms: Making the case for more reusable libraries. J Chem Phys 2023; 159:180901. [PMID: 37947507 DOI: 10.1063/5.0175165] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2023] [Accepted: 10/23/2023] [Indexed: 11/12/2023] Open
Abstract
The traditional foundation of science lies on the cornerstones of theory and experiment. Theory is used to explain experiment, which in turn guides the development of theory. Since the advent of computers and the development of computational algorithms, computation has risen as the third cornerstone of science, joining theory and experiment on an equal footing. Computation has become an essential part of modern science, amending experiment by enabling accurate comparison of complicated theories to sophisticated experiments, as well as guiding by triage both the design and targets of experiments and the development of novel theories and computational methods. Like experiment, computation relies on continued investment in infrastructure: it requires both hardware (the physical computer on which the calculation is run) as well as software (the source code of the programs that performs the wanted simulations). In this Perspective, I discuss present-day challenges on the software side in computational chemistry, which arise from the fast-paced development of algorithms, programming models, as well as hardware. I argue that many of these challenges could be solved with reusable open source libraries, which are a public good, enhance the reproducibility of science, and accelerate the development and availability of state-of-the-art methods and improved software.
Collapse
Affiliation(s)
- Susi Lehtola
- Department of Chemistry, University of Helsinki, P.O. Box 55, FI-00014 Helsinki, Finland
| |
Collapse
|
28
|
Han F, Hao D, He X, Wang L, Niu T, Wang J. Distribution of Bound Conformations in Conformational Ensembles for X-ray Ligands Predicted by the ANI-2X Machine Learning Potential. J Chem Inf Model 2023; 63:6608-6618. [PMID: 37899502 PMCID: PMC10647024 DOI: 10.1021/acs.jcim.3c01350] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2023] [Revised: 10/11/2023] [Accepted: 10/13/2023] [Indexed: 10/31/2023]
Abstract
In this study, we systematically studied the energy distribution of bioactive conformations of small molecular ligands in their conformational ensembles using ANI-2X, a machine learning potential, in conjunction with one of our recently developed geometry optimization algorithms, known as a conjugate gradient with backtracking line search (CG-BS). We first evaluated the combination of these methods (ANI-2X/CG-BS) using two molecule sets. For the 231-molecule set, ab initio calculations were performed at both the ωB97X/6-31G(d) and B3LYP-D3BJ/DZVP levels for accuracy comparison, while for the 8,992-molecule set, ab initio calculations were carried out at the B3LYP-D3BJ/DZVP level. For each molecule in the two molecular sets, up to 10 conformations were generated, which diminish the influence of individual outliers on the performance evaluation. Encouraged by the performance of ANI-2x/CG-BS in these evaluations, we calculated the energy distributions using ANI-2x/CG-BS for more than 27,000 ligands in the protein data bank (PDB). Each ligand has at least one conformation bound to a biological molecule, and this ligand conformation is labeled as a bound conformation. Besides the bound conformations, up to 200 conformations were generated using OpenEye's Omega2 software (https://docs.eyesopen.com/applications/ omega/) for each conformation. We performed a statistical analysis of how the bound conformation energies are distributed in the ensembles for 17,197 PDB ligands that have their bound conformation energies within the energy ranges of the Omega2-generated conformation ensembles. We found that half of the ligands have their relative conformation energy lower than 2.91 kcal/mol for the bound conformations in comparison with the global conformations, and about 90% of the bound conformations are within 10 kcal/mol above the global conformation energies. This information is useful to guide the construction of libraries for shape-based virtual screening and to improve the docking algorithm to efficiently sample bound conformations.
Collapse
Affiliation(s)
- Fengyang Han
- Department
of Pharmaceutical Sciences and Computational Chemical Genomics Screening
Center, School of Pharmacy, University of
Pittsburgh, Pittsburgh, Pennsylvania 15261, United States
| | - Dongxiao Hao
- School
of Electronics and Information Engineering, Ankang University, Ankang 725000, China
| | - Xibing He
- Department
of Pharmaceutical Sciences and Computational Chemical Genomics Screening
Center, School of Pharmacy, University of
Pittsburgh, Pittsburgh, Pennsylvania 15261, United States
| | - Luxuan Wang
- Department
of Pharmaceutical Sciences and Computational Chemical Genomics Screening
Center, School of Pharmacy, University of
Pittsburgh, Pittsburgh, Pennsylvania 15261, United States
| | - Taoyu Niu
- Department
of Pharmaceutical Sciences and Computational Chemical Genomics Screening
Center, School of Pharmacy, University of
Pittsburgh, Pittsburgh, Pennsylvania 15261, United States
| | - Junmei Wang
- Department
of Pharmaceutical Sciences and Computational Chemical Genomics Screening
Center, School of Pharmacy, University of
Pittsburgh, Pittsburgh, Pennsylvania 15261, United States
| |
Collapse
|
29
|
Ross GA, Lu C, Scarabelli G, Albanese SK, Houang E, Abel R, Harder ED, Wang L. The maximal and current accuracy of rigorous protein-ligand binding free energy calculations. Commun Chem 2023; 6:222. [PMID: 37838760 PMCID: PMC10576784 DOI: 10.1038/s42004-023-01019-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2022] [Accepted: 10/02/2023] [Indexed: 10/16/2023] Open
Abstract
Computational techniques can speed up the identification of hits and accelerate the development of candidate molecules for drug discovery. Among techniques for predicting relative binding affinities, the most consistently accurate is free energy perturbation (FEP), a class of rigorous physics-based methods. However, uncertainty remains about how accurate FEP is and can ever be. Here, we present what we believe to be the largest publicly available dataset of proteins and congeneric series of small molecules, and assess the accuracy of the leading FEP workflow. To ascertain the limit of achievable accuracy, we also survey the reproducibility of experimental relative affinity measurements. We find a wide variability in experimental accuracy and a correspondence between binding and functional assays. When careful preparation of protein and ligand structures is undertaken, FEP can achieve accuracy comparable to experimental reproducibility. Throughout, we highlight reliable protocols that can help maximize the accuracy of FEP in prospective studies.
Collapse
Affiliation(s)
- Gregory A Ross
- Schrödinger Inc, New York, NY, USA.
- Isomorphic Labs, London, UK.
| | - Chao Lu
- Schrödinger Inc, New York, NY, USA
| | | | | | | | | | | | | |
Collapse
|
30
|
Lehner MT, Katzberger P, Maeder N, Schiebroek CC, Teetz J, Landrum GA, Riniker S. DASH: Dynamic Attention-Based Substructure Hierarchy for Partial Charge Assignment. J Chem Inf Model 2023; 63:6014-6028. [PMID: 37738206 PMCID: PMC10565818 DOI: 10.1021/acs.jcim.3c00800] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2023] [Indexed: 09/24/2023]
Abstract
We present a robust and computationally efficient approach for assigning partial charges of atoms in molecules. The method is based on a hierarchical tree constructed from attention values extracted from a graph neural network (GNN), which was trained to predict atomic partial charges from accurate quantum-mechanical (QM) calculations. The resulting dynamic attention-based substructure hierarchy (DASH) approach provides fast assignment of partial charges with the same accuracy as the GNN itself, is software-independent, and can easily be integrated in existing parametrization pipelines, as shown for the Open force field (OpenFF). The implementation of the DASH workflow, the final DASH tree, and the training set are available as open source/open data from public repositories.
Collapse
Affiliation(s)
| | | | - Niels Maeder
- Department of Chemistry and
Applied Biosciences, ETH Zurich, Vladimir-Prelog-Weg 2, 8093 Zurich, Switzerland
| | - Carl C.G. Schiebroek
- Department of Chemistry and
Applied Biosciences, ETH Zurich, Vladimir-Prelog-Weg 2, 8093 Zurich, Switzerland
| | - Jakob Teetz
- Department of Chemistry and
Applied Biosciences, ETH Zurich, Vladimir-Prelog-Weg 2, 8093 Zurich, Switzerland
| | - Gregory A. Landrum
- Department of Chemistry and
Applied Biosciences, ETH Zurich, Vladimir-Prelog-Weg 2, 8093 Zurich, Switzerland
| | - Sereina Riniker
- Department of Chemistry and
Applied Biosciences, ETH Zurich, Vladimir-Prelog-Weg 2, 8093 Zurich, Switzerland
| |
Collapse
|
31
|
Horton JT, Boothroyd S, Behara PK, Mobley DL, Cole DJ. A transferable double exponential potential for condensed phase simulations of small molecules. DIGITAL DISCOVERY 2023; 2:1178-1187. [PMID: 38013814 PMCID: PMC10408570 DOI: 10.1039/d3dd00070b] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/20/2023] [Accepted: 07/07/2023] [Indexed: 11/29/2023]
Abstract
The Lennard-Jones potential is the most widely-used function for the description of non-bonded interactions in transferable force fields for the condensed phase. This is not because it has an optimal functional form, but rather it is a legacy resulting from when computational expense was a major consideration and this potential was particularly convenient numerically. At present, it persists because the effort that would be required to re-write molecular modelling software and train new force fields has, until now, been prohibitive. Here, we present Smirnoff-plugins as a flexible framework to extend the Open Force Field software stack to allow custom force field functional forms. We deploy Smirnoff-plugins with the automated Open Force Field infrastructure to train a transferable, small molecule force field based on the recently-proposed double exponential functional form, on over 1000 experimental condensed phase properties. Extensive testing of the resulting force field shows improvements in transfer free energies, with acceptable conformational energetics, run times and convergence properties compared to state-of-the-art Lennard-Jones based force fields.
Collapse
Affiliation(s)
- Joshua T Horton
- School of Natural and Environmental Sciences, Newcastle University Newcastle upon Tyne NE1 7RU UK
| | | | - Pavan Kumar Behara
- Department of Pharmaceutical Sciences, University of California Irvine California 92697 USA
| | - David L Mobley
- Department of Pharmaceutical Sciences, University of California Irvine California 92697 USA
- Department of Chemistry, University of California Irvine California 92697 USA
| | - Daniel J Cole
- School of Natural and Environmental Sciences, Newcastle University Newcastle upon Tyne NE1 7RU UK
| |
Collapse
|
32
|
Madin OC, Shirts MR. Using physical property surrogate models to perform accelerated multi-fidelity optimization of force field parameters †. DIGITAL DISCOVERY 2023; 2:828-847. [PMCID: PMC10259372 DOI: 10.1039/d2dd00138a] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/10/2022] [Accepted: 04/28/2023] [Indexed: 06/14/2023]
Abstract
Accurate representations of van der Waals dispersion–repulsion interactions play an important role in high-quality molecular dynamics simulations. Training the force field parameters used in the Lennard Jones (LJ) potential typically used to represent these interactions is challenging, generally requiring adjustment based on simulations of macroscopic physical properties. The large computational expense of these simulations, especially when many parameters must be trained simultaneously, limits the size of training data set and number of optimization steps that can be taken, often requiring modelers to perform optimizations within a local parameter region. To allow for more global LJ parameter optimization against large training sets, we introduce a multi-fidelity optimization technique which uses Gaussian process surrogate modeling to build inexpensive models of physical properties as a function of LJ parameters. This approach allows for fast evaluation of approximate objective functions, greatly accelerating searches over parameter space and enabling the use of optimization algorithms capable of searching more globally. In this study, we use an iterative framework which performs global optimization with differential evolution at the surrogate level, followed by validation at the simulation level and surrogate refinement. Using this technique on two previously studied training sets, containing up to 195 physical property targets, we refit a subset of the LJ parameters for the OpenFF 1.0.0 (Parsley) force field. We demonstrate that this multi-fidelity technique can find improved parameter sets compared to a purely simulation-based optimization by searching more broadly and escaping local minima. Additionally, this technique often finds significantly different parameter minima that have comparably accurate performance. In most cases, these parameter sets are transferable to other similar molecules in a test set. Our multi-fidelity technique provides a platform for rapid, more global optimization of molecular models against physical properties, as well as a number of opportunities for further refinement of the technique. We present a multi-fidelity method for optimizing nonbonded force field parameters against physical property data. Leveraging fast surrogate models, we accelerate the parameter search and find novel solutions that improve force field performance.![]()
Collapse
Affiliation(s)
- Owen C. Madin
- Department of Chemical & Biological Engineering, University of Colorado BoulderBoulderCOUSA80309
| | - Michael R. Shirts
- Department of Chemical & Biological Engineering, University of Colorado BoulderBoulderCOUSA80309
| |
Collapse
|