1
|
Zheng T, Wang A, Han X, Xia Y, Xu X, Zhan J, Liu Y, Chen Y, Wang Z, Wu X, Gong S, Yan W. Data-driven parametrization of molecular mechanics force fields for expansive chemical space coverage. Chem Sci 2025; 16:2730-2740. [PMID: 39802691 PMCID: PMC11721737 DOI: 10.1039/d4sc06640e] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2024] [Accepted: 12/25/2024] [Indexed: 01/16/2025] Open
Abstract
A force field is a critical component in molecular dynamics simulations for computational drug discovery. It must achieve high accuracy within the constraints of molecular mechanics' (MM) limited functional forms, which offers high computational efficiency. With the rapid expansion of synthetically accessible chemical space, traditional look-up table approaches face significant challenges. In this study, we address this issue using a modern data-driven approach, developing ByteFF, an Amber-compatible force field for drug-like molecules. To create ByteFF, we generated an expansive and highly diverse molecular dataset at the B3LYP-D3(BJ)/DZVP level of theory. This dataset includes 2.4 million optimized molecular fragment geometries with analytical Hessian matrices, along with 3.2 million torsion profiles. We then trained an edge-augmented, symmetry-preserving molecular graph neural network (GNN) on this dataset, employing a carefully optimized training strategy. Our model predicts all bonded and non-bonded MM force field parameters for drug-like molecules simultaneously across a broad chemical space. ByteFF demonstrates state-of-the-art performance on various benchmark datasets, excelling in predicting relaxed geometries, torsional energy profiles, and conformational energies and forces. Its exceptional accuracy and expansive chemical space coverage make ByteFF a valuable tool for multiple stages of computational drug discovery.
Collapse
Affiliation(s)
- Tianze Zheng
- ByteDance Research, Beijing Beijing 100098 China
| | - Ailun Wang
- ByteDance Research Bellevue Washington 98004 USA
| | - Xu Han
- ByteDance Research, Beijing Beijing 100098 China
| | - Yu Xia
- ByteDance Research, Beijing Beijing 100098 China
| | - Xingyuan Xu
- ByteDance Research, Beijing Beijing 100098 China
| | - Jiawei Zhan
- ByteDance Research Bellevue Washington 98004 USA
| | - Yu Liu
- ByteDance Research Bellevue Washington 98004 USA
| | - Yang Chen
- ByteDance Research, Beijing Beijing 100098 China
| | - Zhi Wang
- ByteDance Research Bellevue Washington 98004 USA
| | - Xiaojie Wu
- ByteDance Research Bellevue Washington 98004 USA
| | - Sheng Gong
- ByteDance Research Bellevue Washington 98004 USA
| | - Wen Yan
- ByteDance Research Bellevue Washington 98004 USA
| |
Collapse
|
2
|
Zurek C, Mallaev RA, Paul AC, van Staalduinen N, Pracht P, Ellerbrock R, Bannwarth C. Tensor Train Optimization for Conformational Sampling of Organic Molecules. J Chem Theory Comput 2025. [PMID: 39841125 DOI: 10.1021/acs.jctc.4c01275] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2025]
Abstract
Exploring the conformational space of molecules remains a challenge of fundamental importance to quantum chemistry: identification of relevant conformers at ambient conditions enables predictive simulations of almost arbitrary properties. Here, we propose a novel approach, called TTConf, to enable conformational sampling of large organic molecules where the combinatorial explosion of possible conformers prevents the use of a brute-force systematic conformer search. We employ tensor trains as a highly efficient dimensionality reduction algorithm, effectively reducing the scaling from exponential to polynomial. In our approach, the conformational search is expressed as global energy minimization task in a high-dimensional grid of dihedral angles. Dimensionality reduction is achieved through a tensor train representation of the high-dimensional torsion space. The performance of the approach is assessed on a variety of drug-like molecules in direct comparison to the state-of-the-art metadynamics based conformer search as implemented in CREST. The comparison shows significant acceleration of up to an order of magnitude, while maintaining comparable accuracy. More importantly, the presented approach allows treatment of larger molecules than typically accessible with metadynamics.
Collapse
Affiliation(s)
- Christopher Zurek
- Institute of Physical Chemistry, RWTH Aachen University, Aachen 52074, Germany
| | | | | | | | - Philipp Pracht
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | | | - Christoph Bannwarth
- Institute of Physical Chemistry, RWTH Aachen University, Aachen 52074, Germany
| |
Collapse
|
3
|
Friedman AJ, Hsu WT, Shirts MR. Multiple Topology Replica Exchange of Expanded Ensembles for Multidimensional Alchemical Calculations. J Chem Theory Comput 2025; 21:230-240. [PMID: 39743749 PMCID: PMC11732712 DOI: 10.1021/acs.jctc.4c01268] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2025]
Abstract
Relative free energy (RFE) calculations are now widely used in academia and the industry, but their accuracy is often limited by poor sampling of the complexes' conformational ensemble. To help address conformational sampling problems when simulating many relative binding free energies, we developed a novel method termed multiple topology replica exchange of expanded ensembles (MT-REXEE). This method enables parallel expanded ensemble calculations, facilitating iterative RFE computations while allowing conformational exchange between parallel transformations. These iterative transformations can be adaptable to any set of systems with a common backbone or central substructure. We demonstrate that the MT-REXEE method maintains thermodynamic cycle closure to the same extent as standard expanded ensemble calculations for both solvation free energy and relative binding free energy calculations. The transformations tested involve systems that incorporate diverse heavy atoms and multisite perturbations of a small molecule core resembling multisite λ dynamics, without necessitating modifications to the MD code. Our initial implementation is in GROMACS. We outline a systematic approach for the topology setup and provide instructions on how to perform inter-replica coordinate modifications. This work shows that MT-REXEE can be used to perform accurate and reproducible free energy estimates and prompts expansion to more complex test systems and other molecular dynamics simulation infrastructures.
Collapse
Affiliation(s)
- Anika J Friedman
- Department of Chemical and Biological Engineering, University of Colorado Boulder, Boulder, Colorado 80309, United States
| | - Wei-Tse Hsu
- Department of Chemical and Biological Engineering, University of Colorado Boulder, Boulder, Colorado 80309, United States
| | - Michael R Shirts
- Department of Chemical and Biological Engineering, University of Colorado Boulder, Boulder, Colorado 80309, United States
| |
Collapse
|
4
|
Takaba K, Friedman AJ, Cavender CE, Behara PK, Pulido I, Henry MM, MacDermott-Opeskin H, Iacovella CR, Nagle AM, Payne AM, Shirts MR, Mobley DL, Chodera JD, Wang Y. Machine-learned molecular mechanics force fields from large-scale quantum chemical data. Chem Sci 2024; 15:12861-12878. [PMID: 39148808 PMCID: PMC11322960 DOI: 10.1039/d4sc00690a] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2024] [Accepted: 06/17/2024] [Indexed: 08/17/2024] Open
Abstract
The development of reliable and extensible molecular mechanics (MM) force fields-fast, empirical models characterizing the potential energy surface of molecular systems-is indispensable for biomolecular simulation and computer-aided drug design. Here, we introduce a generalized and extensible machine-learned MM force field, espaloma-0.3, and an end-to-end differentiable framework using graph neural networks to overcome the limitations of traditional rule-based methods. Trained in a single GPU-day to fit a large and diverse quantum chemical dataset of over 1.1 M energy and force calculations, espaloma-0.3 reproduces quantum chemical energetic properties of chemical domains highly relevant to drug discovery, including small molecules, peptides, and nucleic acids. Moreover, this force field maintains the quantum chemical energy-minimized geometries of small molecules and preserves the condensed phase properties of peptides and folded proteins, self-consistently parametrizing proteins and ligands to produce stable simulations leading to highly accurate predictions of binding free energies. This methodology demonstrates significant promise as a path forward for systematically building more accurate force fields that are easily extensible to new chemical domains of interest.
Collapse
Affiliation(s)
- Kenichiro Takaba
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center New York NY 10065 USA
- Pharmaceuticals Research Center, Advanced Drug Discovery, Asahi Kasei Pharma Corporation Shizuoka 410-2321 Japan
| | - Anika J Friedman
- Department of Chemical and Biological Engineering, University of Colorado Boulder Boulder CO 80309 USA
| | - Chapin E Cavender
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego 9500 Gilman Drive La Jolla CA 92093 USA
| | - Pavan Kumar Behara
- Center for Neurotherapeutics, Department of Pathology and Laboratory Medicine, University of California Irvine CA 92697 USA
| | - Iván Pulido
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center New York NY 10065 USA
| | - Michael M Henry
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center New York NY 10065 USA
| | | | - Christopher R Iacovella
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center New York NY 10065 USA
| | - Arnav M Nagle
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center New York NY 10065 USA
- Department of Bioengineering, University of California, Berkeley Berkeley CA 94720 USA
| | - Alexander Matthew Payne
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center New York NY 10065 USA
- Tri-Institutional PhD Program in Chemical Biology, Memorial Sloan Kettering Cancer Center New York 10065 USA
| | - Michael R Shirts
- Department of Chemical and Biological Engineering, University of Colorado Boulder Boulder CO 80309 USA
| | - David L Mobley
- Department of Pharmaceutical Sciences, University of California Irvine California 92697 USA
| | - John D Chodera
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center New York NY 10065 USA
| | - Yuanqing Wang
- Simons Center for Computational Physical Chemistry and Center for Data Science, New York University New York NY 10004 USA
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center New York NY 10065 USA
| |
Collapse
|
5
|
Brueckner AC, Shields B, Kirubakaran P, Suponya A, Panda M, Posy SL, Johnson S, Lakkaraju SK. MDFit: automated molecular simulations workflow enables high throughput assessment of ligands-protein dynamics. J Comput Aided Mol Des 2024; 38:24. [PMID: 39014286 DOI: 10.1007/s10822-024-00564-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2024] [Accepted: 06/28/2024] [Indexed: 07/18/2024]
Abstract
Molecular dynamics (MD) simulation is a powerful tool for characterizing ligand-protein conformational dynamics and offers significant advantages over docking and other rigid structure-based computational methods. However, setting up, running, and analyzing MD simulations continues to be a multi-step process making it cumbersome to assess a library of ligands in a protein binding pocket using MD. We present an automated workflow that streamlines setting up, running, and analyzing Desmond MD simulations for protein-ligand complexes using machine learning (ML) models. The workflow takes a library of pre-docked ligands and a prepared protein structure as input, sets up and runs MD with each protein-ligand complex, and generates simulation fingerprints for each ligand. Simulation fingerprints (SimFP) capture protein-ligand compatibility, including stability of different ligand-pocket interactions and other useful metrics that enable easy rank-ordering of the ligand library for pocket optimization. SimFPs from a ligand library are used to build & deploy ML models that predict binding assay outcomes and automatically infer important interactions. Unlike relative free-energy methods that are constrained to assess ligands with high chemical similarity, ML models based on SimFPs can accommodate diverse ligand sets. We present two case studies on how SimFP helps delineate structure-activity relationship (SAR) trends and explain potency differences across matched-molecular pairs of (1) cyclic peptides targeting PD-L1 and (2) small molecule inhibitors targeting CDK9.
Collapse
Affiliation(s)
| | - Benjamin Shields
- Molecular Structure & Design, Bristol Myers Squibb, Princeton, NJ, 08540, USA
| | - Palani Kirubakaran
- Biocon Bristol Myers Squibb R&D Centre, Bangalore, 560099, Karnataka, India
| | - Alexander Suponya
- Molecular Structure & Design, Bristol Myers Squibb, Princeton, NJ, 08540, USA
| | - Manoranjan Panda
- Molecular Structure & Design, Bristol Myers Squibb, Princeton, NJ, 08540, USA
| | - Shana L Posy
- Molecular Structure & Design, Bristol Myers Squibb, Princeton, NJ, 08540, USA
| | - Stephen Johnson
- Molecular Structure & Design, Bristol Myers Squibb, Princeton, NJ, 08540, USA
| | | |
Collapse
|
6
|
Amezcua M, Setiadi J, Mobley DL. The SAMPL9 host-guest blind challenge: an overview of binding free energy predictive accuracy. Phys Chem Chem Phys 2024; 26:9207-9225. [PMID: 38444308 PMCID: PMC10954238 DOI: 10.1039/d3cp05111k] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2023] [Accepted: 02/03/2024] [Indexed: 03/07/2024]
Abstract
We report the results of the SAMPL9 host-guest blind challenge for predicting binding free energies. The challenge focused on macrocycles from pillar[n]-arene and cyclodextrin host families, including WP6, and bCD and HbCD. A variety of methods were used by participants to submit binding free energy predictions. A machine learning approach based on molecular descriptors achieved the highest accuracy (RMSE of 2.04 kcal mol-1) among the ranked methods in the WP6 dataset. Interestingly, predictions for WP6 obtained via docking tended to outperform all methods (RMSE of 1.70 kcal mol-1), most of which are MD based and computationally more expensive. In general, methods applying force fields achieved better correlation with experiments for WP6 opposed to the machine learning and docking models. In the cyclodextrin-phenothiazine challenge, the ATM approach emerged as the top performing method with RMSE less than 1.86 kcal mol-1. Correlation metrics of ranked methods in this dataset were relatively poor compared to WP6. We also highlight several lessons learned to guide future work and help improve studies on the systems discussed. For example, WP6 may be present in other microstates other than its -12 state in the presence of certain guests. Machine learning approaches can be used to fine tune or help train force fields for certain chemistry (i.e. WP6-G4). Certain phenothiazines occupy distinct primary and secondary orientations, some of which were considered individually for accurate binding free energies. The accuracy of predictions from certain methods while starting from a single binding pose/orientation demonstrates the sensitivity of calculated binding free energies to the orientation, and in some cases the likely dominant orientation for the system. Computational and experimental results suggest that guest phenothiazine core traverses both the secondary and primary faces of the cyclodextrin hosts, a bulky cationic side chain will primarily occupy the primary face, and the phenothiazine core substituent resides at the larger secondary face.
Collapse
Affiliation(s)
- Martin Amezcua
- Department of Pharmaceutical Sciences, University of California, Irvine, Irvine, California 92697, USA.
| | - Jeffry Setiadi
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, California 92093, USA
| | - David L Mobley
- Department of Pharmaceutical Sciences, University of California, Irvine, Irvine, California 92697, USA.
- Department of Chemistry, University of California, Irvine, Irvine, California 92697, USA
| |
Collapse
|
7
|
Hosseini AN, van der Spoel D. Martini on the Rocks: Can a Coarse-Grained Force Field Model Crystals? J Phys Chem Lett 2024; 15:1079-1088. [PMID: 38261634 PMCID: PMC10839907 DOI: 10.1021/acs.jpclett.4c00012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2024] [Revised: 01/18/2024] [Accepted: 01/19/2024] [Indexed: 01/25/2024]
Abstract
Computational chemistry is an important tool in numerous scientific disciplines, including drug discovery and structural biology. Coarse-grained models offer simple representations of molecular systems that enable simulations of large-scale systems. Because there has been an increase in the adoption of such models for simulations of biomolecular systems, critical evaluation is warranted. Here, the stability of the amyloid peptide and organic crystals is evaluated using the Martini 3 coarse-grained force field. The crystals change shape drastically during the simulations. Radial distribution functions show that the distance between backbone beads in β-sheets increases by ∼1 Å, breaking the crystals. The melting points of organic compounds are much too low in the Martini force field. This suggests that Martini 3 lacks the specific interactions needed to accurately simulate peptides or organic crystals without imposing artificial restraints. The problems may be exacerbated by the use of the 12-6 potential, suggesting that a softer potential could improve this model for crystal simulations.
Collapse
Affiliation(s)
- A. Najla Hosseini
- Department of Cell and Molecular
Biology, Uppsala University, Box 596, SE-75124 Uppsala, Sweden
| | - David van der Spoel
- Department of Cell and Molecular
Biology, Uppsala University, Box 596, SE-75124 Uppsala, Sweden
| |
Collapse
|
8
|
Folmsbee D, Koes DR, Hutchison GR. Systematic Comparison of Experimental Crystallographic Geometries and Gas-Phase Computed Conformers for Torsion Preferences. J Chem Inf Model 2023; 63:7401-7411. [PMID: 38000780 PMCID: PMC10716907 DOI: 10.1021/acs.jcim.3c01278] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2023] [Revised: 11/07/2023] [Accepted: 11/13/2023] [Indexed: 11/26/2023]
Abstract
We performed exhaustive torsion sampling on more than 3 million compounds using the GFN2-xTB method and performed a comparison of experimental crystallographic and gas-phase conformers. Many conformer sampling methods derive torsional angle distributions from experimental crystallographic data, limiting the torsion preferences to molecules that must be stable, synthetically accessible, and able to be crystallized. In this work, we evaluate the differences in torsional preferences of experimental crystallographic geometries and gas-phase computed conformers from a broad selection of compounds to determine whether torsional angle distributions obtained from semiempirical methods are suitable priors for conformer sampling. We find that differences in torsion preferences can be mostly attributed to a lack of available experimental crystallographic data with small deviations derived from gas-phase geometry differences. GFN2 demonstrates the ability to provide accurate and reliable torsional preferences that can provide a basis for new methods free from the limitations of experimental data collection. We provide Gaussian-based fits and sampling distributions suitable for torsion sampling and propose an alternative to the widely used "experimental torsion and knowledge distance geometry" (ETKDG) method using quantum torsion-derived distance geometry (QTDG) methods.
Collapse
Affiliation(s)
- Dakota
L. Folmsbee
- Department
of Chemistry, University of Pittsburgh, 219 Parkman Avenue, Pittsburgh, Pennsylvania 15260, United States
- Department
of Anesthesiology & Perioperative Medicine, School of Medicine, University of Pittsburgh, Pittsburgh, Pennsylvania 15261, United States
| | - David R. Koes
- Department
of Computational & Systems Biology, School of Medicine, University of Pittsburgh, Pittsburgh, Pennsylvania 15260, United States
| | - Geoffrey R. Hutchison
- Department
of Chemistry, University of Pittsburgh, 219 Parkman Avenue, Pittsburgh, Pennsylvania 15260, United States
- Department
of Chemical & Petroleum Engineering, University of Pittsburgh, 3700 O’Hara Street, Pittsburgh, Pennsylvania 15261, United States
| |
Collapse
|
9
|
Hurley MFD, Raddi RM, Pattis JG, Voelz VA. Expanded ensemble predictions of absolute binding free energies in the SAMPL9 host-guest challenge. Phys Chem Chem Phys 2023; 25:32393-32406. [PMID: 38009066 PMCID: PMC10760931 DOI: 10.1039/d3cp02197a] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2023]
Abstract
As part of the SAMPL9 community-wide blind host-guest challenge, we implemented an expanded ensemble workflow to predict absolute binding free energies for 13 small molecules against pillar[6]arene. Notable features of our protocol include consideration of a variety of protonation and enantiomeric states for both host and guests, optimization of alchemical intermediates, and analysis of free energy estimates and their uncertainty using large numbers of simulation replicates performed using distributed computing. Our predictions of absolute binding free energies resulted in a mean absolute error of 2.29 kcal mol-1 and an R2 of 0.54. Overall, results show that expanded ensemble calculations using all-atom molecular dynamics simulations are a valuable and efficient computational tool in predicting absolute binding free energies.
Collapse
Affiliation(s)
| | - Robert M Raddi
- Department of Chemistry, Temple University, Philadelphia, PA, USA.
| | - Jason G Pattis
- Department of Chemistry, Temple University, Philadelphia, PA, USA.
| | - Vincent A Voelz
- Department of Chemistry, Temple University, Philadelphia, PA, USA.
| |
Collapse
|
10
|
Pattanaik L, Menon A, Settels V, Spiekermann KA, Tan Z, Vermeire FH, Sandfort F, Eiden P, Green WH. ConfSolv: Prediction of Solute Conformer-Free Energies across a Range of Solvents. J Phys Chem B 2023; 127:10151-10170. [PMID: 37966798 DOI: 10.1021/acs.jpcb.3c05904] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2023]
Abstract
Predicting Gibbs free energy of solution is key to understanding the solvent effects on thermodynamics and reaction rates for kinetic modeling. Accurately computing solution free energies requires the enumeration and evaluation of relevant solute conformers in solution. However, even after generation of relevant conformers, determining their free energy of solution requires an expensive workflow consisting of several ab initio computational chemistry calculations. To help address this challenge, we generate a large data set of solution free energies for nearly 44,000 solutes with almost 9 million conformers calculated in 41 different solvents using density functional theory and COSMO-RS and quantify the impact of solute conformers on the solution free energy. We then train a message passing neural network to predict the relative solution free energies of a set of solute conformers, enabling the identification of a small subset of thermodynamically relevant conformers. The model offers substantial computational time savings with predictions usually substantially within 1 kcal/mol of the free energy of the solution calculated by using computational chemical methods.
Collapse
Affiliation(s)
- Lagnajit Pattanaik
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Angiras Menon
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Volker Settels
- BASF SE, Scientific Modeling, Group Research, Ludwigshafen am Rhein 67056, Germany
| | - Kevin A Spiekermann
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Zipei Tan
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Florence H Vermeire
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
- Department of Chemical Engineering, KU Leuven, Celestijnenlaan 200F, Leuven 3001, Belgium
| | - Frederik Sandfort
- BASF SE, Scientific Modeling, Group Research, Ludwigshafen am Rhein 67056, Germany
| | - Philipp Eiden
- BASF SE, Scientific Modeling, Group Research, Ludwigshafen am Rhein 67056, Germany
| | - William H Green
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| |
Collapse
|
11
|
Ballav S, Bhosale M, Lokhande KB, Paul MK, Padhye S, Swamy KV, Ranjan A, Basu S. Design, Synthesis, and Biological Evaluation of Novel Quercetin Derivatives as PPAR-γ Partial Agonists by Modulating Epithelial-Mesenchymal Transition in Lung Cancer Metastasis. Adv Biol (Weinh) 2023; 7:e2300036. [PMID: 37017501 DOI: 10.1002/adbi.202300036] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2023] [Revised: 03/09/2023] [Indexed: 04/06/2023]
Abstract
Epithelial-to-mesenchymal transition (EMT) is responsible for driving metastasis of multiple cancer types including lung cancer. Peroxisome proliferator-activated receptor (PPAR)-γ, a ligand-activated transcription factor, controls expression of variety of genes involved in EMT. Although several synthetic compounds act as potent full agonists for PPAR-γ, their long term application is restricted due to serious adverse effects. Therefore, partial agonists involving reduced and balanced PPAR-γ activity are more effective and valued. A previous study discerned the efficacy of quercetin and its derivatives to attain favorable stabilization with PPAR-γ. Here this work is extended by synthesizing five novel quercetin derivatives (QDs) namely thiosemicarbazone (QUETSC)) and hydrazones (quercetin isonicotinic acid hydrazone (QUEINH), quercetin nicotinic acid hydrazone (QUENH), quercetin 2-furoic hydrazone (QUE2FH), and quercetin salicyl hydrazone (QUESH)) and their effects are analyzed in modulating EMT in lung cancer cell lines via PPAR-γ partial activation. QDs-treated A549 cells diminish cell proliferation strongly at nanomolar concentration compared to NCI-H460 cells. Of the five screened derivatives, QUETSC, QUE2FH, and QUESH exhibit the property of partial activation as compared to the overexpressive level of rosiglitazone. Consistently, these QDs also suppress EMT process by markedly downregulating the levels of mesenchymal markers (Snail, Slug, and zinc finger E-box binding homeobox 1) and concomitant upregulation of epithelial marker (E-cadherin).
Collapse
Affiliation(s)
- Sangeeta Ballav
- Cancer and Translational Research Centre, Dr. D.Y. Patil Biotechnology and Bioinformatics Institute, Dr. D.Y. Patil Vidyapeeth, Tathawade, Pune, Maharashtra, 411 033, India
| | - Mrinalini Bhosale
- Department of Chemistry, Interdisciplinary Science and Technology Research Academy, Abeda Inamdar Senior College, University of Pune, Maharashtra, 411001, India
| | - Kiran Bharat Lokhande
- Bioinformatics Research Laboratory, Dr. D.Y. Patil Biotechnology and Bioinformatics Institute, Dr. D.Y. Patil Vidyapeeth, Tathawade, Pune, Maharashtra, 411 033, India
| | - Manash K Paul
- Department of Pulmonary and Critical Care Medicine, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA, 90095, USA
| | - Subhash Padhye
- Department of Chemistry, Interdisciplinary Science and Technology Research Academy, Abeda Inamdar Senior College, University of Pune, Maharashtra, 411001, India
| | - K Venkateswara Swamy
- Bioinformatics Research Laboratory, Dr. D.Y. Patil Biotechnology and Bioinformatics Institute, Dr. D.Y. Patil Vidyapeeth, Tathawade, Pune, Maharashtra, 411 033, India
- MIT School of Bioengineering Science and Research, MIT - Art, Design and Technology University, Pune, Maharashtra, 412201, India
| | - Amit Ranjan
- Cancer and Translational Research Centre, Dr. D.Y. Patil Biotechnology and Bioinformatics Institute, Dr. D.Y. Patil Vidyapeeth, Tathawade, Pune, Maharashtra, 411 033, India
| | - Soumya Basu
- Cancer and Translational Research Centre, Dr. D.Y. Patil Biotechnology and Bioinformatics Institute, Dr. D.Y. Patil Vidyapeeth, Tathawade, Pune, Maharashtra, 411 033, India
| |
Collapse
|
12
|
Horton JT, Boothroyd S, Behara PK, Mobley DL, Cole DJ. A transferable double exponential potential for condensed phase simulations of small molecules. DIGITAL DISCOVERY 2023; 2:1178-1187. [PMID: 38013814 PMCID: PMC10408570 DOI: 10.1039/d3dd00070b] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/20/2023] [Accepted: 07/07/2023] [Indexed: 11/29/2023]
Abstract
The Lennard-Jones potential is the most widely-used function for the description of non-bonded interactions in transferable force fields for the condensed phase. This is not because it has an optimal functional form, but rather it is a legacy resulting from when computational expense was a major consideration and this potential was particularly convenient numerically. At present, it persists because the effort that would be required to re-write molecular modelling software and train new force fields has, until now, been prohibitive. Here, we present Smirnoff-plugins as a flexible framework to extend the Open Force Field software stack to allow custom force field functional forms. We deploy Smirnoff-plugins with the automated Open Force Field infrastructure to train a transferable, small molecule force field based on the recently-proposed double exponential functional form, on over 1000 experimental condensed phase properties. Extensive testing of the resulting force field shows improvements in transfer free energies, with acceptable conformational energetics, run times and convergence properties compared to state-of-the-art Lennard-Jones based force fields.
Collapse
Affiliation(s)
- Joshua T Horton
- School of Natural and Environmental Sciences, Newcastle University Newcastle upon Tyne NE1 7RU UK
| | | | - Pavan Kumar Behara
- Department of Pharmaceutical Sciences, University of California Irvine California 92697 USA
| | - David L Mobley
- Department of Pharmaceutical Sciences, University of California Irvine California 92697 USA
- Department of Chemistry, University of California Irvine California 92697 USA
| | - Daniel J Cole
- School of Natural and Environmental Sciences, Newcastle University Newcastle upon Tyne NE1 7RU UK
| |
Collapse
|
13
|
Boothroyd S, Behara PK, Madin OC, Hahn DF, Jang H, Gapsys V, Wagner JR, Horton JT, Dotson DL, Thompson MW, Maat J, Gokey T, Wang LP, Cole DJ, Gilson MK, Chodera JD, Bayly CI, Shirts MR, Mobley DL. Development and Benchmarking of Open Force Field 2.0.0: The Sage Small Molecule Force Field. J Chem Theory Comput 2023; 19:3251-3275. [PMID: 37167319 PMCID: PMC10269353 DOI: 10.1021/acs.jctc.3c00039] [Citation(s) in RCA: 35] [Impact Index Per Article: 17.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2023] [Indexed: 05/13/2023]
Abstract
We introduce the Open Force Field (OpenFF) 2.0.0 small molecule force field for drug-like molecules, code-named Sage, which builds upon our previous iteration, Parsley. OpenFF force fields are based on direct chemical perception, which generalizes easily to highly diverse sets of chemistries based on substructure queries. Like the previous OpenFF iterations, the Sage generation of OpenFF force fields was validated in protein-ligand simulations to be compatible with AMBER biopolymer force fields. In this work, we detail the methodology used to develop this force field, as well as the innovations and improvements introduced since the release of Parsley 1.0.0. One particularly significant feature of Sage is a set of improved Lennard-Jones (LJ) parameters retrained against condensed phase mixture data, the first refit of LJ parameters in the OpenFF small molecule force field line. Sage also includes valence parameters refit to a larger database of quantum chemical calculations than previous versions, as well as improvements in how this fitting is performed. Force field benchmarks show improvements in general metrics of performance against quantum chemistry reference data such as root-mean-square deviations (RMSD) of optimized conformer geometries, torsion fingerprint deviations (TFD), and improved relative conformer energetics (ΔΔE). We present a variety of benchmarks for these metrics against our previous force fields as well as in some cases other small molecule force fields. Sage also demonstrates improved performance in estimating physical properties, including comparison against experimental data from various thermodynamic databases for small molecule properties such as ΔHmix, ρ(x), ΔGsolv, and ΔGtrans. Additionally, we benchmarked against protein-ligand binding free energies (ΔGbind), where Sage yields results statistically similar to previous force fields. All the data is made publicly available along with complete details on how to reproduce the training results at https://github.com/openforcefield/openff-sage.
Collapse
Affiliation(s)
| | - Pavan Kumar Behara
- Department
of Pharmaceutical Sciences, University of
California, Irvine, California 92697, United States
| | - Owen C. Madin
- Chemical
& Biological Engineering Department, University of Colorado Boulder, Boulder, Colorado 80309, United States
| | - David F. Hahn
- Computational
Chemistry, Janssen Research & Development, Turnhoutseweg 30, Beerse B-2340, Belgium
| | - Hyesu Jang
- Chemistry
Department, The University of California
at Davis, Davis, California 95616, United States
- OpenEye
Scientific Software, Santa
Fe, New Mexico 87508, United States
| | - Vytautas Gapsys
- Computational
Chemistry, Janssen Research & Development, Turnhoutseweg 30, Beerse B-2340, Belgium
- Computational
Biomolecular Dynamics Group, Department of Theoretical and Computational
Biophysics, Max Planck Institute for Multidisciplinary
Sciences, Am Fassberg 11, D-37077, Göttingen, Germany
| | - Jeffrey R. Wagner
- Department
of Pharmaceutical Sciences, University of
California, Irvine, California 92697, United States
- The Open
Force Field Initiative, Open Molecular Software
Foundation, Davis, California 95616, United States
| | - Joshua T. Horton
- School
of Natural and Environmental Sciences, Newcastle
University, Newcastle
upon Tyne NE1 7RU, U.K.
| | - David L. Dotson
- The Open
Force Field Initiative, Open Molecular Software
Foundation, Davis, California 95616, United States
- Datryllic LLC, Phoenix, Arizona 85003, United
States
| | - Matthew W. Thompson
- Chemical
& Biological Engineering Department, University of Colorado Boulder, Boulder, Colorado 80309, United States
- The Open
Force Field Initiative, Open Molecular Software
Foundation, Davis, California 95616, United States
| | - Jessica Maat
- Department
of Chemistry, University of California, Irvine, California 92697, United States
| | - Trevor Gokey
- Department
of Chemistry, University of California, Irvine, California 92697, United States
| | - Lee-Ping Wang
- Chemistry
Department, The University of California
at Davis, Davis, California 95616, United States
| | - Daniel J. Cole
- School
of Natural and Environmental Sciences, Newcastle
University, Newcastle
upon Tyne NE1 7RU, U.K.
| | - Michael K. Gilson
- Skaggs
School of Pharmacy and Pharmaceutical Sciences, The University of California at San Diego, La Jolla, California 92093, United States
| | - John D. Chodera
- Computational
& Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, New York 10065, United States
| | | | - Michael R. Shirts
- Chemical
& Biological Engineering Department, University of Colorado Boulder, Boulder, Colorado 80309, United States
| | - David L. Mobley
- Department
of Pharmaceutical Sciences, University of
California, Irvine, California 92697, United States
- Department
of Chemistry, University of California, Irvine, California 92697, United States
| |
Collapse
|
14
|
Morado J, Mortenson PN, Nissink JWM, Essex JW, Skylaris CK. Does a Machine-Learned Potential Perform Better Than an Optimally Tuned Traditional Force Field? A Case Study on Fluorohydrins. J Chem Inf Model 2023; 63:2810-2827. [PMID: 37071825 PMCID: PMC10170518 DOI: 10.1021/acs.jcim.2c01510] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/20/2023]
Abstract
We present a comparative study that evaluates the performance of a machine learning potential (ANI-2x), a conventional force field (GAFF), and an optimally tuned GAFF-like force field in the modeling of a set of 10 γ-fluorohydrins that exhibit a complex interplay between intra- and intermolecular interactions in determining conformer stability. To benchmark the performance of each molecular model, we evaluated their energetic, geometric, and sampling accuracies relative to quantum-mechanical data. This benchmark involved conformational analysis both in the gas phase and chloroform solution. We also assessed the performance of the aforementioned molecular models in estimating nuclear spin-spin coupling constants by comparing their predictions to experimental data available in chloroform. The results and discussion presented in this study demonstrate that ANI-2x tends to predict stronger-than-expected hydrogen bonding and overstabilize global minima and shows problems related to inadequate description of dispersion interactions. Furthermore, while ANI-2x is a viable model for modeling in the gas phase, conventional force fields still play an important role, especially for condensed-phase simulations. Overall, this study highlights the strengths and weaknesses of each model, providing guidelines for the use and future development of force fields and machine learning potentials.
Collapse
Affiliation(s)
- João Morado
- School of Chemistry, University of Southampton, Highfield, Southampton SO17 1BJ, United Kingdom
| | - Paul N Mortenson
- Astex Pharmaceuticals, 436 Cambridge Science Park, Milton Road, Cambridge CB4 0QA, United Kingdom
| | - J Willem M Nissink
- Computational Chemistry, Oncology R&D, AstraZeneca, Cambridge CB4 0WG, United Kingdom
| | - Jonathan W Essex
- School of Chemistry, University of Southampton, Highfield, Southampton SO17 1BJ, United Kingdom
| | - Chris-Kriton Skylaris
- School of Chemistry, University of Southampton, Highfield, Southampton SO17 1BJ, United Kingdom
| |
Collapse
|
15
|
Seo B, Savoie BM. Evidence That Less Can Be More for Transferable Force Fields. J Chem Inf Model 2023; 63:1188-1195. [PMID: 36744744 DOI: 10.1021/acs.jcim.2c01163] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
Graph-based parameter assignment has been the basis for developing transferable force fields for molecular dynamics simulations for decades. Nevertheless, transferable force fields vary in how specifically terms are defined with respect to the molecular graph and the procedures for generating parametrization data. More-specific force-field terms increase the complexity of the force field, theoretically increasing accuracy but also increasing training data requirements. In contrast, less-specific force fields can be reused across larger regions of chemical space, theoretically reducing accuracy but also reducing the number of parameters and training data requirements. Here, the tradeoffs between force-field specificity and accuracy are quantified by parametrizing three new sets of force fields with varying levels of graph specificity, using a shared procedure for generating training data. These force fields are benchmarked for their ability to reproduce the structural features and liquid properties of 87 organic molecules at 146 distinct state points. The overall accuracy for properties that were directly trained on rapidly saturates as the graph specificity of the force-field increases. From this, we conclude there is at best a marginal benefit of using less transferable and more complex force fields with common sources of quantum-chemically derived training data. When looking at properties unseen during training, there is some evidence that the more-complex force fields even perform slightly worse. These results are rationalized by the fortuitous regularization of force fields based on less-specific and more-transferable atom types. Both the saturation in the accuracy of training properties and the marginally worse performance on off-target properties fundamentally contradict the expectation that bespoke force fields are generally more accurate, given their larger number of parameters, and suggests that increasing force-field complexity should be carefully justified against performance gains and balanced against available training data.
Collapse
Affiliation(s)
- Bumjoon Seo
- Davidson School of Chemical Engineering, Purdue University, West Lafayette, Indiana47906, United States
| | - Brett M Savoie
- Davidson School of Chemical Engineering, Purdue University, West Lafayette, Indiana47906, United States
| |
Collapse
|
16
|
Kříž K, Schmidt L, Andersson AT, Walz MM, van der Spoel D. An Imbalance in the Force: The Need for Standardized Benchmarks for Molecular Simulation. J Chem Inf Model 2023; 63:412-431. [PMID: 36630710 PMCID: PMC9875315 DOI: 10.1021/acs.jcim.2c01127] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2022] [Indexed: 01/12/2023]
Abstract
Force fields (FFs) for molecular simulation have been under development for more than half a century. As with any predictive model, rigorous testing and comparisons of models critically depends on the availability of standardized data sets and benchmarks. While such benchmarks are rather common in the fields of quantum chemistry, this is not the case for empirical FFs. That is, few benchmarks are reused to evaluate FFs, and development teams rather use their own training and test sets. Here we present an overview of currently available tests and benchmarks for computational chemistry, focusing on organic compounds, including halogens and common ions, as FFs for these are the most common ones. We argue that many of the benchmark data sets from quantum chemistry can in fact be reused for evaluating FFs, but new gas phase data is still needed for compounds containing phosphorus and sulfur in different valence states. In addition, more nonequilibrium interaction energies and forces, as well as molecular properties such as electrostatic potentials around compounds, would be beneficial. For the condensed phases there is a large body of experimental data available, and tools to utilize these data in an automated fashion are under development. If FF developers, as well as researchers in artificial intelligence, would adopt a number of these data sets, it would become easier to compare the relative strengths and weaknesses of different models and to, eventually, restore the balance in the force.
Collapse
Affiliation(s)
- Kristian Kříž
- Department
of Cell and Molecular Biology, Uppsala University, Box 596, SE-75124Uppsala, Sweden
| | - Lisa Schmidt
- Faculty
of Biosciences, University of Heidelberg, Heidelberg69117, Germany
| | - Alfred T. Andersson
- Department
of Cell and Molecular Biology, Uppsala University, Box 596, SE-75124Uppsala, Sweden
| | - Marie-Madeleine Walz
- Department
of Cell and Molecular Biology, Uppsala University, Box 596, SE-75124Uppsala, Sweden
| | - David van der Spoel
- Department
of Cell and Molecular Biology, Uppsala University, Box 596, SE-75124Uppsala, Sweden
| |
Collapse
|
17
|
D’Amore L, Hahn DF, Dotson DL, Horton JT, Anwar J, Craig I, Fox T, Gobbi A, Lakkaraju SK, Lucas X, Meier K, Mobley DL, Narayanan A, Schindler CE, Swope WC, in ’t Veld PJ, Wagner J, Xue B, Tresadern G. Collaborative Assessment of Molecular Geometries and Energies from the Open Force Field. J Chem Inf Model 2022; 62:6094-6104. [PMID: 36433835 PMCID: PMC9873353 DOI: 10.1021/acs.jcim.2c01185] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
Abstract
Force fields form the basis for classical molecular simulations, and their accuracy is crucial for the quality of, for instance, protein-ligand binding simulations in drug discovery. The huge diversity of small-molecule chemistry makes it a challenge to build and parameterize a suitable force field. The Open Force Field Initiative is a combined industry and academic consortium developing a state-of-the-art small-molecule force field. In this report, industry members of the consortium worked together to objectively evaluate the performance of the force fields (referred to here as OpenFF) produced by the initiative on a combined public and proprietary dataset of 19,653 relevant molecules selected from their internal research and compound collections. This evaluation was important because it was completely blind; at most partners, none of the molecules or data were used in force field development or testing prior to this work. We compare the Open Force Field "Sage" version 2.0.0 and "Parsley" version 1.3.0 with GAFF-2.11-AM1BCC, OPLS4, and SMIRNOFF99Frosst. We analyzed force-field-optimized geometries and conformer energies compared to reference quantum mechanical data. We show that OPLS4 performs best, and the latest Open Force Field release shows a clear improvement compared to its predecessors. The performance of established force fields such as GAFF-2.11 was generally worse. While OpenFF researchers were involved in building the benchmarking infrastructure used in this work, benchmarking was done entirely in-house within industrial organizations and the resulting assessment is reported here. This work assesses the force field performance using separate benchmarking steps, external datasets, and involving external research groups. This effort may also be unique in terms of the number of different industrial partners involved, with 10 different companies participating in the benchmark efforts.
Collapse
Affiliation(s)
- Lorenzo D’Amore
- Computational Chemistry, Janssen R&D, C/ Jarama 75A, 45007 Toledo, Spain
| | - David F. Hahn
- Computational Chemistry, Janssen R&D, Turnhoutseweg 30, Beerse B-2340, Belgium
| | - David L. Dotson
- The Open Force Field Initiative, Open Molecular Software Foundation, Davis, California 95616, USA
| | - Joshua T. Horton
- School of Natural and Environmental Sciences, Newcastle University, Newcastle upon Tyne NE1 7RU, UK
| | - Jamshed Anwar
- Department of Chemistry, Lancaster University, Lancaster LA1 4YW, UK
| | - Ian Craig
- Molecular Modeling & Drug Discovery, BASF SE, 67056 Ludwigshafen, Germany
| | - Thomas Fox
- Medicinal Chemistry, Boehringer Ingelheim Pharma GmbH & Co KG, 88397 Biberach/Riss, Germany
| | - Alberto Gobbi
- Genentech, Inc., 1 DNA Way, South San Francisco, California, 94080, USA
| | | | - Xavier Lucas
- Roche Pharma Research and Early Development, Therapeutic Modalities, Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd, Grenzacherstrasse 124, 4070 Basel, Switzerland
| | - Katharina Meier
- Computational Life Science Technology Functions, Crop Science, R&D, Bayer AG, 40789 Monheim, Germany
| | - David L. Mobley
- Departments of Pharmaceutical Sciences and Chemistry, University of California 92617, Irvine, USA
| | - Arjun Narayanan
- Data and Computational Sciences, Vertex Pharmaceuticals, 50 Northern Ave, Boston, MA 02210, USA
| | | | - William C. Swope
- Genentech, Inc., 1 DNA Way, South San Francisco, California, 94080, USA
| | | | - Jeffrey Wagner
- The Open Force Field Initiative, Open Molecular Software Foundation, Davis, California, 95616, USA,Chemistry Department, The University of California at Irvine, Irvine, California, 92617, USA
| | - Bai Xue
- XtalPi Inc. Floor 3, International Biomedical Innovation Park II, No. 2 Hongliu Road, Fubao Community, Fubao Street, Futian District, Shenzhen, Guangdong, 518040 China
| | - Gary Tresadern
- Computational Chemistry, Janssen R&D, Turnhoutseweg 30, Beerse B-2340, Belgium
| |
Collapse
|
18
|
Horton J, Boothroyd S, Wagner J, Mitchell JA, Gokey T, Dotson DL, Behara PK, Ramaswamy VK, Mackey M, Chodera JD, Anwar J, Mobley DL, Cole DJ. Open Force Field BespokeFit: Automating Bespoke Torsion Parametrization at Scale. J Chem Inf Model 2022; 62:5622-5633. [PMID: 36351167 PMCID: PMC9709916 DOI: 10.1021/acs.jcim.2c01153] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
The development of accurate transferable force fields is key to realizing the full potential of atomistic modeling in the study of biological processes such as protein-ligand binding for drug discovery. State-of-the-art transferable force fields, such as those produced by the Open Force Field Initiative, use modern software engineering and automation techniques to yield accuracy improvements. However, force field torsion parameters, which must account for many stereoelectronic and steric effects, are considered to be less transferable than other force field parameters and are therefore often targets for bespoke parametrization. Here, we present the Open Force Field QCSubmit and BespokeFit software packages that, when combined, facilitate the fitting of torsion parameters to quantum mechanical reference data at scale. We demonstrate the use of QCSubmit for simplifying the process of creating and archiving large numbers of quantum chemical calculations, by generating a dataset of 671 torsion scans for druglike fragments. We use BespokeFit to derive individual torsion parameters for each of these molecules, thereby reducing the root-mean-square error in the potential energy surface from 1.1 kcal/mol, using the original transferable force field, to 0.4 kcal/mol using the bespoke version. Furthermore, we employ the bespoke force fields to compute the relative binding free energies of a congeneric series of inhibitors of the TYK2 protein, and demonstrate further improvements in accuracy, compared to the base force field (MUE reduced from 0.560.390.77 to 0.420.280.59 kcal/mol and R2 correlation improved from 0.720.350.87 to 0.930.840.97).
Collapse
Affiliation(s)
- Joshua
T. Horton
- School
of Natural and Environmental Sciences, Newcastle
University, Newcastle
upon TyneNE1 7RU, United
Kingdom
| | - Simon Boothroyd
- Boothroyd
Scientific Consulting Ltd., 71-75 Shelton Street, LondonWC2H 9JQ, Greater London, United Kingdom
| | - Jeffrey Wagner
- The
Open Force Field Initiative, Open Molecular
Software Foundation, Davis, California95616, United States
| | - Joshua A. Mitchell
- The
Open Force Field Initiative, Open Molecular
Software Foundation, Davis, California95616, United States
| | - Trevor Gokey
- Department
of Chemistry, University of California, Irvine, California92697, United States
| | - David L. Dotson
- The
Open Force Field Initiative, Open Molecular
Software Foundation, Davis, California95616, United States
| | - Pavan Kumar Behara
- Department
of Pharmaceutical Sciences, University of
California, Irvine, California92697, United States
| | | | - Mark Mackey
- Cresset, New Cambridge House, Bassingbourn
Road, LitlingtonSG8 0SS, Cambridgeshire, United Kingdom
| | - John D. Chodera
- Computational
& Systems Biology Program, Sloan Kettering
Institute, Memorial Sloan Kettering Cancer Center, New
York, New York10065, United States
| | - Jamshed Anwar
- Department
of Chemistry, Lancaster University, LancasterLA1 4YW, United Kingdom
| | - David L. Mobley
- Department
of Chemistry, University of California, Irvine, California92697, United States,Department
of Pharmaceutical Sciences, University of
California, Irvine, California92697, United States
| | - Daniel J. Cole
- School
of Natural and Environmental Sciences, Newcastle
University, Newcastle
upon TyneNE1 7RU, United
Kingdom,
| |
Collapse
|
19
|
In Silico Study: Combination of α-Mangostin and Chitosan Conjugated with Trastuzumab against Human Epidermal Growth Factor Receptor 2. Polymers (Basel) 2022; 14:polym14132747. [PMID: 35808792 PMCID: PMC9268814 DOI: 10.3390/polym14132747] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2022] [Revised: 06/26/2022] [Accepted: 06/29/2022] [Indexed: 12/10/2022] Open
Abstract
Breast cancer is a type of cancer with the highest prevalence worldwide. Almost 10–30% of breast cancer cases are diagnosed as positive for HER2 (human epidermal growth factor receptor 2). The currently available treatment methods still exhibit many shortcomings such as a high incidence of side effects and treatment failure due to resistance. This in silico study aims to simulate α-mangostin and chitosan combination conjugated to trastuzumab formulation against HER2 as an effort to improve breast cancer patient therapy. This molecular docking simulation was done through using PatchDock Server. The materials used including the two-dimensional structure of α-mangostin, chitosan, and sodium tripolyphosphate from the PubChem database; trastuzumab FASTA sequence from the DrugBank database; and HER2 structure obtained from a crystal complex with PDB ID: 1N8Z. The results indicated that the particle of α-mangostin and chitosan combinations interacted mostly with the crystallizable fragment (Fc region) of trastuzumab in the conjugation process. The conjugation of trastuzumab to the particle of a combination of α-mangostin and chitosan resulted in the greatest increase in the binding score of the smallest-sized particles (50 Å) with an increase in the score of 3828 and also gave the most similar mode of interaction with trastuzumab. However, the conjugation of trastuzumab eliminated the similarity of the mode of interaction and increased the value of atomic contact energy. Thus, a cominbation of α-mangostin and chitosan conjugated to a trastuzumab formulation was predicted can increase the effectiveness of breast cancer therapy at a relatively small particle size but with the consequence of decreasing atomic contact energy.
Collapse
|
20
|
Facile Synthesis of Functionalized Phenoxy Quinolines: Antibacterial Activities against ESBL Producing Escherichia coli and MRSA, Docking Studies, and Structural Features Determination through Computational Approach. MOLECULES (BASEL, SWITZERLAND) 2022; 27:molecules27123732. [PMID: 35744858 PMCID: PMC9230019 DOI: 10.3390/molecules27123732] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/04/2022] [Revised: 05/30/2022] [Accepted: 05/31/2022] [Indexed: 11/23/2022]
Abstract
The synthesis of new 6-Bromoquinolin-4-ol derivatives (3a–3h) by Chan–Lam coupling utilizing different types of solvents (protic, aprotic, and mixed solvents) and bases was studied in the present manuscript. Furthermore, their potential against ESBL producing Escherichia coli (ESBL E. coli) and methicillin-resistant Staphylococcusaureus (MRSA) were investigated. Commercially available 6-bromoquinolin-4-ol (3a) was reacted with different types of aryl boronic acids along with Cu(OAc)2 via Chan–Lam coupling methodology utilizing the protic and aprotic and mixed solvents. The molecules (3a–3h) exhibited very good yields with methanol, moderate yields with DMF, and low yields with ethanol solvents, while the mixed solvent CH3OH/H2O (8:1) gave more excellent results as compared to the other solvents. The in vitro antiseptic values against ESBL E. coli and MRSA were calculated at five different deliberations (10, 20, 30, 40, 50 mg/well) by agar well diffusion method. The molecule 3e depicted highest antibacterial activity while compounds 3b and 3d showed low antibacterial activity. Additionally, MIC and MBC standards were calculated against the established bacteria by broth dilution method. Furthermore, a molecular docking investigation of the derivatives (3a–3h) were performed. Compound (3e) was highly active and depicted the least binding energy of −5.4. Moreover, to investigate the essential structural and physical properties, the density functional theory (DFT) findings of the synthesized molecules were accomplished by using the basic set PBE0-D3BJ/def2-TZVP/SMD water level of the theory. The synthesized compounds showed an energy gap from 4.93 to 5.07 eV.
Collapse
|
21
|
Yellapu NK, Ly T, Sardiu ME, Pei D, Welch DR, Thompson JA, Koestler DC. Synergistic anti-proliferative activity of JQ1 and GSK2801 in triple-negative breast cancer. BMC Cancer 2022; 22:627. [PMID: 35672711 PMCID: PMC9173973 DOI: 10.1186/s12885-022-09690-2] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2022] [Accepted: 05/23/2022] [Indexed: 11/17/2022] Open
Abstract
BACKGROUND Triple-negative breast cancer (TNBC) constitutes 10-20% of breast cancers and is challenging to treat due to a lack of effective targeted therapies. Previous studies in TNBC cell lines showed in vitro growth inhibition when JQ1 or GSK2801 were administered alone, and enhanced activity when co-administered. Given their respective mechanisms of actions, we hypothesized the combinatorial effect could be due to the target genes affected. Hence the target genes were characterized for their expression in the TNBC cell lines to prove the combinatorial effect of JQ1 and GSK2801. METHODS RNASeq data sets of TNBC cell lines (MDA-MB-231, HCC-1806 and SUM-159) were analyzed to identify the differentially expressed genes in single and combined treatments. The topmost downregulated genes were characterized for their downregulated expression in the TNBC cell lines treated with JQ1 and GSK2801 under different dose concentrations and combinations. The optimal lethal doses were determined by cytotoxicity assays. The inhibitory activity of the drugs was further characterized by molecular modelling studies. RESULTS Global expression profiling of TNBC cell lines using RNASeq revealed different expression patterns when JQ1 and GSK2801 were co-administered. Functional enrichment analyses identified several metabolic pathways (i.e., systemic lupus erythematosus, PI3K-Akt, TNF, JAK-STAT, IL-17, MAPK, Rap1 and signaling pathways) enriched with upregulated and downregulated genes when combined JQ1 and GSK2801 treatment was administered. RNASeq identified downregulation of PTPRC, MUC19, RNA5-8S5, KCNB1, RMRP, KISS1 and TAGLN (validated by RT-qPCR) and upregulation of GPR146, SCARA5, HIST2H4A, CDRT4, AQP3, MSH5-SAPCD1, SENP3-EIF4A1, CTAGE4 and RNASEK-C17orf49 when cells received both drugs. In addition to differential gene regulation, molecular modelling predicted binding of JQ1 and GSK2801 with PTPRC, MUC19, KCNB1, TAGLN and KISS1 proteins, adding another mechanism by which JQ1 and GSK2801 could elicit changes in metabolism and proliferation. CONCLUSION JQ1-GSK2801 synergistically inhibits proliferation and results in selective gene regulation. Besides suggesting that combinatorial use could be useful therapeutics for the treatment of TNBC, the findings provide a glimpse into potential mechanisms of action for this combination therapy approach.
Collapse
Affiliation(s)
- Nanda Kumar Yellapu
- Department of Biostatistics & Data Science, University of Kansas, Medical Center, KS, Kansas City, USA
- The University of Kansas Cancer Center, Kansas City, KS, USA
| | - Thuc Ly
- The University of Kansas Cancer Center, Kansas City, KS, USA
- Department of Cancer Biology, University of Kansas, Medical Center, KS, Kansas City, USA
| | - Mihaela E Sardiu
- Department of Biostatistics & Data Science, University of Kansas, Medical Center, KS, Kansas City, USA
- The University of Kansas Cancer Center, Kansas City, KS, USA
| | - Dong Pei
- Department of Biostatistics & Data Science, University of Kansas, Medical Center, KS, Kansas City, USA
- The University of Kansas Cancer Center, Kansas City, KS, USA
| | - Danny R Welch
- The University of Kansas Cancer Center, Kansas City, KS, USA
- Department of Cancer Biology, University of Kansas, Medical Center, KS, Kansas City, USA
- Departments of Molecular & Integrative Physiology and Internal Medicine, University of Kansas, Medical Center, KS, Kansas City, USA
| | - Jeffery A Thompson
- Department of Biostatistics & Data Science, University of Kansas, Medical Center, KS, Kansas City, USA.
- The University of Kansas Cancer Center, Kansas City, KS, USA.
| | - Devin C Koestler
- Department of Biostatistics & Data Science, University of Kansas, Medical Center, KS, Kansas City, USA.
- The University of Kansas Cancer Center, Kansas City, KS, USA.
| |
Collapse
|
22
|
Quinn TR, Patel HN, Koh KH, Haines BE, Norrby PO, Helquist P, Wiest O. Automated fitting of transition state force fields for biomolecular simulations. PLoS One 2022; 17:e0264960. [PMID: 35271647 PMCID: PMC8912266 DOI: 10.1371/journal.pone.0264960] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2021] [Accepted: 02/22/2022] [Indexed: 12/29/2022] Open
Abstract
The generation of surrogate potential energy functions (PEF) that are orders of magnitude faster to compute but as accurate as the underlying training data from high-level electronic structure methods is one of the most promising applications of fitting procedures in chemistry. In previous work, we have shown that transition state force fields (TSFFs), fitted to the functional form of MM3* force fields using the quantum guided molecular mechanics (Q2MM) method, provide an accurate description of transition states that can be used for stereoselectivity predictions of small molecule reactions. Here, we demonstrate the applicability of the method for fit TSFFs to the well-established Amber force field, which could be used for molecular dynamics studies of enzyme reaction. As a case study, the fitting of a TSFF to the second hydride transfer in Pseudomonas mevalonii 3-hydroxy-3-methylglutaryl coenzyme A reductase (PmHMGR) is used. The differences and similarities to fitting of small molecule TSFFs are discussed.
Collapse
Affiliation(s)
- Taylor R. Quinn
- Department of Chemistry and Biochemistry, University of Notre Dame, Notre Dame, Indiana, United States of America
- Early TDE Discovery, Early Oncology, Oncology R&D, AstraZeneca, Boston, Massachusetts, United States of America
| | - Himani N. Patel
- Department of Chemistry and Biochemistry, University of Notre Dame, Notre Dame, Indiana, United States of America
| | - Kevin H. Koh
- Department of Chemistry and Biochemistry, University of Notre Dame, Notre Dame, Indiana, United States of America
| | - Brandon E. Haines
- Department of Chemistry, Westmont College, Santa Barbara, California, United States of America
| | - Per-Ola Norrby
- Data Science and Modelling, Pharmaceutical Sciences, R&D, AstraZeneca Gothenburg, Mölndal, Sweden
| | - Paul Helquist
- Department of Chemistry and Biochemistry, University of Notre Dame, Notre Dame, Indiana, United States of America
| | - Olaf Wiest
- Department of Chemistry and Biochemistry, University of Notre Dame, Notre Dame, Indiana, United States of America
- Lab of Computational Chemistry and Drug Design, School of Chemical Biology and Biotechnology, Peking University, Shenzhen Graduate School, Shenzhen, China
- * E-mail:
| |
Collapse
|
23
|
Crawford JM, Gensch T, Sigman MS, Elward JM, Steves JE. Impact of Phosphine Featurization Methods in Process Development. Org Process Res Dev 2022. [DOI: 10.1021/acs.oprd.1c00357] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Affiliation(s)
- Jennifer M. Crawford
- Department of Chemistry, University of Utah, 315 South 1400 East, Salt Lake City, Utah 84112, United States
| | - Tobias Gensch
- Department of Chemistry, University of Utah, 315 South 1400 East, Salt Lake City, Utah 84112, United States
| | - Matthew S. Sigman
- Department of Chemistry, University of Utah, 315 South 1400 East, Salt Lake City, Utah 84112, United States
| | - Jennifer M. Elward
- Molecular Design, GlaxoSmithKline, 1250 S. Collegeville Road, Collegeville, Pennsylvania 19426, United States
| | - Janelle E. Steves
- Chemical Development, GlaxoSmithKline, 1250 S. Collegeville Road, Collegeville, Pennsylvania 19426, United States
| |
Collapse
|
24
|
Gervasoni S, Spencer J, Hinchliffe P, Pedretti A, Vairoletti F, Mahler G, Mulholland AJ. A multiscale approach to predict the binding mode of metallo beta-lactamase inhibitors. Proteins 2022; 90:372-384. [PMID: 34455628 PMCID: PMC8944931 DOI: 10.1002/prot.26227] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2021] [Revised: 06/09/2021] [Accepted: 08/18/2021] [Indexed: 02/03/2023]
Abstract
Antibiotic resistance is a major threat to global public health. β-lactamases, which catalyze breakdown of β-lactam antibiotics, are a principal cause. Metallo β-lactamases (MBLs) represent a particular challenge because they hydrolyze almost all β-lactams and to date no MBL inhibitor has been approved for clinical use. Molecular simulations can aid drug discovery, for example, predicting inhibitor complexes, but empirical molecular mechanics (MM) methods often perform poorly for metalloproteins. Here we present a multiscale approach to model thiol inhibitor binding to IMP-1, a clinically important MBL containing two catalytic zinc ions, and predict the binding mode of a 2-mercaptomethyl thiazolidine (MMTZ) inhibitor. Inhibitors were first docked into the IMP-1 active site, testing different docking programs and scoring functions on multiple crystal structures. Complexes were then subjected to molecular dynamics (MD) simulations and subsequently refined through QM/MM optimization with a density functional theory (DFT) method, B3LYP/6-31G(d), increasing the accuracy of the method with successive steps. This workflow was tested on two IMP-1:MMTZ complexes, for which it reproduced crystallographically observed binding, and applied to predict the binding mode of a third MMTZ inhibitor for which a complex structure was crystallographically intractable. We also tested a 12-6-4 nonbonded interaction model in MD simulations and optimization with a SCC-DFTB QM/MM approach. The results show the limitations of empirical models for treating these systems and indicate the need for higher level calculations, for example, DFT/MM, for reliable structural predictions. This study demonstrates a reliable computational pipeline that can be applied to inhibitor design for MBLs and other zinc-metalloenzyme systems.
Collapse
Affiliation(s)
- Silvia Gervasoni
- Department of Pharmaceutical Sciences, University of Milan, Milan, Italy
| | - James Spencer
- School of Cellular and Molecular Medicine, University of Bristol, Bristol, UK
| | - Philip Hinchliffe
- School of Cellular and Molecular Medicine, University of Bristol, Bristol, UK
| | | | - Franco Vairoletti
- Laboratorio de Química Farmacéutica, Departamento de Química Orgánica, Facultad de Química, Universidad de la República (UdelaR), Avda. General Flores 2124, Montevideo, Uruguay
| | - Graciela Mahler
- Laboratorio de Química Farmacéutica, Departamento de Química Orgánica, Facultad de Química, Universidad de la República (UdelaR), Avda. General Flores 2124, Montevideo, Uruguay
| | | |
Collapse
|
25
|
Kovács DP, Oord CVD, Kucera J, Allen AEA, Cole DJ, Ortner C, Csányi G. Linear Atomic Cluster Expansion Force Fields for Organic Molecules: Beyond RMSE. J Chem Theory Comput 2021; 17:7696-7711. [PMID: 34735161 PMCID: PMC8675139 DOI: 10.1021/acs.jctc.1c00647] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2021] [Indexed: 01/25/2023]
Abstract
We demonstrate that fast and accurate linear force fields can be built for molecules using the atomic cluster expansion (ACE) framework. The ACE models parametrize the potential energy surface in terms of body-ordered symmetric polynomials making the functional form reminiscent of traditional molecular mechanics force fields. We show that the four- or five-body ACE force fields improve on the accuracy of the empirical force fields by up to a factor of 10, reaching the accuracy typical of recently proposed machine-learning-based approaches. We not only show state of the art accuracy and speed on the widely used MD17 and ISO17 benchmark data sets, but we also go beyond RMSE by comparing a number of ML and empirical force fields to ACE on more important tasks such as normal-mode prediction, high-temperature molecular dynamics, dihedral torsional profile prediction, and even bond breaking. We also demonstrate the smoothness, transferability, and extrapolation capabilities of ACE on a new challenging benchmark data set comprised of a potential energy surface of a flexible druglike molecule.
Collapse
Affiliation(s)
- Dávid Péter Kovács
- Engineering
Laboratory, University of Cambridge, Cambridge, CB2 1PZUnited Kingdom
| | - Cas van der Oord
- Engineering
Laboratory, University of Cambridge, Cambridge, CB2 1PZUnited Kingdom
| | - Jiri Kucera
- Engineering
Laboratory, University of Cambridge, Cambridge, CB2 1PZUnited Kingdom
| | - Alice E. A. Allen
- Department
of Physics and Materials Science, University
of Luxembourg, L-1511Luxembourg City, Luxembourg
| | - Daniel J. Cole
- School
of Natural and Environmental Sciences, Newcastle University, Newcastle upon Tyne, NE1
7RUUnited Kingdom
| | - Christoph Ortner
- Department
of Mathematics, University of British Columbia, Vancouver, BC, CanadaV6T 1Z2
| | - Gábor Csányi
- Engineering
Laboratory, University of Cambridge, Cambridge, CB2 1PZUnited Kingdom
| |
Collapse
|
26
|
Morado J, Mortenson PN, Nissink JWM, Verdonk ML, Ward RA, Essex JW, Skylaris CK. Generation of Quantum Configurational Ensembles Using Approximate Potentials. J Chem Theory Comput 2021; 17:7021-7042. [PMID: 34644088 DOI: 10.1021/acs.jctc.1c00532] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Conformational analysis is of paramount importance in drug design: it is crucial to determine pharmacological properties, understand molecular recognition processes, and characterize the conformations of ligands when unbound. Molecular Mechanics (MM) simulation methods, such as Monte Carlo (MC) and molecular dynamics (MD), are usually employed to generate ensembles of structures due to their ability to extensively sample the conformational space of molecules. The accuracy of these MM-based schemes strongly depends on the functional form of the force field (FF) and its parametrization, components that often hinder their performance. High-level methods, such as ab initio MD, provide reliable structural information but are still too computationally expensive to allow for extensive sampling. Therefore, to overcome these limitations, we present a multilevel MC method that is capable of generating quantum configurational ensembles while keeping the computational cost at a minimum. We show that FF reparametrization is an efficient route to generate FFs that reproduce QM results more closely, which, in turn, can be used as low-cost models to achieve the gold standard QM accuracy. We demonstrate that the MC acceptance rate is strongly correlated with various phase space overlap measurements and that it constitutes a robust metric to evaluate the similarity between the MM and QM levels of theory. As a more advanced application, we present a self-parametrizing version of the algorithm, which combines sampling and FF parametrization in one scheme, and apply the methodology to generate the QM/MM distribution of a ligand in aqueous solution.
Collapse
Affiliation(s)
- João Morado
- School of Chemistry, University of Southampton, Highfield, Southampton SO17 1BJ, United Kingdom
| | - Paul N Mortenson
- Astex Pharmaceuticals, 436 Cambridge Science Park, Milton Road, Cambridge CB4 0QA, United Kingdom
| | - J Willem M Nissink
- Medicinal Chemistry, Oncology R&D, AstraZeneca, Cambridge CB4 0WG, United Kingdom
| | - Marcel L Verdonk
- Astex Pharmaceuticals, 436 Cambridge Science Park, Milton Road, Cambridge CB4 0QA, United Kingdom
| | - Richard A Ward
- Medicinal Chemistry, Oncology R&D, AstraZeneca, Cambridge CB4 0WG, United Kingdom
| | - Jonathan W Essex
- School of Chemistry, University of Southampton, Highfield, Southampton SO17 1BJ, United Kingdom
| | - Chris-Kriton Skylaris
- School of Chemistry, University of Southampton, Highfield, Southampton SO17 1BJ, United Kingdom
| |
Collapse
|
27
|
Qiu Y, Smith DGA, Boothroyd S, Jang H, Hahn DF, Wagner J, Bannan CC, Gokey T, Lim VT, Stern CD, Rizzi A, Tjanaka B, Tresadern G, Lucas X, Shirts MR, Gilson MK, Chodera JD, Bayly CI, Mobley DL, Wang LP. Development and Benchmarking of Open Force Field v1.0.0-the Parsley Small-Molecule Force Field. J Chem Theory Comput 2021; 17:6262-6280. [PMID: 34551262 PMCID: PMC8511297 DOI: 10.1021/acs.jctc.1c00571] [Citation(s) in RCA: 85] [Impact Index Per Article: 21.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Abstract
We present a methodology for defining and optimizing a general force field for classical molecular simulations, and we describe its use to derive the Open Force Field 1.0.0 small-molecule force field, codenamed Parsley. Rather than using traditional atom typing, our approach is built on the SMIRKS-native Open Force Field (SMIRNOFF) parameter assignment formalism, which handles increases in the diversity and specificity of the force field definition without needlessly increasing the complexity of the specification. Parameters are optimized with the ForceBalance tool, based on reference quantum chemical data that include torsion potential energy profiles, optimized gas-phase structures, and vibrational frequencies. These quantum reference data are computed and are maintained with QCArchive, an open-source and freely available distributed computing and database software ecosystem. In this initial application of the method, we present essentially a full optimization of all valence parameters and report tests of the resulting force field against compounds and data types outside the training set. These tests show improvements in optimized geometries and conformational energetics and demonstrate that Parsley's accuracy for liquid properties is similar to that of other general force fields, as is accuracy on binding free energies. We find that this initial Parsley force field affords accuracy similar to that of other general force fields when used to calculate relative binding free energies spanning 199 protein-ligand systems. Additionally, the resulting infrastructure allows us to rapidly optimize an entirely new force field with minimal human intervention.
Collapse
Affiliation(s)
- Yudong Qiu
- Chemistry Department, The University of California at Davis, Davis, California 95616, United States
| | - Daniel G A Smith
- The Molecular Sciences Software Institute (MolSSI), Blacksburg, Virginia 24060, United States
| | - Simon Boothroyd
- Computational & Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, New York 10065, United States
| | - Hyesu Jang
- Chemistry Department, The University of California at Davis, Davis, California 95616, United States
| | - David F Hahn
- Computational Chemistry, Janssen Research & Development, Turnhoutseweg 30, Beerse B-2340, Belgium
| | - Jeffrey Wagner
- Chemistry Department, The University of California at Irvine, Irvine, California 92617, United States
| | - Caitlin C Bannan
- Chemistry Department, The University of California at Irvine, Irvine, California 92617, United States
- Skaggs School of Pharmacy and Pharmaceutical Sciences, The University of California at San Diego, La Jolla, California 92093, United States
| | - Trevor Gokey
- Chemistry Department, The University of California at Irvine, Irvine, California 92617, United States
| | - Victoria T Lim
- Chemistry Department, The University of California at Irvine, Irvine, California 92617, United States
| | - Chaya D Stern
- Computational & Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, New York 10065, United States
| | - Andrea Rizzi
- Computational & Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, New York 10065, United States
- Tri-Institutional Training Program in Computational Biology and Medicine, New York, New York 10065, United States
| | - Bryon Tjanaka
- Chemistry Department, The University of California at Irvine, Irvine, California 92617, United States
| | - Gary Tresadern
- Computational Chemistry, Janssen Research & Development, Turnhoutseweg 30, Beerse B-2340, Belgium
| | - Xavier Lucas
- F. Hoffmann-La Roche AG, Basel 4070, Switzerland
| | - Michael R Shirts
- Chemical & Biological Engineering Department, The University of Colorado at Boulder, Boulder, Colorado 80309, United States
| | - Michael K Gilson
- Skaggs School of Pharmacy and Pharmaceutical Sciences, The University of California at San Diego, La Jolla, California 92093, United States
| | - John D Chodera
- Computational & Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, New York 10065, United States
| | | | - David L Mobley
- Chemistry Department, The University of California at Irvine, Irvine, California 92617, United States
| | - Lee-Ping Wang
- Chemistry Department, The University of California at Davis, Davis, California 95616, United States
| |
Collapse
|
28
|
Vazquez-Salazar LI, Boittier ED, Unke OT, Meuwly M. Impact of the Characteristics of Quantum Chemical Databases on Machine Learning Prediction of Tautomerization Energies. J Chem Theory Comput 2021; 17:4769-4785. [PMID: 34288675 DOI: 10.1021/acs.jctc.1c00363] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
An essential aspect for adequate predictions of chemical properties by machine learning models is the database used for training them. However, studies that analyze how the content and structure of the databases used for training impact the prediction quality are scarce. In this work, we analyze and quantify the relationships learned by a machine learning model (Neural Network) trained on five different reference databases (QM9, PC9, ANI-1E, ANI-1, and ANI-1x) to predict tautomerization energies from molecules in Tautobase. For this, characteristics such as the number of heavy atoms in a molecule, number of atoms of a given element, bond composition, or initial geometry on the quality of the predictions are considered. The results indicate that training on a chemically diverse database is crucial for obtaining good results and also that conformational sampling can partly compensate for limited coverage of chemical diversity. The overall best-performing reference database (ANI-1x) performs on average by 1 kcal/mol better than PC9, which, however, contains about 2 orders of magnitude fewer reference structures. On the other hand, PC9 is chemically more diverse by a factor of ∼5 as quantified by the number of atom-in-molecule-based fragments (amons) it contains compared with the ANI family of databases. A quantitative measure for deficiencies is the Kullback-Leibler divergence between reference and target distributions. It is explicitly demonstrated that when certain types of bonds need to be covered in the target database (Tautobase) but are undersampled in the reference databases, the resulting predictions are poor. Examples of this include the poor performance of all databases analyzed to predict C(sp2)-C(sp2) double bonds close to heteroatoms and azoles containing N-N and N-O bonds. Analysis of the results with a Tree MAP algorithm provides deeper understanding of specific deficiencies in predicting tautomerization energies by the reference datasets due to inadequate coverage of chemical space. Capitalizing on this information can be used to either improve existing databases or generate new databases of sufficient diversity for a range of machine learning (ML) applications in chemistry.
Collapse
Affiliation(s)
| | - Eric D Boittier
- Department of Chemistry, University of Basel, Klingelbergstrasse 80, CH-4056 Basel, Switzerland
| | - Oliver T Unke
- Machine Learning Group, Technische Universität Berlin, 10587 Berlin, Germany.,DFG Cluster of Excellence "Unifying Systems in Catalysis" (UniSysCat), Technische Universität Berlin, 10623 Berlin, Germany
| | - Markus Meuwly
- Department of Chemistry, University of Basel, Klingelbergstrasse 80, CH-4056 Basel, Switzerland.,Department of Chemistry, Brown University, Providence, Rhode Island 02912, United States
| |
Collapse
|
29
|
Lu C, Wu C, Ghoreishi D, Chen W, Wang L, Damm W, Ross GA, Dahlgren MK, Russell E, Von Bargen CD, Abel R, Friesner RA, Harder ED. OPLS4: Improving Force Field Accuracy on Challenging Regimes of Chemical Space. J Chem Theory Comput 2021; 17:4291-4300. [PMID: 34096718 DOI: 10.1021/acs.jctc.1c00302] [Citation(s) in RCA: 735] [Impact Index Per Article: 183.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Affiliation(s)
- Chao Lu
- Schrodinger, Incorporated, 120 West 45th Street, New York, New York 10036, United States
| | - Chuanjie Wu
- Schrodinger, Incorporated, 120 West 45th Street, New York, New York 10036, United States
| | - Delaram Ghoreishi
- Schrodinger, Incorporated, 120 West 45th Street, New York, New York 10036, United States
| | - Wei Chen
- Schrodinger, Incorporated, 120 West 45th Street, New York, New York 10036, United States
| | - Lingle Wang
- Schrodinger, Incorporated, 120 West 45th Street, New York, New York 10036, United States
| | - Wolfgang Damm
- Schrodinger, Incorporated, 120 West 45th Street, New York, New York 10036, United States
| | - Gregory A. Ross
- Schrodinger, Incorporated, 120 West 45th Street, New York, New York 10036, United States
| | - Markus K. Dahlgren
- Schrodinger, Incorporated, 120 West 45th Street, New York, New York 10036, United States
| | - Ellery Russell
- Schrodinger, Incorporated, 120 West 45th Street, New York, New York 10036, United States
| | | | - Robert Abel
- Schrodinger, Incorporated, 120 West 45th Street, New York, New York 10036, United States
| | - Richard A. Friesner
- Department of Chemistry, Columbia University, 3000 Broadway, New York, New York 10027, United States
| | - Edward D. Harder
- Schrodinger, Incorporated, 120 West 45th Street, New York, New York 10036, United States
| |
Collapse
|
30
|
Ehrman JN, Lim VT, Bannan CC, Thi N, Kyu DY, Mobley DL. Improving small molecule force fields by identifying and characterizing small molecules with inconsistent parameters. J Comput Aided Mol Des 2021; 35:271-284. [PMID: 33506360 PMCID: PMC8162916 DOI: 10.1007/s10822-020-00367-1] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2020] [Accepted: 12/01/2020] [Indexed: 01/07/2023]
Abstract
Many molecular simulation methods use force fields to help model and simulate molecules and their behavior in various environments. Force fields are sets of functions and parameters used to calculate the potential energy of a chemical system as a function of the atomic coordinates. Despite the widespread use of force fields, their inadequacies are often thought to contribute to systematic errors in molecular simulations. Furthermore, different force fields tend to give varying results on the same systems with the same simulation settings. Here, we present a pipeline for comparing the geometries of small molecule conformers. We aimed to identify molecules or chemistries that are particularly informative for future force field development because they display inconsistencies between force fields. We applied our pipeline to a subset of the eMolecules database, and highlighted molecules that appear to be parameterized inconsistently across different force fields. We then identified over-represented functional groups in these molecule sets. The molecules and moieties identified by this pipeline may be particularly helpful for future force field parameterization.
Collapse
Affiliation(s)
- Jordan N Ehrman
- Department of Pharmaceutical Sciences, University of California, Irvine, Irvine, CA, 92697, USA
| | - Victoria T Lim
- Department of Chemistry, University of California, Irvine, Irvine, CA, 92697, USA
| | - Caitlin C Bannan
- Department of Chemistry, University of California, Irvine, Irvine, CA, 92697, USA
| | - Nam Thi
- Department of Pharmaceutical Sciences, University of California, Irvine, Irvine, CA, 92697, USA
| | - Daisy Y Kyu
- Department of Pharmaceutical Sciences, University of California, Irvine, Irvine, CA, 92697, USA
| | - David L Mobley
- Department of Pharmaceutical Sciences, University of California, Irvine, Irvine, CA, 92697, USA.
- Department of Chemistry, University of California, Irvine, Irvine, CA, 92697, USA.
| |
Collapse
|
31
|
Çınaroğlu SS, Biggin PC. Evaluating the Performance of Water Models with Host-Guest Force Fields in Binding Enthalpy Calculations for Cucurbit[7]uril-Guest Systems. J Phys Chem B 2021; 125:1558-1567. [PMID: 33538161 DOI: 10.1021/acs.jpcb.0c11383] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Computational prediction of thermodynamic components with computational methods has become increasingly routine in computer-aided drug design. Although there has been significant recent effort and improvements in the calculation of free energy, the prediction of enthalpy (and entropy) remains underexplored. Furthermore, there has been relatively little work reported so far that attempts to comparatively assess how well different force fields and water models perform in conjunction with each other. Here, we report a comprehensive assessment of force fields and water models using host-guest systems that mimic many features of protein-ligand systems. These systems are computationally inexpensive, possibly because of their small size compared to protein-ligand systems. We present absolute enthalpy calculations using the multibox approach on a set of 25 cucurbit[7]uril-guest pairs. Eight water models were considered (TIP3P, TIP4P, TIP4P-Ew, SPC, SPC/E, OPC, TIP5P, Bind3P), along with five force fields commonly used in the literature (GAFFv1, GAFFv2, CGenFF, Parsley, and SwissParam). We observe that host-guest binding enthalpies are strongly sensitive to the selection of force field and water model. In terms of water models, we find that TIP3P and its derivative Bind3P are the best performing models for this particular host-guest system. The performance is generally better for aliphatic compounds than for aromatic ones, suggesting that aromaticity remains a difficult property to include accurately in these simple force fields.
Collapse
Affiliation(s)
| | - Philip C Biggin
- Department of Biochemistry, University of Oxford, South Parks Road, Oxford OX1 3QU, U.K
| |
Collapse
|