1
|
Zheng T, Wang A, Han X, Xia Y, Xu X, Zhan J, Liu Y, Chen Y, Wang Z, Wu X, Gong S, Yan W. Data-driven parametrization of molecular mechanics force fields for expansive chemical space coverage. Chem Sci 2025; 16:2730-2740. [PMID: 39802691 PMCID: PMC11721737 DOI: 10.1039/d4sc06640e] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2024] [Accepted: 12/25/2024] [Indexed: 01/16/2025] Open
Abstract
A force field is a critical component in molecular dynamics simulations for computational drug discovery. It must achieve high accuracy within the constraints of molecular mechanics' (MM) limited functional forms, which offers high computational efficiency. With the rapid expansion of synthetically accessible chemical space, traditional look-up table approaches face significant challenges. In this study, we address this issue using a modern data-driven approach, developing ByteFF, an Amber-compatible force field for drug-like molecules. To create ByteFF, we generated an expansive and highly diverse molecular dataset at the B3LYP-D3(BJ)/DZVP level of theory. This dataset includes 2.4 million optimized molecular fragment geometries with analytical Hessian matrices, along with 3.2 million torsion profiles. We then trained an edge-augmented, symmetry-preserving molecular graph neural network (GNN) on this dataset, employing a carefully optimized training strategy. Our model predicts all bonded and non-bonded MM force field parameters for drug-like molecules simultaneously across a broad chemical space. ByteFF demonstrates state-of-the-art performance on various benchmark datasets, excelling in predicting relaxed geometries, torsional energy profiles, and conformational energies and forces. Its exceptional accuracy and expansive chemical space coverage make ByteFF a valuable tool for multiple stages of computational drug discovery.
Collapse
Affiliation(s)
- Tianze Zheng
- ByteDance Research, Beijing Beijing 100098 China
| | - Ailun Wang
- ByteDance Research Bellevue Washington 98004 USA
| | - Xu Han
- ByteDance Research, Beijing Beijing 100098 China
| | - Yu Xia
- ByteDance Research, Beijing Beijing 100098 China
| | - Xingyuan Xu
- ByteDance Research, Beijing Beijing 100098 China
| | - Jiawei Zhan
- ByteDance Research Bellevue Washington 98004 USA
| | - Yu Liu
- ByteDance Research Bellevue Washington 98004 USA
| | - Yang Chen
- ByteDance Research, Beijing Beijing 100098 China
| | - Zhi Wang
- ByteDance Research Bellevue Washington 98004 USA
| | - Xiaojie Wu
- ByteDance Research Bellevue Washington 98004 USA
| | - Sheng Gong
- ByteDance Research Bellevue Washington 98004 USA
| | - Wen Yan
- ByteDance Research Bellevue Washington 98004 USA
| |
Collapse
|
2
|
Chen J, Gao Q, Huang M, Yu K. Application of modern artificial intelligence techniques in the development of organic molecular force fields. Phys Chem Chem Phys 2025; 27:2294-2319. [PMID: 39820957 DOI: 10.1039/d4cp02989e] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2025]
Abstract
The molecular force field (FF) determines the accuracy of molecular dynamics (MD) and is one of the major bottlenecks that limits the application of MD in molecular design. Recently, artificial intelligence (AI) techniques, such as machine-learning potentials (MLPs), have been rapidly reshaping the landscape of MD. Meanwhile, organic molecular systems feature unique characteristics, and require more careful treatment in both model construction, optimization, and validation. While an accurate and generic organic molecular force field is still missing, significant progress has been made with the facilitation of AI, warranting a promising future. In this review, we provide an overview of the various types of AI techniques used in molecular FF development and discuss both the advantages and weaknesses of these methodologies. We show how AI methods provide unprecedented capabilities in many tasks such as potential fitting, atom typification, and automatic optimization. Meanwhile, it is also worth noting that more efforts are needed to improve the transferability of the model, develop a more comprehensive database, and establish more standardized validation procedures. With these discussions, we hope to inspire more efforts to solve the existing problems, eventually leading to the birth of next-generation generic organic FFs.
Collapse
Affiliation(s)
- Junmin Chen
- Institute of Materials Research (IMR), Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen, 518055, China.
- Tsinghua-Berkeley Shenzhen Institute (TBSI), Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen, 518055, China
| | - Qian Gao
- Institute of Materials Research (IMR), Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen, 518055, China.
| | - Miaofei Huang
- Institute of Materials Research (IMR), Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen, 518055, China.
| | - Kuang Yu
- Institute of Materials Research (IMR), Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen, 518055, China.
- Tsinghua-Berkeley Shenzhen Institute (TBSI), Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen, 518055, China
| |
Collapse
|
3
|
Karwounopoulos J, Bieniek M, Wu Z, Baskerville AL, König G, Cossins BP, Wood GPF. Evaluation of Machine Learning/Molecular Mechanics End-State Corrections with Mechanical Embedding to Calculate Relative Protein-Ligand Binding Free Energies. J Chem Theory Comput 2025; 21:967-977. [PMID: 39753520 DOI: 10.1021/acs.jctc.4c01427] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2025]
Abstract
The development of machine-learning (ML) potentials offers significant accuracy improvements compared to molecular mechanics (MM) because of the inclusion of quantum-mechanical effects in molecular interactions. However, ML simulations are several times more computationally demanding than MM simulations, so there is a trade-off between speed and accuracy. One possible compromise are hybrid machine learning/molecular mechanics (ML/MM) approaches with mechanical embedding that treat the intramolecular interactions of the ligand at the ML level and the protein-ligand interactions at the MM level. Recent studies have reported improved protein-ligand binding free energy results based on ML/MM using ANI-2x with mechanical embedding, arguing that intramolecular interactions like torsion potentials of the ligand are often the limiting factor for accuracy. This claim is evaluated based on 108 relative binding free energy calculations for four different benchmark systems. As an alternative strategy, we also tested a tool that fits the MM dihedral potentials to the ML level of theory. Fitting was performed with the ML potentials ANI-2x and AIMNet2, and, for the benchmark system TYK2, also with quantum-mechanical calculations using ωB97M-D3(BJ)/def2-TZVPPD. Overall, the relative binding free energy results from MM with Open Force Field 2.2.0, MM with ML-fitted torsion potentials, and the corresponding ML/MM end-state corrected simulations show no statistically significant differences in the mean absolute errors (between 0.8 and 0.9 kcal mol-1). This can probably be explained by the usage of the same MM parameters to calculate the protein-ligand interactions. Therefore, a well-parametrized force field is on a par with simple mechanical embedding ML/MM simulations for protein-ligand binding. In terms of computational costs, the reparametrization of poor torsional potentials is preferable over employing computationally intensive ML/MM simulations of protein-ligand complexes with mechanical embedding. Also, the refitting strategy leads to lower variances of the protein-ligand binding free energy results than the ML/MM end-state corrections. For free energy corrections with ML/MM, the results indicate that better convergence and more advanced ML/MM schemes will be required for applications in computer-guided drug discovery.
Collapse
Affiliation(s)
| | - Mateusz Bieniek
- Exscientia, Schrödinger Building, Oxford Science Park, Oxford OX4 4GE, U.K
| | - Zhiyi Wu
- Exscientia, Schrödinger Building, Oxford Science Park, Oxford OX4 4GE, U.K
| | - Adam L Baskerville
- Exscientia, Schrödinger Building, Oxford Science Park, Oxford OX4 4GE, U.K
| | - Gerhard König
- Exscientia, Schrödinger Building, Oxford Science Park, Oxford OX4 4GE, U.K
| | - Benjamin P Cossins
- Exscientia, Schrödinger Building, Oxford Science Park, Oxford OX4 4GE, U.K
| | - Geoffrey P F Wood
- Exscientia, Schrödinger Building, Oxford Science Park, Oxford OX4 4GE, U.K
| |
Collapse
|
4
|
Wang X, Xiong D, Zhang Y, Zhai J, Gu YC, He X. The evolution of the Amber additive protein force field: History, current status, and future. J Chem Phys 2025; 162:030901. [PMID: 39817575 DOI: 10.1063/5.0227517] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2024] [Accepted: 12/30/2024] [Indexed: 01/18/2025] Open
Abstract
Molecular dynamics simulations are pivotal in elucidating the intricate properties of biological molecules. Nonetheless, the reliability of their outcomes hinges on the precision of the molecular force field utilized. In this perspective, we present a comprehensive review of the developmental trajectory of the Amber additive protein force field, delving into researchers' persistent quest for higher precision force fields and the prevailing challenges. We detail the parameterization process of the Amber protein force fields, emphasizing the specific improvements and retained features in each version compared to their predecessors. Furthermore, we discuss the challenges that current force fields encounter in balancing the interactions of protein-protein, protein-water, and water-water in molecular dynamics simulations, as well as potential solutions to overcome these issues.
Collapse
Affiliation(s)
- Xianwei Wang
- School of Physics, Zhejiang University of Technology, Hangzhou, Zhejiang 310023, China
| | - Danyang Xiong
- Shanghai Engineering Research Center of Molecular Therapeutics and New Drug Development, Shanghai Frontiers Science Center of Molecule Intelligent Syntheses, School of Chemistry and Molecular Engineering, East China Normal University, Shanghai 200062, China
| | - Yueqing Zhang
- Shanghai Engineering Research Center of Molecular Therapeutics and New Drug Development, Shanghai Frontiers Science Center of Molecule Intelligent Syntheses, School of Chemistry and Molecular Engineering, East China Normal University, Shanghai 200062, China
| | - Jihang Zhai
- Shanghai Engineering Research Center of Molecular Therapeutics and New Drug Development, Shanghai Frontiers Science Center of Molecule Intelligent Syntheses, School of Chemistry and Molecular Engineering, East China Normal University, Shanghai 200062, China
| | - Yu-Cheng Gu
- Syngenta Jealott's Hill International Research Centre Bracknell, Berkshire RG42 6EY, United Kingdom
| | - Xiao He
- Shanghai Engineering Research Center of Molecular Therapeutics and New Drug Development, Shanghai Frontiers Science Center of Molecule Intelligent Syntheses, School of Chemistry and Molecular Engineering, East China Normal University, Shanghai 200062, China
- Chongqing Key Laboratory of Precision Optics, Chongqing Institute of East China Normal University, Chongqing 401120, China
- New York University-East China Normal University Center for Computational Chemistry, New York University Shanghai, Shanghai 200062, China
| |
Collapse
|
5
|
Rasouli A, Pickard FC, Sur S, Grossfield A, Işık Bennett M. Essential Considerations for Free Energy Calculations of RNA-Small Molecule Complexes: Lessons from the Theophylline-Binding RNA Aptamer. J Chem Inf Model 2025; 65:223-239. [PMID: 39699235 PMCID: PMC11734693 DOI: 10.1021/acs.jcim.4c01505] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2024] [Revised: 11/22/2024] [Accepted: 11/26/2024] [Indexed: 12/20/2024]
Abstract
Alchemical free energy calculations are widely used to predict the binding affinity of small molecule ligands to protein targets; however, the application of these methods to RNA targets has not been deeply explored. We systematically investigated how modeling decisions affect the performance of absolute binding free energy calculations for a relatively simple RNA model system: theophylline-binding RNA aptamer with theophylline and five analogs. The goal of this investigation was 2-fold: (1) understanding the performance levels we can expect from absolute free energy calculations for a simple RNA complex and (2) learning about practical modeling considerations that impact the success of RNA-binding predictions, which may be different from the best practices established for protein targets. We learned that magnesium ion (Mg2+) placement is a critical decision that impacts affinity predictions. When information regarding Mg2+ positions is lacking, implementing RNA backbone restraints is an alternative way of stabilizing the RNA structure that recapitulates prediction accuracy. Since mistakes in Mg2+ placement can be detrimental, omitting magnesium ions entirely and using RNA backbone restraints are attractive as a risk-mitigating approach. We found that predictions are sensitive to modeling experimental buffer conditions correctly, including salt type and ionic strength. We explored the effects of sampling in the alchemical protocol, choice of the ligand force field (GAFF2/OpenFF Sage), and water model (TIP3P/OPC) on predictions, which allowed us to give practical advice for the application of free energy methods to RNA targets. By capturing experimental buffer conditions and implementing RNA backbone restraints, we were able to compute binding affinities accurately (mean absolute error (MAE) = 2.2 kcal/mol, Pearson's correlation coefficient = 0.9, Kendall's τ = 0.7). We believe there is much to learn about how to apply free energy calculations for RNA targets and how to enhance their performance in prospective predictions. This study is an important first step for learning best practices and special considerations for RNA-ligand free energy calculations. Future studies will consider increasingly complicated ligands and diverse RNA systems and help the development of general protocols for therapeutically relevant RNA targets.
Collapse
Affiliation(s)
- Ali Rasouli
- Moderna,
Inc., 325 Binney Street, Cambridge, Massachusetts 02142, United States
- Theoretical
and Computational Biophysics Group, NIH Center for Macromolecular
Modeling and Bioinformatics, Beckman Institute for Advanced Science
and Technology, Department of Biochemistry, University of Illinois, Urbana, Illinois 61801, United States
- Center
for Biophysics and Quantitative Biology, University of Illinois, Urbana, Illinois 61801, United States
| | - Frank C. Pickard
- Moderna,
Inc., 325 Binney Street, Cambridge, Massachusetts 02142, United States
| | - Sreyoshi Sur
- Moderna,
Inc., 325 Binney Street, Cambridge, Massachusetts 02142, United States
| | - Alan Grossfield
- University
of Rochester Medical Center, Rochester, New York 14620, United States
| | - Mehtap Işık Bennett
- Moderna,
Inc., 325 Binney Street, Cambridge, Massachusetts 02142, United States
| |
Collapse
|
6
|
Fusti-Molnar L. Integrating Quantum Mechanics into Protein-Ligand Docking: Toward Higher Accuracy and Reliability. RESEARCH SQUARE 2024:rs.3.rs-5433993. [PMID: 39678339 PMCID: PMC11643324 DOI: 10.21203/rs.3.rs-5433993/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/17/2024]
Abstract
I introduce two new methods, QFVina and QFVinardo, for protein-ligand docking that leverage precomputed high-quality conformational libraries with QM-optimized geometries and ab initio DFT-D4-based conformational rankings and strain energies. These methods provide greater accuracy in docking-based virtual screening by addressing the inaccuracies in intramolecular relative energies of conformations, a critical component often misrepresented in flexible ligand docking calculations. I demonstrate that numerous force field-based methods widely used today exhibit substantial errors in conformational relative energies, and that it is unrealistic to expect better accuracy from the faster scoring functions typically employed in docking. Consistent with these findings, I show that traditional flexible ligand docking often produces geometries with significant strain energies and large deviations, with magnitudes comparable to the protein-ligand binding energies themselves and much larger than the differences we aim to estimate in docking hitlists. By using physically realistic ligand conformations with accurate strain energies in the scoring function, QFVina and QFVinardo produce markedly different docking results, even with the same docking parameters and scoring functions for protein-ligand interaction energies. I analyzed these differences in docking hitlists and selected protein-ligand interactions using three protein targets from COVID-19 research.
Collapse
|
7
|
Ojha AA, Votapka LW, Amaro RE. Advances and Challenges in Milestoning Simulations for Drug-Target Kinetics. J Chem Theory Comput 2024; 20:9759-9769. [PMID: 39508322 PMCID: PMC11603602 DOI: 10.1021/acs.jctc.4c01108] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2024] [Revised: 10/30/2024] [Accepted: 10/31/2024] [Indexed: 11/15/2024]
Abstract
Molecular dynamics simulations have become indispensable for exploring complex biological processes, yet their limitations in capturing rare events hinder our understanding of drug-target kinetics. In this Perspective, we investigate the domain of milestoning simulations to understand this challenge. The milestoning approach divides the phase space of the drug-target complex into discrete cells, offering extended time scale insights. This Perspective traces the history, applications, and future potential of milestoning simulations in the context of drug-target kinetics. It explores the fundamental principles of milestoning, highlighting the importance of probabilistic transitions and transition time independence. Markovian milestoning with Voronoi tessellations is revisited to address the traditional milestoning challenges. While observing the advancements in this field, this Perspective also addresses impending challenges in estimating drug-target unbinding rate constants through milestoning simulations, paving the way for more effective drug design strategies.
Collapse
Affiliation(s)
- Anupam Anand Ojha
- Department
of Chemistry and Biochemistry, University
of California San Diego, La Jolla, California 92093, United States
- Center
for Computational Biology and Center for Computational Mathematics, Flatiron Institute, New York, New York 10010, United States
| | - Lane W. Votapka
- Department
of Chemistry and Biochemistry, University
of California San Diego, La Jolla, California 92093, United States
| | - Rommie E. Amaro
- Department
of Molecular Biology, University of California
San Diego, La Jolla, California 92093, United States
| |
Collapse
|
8
|
Tu NTP, Williamson S, Johnson ER, Rowley CN. Modeling Intermolecular Interactions with Exchange-Hole Dipole Moment Dispersion Corrections to Neural Network Potentials. J Phys Chem B 2024; 128:8290-8302. [PMID: 39166778 DOI: 10.1021/acs.jpcb.4c02882] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/23/2024]
Abstract
Neural network potentials (NNPs) are an innovative approach for calculating the potential energy and forces of a chemical system. In principle, these methods are capable of modeling large systems with an accuracy approaching that of a high-level ab initio calculation, but with a much smaller computational cost. Due to their training to density-functional theory (DFT) data and neglect of long-range interactions, some classes of NNPs require an additional term to include London dispersion physics. In this Perspective, we discuss the requirements for a dispersion model for use with an NNP, focusing on the MLXDM (Machine Learned eXchange-Hole Dipole Moment) model developed by our groups. This model is based on the DFT-based XDM dispersion correction, which calculates interatomic dispersion coefficients in terms of atomic moments and polarizabilities, both of which can be approximated effectively using neural networks.
Collapse
Affiliation(s)
| | - Siri Williamson
- Department of Chemistry, Carleton University, Ottawa, Ontario K1S 5B6, Canada
| | - Erin R Johnson
- Department of Chemistry, Dalhousie University, Halifax, Nova Scotia B3H 4J3, Canada
| | | |
Collapse
|
9
|
Behara PK, Jang H, Horton JT, Gokey T, Dotson DL, Boothroyd S, Bayly CI, Cole DJ, Wang LP, Mobley DL. Benchmarking Quantum Mechanical Levels of Theory for Valence Parametrization in Force Fields. J Phys Chem B 2024; 128:7888-7902. [PMID: 39087913 PMCID: PMC11331531 DOI: 10.1021/acs.jpcb.4c03167] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2024] [Revised: 07/09/2024] [Accepted: 07/15/2024] [Indexed: 08/02/2024]
Abstract
A wide range of density functional methods and basis sets are available to derive the electronic structure and properties of molecules. Quantum mechanical calculations are too computationally intensive for routine simulation of molecules in the condensed phase, prompting the development of computationally efficient force fields based on quantum mechanical data. Parametrizing general force fields, which cover a vast chemical space, necessitates the generation of sizable quantum mechanical data sets with optimized geometries and torsion scans. To achieve this efficiently, choosing a quantum mechanical method that balances computational cost and accuracy is crucial. In this study, we seek to assess the accuracy of quantum mechanical theory for specific properties such as conformer energies and torsion energetics. To comprehensively evaluate various methods, we focus on a representative set of 59 diverse small molecules, comparing approximately 25 combinations of functional and basis sets against the reference level coupled cluster calculations at the complete basis set limit.
Collapse
Affiliation(s)
- Pavan Kumar Behara
- Center
for Neurotherapeutics, University of California, Irvine, California 92697, United States
| | - Hyesu Jang
- Chemistry
Department, University of California at
Davis, Davis, California 95616, United States
- OpenEye
Scientific Software, Santa
Fe, New Mexico 87508, United States
| | - Joshua T. Horton
- School
of Natural and Environmental Sciences, Newcastle
University, Newcastle
upon Tyne NE1 7RU, U.K.
| | - Trevor Gokey
- Department
of Chemistry, University of California, Irvine, California 92697, United States
| | - David L. Dotson
- The
Open Force Field Initiative, Open Molecular Software Foundation, Davis, California 95616, United States
- Datryllic
LLC, Phoenix, Arizona 85003, United States
| | | | | | - Daniel J. Cole
- School
of Natural and Environmental Sciences, Newcastle
University, Newcastle
upon Tyne NE1 7RU, U.K.
| | - Lee-Ping Wang
- Chemistry
Department, University of California at
Davis, Davis, California 95616, United States
| | - David L. Mobley
- Center
for Neurotherapeutics, University of California, Irvine, California 92697, United States
- Department
of Chemistry, University of California, Irvine, California 92697, United States
| |
Collapse
|
10
|
Takaba K, Friedman AJ, Cavender CE, Behara PK, Pulido I, Henry MM, MacDermott-Opeskin H, Iacovella CR, Nagle AM, Payne AM, Shirts MR, Mobley DL, Chodera JD, Wang Y. Machine-learned molecular mechanics force fields from large-scale quantum chemical data. Chem Sci 2024; 15:12861-12878. [PMID: 39148808 PMCID: PMC11322960 DOI: 10.1039/d4sc00690a] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2024] [Accepted: 06/17/2024] [Indexed: 08/17/2024] Open
Abstract
The development of reliable and extensible molecular mechanics (MM) force fields-fast, empirical models characterizing the potential energy surface of molecular systems-is indispensable for biomolecular simulation and computer-aided drug design. Here, we introduce a generalized and extensible machine-learned MM force field, espaloma-0.3, and an end-to-end differentiable framework using graph neural networks to overcome the limitations of traditional rule-based methods. Trained in a single GPU-day to fit a large and diverse quantum chemical dataset of over 1.1 M energy and force calculations, espaloma-0.3 reproduces quantum chemical energetic properties of chemical domains highly relevant to drug discovery, including small molecules, peptides, and nucleic acids. Moreover, this force field maintains the quantum chemical energy-minimized geometries of small molecules and preserves the condensed phase properties of peptides and folded proteins, self-consistently parametrizing proteins and ligands to produce stable simulations leading to highly accurate predictions of binding free energies. This methodology demonstrates significant promise as a path forward for systematically building more accurate force fields that are easily extensible to new chemical domains of interest.
Collapse
Affiliation(s)
- Kenichiro Takaba
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center New York NY 10065 USA
- Pharmaceuticals Research Center, Advanced Drug Discovery, Asahi Kasei Pharma Corporation Shizuoka 410-2321 Japan
| | - Anika J Friedman
- Department of Chemical and Biological Engineering, University of Colorado Boulder Boulder CO 80309 USA
| | - Chapin E Cavender
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego 9500 Gilman Drive La Jolla CA 92093 USA
| | - Pavan Kumar Behara
- Center for Neurotherapeutics, Department of Pathology and Laboratory Medicine, University of California Irvine CA 92697 USA
| | - Iván Pulido
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center New York NY 10065 USA
| | - Michael M Henry
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center New York NY 10065 USA
| | | | - Christopher R Iacovella
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center New York NY 10065 USA
| | - Arnav M Nagle
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center New York NY 10065 USA
- Department of Bioengineering, University of California, Berkeley Berkeley CA 94720 USA
| | - Alexander Matthew Payne
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center New York NY 10065 USA
- Tri-Institutional PhD Program in Chemical Biology, Memorial Sloan Kettering Cancer Center New York 10065 USA
| | - Michael R Shirts
- Department of Chemical and Biological Engineering, University of Colorado Boulder Boulder CO 80309 USA
| | - David L Mobley
- Department of Pharmaceutical Sciences, University of California Irvine California 92697 USA
| | - John D Chodera
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center New York NY 10065 USA
| | - Yuanqing Wang
- Simons Center for Computational Physical Chemistry and Center for Data Science, New York University New York NY 10004 USA
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center New York NY 10065 USA
| |
Collapse
|
11
|
Roy A, Ali T, Venkatraman V. The Area Law of Molecular Entropy: Moving beyond Harmonic Approximation. ENTROPY (BASEL, SWITZERLAND) 2024; 26:688. [PMID: 39202158 PMCID: PMC11353761 DOI: 10.3390/e26080688] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/10/2024] [Revised: 08/03/2024] [Accepted: 08/13/2024] [Indexed: 09/03/2024]
Abstract
This article shows that the gas-phase entropy of molecules is proportional to the area of the molecules, with corrections for the different curvatures of the molecular surface. The ability to estimate gas-phase entropy by the area law also allows us to calculate molecular entropy faster and more accurately than currently popular methods of estimating molecular entropy with harmonic oscillator approximation. The speed and accuracy of our method will open up new possibilities for the explicit inclusion of entropy in various computational biology methods.
Collapse
Affiliation(s)
- Amitava Roy
- Department of Biomedical and Pharmaceutical Sciences, University of Montana, Missoula, MT 59812, USA;
| | - Tibra Ali
- Department of Mathematics and Natural Sciences, School of Data and Science, BRAC University, Dhaka 1212, Bangladesh;
| | - Vishwesh Venkatraman
- Department of Chemistry, Norwegian University of Science and Technology (NTNU), 7491 Trondheim, Norway
| |
Collapse
|
12
|
Ding T, Guseinov AA, Milligan G, Plouffe B, Tikhonova IG. Exploring an Intracellular Allosteric Site of CC-Chemokine Receptor 4 from 3D Models, Probe Simulations, and Mutagenesis. ACS Pharmacol Transl Sci 2024; 7:2516-2526. [PMID: 39144548 PMCID: PMC11320731 DOI: 10.1021/acsptsci.4c00330] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2024] [Revised: 07/04/2024] [Accepted: 07/08/2024] [Indexed: 08/16/2024]
Abstract
We applied our previously developed probe confined dynamic mapping protocol, which combines enhanced sampling molecular dynamics (MD) simulations and fragment-based approaches, to identify the binding site of GSK2239633A (N-[[3-[[3-[(5-chlorothiophen-2-yl)sulfonylamino]-4-methoxyindazol-1-yl]methyl]phenyl]methyl]-2-hydroxy-2-methylpropanamide), a selective CC-chemokine receptor type 4 (CCR4) negative allosteric modulator, using CCR4 homology and AlphaFold models. By comparing the performance across five computational models, we identified conserved (K3108.49 and Y3047.53) and non-conserved (M2436.36) residue hotspots for GSK2239633A binding, which were validated by mutagenesis and bioluminescence resonance energy transfer assay. Further analysis of 3D models and MD simulations highlighted the pair of residues 6.36 and 7.56 that might account for antagonist selectivity among chemokine receptors. Our in silico protocol provides a promising approach for characterizing ligand binding sites in membrane proteins, considering receptor dynamics and adaptability and guiding protein template selection for ligand design.
Collapse
Affiliation(s)
- Tianyi Ding
- School
of Pharmacy, Queen’s University Belfast, Belfast Bt9 7BL, Northern Ireland, U.K.
| | - Abdul-Akim Guseinov
- School
of Pharmacy, Queen’s University Belfast, Belfast Bt9 7BL, Northern Ireland, U.K.
| | - Graeme Milligan
- Centre
for Translational Pharmacology, School of Molecular Biosciences, College
of Medical, Veterinary and Life Sciences, University of Glasgow, Glasgow, Scotland G12 8QQ, U.K.
| | - Bianca Plouffe
- Wellcome-Wolfson
Institute for Experimental Medicine, School of Medicine, Dentistry
and Biomedical Sciences, Queen’s
University Belfast, Belfast Bt9 7BL, Northern Ireland, U.K.
| | - Irina G. Tikhonova
- School
of Pharmacy, Queen’s University Belfast, Belfast Bt9 7BL, Northern Ireland, U.K.
| |
Collapse
|
13
|
Wang L, Behara PK, Thompson MW, Gokey T, Wang Y, Wagner JR, Cole DJ, Gilson MK, Shirts MR, Mobley DL. The Open Force Field Initiative: Open Software and Open Science for Molecular Modeling. J Phys Chem B 2024; 128:7043-7067. [PMID: 38989715 DOI: 10.1021/acs.jpcb.4c01558] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/12/2024]
Abstract
Force fields are a key component of physics-based molecular modeling, describing the energies and forces in a molecular system as a function of the positions of the atoms and molecules involved. Here, we provide a review and scientific status report on the work of the Open Force Field (OpenFF) Initiative, which focuses on the science, infrastructure and data required to build the next generation of biomolecular force fields. We introduce the OpenFF Initiative and the related OpenFF Consortium, describe its approach to force field development and software, and discuss accomplishments to date as well as future plans. OpenFF releases both software and data under open and permissive licensing agreements to enable rapid application, validation, extension, and modification of its force fields and software tools. We discuss lessons learned to date in this new approach to force field development. We also highlight ways that other force field researchers can get involved, as well as some recent successes of outside researchers taking advantage of OpenFF tools and data.
Collapse
Affiliation(s)
- Lily Wang
- Open Force Field, Open Molecular Software Foundation, Davis, California 95616, United States
| | - Pavan Kumar Behara
- Center for Neurotherapeutics, University of California, Irvine, California 92697, United States
| | - Matthew W Thompson
- Open Force Field, Open Molecular Software Foundation, Davis, California 95616, United States
| | - Trevor Gokey
- Department of Chemistry, University of California, Irvine, California 92697, United States
| | - Yuanqing Wang
- Simons Center for Computational Physical Chemistry and Center for Data Science, New York, New York 10004, United States
| | - Jeffrey R Wagner
- Open Force Field, Open Molecular Software Foundation, Davis, California 95616, United States
| | - Daniel J Cole
- School of Natural and Environmental Sciences, Newcastle University, Newcastle upon Tyne NE1 7RU, United Kingdom
| | - Michael K Gilson
- Skaggs School of Pharmacy and Pharmaceutical Sciences, The University of California at San Diego, La Jolla, California 92093, United States
| | - Michael R Shirts
- Department of Chemical and Biological Engineering, University of Colorado Boulder, Boulder, Colorado 80305, United States
| | - David L Mobley
- Department of Chemistry, University of California, Irvine, California 92697, United States
- Department of Pharmaceutical Sciences, University of California, Irvine, California 92697, United States
| |
Collapse
|
14
|
Sun Z, Procacci P. Methodological and force field effects in the molecular dynamics-based prediction of binding free energies of host-guest systems. Phys Chem Chem Phys 2024; 26:19887-19899. [PMID: 38990073 DOI: 10.1039/d4cp01804d] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/12/2024]
Abstract
As a contribution to the understanding and rationalization of methodological and modeling effects in recent host-guest SAMPL challenges, using an alchemical molecular dynamics technique we have examined the impact of force field parameterization and ionic strength in connection with guest charge neutralization on computed dissociation free energies in two typical SAMPL heavily charged macrocyclic hosts encapsulating small protonated amines with disparate binding affinities. We have shown that the methodological treatment for host neutralization, with explicit ions or with the background neutralizing plasma in the context of alchemical calculations under periodic boundary conditions, has a moderate effect on the calculated affinities. On the other hand, we have shown that seemingly small differences in the force field parameterization in highly symmetric hosts can produce systematic effects on the structural features that can have a significant impact on the predicted binding affinities.
Collapse
Affiliation(s)
- Zhaoxi Sun
- Changping Laboratory, Beijing 102206, China
| | - Piero Procacci
- Dipartimento di Chimica "Ugo Schiff", Università degli Studi di Firenze, Via della Lastruccia 3, 50019 Sesto Fiorentino, Italy.
| |
Collapse
|
15
|
Hahn DF, Gapsys V, de Groot BL, Mobley DL, Tresadern G. Current State of Open Source Force Fields in Protein-Ligand Binding Affinity Predictions. J Chem Inf Model 2024; 64:5063-5076. [PMID: 38895959 PMCID: PMC11234369 DOI: 10.1021/acs.jcim.4c00417] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2024] [Revised: 04/23/2024] [Accepted: 04/25/2024] [Indexed: 06/21/2024]
Abstract
In drug discovery, the in silico prediction of binding affinity is one of the major means to prioritize compounds for synthesis. Alchemical relative binding free energy (RBFE) calculations based on molecular dynamics (MD) simulations are nowadays a popular approach for the accurate affinity ranking of compounds. MD simulations rely on empirical force field parameters, which strongly influence the accuracy of the predicted affinities. Here, we evaluate the ability of six different small-molecule force fields to predict experimental protein-ligand binding affinities in RBFE calculations on a set of 598 ligands and 22 protein targets. The public force fields OpenFF Parsley and Sage, GAFF, and CGenFF show comparable accuracy, while OPLS3e is significantly more accurate. However, a consensus approach using Sage, GAFF, and CGenFF leads to accuracy comparable to OPLS3e. While Parsley and Sage are performing comparably based on aggregated statistics across the whole dataset, there are differences in terms of outliers. Analysis of the force field reveals that improved parameters lead to significant improvement in the accuracy of affinity predictions on subsets of the dataset involving those parameters. Lower accuracy can not only be attributed to the force field parameters but is also dependent on input preparation and sampling convergence of the calculations. Especially large perturbations and nonconverged simulations lead to less accurate predictions. The input structures, Gromacs force field files, as well as the analysis Python notebooks are available on GitHub.
Collapse
Affiliation(s)
- David F. Hahn
- Computational
Chemistry, Janssen Research & Development, Turnhoutseweg 30, Beerse 2340, Belgium
| | - Vytautas Gapsys
- Computational
Chemistry, Janssen Research & Development, Turnhoutseweg 30, Beerse 2340, Belgium
- Computational
Biomolecular Dynamics Group, Max Planck
Institute for Multidisciplinary Sciences, Am Fassberg 11, Göttingen 37077, Germany
| | - Bert L. de Groot
- Computational
Biomolecular Dynamics Group, Max Planck
Institute for Multidisciplinary Sciences, Am Fassberg 11, Göttingen 37077, Germany
| | - David L. Mobley
- Department
of Chemistry, University of California, Irvine, California 92697, United States
- Department
of Pharmaceutical Sciences, University of
California, Irvine, California 92697, United States
| | - Gary Tresadern
- Computational
Chemistry, Janssen Research & Development, Turnhoutseweg 30, Beerse 2340, Belgium
| |
Collapse
|
16
|
Wehrhan L, Keller BG. Fluorinated Protein-Ligand Complexes: A Computational Perspective. J Phys Chem B 2024; 128:5925-5934. [PMID: 38886167 PMCID: PMC11215785 DOI: 10.1021/acs.jpcb.4c01493] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2024] [Revised: 05/28/2024] [Accepted: 05/30/2024] [Indexed: 06/20/2024]
Abstract
Fluorine is an element renowned for its unique properties. Its powerful capability to modulate molecular properties makes it an attractive substituent for protein binding ligands; however, the rational design of fluorination can be challenging with effects on interactions and binding energies being difficult to predict. In this Perspective, we highlight how computational methods help us to understand the role of fluorine in protein-ligand binding with a focus on molecular simulation. We underline the importance of an accurate force field, present fluoride channels as a showcase for biomolecular interactions with fluorine, and discuss fluorine specific interactions like the ability to form hydrogen bonds and interactions with aryl groups. We put special emphasis on the disruption of water networks and entropic effects.
Collapse
Affiliation(s)
- Leon Wehrhan
- Department of Chemistry,
Biology and Pharmacy, Freie Universität
Berlin, Arnimallee 22, 14195 Berlin, Germany
| | - Bettina G. Keller
- Department of Chemistry,
Biology and Pharmacy, Freie Universität
Berlin, Arnimallee 22, 14195 Berlin, Germany
| |
Collapse
|
17
|
Wang Y, Pulido I, Takaba K, Kaminow B, Scheen J, Wang L, Chodera JD. EspalomaCharge: Machine Learning-Enabled Ultrafast Partial Charge Assignment. J Phys Chem A 2024; 128:4160-4167. [PMID: 38717302 PMCID: PMC11129294 DOI: 10.1021/acs.jpca.4c01287] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2024] [Revised: 04/17/2024] [Accepted: 04/17/2024] [Indexed: 05/24/2024]
Abstract
Atomic partial charges are crucial parameters in molecular dynamics simulation, dictating the electrostatic contributions to intermolecular energies and thereby the potential energy landscape. Traditionally, the assignment of partial charges has relied on surrogates of ab initio semiempirical quantum chemical methods such as AM1-BCC and is expensive for large systems or large numbers of molecules. We propose a hybrid physical/graph neural network-based approximation to the widely popular AM1-BCC charge model that is orders of magnitude faster while maintaining accuracy comparable to differences in AM1-BCC implementations. Our hybrid approach couples a graph neural network to a streamlined charge equilibration approach in order to predict molecule-specific atomic electronegativity and hardness parameters, followed by analytical determination of optimal charge-equilibrated parameters that preserve total molecular charge. This hybrid approach scales linearly with the number of atoms, enabling for the first time the use of fully consistent charge models for small molecules and biopolymers for the construction of next-generation self-consistent biomolecular force fields. Implemented in the free and open source package EspalomaCharge, this approach provides drop-in replacements for both AmberTools antechamber and the Open Force Field Toolkit charging workflows, in addition to stand-alone charge generation interfaces. Source code is available at https://github.com/choderalab/espaloma-charge.
Collapse
Affiliation(s)
- Yuanqing Wang
- Computational
and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, New York 10065, United States
- Simons
Center for Computational Chemistry and Center for Data Science, New York University, New York, New York 10004, United States
| | - Iván Pulido
- Computational
and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, New York 10065, United States
| | - Kenichiro Takaba
- Computational
and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, New York 10065, United States
- Pharmaceutical
Research Center, Advanced Drug Discovery, Asahi Kasei Pharma Corporation, Shizuoka 410-2321, Japan
| | - Benjamin Kaminow
- Computational
and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, New York 10065, United States
- Tri-Institutional
PhD Program in Computational Biology and Medicine, Weill Cornell Medical
College, Cornell University, New York, New York 10065, United States
| | - Jenke Scheen
- Computational
and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, New York 10065, United States
| | - Lily Wang
- Computational
and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, New York 10065, United States
- Open Molecular Sciences Foundation, Davis, California 95618, United States
| | - John D. Chodera
- Computational
and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, New York 10065, United States
| |
Collapse
|
18
|
Li J, Guan X, Zhang O, Sun K, Wang Y, Bagni D, Head-Gordon T. Leak Proof PDBBind: A Reorganized Dataset of Protein-Ligand Complexes for More Generalizable Binding Affinity Prediction. ARXIV 2024:arXiv:2308.09639v2. [PMID: 37645037 PMCID: PMC10462179] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 08/31/2023]
Abstract
Many physics-based and machine-learned scoring functions (SFs) used to predict protein-ligand binding free energies have been trained on the PDBBind dataset. However, it is controversial as to whether new SFs are actually improving since the general, refined, and core datasets of PDBBind are cross-contaminated with proteins and ligands with high similarity, and hence they may not perform comparably well in binding prediction of new protein-ligand complexes. In this work we have carefully prepared a cleaned PDBBind data set of non-covalent binders that are split into training, validation, and test datasets to control for data leakage, defined as proteins and ligands with high sequence and structural similarity. The resulting leak-proof (LP)-PDBBind data is used to retrain four popular SFs: AutoDock Vina, Random Forest (RF)-Score, InteractionGraphNet (IGN), and DeepDTA, to better test their capabilities when applied to new protein-ligand complexes. In particular we have formulated a new independent data set, BDB2020+, by matching high quality binding free energies from BindingDB with co-crystalized ligand-protein complexes from the PDB that have been deposited since 2020. Based on all the benchmark results, the retrained models using LP-PDBBind consistently perform better, with IGN especially being recommended for scoring and ranking applications for new protein-ligand systems.
Collapse
|
19
|
Shahab M, Khan A, Khan SA, Zheng G. Unraveling the mechanisms of Sofosbuvir resistance in HCV NS3/4A protease: Structural and molecular simulation-based insights. Int J Biol Macromol 2024; 267:131629. [PMID: 38631585 DOI: 10.1016/j.ijbiomac.2024.131629] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2023] [Revised: 04/05/2024] [Accepted: 04/13/2024] [Indexed: 04/19/2024]
Abstract
Current management of HCV infection is based on Direct-Acting Antiviral Drugs (DAAs). However, resistance-associated mutations, especially in the NS3 and NS5B regions are gradually decreasing the efficacy of DAAs. Among the most effective HCV NS3/4A protease drugs, Sofosbuvir also develops resistance due to mutations in the NS3 and NS5B regions. Four mutations at positions A156Y, L36P, Q41H, and Q80K are classified as high-level resistance mutations. The resistance mechanism of HCV NS3/4A protease toward Sofosbuvir caused by these mutations is still unclear, as there is less information available regarding the structural and functional effects of the mutations against Sofosbuvir. In this work, we combined molecular dynamics simulation, molecular mechanics/Generalized-Born surface area calculation, principal component analysis, and free energy landscape analysis to explore the resistance mechanism of HCV NS3/4A protease due to these mutations, as well as compare interaction changes in wild-type. Subsequently, we identified that the mutant form of HCV NS3/4A protease affects the activity of Sofosbuvir. In this study, the resistance mechanism of Sofosbuvir at the atomic level is proposed. The proposed drug-resistance mechanism will provide valuable guidance for the design of HCV drugs.
Collapse
Affiliation(s)
- Muhammad Shahab
- State Key Laboratories of Chemical Resources Engineering, Beijing University of Chemical Technology, Beijing 100029, PR China
| | - Abbas Khan
- Department of Bioinformatics and Biological Statistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, PR China
| | - Salman Ali Khan
- Tunneling Group, Biotechnology Centre, Doctoral School, Silesian University of Technology, Akademicka 2, 44-100, Gliwice, Poland
| | - Guojun Zheng
- State Key Laboratories of Chemical Resources Engineering, Beijing University of Chemical Technology, Beijing 100029, PR China.
| |
Collapse
|
20
|
Chen M, Jiang X, Zhang L, Chen X, Wen Y, Gu Z, Li X, Zheng M. The emergence of machine learning force fields in drug design. Med Res Rev 2024; 44:1147-1182. [PMID: 38173298 DOI: 10.1002/med.22008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2023] [Revised: 11/29/2023] [Accepted: 12/05/2023] [Indexed: 01/05/2024]
Abstract
In the field of molecular simulation for drug design, traditional molecular mechanic force fields and quantum chemical theories have been instrumental but limited in terms of scalability and computational efficiency. To overcome these limitations, machine learning force fields (MLFFs) have emerged as a powerful tool capable of balancing accuracy with efficiency. MLFFs rely on the relationship between molecular structures and potential energy, bypassing the need for a preconceived notion of interaction representations. Their accuracy depends on the machine learning models used, and the quality and volume of training data sets. With recent advances in equivariant neural networks and high-quality datasets, MLFFs have significantly improved their performance. This review explores MLFFs, emphasizing their potential in drug design. It elucidates MLFF principles, provides development and validation guidelines, and highlights successful MLFF implementations. It also addresses potential challenges in developing and applying MLFFs. The review concludes by illuminating the path ahead for MLFFs, outlining the challenges to be overcome and the opportunities to be harnessed. This inspires researchers to embrace MLFFs in their investigations as a new tool to perform molecular simulations in drug design.
Collapse
Affiliation(s)
- Mingan Chen
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
- School of Physical Science and Technology, ShanghaiTech University, Shanghai, China
- Lingang Laboratory, Shanghai, China
| | - Xinyu Jiang
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
- School of Pharmacy, University of Chinese Academy of Sciences, Beijing, China
| | - Lehan Zhang
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
- School of Pharmacy, University of Chinese Academy of Sciences, Beijing, China
| | - Xiaoxu Chen
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
- School of Pharmacy, University of Chinese Academy of Sciences, Beijing, China
- School of Pharmaceutical Science and Technology, Hangzhou Institute for Advanced Study, UCAS, Hangzhou, China
| | - Yiming Wen
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
- School of Pharmacy, University of Chinese Academy of Sciences, Beijing, China
- School of Pharmaceutical Science and Technology, Hangzhou Institute for Advanced Study, UCAS, Hangzhou, China
| | - Zhiyong Gu
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
- School of Pharmacy, University of Chinese Academy of Sciences, Beijing, China
- School of Pharmaceutical Science and Technology, Hangzhou Institute for Advanced Study, UCAS, Hangzhou, China
| | - Xutong Li
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
- School of Pharmacy, University of Chinese Academy of Sciences, Beijing, China
| | - Mingyue Zheng
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
- School of Pharmacy, University of Chinese Academy of Sciences, Beijing, China
- School of Pharmaceutical Science and Technology, Hangzhou Institute for Advanced Study, UCAS, Hangzhou, China
| |
Collapse
|
21
|
Nessler A, Okada O, Kinoshita Y, Nishimura K, Nagata H, Fukuzawa K, Yonemochi E, Schnieders MJ. Crystal Polymorph Search in the NPT Ensemble via a Deposition/Sublimation Alchemical Path. CRYSTAL GROWTH & DESIGN 2024; 24:3205-3217. [PMID: 38659664 PMCID: PMC11036363 DOI: 10.1021/acs.cgd.3c01358] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/14/2023] [Revised: 02/22/2024] [Accepted: 02/23/2024] [Indexed: 04/26/2024]
Abstract
The formulation of active pharmaceutical ingredients involves discovering stable crystal packing arrangements or polymorphs, each of which has distinct pharmaceutically relevant properties. Traditional experimental screening techniques utilizing various conditions are commonly supplemented with in silico crystal structure prediction (CSP) to inform the crystallization process and mitigate risk. Predictions are often based on advanced classical force fields or quantum mechanical calculations that model the crystal potential energy landscape but do not fully incorporate temperature, pressure, or solution conditions during the search procedure. This study proposes an innovative alchemical path that utilizes an advanced polarizable atomic multipole force field to predict crystal structures based on direct sampling of the NPT ensemble. The use of alchemical (i.e., nonphysical) intermediates, a novel Monte Carlo barostat, and an orthogonal space tempering bias combine to enhance the sampling efficiency of the deposition/sublimation phase transition. The proposed algorithm was applied to 2-((4-(2-(3,4-dichlorophenyl)ethyl)phenyl)amino)benzoic acid (Cambridge Crystallography Database Centre ID: XAFPAY) as a case study to showcase the algorithm. Each experimentally determined polymorph with one molecule in the asymmetric unit was successfully reproduced via approximately 1000 short 1 ns simulations per space group where each simulation was initiated from random rigid body coordinates and unit cell parameters. Utilizing two threads of a recent Intel CPU (a Xeon Gold 6330 CPU at 2.00 GHz), 1 ns of sampling using the polarizable AMOEBA force field can be acquired in 4 h (equating to more than 300 ns/day using all 112 threads/56 cores of a dual CPU node) within the Force Field X software (https://ffx.biochem.uiowa.edu). These results demonstrate a step forward in the rigorous use of the NPT ensemble during the CSP search process and open the door to future algorithms that incorporate solution conditions using continuum solvation methods.
Collapse
Affiliation(s)
- Aaron
J. Nessler
- Department
of Biomedical Engineering, University of
Iowa, 103 South Capitol
Street, 5601 Seamans Center for the Engineering Arts and Sciences, Iowa City, Iowa 52242, United States
| | - Okimasa Okada
- Sohyaku
Innovative Research Division, Mitsubishi
Tanabe Pharma Corporation, 1000 Kamoshida-cho, Aoba-ku, Yokohama, Kanagawa 227-0033, Japan
| | - Yuya Kinoshita
- Analytical
Development, Pharmaceutical Sciences, Takeda
Pharmaceutical Company Limited, 2-26-1, Muraoka-Higashi, Fujisawa 251-8555, Kanagawa, Japan
| | - Koki Nishimura
- Analytical
Development, Pharmaceutical Sciences, Takeda
Pharmaceutical Company Limited, 2-26-1, Muraoka-Higashi, Fujisawa 251-8555, Kanagawa, Japan
| | - Hiroomi Nagata
- CMC
Modality Technology Laboratories, Production Technology and Supply
Chain Management Division, Mitsubishi Tanabe
Pharma Corporation, Osaka 541-8505, Japan
| | - Kaori Fukuzawa
- Graduate
School of Pharmaceutical Sciences, Osaka
University, 1-6 Yamadaoka, Suita, Osaka 565-0871, Japan
| | - Etsuo Yonemochi
- Department
of Physical Chemistry, School of Pharmacy and Pharmaceutical Sciences, Hoshi University, 2-4-41 Ebara, Shinagawa-ku, Tokyo 142-8501, Japan
| | - Michael J. Schnieders
- Department
of Biomedical Engineering, University of
Iowa, 103 South Capitol
Street, 5601 Seamans Center for the Engineering Arts and Sciences, Iowa City, Iowa 52242, United States
- Department
of Biochemistry, University of Iowa, 51 Newton Road, 4-403 Bowen Science
Building, Iowa City, Iowa 52242, United States
| |
Collapse
|
22
|
Tkaczyk S, Karwounopoulos J, Schöller A, Woodcock HL, Langer T, Boresch S, Wieder M. Reweighting from Molecular Mechanics Force Fields to the ANI-2x Neural Network Potential. J Chem Theory Comput 2024; 20:2719-2728. [PMID: 38527958 DOI: 10.1021/acs.jctc.3c01274] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/27/2024]
Abstract
To achieve chemical accuracy in free energy calculations, it is necessary to accurately describe the system's potential energy surface and efficiently sample configurations from its Boltzmann distribution. While neural network potentials (NNPs) have shown significantly higher accuracy than classical molecular mechanics (MM) force fields, they have a limited range of applicability and are considerably slower than MM potentials, often by orders of magnitude. To address this challenge, Rufa et al. [Rufa et al. bioRxiv 2020, 10.1101/2020.07.29.227959.] suggested a two-stage approach that uses a fast and established MM alchemical energy protocol, followed by reweighting the results using NNPs, known as endstate correction or indirect free energy calculation. This study systematically investigates the accuracy and robustness of reweighting from an MM reference to a neural network target potential (ANI-2x) for an established data set in vacuum, using single-step free-energy perturbation (FEP) and nonequilibrium (NEQ) switching simulation. We assess the influence of longer switching lengths and the impact of slow degrees of freedom on outliers in the work distribution and compare the results to those of multistate equilibrium free energy simulations. Our results demonstrate that free energy calculations between NNPs and MM potentials should be preferably performed using NEQ switching simulations to obtain accurate free energy estimates. NEQ switching simulations between the MM potentials and NNPs are efficient, robust, and trivial to implement.
Collapse
Affiliation(s)
- Sara Tkaczyk
- Department of Pharmaceutical Sciences, Pharmaceutical Chemistry Division, University of Vienna, Josef-Holaubek-Platz 2, 1090 Vienna, Austria
- Vienna Doctoral School of Pharmaceutical, Nutritional and Sport Sciences (PhaNuSpo), University of Vienna, 1090 Vienna, Austria
| | - Johannes Karwounopoulos
- Faculty of Chemistry, Institute of Computational Biological Chemistry, University of Vienna, Währingerstrasse 17, 1090 Vienna, Austria
- Vienna Doctoral School of Chemistry (DoSChem), University of Vienna, Währingerstrasse 42, 1090 Vienna, Austria
| | - Andreas Schöller
- Faculty of Chemistry, Institute of Computational Biological Chemistry, University of Vienna, Währingerstrasse 17, 1090 Vienna, Austria
- Vienna Doctoral School of Chemistry (DoSChem), University of Vienna, Währingerstrasse 42, 1090 Vienna, Austria
| | - H Lee Woodcock
- Department of Chemistry, University of South Florida, 4202 E. Fowler Ave., CHE205, Tampa, Florida 33620-5250, United States
| | - Thierry Langer
- Department of Pharmaceutical Sciences, Pharmaceutical Chemistry Division, University of Vienna, Josef-Holaubek-Platz 2, 1090 Vienna, Austria
| | - Stefan Boresch
- Faculty of Chemistry, Institute of Computational Biological Chemistry, University of Vienna, Währingerstrasse 17, 1090 Vienna, Austria
| | - Marcus Wieder
- Faculty of Chemistry, Institute of Computational Biological Chemistry, University of Vienna, Währingerstrasse 17, 1090 Vienna, Austria
| |
Collapse
|
23
|
Gilson MK, Kurtzman T. Free Energy Density of a Fluid and Its Role in Solvation and Binding. J Chem Theory Comput 2024; 20:2871-2887. [PMID: 38536144 PMCID: PMC11197885 DOI: 10.1021/acs.jctc.3c01173] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/10/2024]
Abstract
The concept that a fluid has a position-dependent free energy density appears in the literature but has not been fully developed or accepted. We set this concept on an unambiguous theoretical footing via the following strategy. First, we set forth four desiderata that should be satisfied by any definition of the position-dependent free energy density, f(R), in a system comprising only a fluid and a rigid solute: its volume integral, plus the fixed internal energy of the solute, should be the system free energy; it deviates from its bulk value, fbulk, near a solute but should asymptotically approach fbulk with increasing distance from the solute; it should go to zero where the solvent density goes to zero; and it should be well-defined in the most general case of a fluid made up of flexible molecules with an arbitrary interaction potential. Second, we use statistical thermodynamics to formulate a definition of the free energy density that satisfies these desiderata. Third, we show how any free energy density satisfying the desiderata may be used to analyze molecular processes in solution. In particular, because the spatial integral of f(R) equals the free energy of the system, it can be used to compute free energy changes that result from the rearrangement of solutes as well as the forces exerted on the solutes by the solvent. This enables the use of a thermodynamic analysis of water in protein binding sites to inform ligand design. Finally, we discuss related literature and address published concerns regarding the thermodynamic plausibility of a position-dependent free energy density. The theory presented here has applications in theoretical and computational chemistry and may be further generalizable beyond fluids, such as to solids and macromolecules.
Collapse
Affiliation(s)
- Michael K Gilson
- Skaggs School of Pharmacy and Pharmaceutical Sciences, and Department of Chemistry and Biochemistry, UC San Diego, La Jolla, CA, 92093, USA
| | - Tom Kurtzman
- PhD Programs in Chemistry, Biochemistry, and Biology, The Graduate Center of the City University of New York, New York, 10016, USA; Department of Chemistry, Lehman College, The City University of New York, Bronx, New York, 10468, USA
| |
Collapse
|
24
|
Draper MR, Waterman A, Dannatt JE, Patel P. Integrating multiscale and machine learning approaches towards the SAMPL9 log P challenge. Phys Chem Chem Phys 2024; 26:7907-7919. [PMID: 38376855 PMCID: PMC10938873 DOI: 10.1039/d3cp04140a] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/21/2024]
Abstract
The partition coefficient (log P) is an important physicochemical property that provides information regarding a molecule's pharmacokinetics, toxicity, and bioavailability. Methods to accurately predict the partition coefficient have the potential to accelerate drug design. In an effort to test current methods and explore new computational techniques, the statistical assessment of the modeling of proteins and ligands (SAMPL) has established a blind prediction challenge. The ninth iteration challenge was to predict the toluene-water partition coefficient (log Ptol/w) of sixteen drug molecules. Herein, three approaches are reported broadly under the categories of quantum mechanics (QM), molecular mechanics (MM), and data-driven machine learning (ML). The three blind submissions yield mean unsigned errors (MUE) ranging from 1.53-2.93 log Ptol/w units. The MUEs were reduced to 1.00 log Ptol/w for the QM methods. While MM and ML methods outperformed DFT approaches for challenge molecules with fewer rotational degrees of freedom, they suffered for the larger molecules in this dataset. Overall, DFT functionals paired with a triple-ζ basis set were the simplest and most effective tool to obtain quantitatively accurate partition coefficients.
Collapse
Affiliation(s)
- Michael R Draper
- Chemistry Department, University of Dallas, Irving, Texas, 75062, USA.
| | - Asa Waterman
- Chemistry Department, University of Dallas, Irving, Texas, 75062, USA.
| | | | - Prajay Patel
- Chemistry Department, University of Dallas, Irving, Texas, 75062, USA.
| |
Collapse
|
25
|
Tropsha A, Isayev O, Varnek A, Schneider G, Cherkasov A. Integrating QSAR modelling and deep learning in drug discovery: the emergence of deep QSAR. Nat Rev Drug Discov 2024; 23:141-155. [PMID: 38066301 DOI: 10.1038/s41573-023-00832-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/21/2023] [Indexed: 02/08/2024]
Abstract
Quantitative structure-activity relationship (QSAR) modelling, an approach that was introduced 60 years ago, is widely used in computer-aided drug design. In recent years, progress in artificial intelligence techniques, such as deep learning, the rapid growth of databases of molecules for virtual screening and dramatic improvements in computational power have supported the emergence of a new field of QSAR applications that we term 'deep QSAR'. Marking a decade from the pioneering applications of deep QSAR to tasks involved in small-molecule drug discovery, we herein describe key advances in the field, including deep generative and reinforcement learning approaches in molecular design, deep learning models for synthetic planning and the application of deep QSAR models in structure-based virtual screening. We also reflect on the emergence of quantum computing, which promises to further accelerate deep QSAR applications and the need for open-source and democratized resources to support computer-aided drug design.
Collapse
Affiliation(s)
| | | | | | | | - Artem Cherkasov
- University of British Columbia, Vancouver, BC, Canada.
- Photonic Inc., Coquitlam, BC, Canada.
| |
Collapse
|
26
|
Ding Y, Huang J. Implementation and Validation of an OpenMM Plugin for the Deep Potential Representation of Potential Energy. Int J Mol Sci 2024; 25:1448. [PMID: 38338727 PMCID: PMC10855459 DOI: 10.3390/ijms25031448] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2023] [Revised: 01/08/2024] [Accepted: 01/11/2024] [Indexed: 02/12/2024] Open
Abstract
Machine learning potentials, particularly the deep potential (DP) model, have revolutionized molecular dynamics (MD) simulations, striking a balance between accuracy and computational efficiency. To facilitate the DP model's integration with the popular MD engine OpenMM, we have developed a versatile OpenMM plugin. This plugin supports a range of applications, from conventional MD simulations to alchemical free energy calculations and hybrid DP/MM simulations. Our extensive validation tests encompassed energy conservation in microcanonical ensemble simulations, fidelity in canonical ensemble generation, and the evaluation of the structural, transport, and thermodynamic properties of bulk water. The introduction of this plugin is expected to significantly expand the application scope of DP models within the MD simulation community, representing a major advancement in the field.
Collapse
Affiliation(s)
- Ye Ding
- College of Life Sciences, Zhejiang University, Hangzhou 310027, China;
- School of Life Sciences, Westlake University, Hangzhou 310024, China
- Westlake AI Therapeutics Lab, Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou 310024, China
| | - Jing Huang
- School of Life Sciences, Westlake University, Hangzhou 310024, China
- Westlake AI Therapeutics Lab, Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou 310024, China
| |
Collapse
|
27
|
Sweeney A, Mulvaney T, Maiorca M, Topf M. ChemEM: Flexible Docking of Small Molecules in Cryo-EM Structures. J Med Chem 2024; 67:199-212. [PMID: 38157562 PMCID: PMC10788898 DOI: 10.1021/acs.jmedchem.3c01134] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2023] [Revised: 11/28/2023] [Accepted: 12/08/2023] [Indexed: 01/03/2024]
Abstract
Cryo-electron microscopy (cryo-EM), through resolution advancements, has become pivotal in structure-based drug discovery. However, most cryo-EM structures are solved at 3-4 Å resolution, posing challenges for small-molecule docking and structure-based virtual screening due to issues in the precise positioning of ligands and the surrounding side chains. We present ChemEM, a software package that employs cryo-EM data for the accurate docking of one or multiple ligands in a protein-binding site. Validated against a highly curated benchmark of high- and medium-resolution cryo-EM structures and the corresponding high-resolution controls, ChemEM displayed impressive performance, accurately placing ligands in all but one case, often surpassing cryo-EM PDB-deposited solutions. Even without including the cryo-EM density, the ChemEM scoring function outperformed the well-established AutoDock Vina score. Using ChemEM, we illustrate that valuable information can be extracted from maps at medium resolution and underline the utility of cryo-EM structures for drug discovery.
Collapse
Affiliation(s)
- Aaron Sweeney
- Leibniz Institute of Virology (LIV), Hamburg 20251, Germany
- Centre for Structural Systems Biology, Hamburg 22607, Germany
- Universitätsklinikum Hamburg
Eppendorf (UKE), Hamburg 20246, Germany
| | - Thomas Mulvaney
- Leibniz Institute of Virology (LIV), Hamburg 20251, Germany
- Centre for Structural Systems Biology, Hamburg 22607, Germany
- Universitätsklinikum Hamburg
Eppendorf (UKE), Hamburg 20246, Germany
| | - Mauro Maiorca
- Leibniz Institute of Virology (LIV), Hamburg 20251, Germany
- Centre for Structural Systems Biology, Hamburg 22607, Germany
- Universitätsklinikum Hamburg
Eppendorf (UKE), Hamburg 20246, Germany
| | | |
Collapse
|
28
|
Setiadi J, Boothroyd S, Slochower DR, Dotson DL, Thompson MW, Wagner JR, Wang LP, Gilson MK. Tuning Potential Functions to Host-Guest Binding Data. J Chem Theory Comput 2024; 20:239-252. [PMID: 38147689 PMCID: PMC10838530 DOI: 10.1021/acs.jctc.3c01050] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2023]
Abstract
Software to more rapidly and accurately predict protein-ligand binding affinities is of high interest for early-stage drug discovery, and physics-based methods are among the most widely used technologies for this purpose. The accuracy of these methods depends critically on the accuracy of the potential functions that they use. Potential functions are typically trained against a combination of quantum chemical and experimental data. However, although binding affinities are among the most important quantities to predict, experimental binding affinities have not to date been integrated into the experimental data set used to train potential functions. In recent years, the use of host-guest complexes as simple and tractable models of binding thermodynamics has gained popularity due to their small size and simplicity, relative to protein-ligand systems. Host-guest complexes can also avoid ambiguities that arise in protein-ligand systems such as uncertain protonation states. Thus, experimental host-guest binding data are an appealing additional data type to integrate into the experimental data set used to optimize potential functions. Here, we report the extension of the Open Force Field Evaluator framework to enable the systematic calculation of host-guest binding free energies and their gradients with respect to force field parameters, coupled with the curation of 126 host-guest complexes with available experimental binding free energies. As an initial application of this novel infrastructure, we optimized generalized Born (GB) cavity radii for the OBC2 GB implicit solvent model against experimental data for 36 host-guest systems. This refitting led to a dramatic improvement in accuracy for both the training set and a separate test set with 90 additional host-guest systems. The optimized radii also showed encouraging transferability from host-guest systems to 59 protein-ligand systems. However, the new radii are significantly smaller than the baseline radii and lead to excessively favorable hydration free energies (HFEs). Thus, users of the OBC2 GB model currently may choose between GB cavity radii that yield more accurate binding affinities and GB cavity radii that yield more accurate HFEs. We suspect that achieving good accuracy on both will require more far-reaching adjustments to the GB model. We note that binding free-energy calculations using the OBC2 model in OpenMM gain about a 10× speedup relative to corresponding explicit solvent calculations, suggesting a future role for implicit solvent absolute binding free-energy (ABFE) calculations in virtual compound screening. This study proves the principle of using host-guest systems to train potential functions that are transferrable to protein-ligand systems and provides an infrastructure that enables a range of applications.
Collapse
Affiliation(s)
- Jeffry Setiadi
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, 9255 Pharmacy Lane, La Jolla, California 92093, United States
| | - Simon Boothroyd
- Boothroyd Scientific Consulting Ltd., London WC2H 9JQ, U.K
- Psivant Therapeutics, Boston, Massachusetts 02210, United States
| | | | - David L Dotson
- Datryllic LLC, Phoenix, Arizona 85003, United States
- The Open Force Field Consortium, Open Molecular Software Foundation, Davis, California 95616, United States
| | - Matthew W Thompson
- The Open Force Field Consortium, Open Molecular Software Foundation, Davis, California 95616, United States
| | - Jeffrey R Wagner
- The Open Force Field Consortium, Open Molecular Software Foundation, Davis, California 95616, United States
| | - Lee-Ping Wang
- Chemistry Department, University of California Davis, Davis, California 95616, United States
| | - Michael K Gilson
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, 9255 Pharmacy Lane, La Jolla, California 92093, United States
| |
Collapse
|
29
|
Hervø-Hansen S, Lin D, Kasahara K, Matubayasi N. Free-energy decomposition of salt effects on the solubilities of small molecules and the role of excluded-volume effects. Chem Sci 2024; 15:477-489. [PMID: 38179544 PMCID: PMC10763565 DOI: 10.1039/d3sc04617f] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2023] [Accepted: 11/20/2023] [Indexed: 01/06/2024] Open
Abstract
The roles of cations and anions are different in the perturbation on solvation, and thus, the analyses of the separated contributions from cations and anions are useful to establish molecular pictures of ion-specific effects. In this work, we investigate the effects of cations, anions, and water separately in the solvation of n-alcohols and n-alkanes by free-energy decomposition. By utilising energy-representation theory of solvation, we address the contributions arising from the direct solute-solvent interactions and the excluded-volume effects. It is found that the change in solvation of n-alcohols and n-alkanes upon addition of salt depends primarily on the anion species. The direct interaction between the anion and solute is in agreement with the Setschenow coefficient in terms of the ranking of salting-in and salting-out for n-alkanes, which corresponds to the extent of accumulation of the anion on the solute surface. For each of the n-alcohols and n-alkanes examined, the excluded-volume component in the Setschenow coefficient is well correlated to the (total) Setschenow coefficient when the salt effects are concerned. The ranking of the excluded-volume component in the variation of the salt species is parallel to the water contribution, which is correlated further to the change in the water density upon the addition of the salt.
Collapse
Affiliation(s)
- Stefan Hervø-Hansen
- Division of Chemical Engineering, Graduate School of Engineering Science, Osaka University Toyonaka Osaka 560-8531 Japan
| | - Daoyang Lin
- Division of Chemical Engineering, Graduate School of Engineering Science, Osaka University Toyonaka Osaka 560-8531 Japan
| | - Kento Kasahara
- Division of Chemical Engineering, Graduate School of Engineering Science, Osaka University Toyonaka Osaka 560-8531 Japan
| | - Nobuyuki Matubayasi
- Division of Chemical Engineering, Graduate School of Engineering Science, Osaka University Toyonaka Osaka 560-8531 Japan
| |
Collapse
|
30
|
Herz AM, Kellici T, Morao I, Michel J. Alchemical Free Energy Workflows for the Computation of Protein-Ligand Binding Affinities. Methods Mol Biol 2024; 2716:241-264. [PMID: 37702943 DOI: 10.1007/978-1-0716-3449-3_11] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/14/2023]
Abstract
Alchemical free energy methods can be used for the efficient computation of relative binding free energies during preclinical drug discovery stages. In recent years, this has been facilitated further by the implementation of workflows that enable non-experts to quickly and consistently set up the required simulations. Given the correct input structures, workflows handle the difficult aspects of setting up perturbations, including consistently defining the perturbable molecule, its atom mapping and topology generation, perturbation network generation, running of the simulations via different sampling methods, and analysis of the results. Different academic and commercial workflows are discussed, including FEW, FESetup, FEPrepare, CHARMM-GUI, Transformato, PMX, QLigFEP, TIES, ProFESSA, PyAutoFEP, BioSimSpace, FEP+, Flare, and Orion. These workflows differ in various aspects, such as mapping algorithms or enhanced sampling methods. Some workflows can accommodate more than one molecular dynamics (MD) engine and use external libraries for tasks. Differences between workflows can present advantages for different use cases, however a lack of interoperability of the workflows' components hinders systematic comparisons.
Collapse
Affiliation(s)
- Anna M Herz
- EaStChem School of Chemistry, Joseph Black Building, University of Edinburgh, Edinburgh, UK
| | - Tahsin Kellici
- Evotec (UK) Ltd., In Silico Research and Development, Abingdon, Oxfordshire, UK
- Merck & Co., Inc., Modelling and Informatics, West Point, PA, USA
| | - Inaki Morao
- Evotec (UK) Ltd., In Silico Research and Development, Abingdon, Oxfordshire, UK
| | - Julien Michel
- EaStChem School of Chemistry, Joseph Black Building, University of Edinburgh, Edinburgh, UK.
| |
Collapse
|
31
|
Papadourakis M, Sinenka H, Matricon P, Hénin J, Brannigan G, Pérez-Benito L, Pande V, van Vlijmen H, de Graaf C, Deflorian F, Tresadern G, Cecchini M, Cournia Z. Alchemical Free Energy Calculations on Membrane-Associated Proteins. J Chem Theory Comput 2023; 19:7437-7458. [PMID: 37902715 PMCID: PMC11017255 DOI: 10.1021/acs.jctc.3c00365] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2023] [Indexed: 10/31/2023]
Abstract
Membrane proteins have diverse functions within cells and are well-established drug targets. The advances in membrane protein structural biology have revealed drug and lipid binding sites on membrane proteins, while computational methods such as molecular simulations can resolve the thermodynamic basis of these interactions. Particularly, alchemical free energy calculations have shown promise in the calculation of reliable and reproducible binding free energies of protein-ligand and protein-lipid complexes in membrane-associated systems. In this review, we present an overview of representative alchemical free energy studies on G-protein-coupled receptors, ion channels, transporters as well as protein-lipid interactions, with emphasis on best practices and critical aspects of running these simulations. Additionally, we analyze challenges and successes when running alchemical free energy calculations on membrane-associated proteins. Finally, we highlight the value of alchemical free energy calculations calculations in drug discovery and their applicability in the pharmaceutical industry.
Collapse
Affiliation(s)
- Michail Papadourakis
- Biomedical
Research Foundation, Academy of Athens, 4 Soranou Ephessiou, 11527 Athens, Greece
| | - Hryhory Sinenka
- Institut
de Chimie de Strasbourg, UMR7177, CNRS, Université de Strasbourg, F-67083 Strasbourg Cedex, France
| | - Pierre Matricon
- Sosei
Heptares, Steinmetz Building,
Granta Park, Great Abington, Cambridge CB21 6DG, United
Kingdom
| | - Jérôme Hénin
- Laboratoire
de Biochimie Théorique UPR 9080, CNRS and Université Paris Cité, 75005 Paris, France
| | - Grace Brannigan
- Center
for Computational and Integrative Biology, Rutgers University−Camden, Camden, New Jersey 08103, United States of America
- Department
of Physics, Rutgers University−Camden, Camden, New Jersey 08102, United States
of America
| | - Laura Pérez-Benito
- CADD,
In Silico Discovery, Janssen Research &
Development, Turnhoutseweg 30, 2340 Beerse, Belgium
| | - Vineet Pande
- CADD,
In Silico Discovery, Janssen Research &
Development, Turnhoutseweg 30, 2340 Beerse, Belgium
| | - Herman van Vlijmen
- CADD,
In Silico Discovery, Janssen Research &
Development, Turnhoutseweg 30, 2340 Beerse, Belgium
| | - Chris de Graaf
- Sosei
Heptares, Steinmetz Building,
Granta Park, Great Abington, Cambridge CB21 6DG, United
Kingdom
| | - Francesca Deflorian
- Sosei
Heptares, Steinmetz Building,
Granta Park, Great Abington, Cambridge CB21 6DG, United
Kingdom
| | - Gary Tresadern
- CADD,
In Silico Discovery, Janssen Research &
Development, Turnhoutseweg 30, 2340 Beerse, Belgium
| | - Marco Cecchini
- Institut
de Chimie de Strasbourg, UMR7177, CNRS, Université de Strasbourg, F-67083 Strasbourg Cedex, France
| | - Zoe Cournia
- Biomedical
Research Foundation, Academy of Athens, 4 Soranou Ephessiou, 11527 Athens, Greece
| |
Collapse
|
32
|
Lehtola S. A call to arms: Making the case for more reusable libraries. J Chem Phys 2023; 159:180901. [PMID: 37947507 DOI: 10.1063/5.0175165] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2023] [Accepted: 10/23/2023] [Indexed: 11/12/2023] Open
Abstract
The traditional foundation of science lies on the cornerstones of theory and experiment. Theory is used to explain experiment, which in turn guides the development of theory. Since the advent of computers and the development of computational algorithms, computation has risen as the third cornerstone of science, joining theory and experiment on an equal footing. Computation has become an essential part of modern science, amending experiment by enabling accurate comparison of complicated theories to sophisticated experiments, as well as guiding by triage both the design and targets of experiments and the development of novel theories and computational methods. Like experiment, computation relies on continued investment in infrastructure: it requires both hardware (the physical computer on which the calculation is run) as well as software (the source code of the programs that performs the wanted simulations). In this Perspective, I discuss present-day challenges on the software side in computational chemistry, which arise from the fast-paced development of algorithms, programming models, as well as hardware. I argue that many of these challenges could be solved with reusable open source libraries, which are a public good, enhance the reproducibility of science, and accelerate the development and availability of state-of-the-art methods and improved software.
Collapse
Affiliation(s)
- Susi Lehtola
- Department of Chemistry, University of Helsinki, P.O. Box 55, FI-00014 Helsinki, Finland
| |
Collapse
|
33
|
Ross GA, Lu C, Scarabelli G, Albanese SK, Houang E, Abel R, Harder ED, Wang L. The maximal and current accuracy of rigorous protein-ligand binding free energy calculations. Commun Chem 2023; 6:222. [PMID: 37838760 PMCID: PMC10576784 DOI: 10.1038/s42004-023-01019-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2022] [Accepted: 10/02/2023] [Indexed: 10/16/2023] Open
Abstract
Computational techniques can speed up the identification of hits and accelerate the development of candidate molecules for drug discovery. Among techniques for predicting relative binding affinities, the most consistently accurate is free energy perturbation (FEP), a class of rigorous physics-based methods. However, uncertainty remains about how accurate FEP is and can ever be. Here, we present what we believe to be the largest publicly available dataset of proteins and congeneric series of small molecules, and assess the accuracy of the leading FEP workflow. To ascertain the limit of achievable accuracy, we also survey the reproducibility of experimental relative affinity measurements. We find a wide variability in experimental accuracy and a correspondence between binding and functional assays. When careful preparation of protein and ligand structures is undertaken, FEP can achieve accuracy comparable to experimental reproducibility. Throughout, we highlight reliable protocols that can help maximize the accuracy of FEP in prospective studies.
Collapse
Affiliation(s)
- Gregory A Ross
- Schrödinger Inc, New York, NY, USA.
- Isomorphic Labs, London, UK.
| | - Chao Lu
- Schrödinger Inc, New York, NY, USA
| | | | | | | | | | | | | |
Collapse
|
34
|
Lehner MT, Katzberger P, Maeder N, Schiebroek CC, Teetz J, Landrum GA, Riniker S. DASH: Dynamic Attention-Based Substructure Hierarchy for Partial Charge Assignment. J Chem Inf Model 2023; 63:6014-6028. [PMID: 37738206 PMCID: PMC10565818 DOI: 10.1021/acs.jcim.3c00800] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2023] [Indexed: 09/24/2023]
Abstract
We present a robust and computationally efficient approach for assigning partial charges of atoms in molecules. The method is based on a hierarchical tree constructed from attention values extracted from a graph neural network (GNN), which was trained to predict atomic partial charges from accurate quantum-mechanical (QM) calculations. The resulting dynamic attention-based substructure hierarchy (DASH) approach provides fast assignment of partial charges with the same accuracy as the GNN itself, is software-independent, and can easily be integrated in existing parametrization pipelines, as shown for the Open force field (OpenFF). The implementation of the DASH workflow, the final DASH tree, and the training set are available as open source/open data from public repositories.
Collapse
Affiliation(s)
| | | | - Niels Maeder
- Department of Chemistry and
Applied Biosciences, ETH Zurich, Vladimir-Prelog-Weg 2, 8093 Zurich, Switzerland
| | - Carl C.G. Schiebroek
- Department of Chemistry and
Applied Biosciences, ETH Zurich, Vladimir-Prelog-Weg 2, 8093 Zurich, Switzerland
| | - Jakob Teetz
- Department of Chemistry and
Applied Biosciences, ETH Zurich, Vladimir-Prelog-Weg 2, 8093 Zurich, Switzerland
| | - Gregory A. Landrum
- Department of Chemistry and
Applied Biosciences, ETH Zurich, Vladimir-Prelog-Weg 2, 8093 Zurich, Switzerland
| | - Sereina Riniker
- Department of Chemistry and
Applied Biosciences, ETH Zurich, Vladimir-Prelog-Weg 2, 8093 Zurich, Switzerland
| |
Collapse
|
35
|
Friedman AJ, Padgette HM, Kramer L, Liechty ET, Donovan GW, Fox JM, Shirts MR. Biophysical Rationale for the Selective Inhibition of PTP1B over TCPTP by Nonpolar Terpenoids. J Phys Chem B 2023; 127:8305-8316. [PMID: 37729547 PMCID: PMC10694825 DOI: 10.1021/acs.jpcb.3c03791] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/22/2023]
Abstract
Protein tyrosine phosphatases (PTPs) are emerging drug targets for many diseases, including cancer, autoimmunity, and neurological disorders. A high degree of structural similarity between their catalytic domains, however, has hindered the development of selective pharmacological agents. Our previous research uncovered two unfunctionalized terpenoid inhibitors that selectively inhibit PTP1B over T-cell PTP (TCPTP), two PTPs with high sequence conservation. Here, we use molecular modeling, with supporting experimental validation, to study the molecular basis of this unusual selectivity. Molecular dynamics (MD) simulations suggest that PTP1B and TCPTP share a h-bond network that connects the active site to a distal allosteric pocket; this network stabilizes the closed conformation of the catalytically essential WPD loop, which it links to the L-11 loop and neighboring α3 and α7 helices on the other side of the catalytic domain. Terpenoid binding to either of two proximal C-terminal sites─an α site and a β site─can disrupt the allosteric network; however, binding to the α site forms a stable complex only in PTP1B. In TCPTP, two charged residues disfavor binding at the α site in favor of binding at the β site, which is conserved between the two proteins. Our findings thus indicate that minor amino acid differences at the poorly conserved α site enable selective binding, a property that might be enhanced with chemical elaboration, and illustrate more broadly how minor differences in the conservation of neighboring─yet functionally similar─allosteric sites can affect the selectivity of inhibitory scaffolds (e.g., fragments).
Collapse
Affiliation(s)
- Anika J Friedman
- Department of Chemical and Biological Engineering, University of Colorado Boulder, Boulder, Colorado 80309, United States
| | - Hannah M Padgette
- Department of Chemical and Biological Engineering, University of Colorado Boulder, Boulder, Colorado 80309, United States
| | - Levi Kramer
- Department of Chemical and Biological Engineering, University of Colorado Boulder, Boulder, Colorado 80309, United States
| | - Evan T Liechty
- Department of Chemical and Biological Engineering, University of Colorado Boulder, Boulder, Colorado 80309, United States
| | - Gregory W Donovan
- Department of Chemical and Biological Engineering, University of Colorado Boulder, Boulder, Colorado 80309, United States
| | - Jerome M Fox
- Department of Chemical and Biological Engineering, University of Colorado Boulder, Boulder, Colorado 80309, United States
| | - Michael R Shirts
- Department of Chemical and Biological Engineering, University of Colorado Boulder, Boulder, Colorado 80309, United States
| |
Collapse
|
36
|
Conflitti P, Raniolo S, Limongelli V. Perspectives on Ligand/Protein Binding Kinetics Simulations: Force Fields, Machine Learning, Sampling, and User-Friendliness. J Chem Theory Comput 2023; 19:6047-6061. [PMID: 37656199 PMCID: PMC10536999 DOI: 10.1021/acs.jctc.3c00641] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2023] [Indexed: 09/02/2023]
Abstract
Computational techniques applied to drug discovery have gained considerable popularity for their ability to filter potentially active drugs from inactive ones, reducing the time scale and costs of preclinical investigations. The main focus of these studies has historically been the search for compounds endowed with high affinity for a specific molecular target to ensure the formation of stable and long-lasting complexes. Recent evidence has also correlated the in vivo drug efficacy with its binding kinetics, thus opening new fascinating scenarios for ligand/protein binding kinetic simulations in drug discovery. The present article examines the state of the art in the field, providing a brief summary of the most popular and advanced ligand/protein binding kinetics techniques and evaluating their current limitations and the potential solutions to reach more accurate kinetic models. Particular emphasis is put on the need for a paradigm change in the present methodologies toward ligand and protein parametrization, the force field problem, characterization of the transition states, the sampling issue, and algorithms' performance, user-friendliness, and data openness.
Collapse
Affiliation(s)
- Paolo Conflitti
- Faculty
of Biomedical Sciences, Euler Institute, Universitá della Svizzera italiana (USI), 6900 Lugano, Switzerland
| | - Stefano Raniolo
- Faculty
of Biomedical Sciences, Euler Institute, Universitá della Svizzera italiana (USI), 6900 Lugano, Switzerland
| | - Vittorio Limongelli
- Faculty
of Biomedical Sciences, Euler Institute, Universitá della Svizzera italiana (USI), 6900 Lugano, Switzerland
- Department
of Pharmacy, University of Naples “Federico
II”, 80131 Naples, Italy
| |
Collapse
|
37
|
Khuttan S, Azimi S, Wu JZ, Dick S, Wu C, Xu H, Gallicchio E. Taming multiple binding poses in alchemical binding free energy prediction: the β-cyclodextrin host-guest SAMPL9 blinded challenge. Phys Chem Chem Phys 2023; 25:24364-24376. [PMID: 37676233 DOI: 10.1039/d3cp02125d] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/08/2023]
Abstract
We apply the Alchemical Transfer Method (ATM) and a bespoke fixed partial charge force field to the SAMPL9 bCD host-guest binding free energy prediction challenge that comprises a combination of complexes formed between five phenothiazine guests and two cyclodextrin hosts. Multiple chemical forms, competing binding poses, and computational modeling challenges pose significant obstacles to obtaining reliable computational predictions for these systems. The phenothiazine guests exist in solution as racemic mixtures of enantiomers related by nitrogen inversions that bind the hosts in various binding poses, each requiring an individual free energy analysis. Due to the large size of the guests and the conformational reorganization of the hosts, which prevent a direct absolute binding free energy route, binding free energies are obtained by a series of absolute and relative binding alchemical steps for each chemical species in each binding pose. Metadynamics-accelerated conformational sampling was found to be necessary to address the poor convergence of some numerical estimates affected by conformational trapping. Despite these challenges, our blinded predictions quantitatively reproduced the experimental affinities for the β-cyclodextrin host and, to a lesser extent, those with a methylated derivative. The work illustrates the challenges of obtaining reliable free energy data in in silico drug design for even seemingly simple systems and introduces some of the technologies available to tackle them.
Collapse
Affiliation(s)
- Sheenam Khuttan
- Department of Chemistry, Brooklyn College of the City University of New York, New York, USA.
- PhD Program in Biochemistry, Graduate Center of the City University of New York, USA
| | - Solmaz Azimi
- Department of Chemistry, Brooklyn College of the City University of New York, New York, USA.
- PhD Program in Biochemistry, Graduate Center of the City University of New York, USA
| | - Joe Z Wu
- Department of Chemistry, Brooklyn College of the City University of New York, New York, USA.
- PhD Program in Chemistry, Graduate Center of the City University of New York, USA
| | | | | | | | - Emilio Gallicchio
- Department of Chemistry, Brooklyn College of the City University of New York, New York, USA.
- PhD Program in Biochemistry, Graduate Center of the City University of New York, USA
- PhD Program in Chemistry, Graduate Center of the City University of New York, USA
| |
Collapse
|
38
|
Talmazan RA, Podewitz M. PyConSolv: A Python Package for Conformer Generation of (Metal-Containing) Systems in Explicit Solvent. J Chem Inf Model 2023; 63:5400-5407. [PMID: 37606893 PMCID: PMC10498442 DOI: 10.1021/acs.jcim.3c00798] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2023] [Indexed: 08/23/2023]
Abstract
We introduce PyConSolv, a freely available Python package that automates the generation of conformers of metal- and nonmetal-containing complexes in explicit solvent, through classical molecular dynamics simulations. Using a streamlined workflow and interfacing with widely used computational chemistry software, PyConSolv is an all-in-one tool for the generation of conformers in any solvent. Input requirements are minimal; only the geometry of the structure and the desired solvent in xyz (XMOL) format are needed. The package can also account for charged systems, by including arbitrary counterions in the simulation. A bonded model parametrization is performed automatically, utilizing AmberTools, ORCA, and Multiwfn software packages. PyConSolv provides a selection of preparametrized solvents and counterions for use in classical molecular dynamics simulations. We show the applicability of our package on a number of (transition-metal-containing) systems. The software is provided open source and free of charge.
Collapse
Affiliation(s)
- R. A. Talmazan
- Institute
of Materials Chemistry, TU Wien, Getreidemarkt 9, A-1060 Wien, Austria
| | - M. Podewitz
- Institute
of Materials Chemistry, TU Wien, Getreidemarkt 9, A-1060 Wien, Austria
| |
Collapse
|
39
|
Seidel T, Permann C, Wieder O, Kohlbacher SM, Langer T. High-Quality Conformer Generation with CONFORGE: Algorithm and Performance Assessment. J Chem Inf Model 2023; 63:5549-5570. [PMID: 37624145 PMCID: PMC10498443 DOI: 10.1021/acs.jcim.3c00563] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2023] [Indexed: 08/26/2023]
Abstract
Knowledge of the putative bound-state conformation of a molecule is an essential prerequisite for the successful application of many computer-aided drug design methods that aim to assess or predict its capability to bind to a particular target receptor. An established approach to predict bioactive conformers in the absence of receptor structure information is to sample the low-energy conformational space of the investigated molecules and derive representative conformer ensembles that can be expected to comprise members closely resembling possible bound-state ligand conformations. The high relevance of such conformer generation functionality led to the development of a wide panel of dedicated commercial and open-source software tools throughout the last decades. Several published benchmarking studies have shown that open-source tools usually lag behind their commercial competitors in many key aspects. In this work, we introduce the open-source conformer ensemble generator CONFORGE, which aims at delivering state-of-the-art performance for all types of organic molecules in drug-like chemical space. The ability of CONFORGE and several well-known commercial and open-source conformer ensemble generators to reproduce experimental 3D structures as well as their computational efficiency and robustness has been assessed thoroughly for both typical drug-like molecules and macrocyclic structures. For small molecules, CONFORGE clearly outperformed all other tested open-source conformer generators and performed at least equally well as the evaluated commercial generators in terms of both processing speed and accuracy. In the case of macrocyclic structures, CONFORGE achieved the best average accuracy among all benchmarked generators, with RDKit's generator coming close in second place.
Collapse
Affiliation(s)
- Thomas Seidel
- Department
of Pharmaceutical Sciences, Division of Pharmaceutical Chemistry, University of Vienna, Josef-Holaubek-Platz 2, 1090 Vienna, Austria
| | - Christian Permann
- NeGeMac
Research Platform, Department of Pharmaceutical Sciences, Division
of Pharmaceutical Chemistry, University
of Vienna, Josef-Holaubek-Platz
2, 1090 Vienna, Austria
| | - Oliver Wieder
- Christian
Doppler Laboratory for Molecular Informatics in the Biosciences, Department
of Pharmaceutical Sciences, Division of Pharmaceutical Chemistry, University of Vienna, Josef-Holaubek-Platz 2, 1090 Vienna, Austria
| | - Stefan M. Kohlbacher
- Department
of Pharmaceutical Sciences, Division of Pharmaceutical Chemistry, University of Vienna, Josef-Holaubek-Platz 2, 1090 Vienna, Austria
| | - Thierry Langer
- Department
of Pharmaceutical Sciences, Division of Pharmaceutical Chemistry, University of Vienna, Josef-Holaubek-Platz 2, 1090 Vienna, Austria
- NeGeMac
Research Platform, Department of Pharmaceutical Sciences, Division
of Pharmaceutical Chemistry, University
of Vienna, Josef-Holaubek-Platz
2, 1090 Vienna, Austria
- Christian
Doppler Laboratory for Molecular Informatics in the Biosciences, Department
of Pharmaceutical Sciences, Division of Pharmaceutical Chemistry, University of Vienna, Josef-Holaubek-Platz 2, 1090 Vienna, Austria
| |
Collapse
|
40
|
Bolhuis PG, Brotzakis ZF, Keller BG. Optimizing molecular potential models by imposing kinetic constraints with path reweighting. J Chem Phys 2023; 159:074102. [PMID: 37581416 DOI: 10.1063/5.0151166] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2023] [Accepted: 06/19/2023] [Indexed: 08/16/2023] Open
Abstract
Empirical force fields employed in molecular dynamics simulations of complex systems are often optimized to reproduce experimentally determined structural and thermodynamic properties. In contrast, experimental knowledge about the interconversion rates between metastable states in such systems is hardly ever incorporated in a force field due to a lack of an efficient approach. Here, we introduce such a framework based on the relationship between dynamical observables, such as rate constants, and the underlying molecular model parameters using the statistical mechanics of trajectories. Given a prior ensemble of molecular dynamics trajectories produced with imperfect force field parameters, the approach allows for the optimal adaption of these parameters such that the imposed constraint of equally predicted and experimental rate constant is obeyed. To do so, the method combines the continuum path ensemble maximum caliber approach with path reweighting methods for stochastic dynamics. When multiple solutions are found, the method selects automatically the combination that corresponds to the smallest perturbation of the entire path ensemble, as required by the maximum entropy principle. To show the validity of the approach, we illustrate the method on simple test systems undergoing rare event dynamics. Next to simple 2D potentials, we explore particle models representing molecular isomerization reactions and protein-ligand unbinding. Besides optimal interaction parameters, the methodology gives physical insights into what parts of the model are most sensitive to the kinetics. We discuss the generality and broad implications of the methodology.
Collapse
Affiliation(s)
- Peter G Bolhuis
- van 't Hoff Institute for Molecular Sciences, University of Amsterdam, P.O. Box 94157, 1090 GD Amsterdam, The Netherlands
| | - Z Faidon Brotzakis
- Department of Chemistry, University of Cambridge, Cambridge CB2 1EW, United Kingdom
| | - Bettina G Keller
- Department of Biology, Chemistry, Pharmacy, Freie Universität Berlin, Arnimallee 22, D-14195 Berlin, Germany
| |
Collapse
|
41
|
Guterres H, Im W. CHARMM-GUI-Based Induced Fit Docking Workflow to Generate Reliable Protein-Ligand Binding Modes. J Chem Inf Model 2023; 63:4772-4779. [PMID: 37462607 PMCID: PMC10428204 DOI: 10.1021/acs.jcim.3c00416] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2023] [Indexed: 08/15/2023]
Abstract
Molecular docking is a preferred method to predict ligand binding modes and their binding energy to target protein receptors, which is critical in early phase structure-based drug discovery. However, there is a persistent challenge in docking that can be attributed to the induced fit effect, as receptor binding sites undergo induced fit conformational changes upon ligand binding to achieve better binding modes. In this work, based on CHARMM-GUI LBS Finder& Refiner and High-Throughput Simulator, we present a straightforward CHARMM-GUI induced fit docking (CGUI-IFD) workflow to generate reliable protein-ligand binding modes. The CGUI-IFD workflow generates an ensemble of receptor binding site conformations through ligand-binding site (LBS) refinement, runs rigid receptor docking, and performs high-throughput molecular dynamics (MD) simulations of protein-ligand complex structures in explicit solvents. The results are evaluated based on the ligand root-mean-square deviation (RMSD)-based binding stability and the molecular mechanics generalized Born surface area binding energy. For a benchmark test, we used 258 cross-docking protein-ligand pairs across 41 target proteins from the Schrodinger IFD-MD data set. The application of CGUI-IFD on this data set shows 80% success rate (within 2.5 Å RMSD from the experimental structures). We expect that the CGUI-IFD workflow can be useful to generate reliable ligand binding modes for cross-docking cases.
Collapse
Affiliation(s)
- Hugo Guterres
- Departments of Biological
Sciences, Chemistry, Bioengineering, and Computer Science and Engineering, Lehigh University, Bethlehem, Pennsylvania 18015, United States
| | - Wonpil Im
- Departments of Biological
Sciences, Chemistry, Bioengineering, and Computer Science and Engineering, Lehigh University, Bethlehem, Pennsylvania 18015, United States
| |
Collapse
|
42
|
Horton JT, Boothroyd S, Behara PK, Mobley DL, Cole DJ. A transferable double exponential potential for condensed phase simulations of small molecules. DIGITAL DISCOVERY 2023; 2:1178-1187. [PMID: 38013814 PMCID: PMC10408570 DOI: 10.1039/d3dd00070b] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/20/2023] [Accepted: 07/07/2023] [Indexed: 11/29/2023]
Abstract
The Lennard-Jones potential is the most widely-used function for the description of non-bonded interactions in transferable force fields for the condensed phase. This is not because it has an optimal functional form, but rather it is a legacy resulting from when computational expense was a major consideration and this potential was particularly convenient numerically. At present, it persists because the effort that would be required to re-write molecular modelling software and train new force fields has, until now, been prohibitive. Here, we present Smirnoff-plugins as a flexible framework to extend the Open Force Field software stack to allow custom force field functional forms. We deploy Smirnoff-plugins with the automated Open Force Field infrastructure to train a transferable, small molecule force field based on the recently-proposed double exponential functional form, on over 1000 experimental condensed phase properties. Extensive testing of the resulting force field shows improvements in transfer free energies, with acceptable conformational energetics, run times and convergence properties compared to state-of-the-art Lennard-Jones based force fields.
Collapse
Affiliation(s)
- Joshua T Horton
- School of Natural and Environmental Sciences, Newcastle University Newcastle upon Tyne NE1 7RU UK
| | | | - Pavan Kumar Behara
- Department of Pharmaceutical Sciences, University of California Irvine California 92697 USA
| | - David L Mobley
- Department of Pharmaceutical Sciences, University of California Irvine California 92697 USA
- Department of Chemistry, University of California Irvine California 92697 USA
| | - Daniel J Cole
- School of Natural and Environmental Sciences, Newcastle University Newcastle upon Tyne NE1 7RU UK
| |
Collapse
|
43
|
Baumann H, Dybeck E, McClendon CL, Pickard FC, Gapsys V, Pérez-Benito L, Hahn DF, Tresadern G, Mathiowetz AM, Mobley DL. Broadening the Scope of Binding Free Energy Calculations Using a Separated Topologies Approach. J Chem Theory Comput 2023; 19:5058-5076. [PMID: 37487138 PMCID: PMC10413862 DOI: 10.1021/acs.jctc.3c00282] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2023] [Indexed: 07/26/2023]
Abstract
Binding free energy calculations predict the potency of compounds to protein binding sites in a physically rigorous manner and see broad application in prioritizing the synthesis of novel drug candidates. Relative binding free energy (RBFE) calculations have emerged as an industry-standard approach to achieve highly accurate rank-order predictions of the potency of related compounds; however, this approach requires that the ligands share a common scaffold and a common binding mode, restricting the methods' domain of applicability. This is a critical limitation since complex modifications to the ligands, especially core hopping, are very common in drug design. Absolute binding free energy (ABFE) calculations are an alternate method that can be used for ligands that are not congeneric. However, ABFE suffers from a known problem of long convergence times due to the need to sample additional degrees of freedom within each system, such as sampling rearrangements necessary to open and close the binding site. Here, we report on an alternative method for RBFE, called Separated Topologies (SepTop), which overcomes the issues in both of the aforementioned methods by enabling large scaffold changes between ligands with a convergence time comparable to traditional RBFE. Instead of only mutating atoms that vary between two ligands, this approach performs two absolute free energy calculations at the same time in opposite directions, one for each ligand. Defining the two ligands independently allows the comparison of the binding of diverse ligands without the artificial constraints of identical poses or a suitable atom-atom mapping. This approach also avoids the need to sample the unbound state of the protein, making it more efficient than absolute binding free energy calculations. Here, we introduce an implementation of SepTop. We developed a general and efficient protocol for running SepTop, and we demonstrated the method on four diverse, pharmaceutically relevant systems. We report the performance of the method, as well as our practical insights into the strengths, weaknesses, and challenges of applying this method in an industrial drug design setting. We find that the accuracy of the approach is sufficiently high to rank order ligands with an accuracy comparable to traditional RBFE calculations while maintaining the additional flexibility of SepTop.
Collapse
Affiliation(s)
- Hannah
M. Baumann
- Department
of Pharmaceutical Sciences, University of
California, Irvine, Irvine, California 92697, United States
| | - Eric Dybeck
- Pfizer
Worldwide Research, Development, and Medical, 1 Portland Street, Cambridge, Massachusetts 02139, United States
| | - Christopher L. McClendon
- Pfizer
Worldwide Research, Development, and Medical, 1 Portland Street, Cambridge, Massachusetts 02139, United States
| | - Frank C. Pickard
- Pfizer
Worldwide Research, Development, and Medical, 1 Portland Street, Cambridge, Massachusetts 02139, United States
| | - Vytautas Gapsys
- Computational
Chemistry, Janssen Research & Development, Janssen Pharmaceutica N. V., Turnhoutseweg 30, B-2340 Beerse, Belgium
| | - Laura Pérez-Benito
- Computational
Chemistry, Janssen Research & Development, Janssen Pharmaceutica N. V., Turnhoutseweg 30, B-2340 Beerse, Belgium
| | - David F. Hahn
- Computational
Chemistry, Janssen Research & Development, Janssen Pharmaceutica N. V., Turnhoutseweg 30, B-2340 Beerse, Belgium
| | - Gary Tresadern
- Computational
Chemistry, Janssen Research & Development, Janssen Pharmaceutica N. V., Turnhoutseweg 30, B-2340 Beerse, Belgium
| | - Alan M. Mathiowetz
- Pfizer
Worldwide Research, Development, and Medical, 1 Portland Street, Cambridge, Massachusetts 02139, United States
| | - David L. Mobley
- Department
of Pharmaceutical Sciences, University of
California, Irvine, Irvine, California 92697, United States
- Department
of Chemistry, University of California,
Irvine, Irvine, California 92697, United States
| |
Collapse
|
44
|
Seiferth D, Tucker SJ, Biggin PC. Limitations of non-polarizable force fields in describing anion binding poses in non-polar synthetic hosts. Phys Chem Chem Phys 2023. [PMID: 37365974 DOI: 10.1039/d3cp00479a] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/28/2023]
Abstract
Transmembrane anion transport by synthetic ionophores has received increasing interest not only because of its relevance for understanding endogenous anion transport, but also because of potential implications for therapeutic routes in disease states where chloride transport is impaired. Computational studies can shed light on the binding recognition process and can deepen our mechanistic understanding of them. However, the ability of molecular mechanics methods to properly capture solvation and binding properties of anions is known to be challenging. Consequently, polarizable models have been suggested to improve the accuracy of such calculations. In this study, we calculate binding free energies for different anions to the synthetic ionophore, biotin[6]uril hexamethyl ester in acetonitrile and to biotin[6]uril hexaacid in water by employing non-polarizable and polarizable force fields. Anion binding shows strong solvent dependency consistent with experimental studies. In water, the binding strengths are iodide > bromide > chloride, and reversed in acetonitrile. These trends are well captured by both classes of force fields. However, the free energy profiles obtained from potential of mean force calculations and preferred binding positions of anions depend on the treatment of electrostatics. Results from simulations using the AMOEBA force-field, which recapitulate the observed binding positions, suggest strong effects from multipoles dominate with a smaller contribution from polarization. The oxidation status of the macrocycle was also found to influence anion recognition in water. Overall, these results have implications for the understanding of anion host interactions not just in synthetic ionophores, but also in narrow cavities of biological ion channels.
Collapse
Affiliation(s)
- David Seiferth
- Clarendon Laboratory, Department of Physics, University of Oxford, Oxford, OX1 3PU, UK
- Structural Bioinformatics and Computational Biochemistry, Department of Biochemistry, University of Oxford, Oxford, OX1 3QU, UK.
| | - Stephen J Tucker
- Clarendon Laboratory, Department of Physics, University of Oxford, Oxford, OX1 3PU, UK
- Kavli Institute for Nanoscience Discovery, University of Oxford, Oxford, UK
| | - Philip C Biggin
- Structural Bioinformatics and Computational Biochemistry, Department of Biochemistry, University of Oxford, Oxford, OX1 3QU, UK.
| |
Collapse
|
45
|
Boothroyd S, Behara PK, Madin OC, Hahn DF, Jang H, Gapsys V, Wagner JR, Horton JT, Dotson DL, Thompson MW, Maat J, Gokey T, Wang LP, Cole DJ, Gilson MK, Chodera JD, Bayly CI, Shirts MR, Mobley DL. Development and Benchmarking of Open Force Field 2.0.0: The Sage Small Molecule Force Field. J Chem Theory Comput 2023; 19:3251-3275. [PMID: 37167319 PMCID: PMC10269353 DOI: 10.1021/acs.jctc.3c00039] [Citation(s) in RCA: 35] [Impact Index Per Article: 17.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2023] [Indexed: 05/13/2023]
Abstract
We introduce the Open Force Field (OpenFF) 2.0.0 small molecule force field for drug-like molecules, code-named Sage, which builds upon our previous iteration, Parsley. OpenFF force fields are based on direct chemical perception, which generalizes easily to highly diverse sets of chemistries based on substructure queries. Like the previous OpenFF iterations, the Sage generation of OpenFF force fields was validated in protein-ligand simulations to be compatible with AMBER biopolymer force fields. In this work, we detail the methodology used to develop this force field, as well as the innovations and improvements introduced since the release of Parsley 1.0.0. One particularly significant feature of Sage is a set of improved Lennard-Jones (LJ) parameters retrained against condensed phase mixture data, the first refit of LJ parameters in the OpenFF small molecule force field line. Sage also includes valence parameters refit to a larger database of quantum chemical calculations than previous versions, as well as improvements in how this fitting is performed. Force field benchmarks show improvements in general metrics of performance against quantum chemistry reference data such as root-mean-square deviations (RMSD) of optimized conformer geometries, torsion fingerprint deviations (TFD), and improved relative conformer energetics (ΔΔE). We present a variety of benchmarks for these metrics against our previous force fields as well as in some cases other small molecule force fields. Sage also demonstrates improved performance in estimating physical properties, including comparison against experimental data from various thermodynamic databases for small molecule properties such as ΔHmix, ρ(x), ΔGsolv, and ΔGtrans. Additionally, we benchmarked against protein-ligand binding free energies (ΔGbind), where Sage yields results statistically similar to previous force fields. All the data is made publicly available along with complete details on how to reproduce the training results at https://github.com/openforcefield/openff-sage.
Collapse
Affiliation(s)
| | - Pavan Kumar Behara
- Department
of Pharmaceutical Sciences, University of
California, Irvine, California 92697, United States
| | - Owen C. Madin
- Chemical
& Biological Engineering Department, University of Colorado Boulder, Boulder, Colorado 80309, United States
| | - David F. Hahn
- Computational
Chemistry, Janssen Research & Development, Turnhoutseweg 30, Beerse B-2340, Belgium
| | - Hyesu Jang
- Chemistry
Department, The University of California
at Davis, Davis, California 95616, United States
- OpenEye
Scientific Software, Santa
Fe, New Mexico 87508, United States
| | - Vytautas Gapsys
- Computational
Chemistry, Janssen Research & Development, Turnhoutseweg 30, Beerse B-2340, Belgium
- Computational
Biomolecular Dynamics Group, Department of Theoretical and Computational
Biophysics, Max Planck Institute for Multidisciplinary
Sciences, Am Fassberg 11, D-37077, Göttingen, Germany
| | - Jeffrey R. Wagner
- Department
of Pharmaceutical Sciences, University of
California, Irvine, California 92697, United States
- The Open
Force Field Initiative, Open Molecular Software
Foundation, Davis, California 95616, United States
| | - Joshua T. Horton
- School
of Natural and Environmental Sciences, Newcastle
University, Newcastle
upon Tyne NE1 7RU, U.K.
| | - David L. Dotson
- The Open
Force Field Initiative, Open Molecular Software
Foundation, Davis, California 95616, United States
- Datryllic LLC, Phoenix, Arizona 85003, United
States
| | - Matthew W. Thompson
- Chemical
& Biological Engineering Department, University of Colorado Boulder, Boulder, Colorado 80309, United States
- The Open
Force Field Initiative, Open Molecular Software
Foundation, Davis, California 95616, United States
| | - Jessica Maat
- Department
of Chemistry, University of California, Irvine, California 92697, United States
| | - Trevor Gokey
- Department
of Chemistry, University of California, Irvine, California 92697, United States
| | - Lee-Ping Wang
- Chemistry
Department, The University of California
at Davis, Davis, California 95616, United States
| | - Daniel J. Cole
- School
of Natural and Environmental Sciences, Newcastle
University, Newcastle
upon Tyne NE1 7RU, U.K.
| | - Michael K. Gilson
- Skaggs
School of Pharmacy and Pharmaceutical Sciences, The University of California at San Diego, La Jolla, California 92093, United States
| | - John D. Chodera
- Computational
& Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, New York 10065, United States
| | | | - Michael R. Shirts
- Chemical
& Biological Engineering Department, University of Colorado Boulder, Boulder, Colorado 80309, United States
| | - David L. Mobley
- Department
of Pharmaceutical Sciences, University of
California, Irvine, California 92697, United States
- Department
of Chemistry, University of California, Irvine, California 92697, United States
| |
Collapse
|
46
|
Morado J, Mortenson PN, Nissink JWM, Essex JW, Skylaris CK. Does a Machine-Learned Potential Perform Better Than an Optimally Tuned Traditional Force Field? A Case Study on Fluorohydrins. J Chem Inf Model 2023; 63:2810-2827. [PMID: 37071825 PMCID: PMC10170518 DOI: 10.1021/acs.jcim.2c01510] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/20/2023]
Abstract
We present a comparative study that evaluates the performance of a machine learning potential (ANI-2x), a conventional force field (GAFF), and an optimally tuned GAFF-like force field in the modeling of a set of 10 γ-fluorohydrins that exhibit a complex interplay between intra- and intermolecular interactions in determining conformer stability. To benchmark the performance of each molecular model, we evaluated their energetic, geometric, and sampling accuracies relative to quantum-mechanical data. This benchmark involved conformational analysis both in the gas phase and chloroform solution. We also assessed the performance of the aforementioned molecular models in estimating nuclear spin-spin coupling constants by comparing their predictions to experimental data available in chloroform. The results and discussion presented in this study demonstrate that ANI-2x tends to predict stronger-than-expected hydrogen bonding and overstabilize global minima and shows problems related to inadequate description of dispersion interactions. Furthermore, while ANI-2x is a viable model for modeling in the gas phase, conventional force fields still play an important role, especially for condensed-phase simulations. Overall, this study highlights the strengths and weaknesses of each model, providing guidelines for the use and future development of force fields and machine learning potentials.
Collapse
Affiliation(s)
- João Morado
- School of Chemistry, University of Southampton, Highfield, Southampton SO17 1BJ, United Kingdom
| | - Paul N Mortenson
- Astex Pharmaceuticals, 436 Cambridge Science Park, Milton Road, Cambridge CB4 0QA, United Kingdom
| | - J Willem M Nissink
- Computational Chemistry, Oncology R&D, AstraZeneca, Cambridge CB4 0WG, United Kingdom
| | - Jonathan W Essex
- School of Chemistry, University of Southampton, Highfield, Southampton SO17 1BJ, United Kingdom
| | - Chris-Kriton Skylaris
- School of Chemistry, University of Southampton, Highfield, Southampton SO17 1BJ, United Kingdom
| |
Collapse
|
47
|
Friedman AJ, Padgette HM, Kramer L, Liechty ET, Donovan GW, Fox JM, Shirts MR. A biophysical rationale for the selective inhibition of PTP1B over TCPTP by nonpolar terpenoids. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.04.17.537234. [PMID: 37131728 PMCID: PMC10153121 DOI: 10.1101/2023.04.17.537234] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Protein tyrosine phosphatases (PTPs) are emerging drug targets for many diseases, including type 2 diabetes, obesity, and cancer. However, a high degree of structural similarity between the catalytic domains of these enzymes has made the development of selective pharmacological inhibitors an enormous challenge. Our previous research uncovered two unfunctionalized terpenoid inhibitors that selectively inhibit PTP1B over TCPTP, two PTPs with high sequence conservation. Here, we use molecular modeling with experimental validation to study the molecular basis of this unusual selectivity. Molecular dynamics (MD) simulations indicate that PTP1B and TCPTP contain a conserved h-bond network that connects the active site to a distal allosteric pocket; this network stabilizes the closed conformation of the catalytically influential WPD loop, which it links to the L-11 loop and α 3 and α 7 helices-the C-terminal side of the catalytic domain. Terpenoid binding to either of two proximal allosteric sites-an α site and a β site-can disrupt the allosteric network. Interestingly, binding to the α site forms a stable complex with only PTP1B; in TCPTP, where two charged residues disfavor binding at the α site, the terpenoids bind to the β site, which is conserved between the two proteins. Our findings indicate that minor amino acid differences at the poorly conserved α site enable selective binding, a property that might be enhanced with chemical elaboration, and illustrate, more broadly, how minor differences in the conservation of neighboring-yet functionally similar-allosteric sites can have very different implications for inhibitor selectivity.
Collapse
Affiliation(s)
- Anika J Friedman
- University of Colorado Boulder Department of Chemical and Biological Engineering, University of Colorado Boulder, Boulder, CO 80309
| | - Hannah M Padgette
- University of Colorado Boulder Department of Chemical and Biological Engineering, University of Colorado Boulder, Boulder, CO 80309
| | - Levi Kramer
- University of Colorado Boulder Department of Chemical and Biological Engineering, University of Colorado Boulder, Boulder, CO 80309
| | - Evan T Liechty
- University of Colorado Boulder Department of Chemical and Biological Engineering, University of Colorado Boulder, Boulder, CO 80309
| | - Gregory W Donovan
- University of Colorado Boulder Department of Chemical and Biological Engineering, University of Colorado Boulder, Boulder, CO 80309
| | - Jerome M Fox
- University of Colorado Boulder Department of Chemical and Biological Engineering, University of Colorado Boulder, Boulder, CO 80309
| | - Michael R Shirts
- University of Colorado Boulder Department of Chemical and Biological Engineering, University of Colorado Boulder, Boulder, CO 80309
| |
Collapse
|
48
|
Müller M, Hagg A, Strickstrock R, Hülsmann M, Asteroth A, Kirschner KN, Reith D. Determining Lennard-Jones Parameters Using Multiscale Target Data through Presampling-Enhanced, Surrogate-Assisted Global Optimization. J Chem Inf Model 2023; 63:1872-1881. [PMID: 36942658 DOI: 10.1021/acs.jcim.2c01231] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/23/2023]
Abstract
Force field-based models are a Newtonian mechanics approximation of reality and are inherently noisy. Coupling models from different molecular scale domains (including single, gas-phase molecules up to multimolecule, condensed phase ensembles) is difficult, which is also the case for finding solutions that transfer well between the scales. In this contribution, we introduce a surrogate-assisted algorithm to optimize Lennard-Jones parameters for target data from different scale domains to overcome the difficulties named above. Specifically, our approach combines a surrogate-assisted global evolutionary optimization method with a presampling phase that takes advantage of one scale domain being less computationally expensive to evaluate. The algorithm's components were evaluated individually, elucidating their individual merits. Our findings show that the process of parametrizing force fields can significantly benefit from both the presampling method, which alleviates the need to have a good initial guess for the parameters, and the surrogate model, which improves efficiency.
Collapse
Affiliation(s)
- Max Müller
- Department of Computer Science, Bonn-Rhein-Sieg University of Applied Sciences, 53757 Sankt Augustin, Germany
| | - Alexander Hagg
- Department of Electrical Engineering, Mechanical Engineering and Technical Journalism, Bonn-Rhein-Sieg University of Applied Sciences, 53757 Sankt Augustin, Germany
| | - Robin Strickstrock
- Department of Electrical Engineering, Mechanical Engineering and Technical Journalism, Bonn-Rhein-Sieg University of Applied Sciences, 53757 Sankt Augustin, Germany
| | - Marco Hülsmann
- Department of Computer Science, Bonn-Rhein-Sieg University of Applied Sciences, 53757 Sankt Augustin, Germany
| | - Alexander Asteroth
- Department of Computer Science, Bonn-Rhein-Sieg University of Applied Sciences, 53757 Sankt Augustin, Germany
| | - Karl N Kirschner
- Department of Computer Science, Bonn-Rhein-Sieg University of Applied Sciences, 53757 Sankt Augustin, Germany
| | - Dirk Reith
- Department of Electrical Engineering, Mechanical Engineering and Technical Journalism, Bonn-Rhein-Sieg University of Applied Sciences, 53757 Sankt Augustin, Germany
- Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Schloss Birlinghoven, 53757 Sankt Augustin, Germany
| |
Collapse
|
49
|
Charvati E, Sun H. Potential Energy Surfaces Sampled in Cremer-Pople Coordinates and Represented by Common Force Field Functionals for Small Cyclic Molecules. J Phys Chem A 2023; 127:2646-2663. [PMID: 36893434 DOI: 10.1021/acs.jpca.3c00095] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/11/2023]
Abstract
The complex conformations of the cyclic moieties impact the physical and chemical properties of molecules. In this work, we chose 22 molecules of four-, five-, and six-membered rings and performed a thorough conformational sampling using Cremer-Pople coordinates. With consideration of symmetries, we obtained a total of 1504 conformational structures for four-membered, 5576 for five-membered, and 13509 for six-membered rings. All well-known and many less well-known conformers for each molecule were identified. We represented the potential energy surfaces (PESs) by fitting the data to common analytical force field (FF) functional forms. We found that the general features of PESs can be described by the essential FF functional forms; however, the accuracy of representation can be improved remarkably by including the torsion-bond and torsion-angle coupling terms. The best fit yields R-squared (R2) values close to 1.0 and mean absolute errors in energy less than 0.3 kcal/mol.
Collapse
Affiliation(s)
- Evangelia Charvati
- School of Chemistry and Chemical Engineering, Materials Genome Initiative Center, and Key Laboratory of Scientific and Engineering Computing of Ministry of Education, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Huai Sun
- School of Chemistry and Chemical Engineering, Materials Genome Initiative Center, and Key Laboratory of Scientific and Engineering Computing of Ministry of Education, Shanghai Jiao Tong University, Shanghai 200240, China
| |
Collapse
|
50
|
Crawford B, Timalsina U, Quach CD, Craven NC, Gilmer JB, McCabe C, Cummings PT, Potoff JJ. MoSDeF-GOMC: Python Software for the Creation of Scientific Workflows for the Monte Carlo Simulation Engine GOMC. J Chem Inf Model 2023; 63:1218-1228. [PMID: 36791286 DOI: 10.1021/acs.jcim.2c01498] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/17/2023]
Abstract
MoSDeF-GOMC is a python interface for the Monte Carlo software GOMC to the Molecular Simulation Design Framework (MoSDeF) ecosystem. MoSDeF-GOMC automates the process of generating initial coordinates, assigning force field parameters, and writing coordinate (PDB), connectivity (PSF), force field parameter, and simulation control files. The software lowers entry barriers for novice users while allowing advanced users to create complex workflows that encapsulate simulation setup, execution, and data analysis in a single script. All relevant simulation parameters are encoded within the workflow, ensuring reproducible simulations. MoSDeF-GOMC's capabilities are illustrated through a number of examples, including prediction of the adsorption isotherm for CO2 in IRMOF-1, free energies of hydration for neon and radon over a broad temperature range, and the vapor-liquid coexistence curve of a four-component surrogate for the jet fuel S-8. The MoSDeF-GOMC software is available on GitHub at https://github.com/GOMC-WSU/MoSDeF-GOMC.
Collapse
Affiliation(s)
- Brad Crawford
- Department of Chemical Engineering, Wayne State University, Detroit, Michigan 48202-4050, United States
| | - Umesh Timalsina
- Institute for Software Integrated Systems (ISIS), Vanderbilt University, Nashville, Tennessee 37212, United States
| | - Co D Quach
- Department of Chemical and Biomolecular Engineering, Vanderbilt University, Nashville, Tennessee 37235-1604, United States.,Multiscale Modeling and Simulation (MuMS) Center, Vanderbilt University, Nashville, Tennessee 37212, United States
| | - Nicholas C Craven
- Multiscale Modeling and Simulation (MuMS) Center, Vanderbilt University, Nashville, Tennessee 37212, United States.,Interdisciplinary Material Science Program, Vanderbilt University, Nashville, Tennessee 37235-0106, United States
| | - Justin B Gilmer
- Multiscale Modeling and Simulation (MuMS) Center, Vanderbilt University, Nashville, Tennessee 37212, United States.,Interdisciplinary Material Science Program, Vanderbilt University, Nashville, Tennessee 37235-0106, United States
| | - Clare McCabe
- Department of Chemical and Biomolecular Engineering, Vanderbilt University, Nashville, Tennessee 37235-1604, United States.,Multiscale Modeling and Simulation (MuMS) Center, Vanderbilt University, Nashville, Tennessee 37212, United States
| | - Peter T Cummings
- Department of Chemical and Biomolecular Engineering, Vanderbilt University, Nashville, Tennessee 37235-1604, United States.,Multiscale Modeling and Simulation (MuMS) Center, Vanderbilt University, Nashville, Tennessee 37212, United States
| | - Jeffrey J Potoff
- Department of Chemical Engineering, Wayne State University, Detroit, Michigan 48202-4050, United States
| |
Collapse
|