1
|
Wang L, Behara PK, Thompson MW, Gokey T, Wang Y, Wagner JR, Cole DJ, Gilson MK, Shirts MR, Mobley DL. The Open Force Field Initiative: Open Software and Open Science for Molecular Modeling. J Phys Chem B 2024; 128:7043-7067. [PMID: 38989715 DOI: 10.1021/acs.jpcb.4c01558] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/12/2024]
Abstract
Force fields are a key component of physics-based molecular modeling, describing the energies and forces in a molecular system as a function of the positions of the atoms and molecules involved. Here, we provide a review and scientific status report on the work of the Open Force Field (OpenFF) Initiative, which focuses on the science, infrastructure and data required to build the next generation of biomolecular force fields. We introduce the OpenFF Initiative and the related OpenFF Consortium, describe its approach to force field development and software, and discuss accomplishments to date as well as future plans. OpenFF releases both software and data under open and permissive licensing agreements to enable rapid application, validation, extension, and modification of its force fields and software tools. We discuss lessons learned to date in this new approach to force field development. We also highlight ways that other force field researchers can get involved, as well as some recent successes of outside researchers taking advantage of OpenFF tools and data.
Collapse
Affiliation(s)
- Lily Wang
- Open Force Field, Open Molecular Software Foundation, Davis, California 95616, United States
| | - Pavan Kumar Behara
- Center for Neurotherapeutics, University of California, Irvine, California 92697, United States
| | - Matthew W Thompson
- Open Force Field, Open Molecular Software Foundation, Davis, California 95616, United States
| | - Trevor Gokey
- Department of Chemistry, University of California, Irvine, California 92697, United States
| | - Yuanqing Wang
- Simons Center for Computational Physical Chemistry and Center for Data Science, New York, New York 10004, United States
| | - Jeffrey R Wagner
- Open Force Field, Open Molecular Software Foundation, Davis, California 95616, United States
| | - Daniel J Cole
- School of Natural and Environmental Sciences, Newcastle University, Newcastle upon Tyne NE1 7RU, United Kingdom
| | - Michael K Gilson
- Skaggs School of Pharmacy and Pharmaceutical Sciences, The University of California at San Diego, La Jolla, California 92093, United States
| | - Michael R Shirts
- Department of Chemical and Biological Engineering, University of Colorado Boulder, Boulder, Colorado 80305, United States
| | - David L Mobley
- Department of Chemistry, University of California, Irvine, California 92697, United States
- Department of Pharmaceutical Sciences, University of California, Irvine, California 92697, United States
| |
Collapse
|
2
|
Cheng Z, Bi H, Liu S, Chen J, Misquitta AJ, Yu K. Developing a Differentiable Long-Range Force Field for Proteins with E(3) Neural Network-Predicted Asymptotic Parameters. J Chem Theory Comput 2024; 20:5598-5608. [PMID: 38888427 DOI: 10.1021/acs.jctc.4c00337] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/20/2024]
Abstract
Accurately describing long-range interactions is a significant challenge in molecular dynamics (MD) simulations of proteins. High-quality long-range potential is also an important component of the range-separated machine learning force field. This study introduces a comprehensive asymptotic parameter database encompassing atomic multipole moments, polarizabilities, and dispersion coefficients. Leveraging active learning, our database comprehensively represents protein fragments with up to 8 heavy atoms, capturing their conformational diversity with merely 78,000 data points. Additionally, the E(3) neural network (E3NN) is employed to predict the asymptotic parameters directly from the local geometry. The E3NN models demonstrate exceptional accuracy and transferability across all asymptotic parameters, achieving an R2 of 0.999 for both protein fragments and 20 amino acid dipeptide test sets. The long-range electrostatic and dispersion energies can be obtained using the E3NN-predicted parameters, with an error of 0.07 and 0.02 kcal/mol, respectively, when compared to symmetry-adapted perturbation theory (SAPT). Therefore, our force fields demonstrate the capability to accurately describe long-range interactions in proteins, paving the way for next-generation protein force fields.
Collapse
Affiliation(s)
- Zheng Cheng
- School of Mathematical Sciences, Peking University, Beijing 100871, China
- AI for Science Institute, Beijing 100084, P. R. China
| | - Hangrui Bi
- School of Mathematical Sciences, Peking University, Beijing 100871, China
- DP Technology, Beijing 100080, P. R. China
| | - Siyuan Liu
- DP Technology, Beijing 100080, P. R. China
| | - Junmin Chen
- Tsinghua-Berkeley Shenzhen Institute, Shenzhen 518055, Guangdong, P. R. China
- Tsinghua Shenzhen International Graduate School, Shenzhen 518055, Guangdong, P. R. China
| | - Alston J Misquitta
- School of Physics and Astronomy, Queen Mary, University of London, London E1 4NS, U.K
| | - Kuang Yu
- Tsinghua-Berkeley Shenzhen Institute, Shenzhen 518055, Guangdong, P. R. China
- Tsinghua Shenzhen International Graduate School, Shenzhen 518055, Guangdong, P. R. China
| |
Collapse
|
3
|
Kumar A, MacKerell AD. FFParam-v2.0: A Comprehensive Tool for CHARMM Additive and Drude Polarizable Force-Field Parameter Optimization and Validation. J Phys Chem B 2024; 128:4385-4395. [PMID: 38690986 PMCID: PMC11260432 DOI: 10.1021/acs.jpcb.4c01314] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/03/2024]
Abstract
Developing production quality CHARMM force-field (FF) parameters is a very detailed process involving a variety of calculations, many of which are specific for the molecule of interest. The first version of FFParam was developed as a standalone Python package designed for the optimization of electrostatic and bonded parameters of the CHARMM additive and polarizable Drude FFs by using quantum mechanical (QM) target data. The new version of FFParam has multiple new capabilities for FF parameter optimization and validation, with an emphasis on the ability to use condensed-phase target data in optimization. FFParam-v2 allows optimization of Lennard-Jones (LJ) parameters using potential energy scans of interactions between selected atoms in a molecule and noble gases, viz., He and Ne, and through condensed-phase calculations, from which experimental observables such as heats of vaporization and free energies of solvation may be obtained. This functionality serves as a gold standard for both optimizing parameters and validating the performance of the final parameters. A new bonded parameter optimization algorithm has been introduced to account for simultaneously optimizing multiple molecules sharing parameters. FFParam-v2 also supports the comparison of normal modes and the potential energy distribution of internal coordinates towards each normal mode obtained from QM and molecular mechanics calculations. Such comparison capability is vital to validate the balance among various bonded parameters that contribute to the complex normal modes of molecules. User interaction has been extended beyond the original graphical user interface to include command-line interface capabilities that allow for integration of FFParam in workflows, thereby facilitating the automation of parameter optimization. With these new functionalities, FFParam is a more comprehensive parameter optimization tool for both beginners and advanced users.
Collapse
Affiliation(s)
- Anmol Kumar
- Department of Pharmaceutical Sciences, School of Pharmacy, University of Maryland, Baltimore, MD 21201, USA
| | - Alexander D. MacKerell
- Department of Pharmaceutical Sciences, School of Pharmacy, University of Maryland, Baltimore, MD 21201, USA
| |
Collapse
|
4
|
Orlando G, Serrano L, Schymkowitz J, Rousseau F. Integrating physics in deep learning algorithms: a force field as a PyTorch module. Bioinformatics 2024; 40:btae160. [PMID: 38514422 PMCID: PMC11007235 DOI: 10.1093/bioinformatics/btae160] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2023] [Revised: 02/08/2024] [Accepted: 03/19/2024] [Indexed: 03/23/2024] Open
Abstract
MOTIVATION Deep learning algorithms applied to structural biology often struggle to converge to meaningful solutions when limited data is available, since they are required to learn complex physical rules from examples. State-of-the-art force-fields, however, cannot interface with deep learning algorithms due to their implementation. RESULTS We present MadraX, a forcefield implemented as a differentiable PyTorch module, able to interact with deep learning algorithms in an end-to-end fashion. AVAILABILITY AND IMPLEMENTATION MadraX documentation, together with tutorials and installation guide, is available at madrax.readthedocs.io.
Collapse
Affiliation(s)
- Gabriele Orlando
- Switch Laboratory, VIB Center for Brain and Disease Research, VIB, Leuven 3000, Belgium
- Switch Laboratory, Department of Cellular and Molecular Medicine, KU Leuven, Leuven 3000, Belgium
- Switch Laboratory, VIB Center for AI & Computational Biology, VIB, Leuven 3000, Belgium
| | - Luis Serrano
- Centre for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology, Dr Aiguader 88, Barcelona 08003, Spain
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
- IC REA, Pg. Lluis Companys 23, Barcelona 08010, Spain
| | - Joost Schymkowitz
- Switch Laboratory, VIB Center for Brain and Disease Research, VIB, Leuven 3000, Belgium
- Switch Laboratory, Department of Cellular and Molecular Medicine, KU Leuven, Leuven 3000, Belgium
- Switch Laboratory, VIB Center for AI & Computational Biology, VIB, Leuven 3000, Belgium
| | - Frederic Rousseau
- Switch Laboratory, VIB Center for Brain and Disease Research, VIB, Leuven 3000, Belgium
- Switch Laboratory, Department of Cellular and Molecular Medicine, KU Leuven, Leuven 3000, Belgium
- Switch Laboratory, VIB Center for AI & Computational Biology, VIB, Leuven 3000, Belgium
| |
Collapse
|
5
|
Chen J, Yu K. PhyNEO: A Neural-Network-Enhanced Physics-Driven Force Field Development Workflow for Bulk Organic Molecule and Polymer Simulations. J Chem Theory Comput 2024; 20:253-265. [PMID: 38118076 DOI: 10.1021/acs.jctc.3c01045] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2023]
Abstract
An accurate, generalizable, and transferable force field plays a crucial role in the molecular dynamics simulations of organic polymers and biomolecules. Conventional empirical force fields often fail to capture precise intermolecular interactions due to their negligence of important physics, such as polarization, charge penetration, many-body dispersion, etc. Moreover, the parameterization of these force fields relies heavily on top-down fittings, limiting their transferabilities to new systems where the experimental data are often unavailable. To address these challenges, we introduce a general and fully ab initio force field construction strategy, named PhyNEO. It features a hybrid approach that combines both the physics-driven and the data-driven methods and is able to generate a bulk potential with chemical accuracy using only quantum chemistry data of very small clusters. Careful separations of long-/short-range interactions and nonbonding/bonding interactions are the key to the success of PhyNEO. By such a strategy, we mitigate the limitations of pure data-driven methods in long-range interactions, thus largely increasing the data efficiency and the scalability of machine learning models. The new approach is thoroughly tested on poly(ethylene oxide) and polyethylene glycol systems, giving superior accuracies in both microscopic and bulk properties compared to conventional force fields. This work thus offers a promising framework for the development of advanced force fields in a wide range of organic molecular systems.
Collapse
Affiliation(s)
- Junmin Chen
- Tsinghua-Berkeley Shenzhen Institute, Tsinghua University, Shenzhen, Guangdong 518055, P. R. China
| | - Kuang Yu
- Tsinghua-Berkeley Shenzhen Institute, Tsinghua University, Shenzhen, Guangdong 518055, P. R. China
- Institute of Materials Research (iMR), Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen, Guangdong 518055, P. R. China
| |
Collapse
|