1
|
Devereux M, Boittier ED, Meuwly M. Systematic improvement of empirical energy functions in the era of machine learning. J Comput Chem 2024; 45:1899-1913. [PMID: 38695412 DOI: 10.1002/jcc.27367] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Revised: 02/13/2024] [Accepted: 02/21/2024] [Indexed: 07/05/2024]
Abstract
The impact of targeted replacement of individual terms in empirical force fields is quantitatively assessed for pure water, dichloromethane (CH 2 Cl 2 ), and solvated K + and Cl - ions. For the electrostatic interactions, point charges (PCs) and machine learning (ML)-based minimally distributed charges (MDCM) fitted to the molecular electrostatic potential are evaluated together with electrostatics based on the Coulomb integral. The impact of explicitly including second-order terms is investigated by adding a fragment molecular orbital (FMO)-derived polarization energy to an existing force field, in this case CHARMM. It is demonstrated that anisotropic electrostatics reduce the RMSE for water (by 1.4 kcal/mol), CH 2 Cl 2 (by 0.8 kcal/mol) and for solvated Cl - clusters (by 0.4 kcal/mol). An additional polarization term can be neglected for CH 2 Cl 2 but further improves the models for pure water (by ∼ 1.0 kcal/mol) and hydrated Cl - (by 0.4 kcal/mol), and is key for solvated K + , reducing the RMSE by 2.3 kcal/mol. A 12-6 Lennard-Jones functional form performs satisfactorily with PC and MDCM electrostatics, but is not appropriate for descriptions that account for the electrostatic penetration energy. The importance of many-body contributions is assessed by comparing a strictly 2-body approach with self-consistent reference data. Two-body interactions suffice for CH 2 Cl 2 whereas water and solvated K + and Cl - ions require explicit many-body corrections. Finally, a many-body-corrected dimer potential energy surface exceeds the accuracy attained using a conventional empirical force field, potentially reaching that of an FMO calculation. The present work systematically quantifies which terms improve the performance of an existing force field and what reference data to use for parametrizing these terms in a tractable fashion for ML fitting of pure and heterogeneous systems.
Collapse
Affiliation(s)
- Mike Devereux
- Department of Chemistry, University of Basel, Basel, Switzerland
| | - Eric D Boittier
- Department of Chemistry, University of Basel, Basel, Switzerland
| | - Markus Meuwly
- Department of Chemistry, University of Basel, Basel, Switzerland
| |
Collapse
|
2
|
Cao Y, Balduf T, Beachy MD, Bennett MC, Bochevarov AD, Chien A, Dub PA, Dyall KG, Furness JW, Halls MD, Hughes TF, Jacobson LD, Kwak HS, Levine DS, Mainz DT, Moore KB, Svensson M, Videla PE, Watson MA, Friesner RA. Quantum chemical package Jaguar: A survey of recent developments and unique features. J Chem Phys 2024; 161:052502. [PMID: 39092934 DOI: 10.1063/5.0213317] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2024] [Accepted: 07/12/2024] [Indexed: 08/04/2024] Open
Abstract
This paper is dedicated to the quantum chemical package Jaguar, which is commercial software developed and distributed by Schrödinger, Inc. We discuss Jaguar's scientific features that are relevant to chemical research as well as describe those aspects of the program that are pertinent to the user interface, the organization of the computer code, and its maintenance and testing. Among the scientific topics that feature prominently in this paper are the quantum chemical methods grounded in the pseudospectral approach. A number of multistep workflows dependent on Jaguar are covered: prediction of protonation equilibria in aqueous solutions (particularly calculations of tautomeric stability and pKa), reactivity predictions based on automated transition state search, assembly of Boltzmann-averaged spectra such as vibrational and electronic circular dichroism, as well as nuclear magnetic resonance. Discussed also are quantum chemical calculations that are oriented toward materials science applications, in particular, prediction of properties of optoelectronic materials and organic semiconductors, and molecular catalyst design. The topic of treatment of conformations inevitably comes up in real world research projects and is considered as part of all the workflows mentioned above. In addition, we examine the role of machine learning methods in quantum chemical calculations performed by Jaguar, from auxiliary functions that return the approximate calculation runtime in a user interface, to prediction of actual molecular properties. The current work is second in a series of reviews of Jaguar, the first having been published more than ten years ago. Thus, this paper serves as a rare milestone on the path that is being traversed by Jaguar's development in more than thirty years of its existence.
Collapse
Affiliation(s)
- Yixiang Cao
- Schrödinger, Inc., 1540 Broadway, Floor 24, New York, New York 10036, USA
| | - Ty Balduf
- Schrödinger, Inc., 1540 Broadway, Floor 24, New York, New York 10036, USA
| | - Michael D Beachy
- Schrödinger, Inc., 1540 Broadway, Floor 24, New York, New York 10036, USA
| | - M Chandler Bennett
- Schrödinger, Inc., 1540 Broadway, Floor 24, New York, New York 10036, USA
| | - Art D Bochevarov
- Schrödinger, Inc., 1540 Broadway, Floor 24, New York, New York 10036, USA
| | - Alan Chien
- Schrödinger, Inc., 1540 Broadway, Floor 24, New York, New York 10036, USA
| | - Pavel A Dub
- Schrödinger, Inc., 9868 Scranton Road, Suite 3200, San Diego, California 92121, USA
| | - Kenneth G Dyall
- Schrödinger, Inc., 101 SW Main St., Suite 1300, Portland, Oregon 97204, USA
| | - James W Furness
- Schrödinger, Inc., 1540 Broadway, Floor 24, New York, New York 10036, USA
| | - Mathew D Halls
- Schrödinger, Inc., 9868 Scranton Road, Suite 3200, San Diego, California 92121, USA
| | - Thomas F Hughes
- Schrödinger, Inc., 1540 Broadway, Floor 24, New York, New York 10036, USA
| | - Leif D Jacobson
- Schrödinger, Inc., 101 SW Main St., Suite 1300, Portland, Oregon 97204, USA
| | - H Shaun Kwak
- Schrödinger, Inc., 101 SW Main St., Suite 1300, Portland, Oregon 97204, USA
| | - Daniel S Levine
- Schrödinger, Inc., 1540 Broadway, Floor 24, New York, New York 10036, USA
| | - Daniel T Mainz
- Schrödinger, Inc., 1540 Broadway, Floor 24, New York, New York 10036, USA
| | - Kevin B Moore
- Schrödinger, Inc., 1540 Broadway, Floor 24, New York, New York 10036, USA
| | - Mats Svensson
- Schrödinger, Inc., 1540 Broadway, Floor 24, New York, New York 10036, USA
| | - Pablo E Videla
- Schrödinger, Inc., 1540 Broadway, Floor 24, New York, New York 10036, USA
| | - Mark A Watson
- Schrödinger, Inc., 1540 Broadway, Floor 24, New York, New York 10036, USA
| | - Richard A Friesner
- Department of Chemistry, Columbia University, 3000 Broadway, New York, New York 10027, USA
| |
Collapse
|
3
|
Bigi F, Pozdnyakov SN, Ceriotti M. Wigner kernels: Body-ordered equivariant machine learning without a basis. J Chem Phys 2024; 161:044116. [PMID: 39056390 DOI: 10.1063/5.0208746] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2024] [Accepted: 06/10/2024] [Indexed: 07/28/2024] Open
Abstract
Machine-learning models based on a point-cloud representation of a physical object are ubiquitous in scientific applications and particularly well-suited to the atomic-scale description of molecules and materials. Among the many different approaches that have been pursued, the description of local atomic environments in terms of their discretized neighbor densities has been used widely and very successfully. We propose a novel density-based method, which involves computing "Wigner kernels." These are fully equivariant and body-ordered kernels that can be computed iteratively at a cost that is independent of the basis used to discretize the density and grows only linearly with the maximum body-order considered. Wigner kernels represent the infinite-width limit of feature-space models, whose dimensionality and computational cost instead scale exponentially with the increasing order of correlations. We present several examples of the accuracy of models based on Wigner kernels in chemical applications, for both scalar and tensorial targets, reaching an accuracy that is competitive with state-of-the-art deep-learning architectures. We discuss the broader relevance of these findings to equivariant geometric machine-learning.
Collapse
Affiliation(s)
- Filippo Bigi
- Laboratory of Computational Science and Modeling, Institut des Matériaux, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
| | - Sergey N Pozdnyakov
- Laboratory of Computational Science and Modeling, Institut des Matériaux, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
| | - Michele Ceriotti
- Laboratory of Computational Science and Modeling, Institut des Matériaux, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
| |
Collapse
|
4
|
Plé T, Adjoua O, Lagardère L, Piquemal JP. FeNNol: An efficient and flexible library for building force-field-enhanced neural network potentials. J Chem Phys 2024; 161:042502. [PMID: 39051830 DOI: 10.1063/5.0217688] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2024] [Accepted: 06/28/2024] [Indexed: 07/27/2024] Open
Abstract
Neural network interatomic potentials (NNPs) have recently proven to be powerful tools to accurately model complex molecular systems while bypassing the high numerical cost of ab initio molecular dynamics simulations. In recent years, numerous advances in model architectures as well as the development of hybrid models combining machine-learning (ML) with more traditional, physically motivated, force-field interactions have considerably increased the design space of ML potentials. In this paper, we present FeNNol, a new library for building, training, and running force-field-enhanced neural network potentials. It provides a flexible and modular system for building hybrid models, allowing us to easily combine state-of-the-art embeddings with ML-parameterized physical interaction terms without the need for explicit programming. Furthermore, FeNNol leverages the automatic differentiation and just-in-time compilation features of the Jax Python library to enable fast evaluation of NNPs, shrinking the performance gap between ML potentials and standard force-fields. This is demonstrated with the popular ANI-2x model reaching simulation speeds nearly on par with the AMOEBA polarizable force-field on commodity GPUs (graphics processing units). We hope that FeNNol will facilitate the development and application of new hybrid NNP architectures for a wide range of molecular simulation problems.
Collapse
Affiliation(s)
- Thomas Plé
- Sorbonne Université, LCT, UMR 7616 CNRS, 75005 Paris, France
| | - Olivier Adjoua
- Sorbonne Université, LCT, UMR 7616 CNRS, 75005 Paris, France
| | - Louis Lagardère
- Sorbonne Université, LCT, UMR 7616 CNRS, 75005 Paris, France
| | | |
Collapse
|
5
|
Zills F, Schäfer MR, Tovey S, Kästner J, Holm C. Machine learning-driven investigation of the structure and dynamics of the BMIM-BF 4 room temperature ionic liquid. Faraday Discuss 2024. [PMID: 39056186 DOI: 10.1039/d4fd00025k] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/28/2024]
Abstract
Room-temperature ionic liquids are an exciting group of materials with the potential to revolutionize energy storage. Due to their chemical structure and means of interaction, they are challenging to study computationally. Classical descriptions of their inter- and intra-molecular interactions require time intensive parametrization of force-fields which is prone to assumptions. While ab initio molecular dynamics approaches can capture all necessary interactions, they are too slow to achieve the time and length scales required. In this work, we take a step towards addressing these challenges by applying state-of-the-art machine-learned potentials to the simulation of 1-butyl-3-methylimidazolium tetrafluoroborate. We demonstrate a learning-on-the-fly procedure to train machine-learned potentials from single-point density functional theory calculations before performing production molecular dynamics simulations. Obtained structural and dynamical properties are in good agreement with computational and experimental references. Furthermore, our results show that hybrid machine-learned potentials can contribute to an improved prediction accuracy by mitigating the inherent shortsightedness of the models. Given that room-temperature ionic liquids necessitate long simulations to address their slow dynamics, achieving an optimal balance between accuracy and computational cost becomes imperative. To facilitate further investigation of these materials, we have made our IPSuite-based training and simulation workflow publicly accessible, enabling easy replication or adaptation to similar systems.
Collapse
Affiliation(s)
- Fabian Zills
- Institute for Computational Physics, University of Stuttgart, 70569 Stuttgart, Germany.
| | - Moritz René Schäfer
- Institute for Theoretical Chemistry, University of Stuttgart, 70569 Stuttgart, Germany
| | - Samuel Tovey
- Institute for Computational Physics, University of Stuttgart, 70569 Stuttgart, Germany.
| | - Johannes Kästner
- Institute for Theoretical Chemistry, University of Stuttgart, 70569 Stuttgart, Germany
| | - Christian Holm
- Institute for Computational Physics, University of Stuttgart, 70569 Stuttgart, Germany.
| |
Collapse
|
6
|
Slootman E, Poltavsky I, Shinde R, Cocomello J, Moroni S, Tkatchenko A, Filippi C. Accurate Quantum Monte Carlo Forces for Machine-Learned Force Fields: Ethanol as a Benchmark. J Chem Theory Comput 2024; 20:6020-6027. [PMID: 39003522 PMCID: PMC11270822 DOI: 10.1021/acs.jctc.4c00498] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2024] [Revised: 05/31/2024] [Accepted: 06/03/2024] [Indexed: 07/15/2024]
Abstract
Quantum Monte Carlo (QMC) is a powerful method to calculate accurate energies and forces for molecular systems. In this work, we demonstrate how we can obtain accurate QMC forces for the fluxional ethanol molecule at room temperature by using either multideterminant Jastrow-Slater wave functions in variational Monte Carlo or just a single determinant in diffusion Monte Carlo. The excellent performance of our protocols is assessed against high-level coupled cluster calculations on a diverse set of representative configurations of the system. Finally, we train machine-learning force fields on the QMC forces and compare them to models trained on coupled cluster reference data, showing that a force field based on the diffusion Monte Carlo forces with a single determinant can faithfully reproduce coupled cluster power spectra in molecular dynamics simulations.
Collapse
Affiliation(s)
- E. Slootman
- MESA+
Institute for Nanotechnology, University
of Twente, P.O. Box 217,
7500 AE Enschede, The Netherlands
| | - I. Poltavsky
- Department
of Physics and Materials Science, University
of Luxembourg, L-1511 Luxembourg City, Luxembourg
| | - R. Shinde
- MESA+
Institute for Nanotechnology, University
of Twente, P.O. Box 217,
7500 AE Enschede, The Netherlands
| | - J. Cocomello
- MESA+
Institute for Nanotechnology, University
of Twente, P.O. Box 217,
7500 AE Enschede, The Netherlands
| | - S. Moroni
- CNR-IOM
DEMOCRITOS, Istituto Officina dei Materiali,
and SISSA Scuola Internazionale Superiore di Studi Avanzati, Via Bonomea 265, I-34136 Trieste, Italy
| | - A. Tkatchenko
- Department
of Physics and Materials Science, University
of Luxembourg, L-1511 Luxembourg City, Luxembourg
| | - C. Filippi
- MESA+
Institute for Nanotechnology, University
of Twente, P.O. Box 217,
7500 AE Enschede, The Netherlands
| |
Collapse
|
7
|
Zhang H, Juraskova V, Duarte F. Modelling chemical processes in explicit solvents with machine learning potentials. Nat Commun 2024; 15:6114. [PMID: 39030199 PMCID: PMC11271496 DOI: 10.1038/s41467-024-50418-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2023] [Accepted: 07/08/2024] [Indexed: 07/21/2024] Open
Abstract
Solvent effects influence all stages of the chemical processes, modulating the stability of intermediates and transition states, as well as altering reaction rates and product ratios. However, accurately modelling these effects remains challenging. Here, we present a general strategy for generating reactive machine learning potentials to model chemical processes in solution. Our approach combines active learning with descriptor-based selectors and automation, enabling the construction of data-efficient training sets that span the relevant chemical and conformational space. We apply this strategy to investigate a Diels-Alder reaction in water and methanol. The generated machine learning potentials enable us to obtain reaction rates that are in agreement with experimental data and analyse the influence of these solvents on the reaction mechanism. Our strategy offers an efficient approach to the routine modelling of chemical reactions in solution, opening up avenues for studying complex chemical processes in an efficient manner.
Collapse
Affiliation(s)
- Hanwen Zhang
- Chemistry Research Laboratory, Oxford, United Kingdom
| | | | | |
Collapse
|
8
|
van Gerwen P, Briling KR, Bunne C, Somnath VR, Laplaza R, Krause A, Corminboeuf C. 3DReact: Geometric Deep Learning for Chemical Reactions. J Chem Inf Model 2024. [PMID: 39007724 DOI: 10.1021/acs.jcim.4c00104] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/16/2024]
Abstract
Geometric deep learning models, which incorporate the relevant molecular symmetries within the neural network architecture, have considerably improved the accuracy and data efficiency of predictions of molecular properties. Building on this success, we introduce 3DReact, a geometric deep learning model to predict reaction properties from three-dimensional structures of reactants and products. We demonstrate that the invariant version of the model is sufficient for existing reaction data sets. We illustrate its competitive performance on the prediction of activation barriers on the GDB7-22-TS, Cyclo-23-TS, and Proparg-21-TS data sets in different atom-mapping regimes. We show that, compared to existing models for reaction property prediction, 3DReact offers a flexible framework that exploits atom-mapping information, if available, as well as geometries of reactants and products (in an invariant or equivariant fashion). Accordingly, it performs systematically well across different data sets, atom-mapping regimes, as well as both interpolation and extrapolation tasks.
Collapse
Affiliation(s)
- Puck van Gerwen
- Laboratory for Computational Molecular Design, Institute of Chemical Sciences and Engineering, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
- National Center for Competence in Research - Catalysis (NCCR-Catalysis), École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
| | - Ksenia R Briling
- Laboratory for Computational Molecular Design, Institute of Chemical Sciences and Engineering, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
| | - Charlotte Bunne
- National Center for Competence in Research - Catalysis (NCCR-Catalysis), École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
- Learning & Adaptive Systems Group, Department of Computer Science, ETH Zurich, 8092 Zurich, Switzerland
| | - Vignesh Ram Somnath
- National Center for Competence in Research - Catalysis (NCCR-Catalysis), École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
- Learning & Adaptive Systems Group, Department of Computer Science, ETH Zurich, 8092 Zurich, Switzerland
| | - Ruben Laplaza
- Laboratory for Computational Molecular Design, Institute of Chemical Sciences and Engineering, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
- National Center for Competence in Research - Catalysis (NCCR-Catalysis), École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
| | - Andreas Krause
- National Center for Competence in Research - Catalysis (NCCR-Catalysis), École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
- Learning & Adaptive Systems Group, Department of Computer Science, ETH Zurich, 8092 Zurich, Switzerland
| | - Clemence Corminboeuf
- Laboratory for Computational Molecular Design, Institute of Chemical Sciences and Engineering, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
- National Center for Competence in Research - Catalysis (NCCR-Catalysis), École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
| |
Collapse
|
9
|
Shermukhamedov S, Mamurjonova D, Maihom T, Probst M. Structure to Property: Chemical Element Embeddings for Predicting Electronic Properties of Crystals. J Chem Inf Model 2024. [PMID: 39007646 DOI: 10.1021/acs.jcim.3c01990] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/16/2024]
Abstract
We present a new general-purpose machine learning model that is able to predict a variety of crystal properties, including Fermi level energy and band gap, as well as spectral ones such as electronic densities of states. The model is based on atomic representations that enable it to effectively capture complex information about each atom and its surrounding environment in a crystal. The accuracy achieved for band gaps exceeds results previously published. By design, our model is not restricted to the electronic properties discussed here but can be extended to fit diverse chemical descriptors. Its advantages are (a) its low computational requirements, making it an efficient tool for high-throughput screening of materials; and (b) the simplicity and flexibility of its architecture, facilitating implementation and interpretation, especially for researchers in the field of computational chemistry.
Collapse
Affiliation(s)
| | - Dilorom Mamurjonova
- Department of Inorganic Chemistry, Tashkent Chemical Technological Institute, 100011 Tashkent, Uzbekistan
| | - Thana Maihom
- School of Molecular Science and Engineering, Vidyasirimedhi Institute of Science and Technology, 21201 Rayong, Thailand
- Division of Chemistry, Department of Physical and Material Sciences, Faculty of Liberal Arts and Science, Kasetsart University, Kamphaeng Saen Campus, 73140 Nakhon Pathom, Thailand
| | - Michael Probst
- Institute of Ion Physics and Applied Physics, University of Innsbruck, 6020 Innsbruck, Austria
- School of Molecular Science and Engineering, Vidyasirimedhi Institute of Science and Technology, 21201 Rayong, Thailand
| |
Collapse
|
10
|
Shirani H, Hashemianzadeh SM. Quantum-level machine learning calculations of Levodopa. Comput Biol Chem 2024; 112:108146. [PMID: 39067350 DOI: 10.1016/j.compbiolchem.2024.108146] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2024] [Revised: 06/20/2024] [Accepted: 07/08/2024] [Indexed: 07/30/2024]
Abstract
Many drug molecules contain functional groups, resulting in a torsional barrier corresponding to rotation around the bond linking the fragments. In medicinal chemistry and pharmaceutical sciences, inclusive of drug design studies, the exact calculation of the potential energy surface (PES) of these molecular torsions is extremely important and precious. Machine learning (ML), including deep learning (DL), is currently one of the most rapidly evolving tools in computer-aided drug discovery and molecular simulations. In this work, we used ANI-1x neural network potential as a quantum-level ML to predict the PESs of the L-3,4-dihydroxyphenylalanine (Levodopa) antiparkinsonian drug molecule. The electronic energies and structural parameters calculated by density functional theory (DFT) using the wB97X method and all possible Pople's basis sets indicated the 6-31G(d) basis set, when used with the wB97X functional, exhibits behavior similar to that of the ANI-1x model. The vibrational frequencies investigation showed a linear correlation between DFT and ML data. All ANI-1x calculations were completed quickly in a very short computing time. From this perspective, we expect the ANI-1x dataset applied in this work to be appreciably efficient and effective in computational structure-based drug design studies.
Collapse
Affiliation(s)
- Hossein Shirani
- Molecular Simulation Research Laboratory, Department of Chemistry, Iran University of Science and Technology, P.O. Box 16846-13114, Tehran, Iran.
| | - Seyed Majid Hashemianzadeh
- Molecular Simulation Research Laboratory, Department of Chemistry, Iran University of Science and Technology, P.O. Box 16846-13114, Tehran, Iran.
| |
Collapse
|
11
|
Zhu D, Xin Z, Zheng S, Wang Y, Yang X. Addressing the Accuracy-Cost Trade-off in Material Property Prediction Using a Teacher-Student Strategy. J Chem Theory Comput 2024; 20:5743-5750. [PMID: 38875176 DOI: 10.1021/acs.jctc.4c00625] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/16/2024]
Abstract
Deep learning has catalyzed a transformative shift in material discovery, offering a key advantage over traditional experimental and theoretical methods by significantly reducing associated costs. Models adept at predicting properties from chemical compositions alone do not require structural information. However, this cost-efficient approach compromises model precision, particularly in Chemical Composition-based Property Prediction Models (CPMs), which are notably less accurate than Structure-based Property Prediction Models (SPMs). Addressing this challenge, our study introduces a novel Teacher-Student (TS) strategy, where a pretrained SPM serves as an instructive 'teacher' to enhance the CPM's precision. This TS strategy successfully harmonizes low-cost exploration with high accuracy, achieving a significant 47.1% reduction in relative error in scenarios involving 100 data entries. We also evaluate the effectiveness of the proposed strategy by employing perovskites as a case study. This method represents a significant advancement in the exploration and identification of valuable materials, leveraging CPM's potential while overcoming its precision limitations.
Collapse
Affiliation(s)
- Dong Zhu
- Computer Network Information Center, Chinese Academy of Sciences, Beijing 100190, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Zhikuang Xin
- Computer Network Information Center, Chinese Academy of Sciences, Beijing 100190, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Siming Zheng
- Computer Network Information Center, Chinese Academy of Sciences, Beijing 100190, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Yangang Wang
- Computer Network Information Center, Chinese Academy of Sciences, Beijing 100190, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Xiaoyu Yang
- Computer Network Information Center, Chinese Academy of Sciences, Beijing 100190, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| |
Collapse
|
12
|
Cheng Z, Bi H, Liu S, Chen J, Misquitta AJ, Yu K. Developing a Differentiable Long-Range Force Field for Proteins with E(3) Neural Network-Predicted Asymptotic Parameters. J Chem Theory Comput 2024; 20:5598-5608. [PMID: 38888427 DOI: 10.1021/acs.jctc.4c00337] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/20/2024]
Abstract
Accurately describing long-range interactions is a significant challenge in molecular dynamics (MD) simulations of proteins. High-quality long-range potential is also an important component of the range-separated machine learning force field. This study introduces a comprehensive asymptotic parameter database encompassing atomic multipole moments, polarizabilities, and dispersion coefficients. Leveraging active learning, our database comprehensively represents protein fragments with up to 8 heavy atoms, capturing their conformational diversity with merely 78,000 data points. Additionally, the E(3) neural network (E3NN) is employed to predict the asymptotic parameters directly from the local geometry. The E3NN models demonstrate exceptional accuracy and transferability across all asymptotic parameters, achieving an R2 of 0.999 for both protein fragments and 20 amino acid dipeptide test sets. The long-range electrostatic and dispersion energies can be obtained using the E3NN-predicted parameters, with an error of 0.07 and 0.02 kcal/mol, respectively, when compared to symmetry-adapted perturbation theory (SAPT). Therefore, our force fields demonstrate the capability to accurately describe long-range interactions in proteins, paving the way for next-generation protein force fields.
Collapse
Affiliation(s)
- Zheng Cheng
- School of Mathematical Sciences, Peking University, Beijing 100871, China
- AI for Science Institute, Beijing 100084, P. R. China
| | - Hangrui Bi
- School of Mathematical Sciences, Peking University, Beijing 100871, China
- DP Technology, Beijing 100080, P. R. China
| | - Siyuan Liu
- DP Technology, Beijing 100080, P. R. China
| | - Junmin Chen
- Tsinghua-Berkeley Shenzhen Institute, Shenzhen 518055, Guangdong, P. R. China
- Tsinghua Shenzhen International Graduate School, Shenzhen 518055, Guangdong, P. R. China
| | - Alston J Misquitta
- School of Physics and Astronomy, Queen Mary, University of London, London E1 4NS, U.K
| | - Kuang Yu
- Tsinghua-Berkeley Shenzhen Institute, Shenzhen 518055, Guangdong, P. R. China
- Tsinghua Shenzhen International Graduate School, Shenzhen 518055, Guangdong, P. R. China
| |
Collapse
|
13
|
Arango AS, Park H, Tajkhorshid E. Topological Learning Approach to Characterizing Biological Membranes. J Chem Inf Model 2024; 64:5242-5252. [PMID: 38912752 DOI: 10.1021/acs.jcim.4c00552] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/25/2024]
Abstract
Biological membranes play key roles in cellular compartmentalization, structure, and its signaling pathways. At varying temperatures, individual membrane lipids sample from different configurations, a process that frequently leads to higher-order phase behavior and phenomena. Here, we present a persistent homology (PH)-based method for quantifying the structural features of individual and bulk lipids, providing local and contextual information on lipid tail organization. Our method leverages the mathematical machinery of algebraic topology and machine learning to infer temperature-dependent structural information on lipids from static coordinates. To train our model, we generated multiple molecular dynamics trajectories of dipalmitoyl-phosphatidylcholine membranes at varying temperatures. A fingerprint was then constructed for each set of lipid coordinates by PH filtration, in which interaction spheres were grown around the lipid atoms while tracking their intersections. The sphere filtration formed a simplicial complex that captures enduring key topological features of the configuration landscape using homology, yielding persistence data. Following fingerprint extraction for physiologically relevant temperatures, the persistence data were used to train an attention-based neural network for assignment of effective temperature values to selected membrane regions. Our persistence homology-based method captures the local structural effects, via effective temperature, of lipids adjacent to other membrane constituents, e.g., sterols and proteins. This topological learning approach can predict lipid effective temperatures from static coordinates across multiple spatial resolutions. The tool, called MembTDA, can be accessed at https://github.com/hyunp2/Memb-TDA.
Collapse
Affiliation(s)
- Andres S Arango
- Theoretical and Computational Biophysics Group, NIH Resource Center for Macromolecular Modeling and Visualization, Beckman Institute for Advanced Science and Technology, Department of Biochemistry, and Center for Biophysics and Quantitative Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, United States
| | - Hyun Park
- Theoretical and Computational Biophysics Group, NIH Resource Center for Macromolecular Modeling and Visualization, Beckman Institute for Advanced Science and Technology, Department of Biochemistry, and Center for Biophysics and Quantitative Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, United States
| | - Emad Tajkhorshid
- Theoretical and Computational Biophysics Group, NIH Resource Center for Macromolecular Modeling and Visualization, Beckman Institute for Advanced Science and Technology, Department of Biochemistry, and Center for Biophysics and Quantitative Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, United States
| |
Collapse
|
14
|
Fantasia A, Rovaris F, Abou El Kheir O, Marzegalli A, Lanzoni D, Pessina L, Xiao P, Zhou C, Li L, Henkelman G, Scalise E, Montalenti F. Development of a machine learning interatomic potential for exploring pressure-dependent kinetics of phase transitions in germanium. J Chem Phys 2024; 161:014110. [PMID: 38953439 DOI: 10.1063/5.0214588] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2024] [Accepted: 06/15/2024] [Indexed: 07/04/2024] Open
Abstract
We introduce a data-driven potential aimed at the investigation of pressure-dependent phase transitions in bulk germanium, including the estimate of kinetic barriers. This is achieved by suitably building a database including several configurations along minimum energy paths, as computed using the solid-state nudged elastic band method. After training the model based on density functional theory (DFT)-computed energies, forces, and stresses, we provide validation and rigorously test the potential on unexplored paths. The resulting agreement with the DFT calculations is remarkable in a wide range of pressures. The potential is exploited in large-scale isothermal-isobaric simulations, displaying local nucleation in the R8 to β-Sn pressure-induced phase transformation, taken here as an illustrative example.
Collapse
Affiliation(s)
- A Fantasia
- Department of Materials Science, University of Milano-Bicocca, 20125 Milano, Italy
| | - F Rovaris
- Department of Materials Science, University of Milano-Bicocca, 20125 Milano, Italy
| | - O Abou El Kheir
- Department of Materials Science, University of Milano-Bicocca, 20125 Milano, Italy
| | - A Marzegalli
- Department of Materials Science, University of Milano-Bicocca, 20125 Milano, Italy
| | - D Lanzoni
- Department of Materials Science, University of Milano-Bicocca, 20125 Milano, Italy
| | - L Pessina
- Department of Materials Science, University of Milano-Bicocca, 20125 Milano, Italy
| | - P Xiao
- Department of Physics and Atmospheric Science, Dalhousie University, 1453 Lord Dalhousie Drive, Halifax, Nova Scotia B3H 4R2, Canada
| | - C Zhou
- Department of Materials Science and Engineering, Southern University of Science and Technology, 1088 Xueyuan Avenue, 518055 Shenzhen, China
| | - L Li
- Department of Materials Science and Engineering, Southern University of Science and Technology, 1088 Xueyuan Avenue, 518055 Shenzhen, China
| | - G Henkelman
- Department of Chemistry, The University of Texas at Austin, 105 East 24th Street STOP A5300 Austin, Texas 78712, USA
| | - E Scalise
- Department of Materials Science, University of Milano-Bicocca, 20125 Milano, Italy
| | - F Montalenti
- Department of Materials Science, University of Milano-Bicocca, 20125 Milano, Italy
| |
Collapse
|
15
|
Huynh H, Le K, Vu L, Nguyen T, Holcomb M, Forli S, Phan H. Synergy of machine learning and density functional theory calculations for predicting experimental Lewis base affinity and Lewis polybase binding atoms. J Comput Chem 2024; 45:1552-1561. [PMID: 38500409 PMCID: PMC11099847 DOI: 10.1002/jcc.27329] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Revised: 01/26/2024] [Accepted: 01/31/2024] [Indexed: 03/20/2024]
Abstract
Investigation of Lewis acid-base interactions has been conducted by ab initio calculations and machine learning (ML) models. This study aims to resolve two critical tasks that have not been quantitatively investigated. First, ML models developed from density functional theory (DFT) calculations predict experimental BF3 affinity with Pearson correlation coefficients around 0.9 and mean absolute errors around 10 kJ mol-1. The ML models are trained by DFT-calculated BF3 affinity of more than 3000 adducts, with input features readily obtained by rdkit. Second, the ML models have the capability of predicting the relative strength of Lewis base binding atoms in Lewis polybases, which is either an extremely challenging task to conduct experimentally or a computationally expensive task for ab initio methods. The study demonstrates and solidifies the potential of combining DFT calculations and ML models to predict experimental properties, especially those that are scarce and impractical to empirically acquire.
Collapse
Affiliation(s)
- Hieu Huynh
- Fulbright University Vietnam, Ho Chi Minh city, Vietnam, Ho Chi Minh City 700000
| | - Khanh Le
- Fulbright University Vietnam, Ho Chi Minh city, Vietnam, Ho Chi Minh City 700000
| | - Linh Vu
- Fulbright University Vietnam, Ho Chi Minh city, Vietnam, Ho Chi Minh City 700000
| | - Trang Nguyen
- Fulbright University Vietnam, Ho Chi Minh city, Vietnam, Ho Chi Minh City 700000
| | - Matthew Holcomb
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037 USA
| | - Stefano Forli
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037 USA
| | - Hung Phan
- Fulbright University Vietnam, Ho Chi Minh city, Vietnam, Ho Chi Minh City 700000
- Soka University of America, Aliso Viejo, California, United States, CA 92656
| |
Collapse
|
16
|
Yang L, Guo Q, Zhang L. AI-assisted chemistry research: a comprehensive analysis of evolutionary paths and hotspots through knowledge graphs. Chem Commun (Camb) 2024; 60:6977-6987. [PMID: 38910536 DOI: 10.1039/d4cc01892c] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/25/2024]
Abstract
Artificial intelligence (AI) offers transformative potential for chemical research through its ability to optimize reactions and processes, enhance energy efficiency, and reduce waste. AI-assisted chemical research (AI + chem) has become a global hotspot. To better understand the current research status of "AI + chem", this study conducted a scientific bibliometric investigation using CiteSpace. The web of science core collection was utilized to retrieve original articles related to "AI + chem" published from 2000 to 2024. The obtained data allowed for the visualization of the knowledge background, current research status, and latest knowledge structure of "AI + chem". The "AI + chem" has entered a stage of explosive growth, and the number of papers will maintain long-term high-speed growth. This article systematically analyzes the latest progress in "AI + chem" and objectively predicts future trends, including molecular design, reaction prediction, materials design, drug design, and quantum chemistry. The outcomes of this study will provide readers with a comprehensive understanding of the overall landscape of "AI + chem".
Collapse
Affiliation(s)
- Lin Yang
- School of Intellectual Property, Dalian University of Technology, Dalian 116024, Liaoning, P. R. China
| | - Qingle Guo
- School of Intellectual Property, Dalian University of Technology, Dalian 116024, Liaoning, P. R. China
| | - Lijing Zhang
- School of Chemistry, Dalian University of Technology, Dalian 116024, Liaoning, P. R. China.
| |
Collapse
|
17
|
Aldossary A, Campos-Gonzalez-Angulo JA, Pablo-García S, Leong SX, Rajaonson EM, Thiede L, Tom G, Wang A, Avagliano D, Aspuru-Guzik A. In Silico Chemical Experiments in the Age of AI: From Quantum Chemistry to Machine Learning and Back. ADVANCED MATERIALS (DEERFIELD BEACH, FLA.) 2024; 36:e2402369. [PMID: 38794859 DOI: 10.1002/adma.202402369] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/15/2024] [Revised: 04/28/2024] [Indexed: 05/26/2024]
Abstract
Computational chemistry is an indispensable tool for understanding molecules and predicting chemical properties. However, traditional computational methods face significant challenges due to the difficulty of solving the Schrödinger equations and the increasing computational cost with the size of the molecular system. In response, there has been a surge of interest in leveraging artificial intelligence (AI) and machine learning (ML) techniques to in silico experiments. Integrating AI and ML into computational chemistry increases the scalability and speed of the exploration of chemical space. However, challenges remain, particularly regarding the reproducibility and transferability of ML models. This review highlights the evolution of ML in learning from, complementing, or replacing traditional computational chemistry for energy and property predictions. Starting from models trained entirely on numerical data, a journey set forth toward the ideal model incorporating or learning the physical laws of quantum mechanics. This paper also reviews existing computational methods and ML models and their intertwining, outlines a roadmap for future research, and identifies areas for improvement and innovation. Ultimately, the goal is to develop AI architectures capable of predicting accurate and transferable solutions to the Schrödinger equation, thereby revolutionizing in silico experiments within chemistry and materials science.
Collapse
Affiliation(s)
- Abdulrahman Aldossary
- Department of Chemistry, University of Toronto, 80 St. George Street, Toronto, ON, M5S 3H6, Canada
| | | | - Sergio Pablo-García
- Department of Chemistry, University of Toronto, 80 St. George Street, Toronto, ON, M5S 3H6, Canada
- Department of Computer Science, University of Toronto, 40 St. George Street, Toronto, ON, M5S 2E4, Canada
| | - Shi Xuan Leong
- Department of Chemistry, University of Toronto, 80 St. George Street, Toronto, ON, M5S 3H6, Canada
| | - Ella Miray Rajaonson
- Department of Chemistry, University of Toronto, 80 St. George Street, Toronto, ON, M5S 3H6, Canada
- Vector Institute for Artificial Intelligence, 661 University Ave. Suite 710, Toronto, ON, M5G 1M1, Canada
| | - Luca Thiede
- Department of Computer Science, University of Toronto, 40 St. George Street, Toronto, ON, M5S 2E4, Canada
- Vector Institute for Artificial Intelligence, 661 University Ave. Suite 710, Toronto, ON, M5G 1M1, Canada
| | - Gary Tom
- Department of Chemistry, University of Toronto, 80 St. George Street, Toronto, ON, M5S 3H6, Canada
- Vector Institute for Artificial Intelligence, 661 University Ave. Suite 710, Toronto, ON, M5G 1M1, Canada
| | - Andrew Wang
- Department of Chemistry, University of Toronto, 80 St. George Street, Toronto, ON, M5S 3H6, Canada
| | - Davide Avagliano
- Chimie ParisTech, PSL University, CNRS, Institute of Chemistry for Life and Health Sciences (iCLeHS UMR 8060), Paris, F-75005, France
| | - Alán Aspuru-Guzik
- Department of Chemistry, University of Toronto, 80 St. George Street, Toronto, ON, M5S 3H6, Canada
- Department of Computer Science, University of Toronto, 40 St. George Street, Toronto, ON, M5S 2E4, Canada
- Vector Institute for Artificial Intelligence, 661 University Ave. Suite 710, Toronto, ON, M5G 1M1, Canada
- Department of Materials Science & Engineering, University of Toronto, 184 College St., Toronto, ON, M5S 3E4, Canada
- Department of Chemical Engineering & Applied Chemistry, University of Toronto, 200 College St., Toronto, ON, M5S 3E5, Canada
- Lebovic Fellow, Canadian Institute for Advanced Research (CIFAR), 66118 University Ave., Toronto, M5G 1M1, Canada
- Acceleration Consortium, 80 St George St, Toronto, M5S 3H6, Canada
| |
Collapse
|
18
|
Han C, Zhang D, Xia S, Zhang Y. Accurate Prediction of NMR Chemical Shifts: Integrating DFT Calculations with Three-Dimensional Graph Neural Networks. J Chem Theory Comput 2024; 20:5250-5258. [PMID: 38842505 PMCID: PMC11209944 DOI: 10.1021/acs.jctc.4c00422] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2024] [Revised: 05/25/2024] [Accepted: 05/29/2024] [Indexed: 06/07/2024]
Abstract
Computer prediction of NMR chemical shifts plays an increasingly important role in molecular structure assignment and elucidation for organic molecule studies. Density functional theory (DFT) and gauge-including atomic orbital (GIAO) have established a framework to predict NMR chemical shifts but often at a significant computational expense with a limited prediction accuracy. Recent advancements in deep learning methods, especially graph neural networks (GNNs), have shown promise in improving the accuracy of predicting experimental chemical shifts, either by using 2D molecular topological features or 3D conformational representation. This study presents a new 3D GNN model to predict 1H and 13C chemical shifts, CSTShift, that combines atomic features with DFT-calculated shielding tensor descriptors, capturing both isotropic and anisotropic shielding effects. Utilizing the NMRShiftDB2 data set and conducting DFT optimization and GIAO calculations at the B3LYP/6-31G(d) level, we prepared the NMRShiftDB2-DFT data set of high-quality 3D structures and shielding tensors with corresponding experimentally measured 1H and 13C chemical shifts. The developed CSTShift models achieve the state-of-the-art prediction performance on both the NMRShiftDB2-DFT test data set and external CHESHIRE data set. Further case studies on identifying correct structures from two groups of constitutional isomers show its capability for structure assignment and elucidation. The source code and data are accessible at https://yzhang.hpc.nyu.edu/IMA.
Collapse
Affiliation(s)
- Chao Han
- Department
of Chemistry, New York University, New York, New York 10003, United States
| | - Dongdong Zhang
- Department
of Chemistry, New York University, New York, New York 10003, United States
| | - Song Xia
- Department
of Chemistry, New York University, New York, New York 10003, United States
| | - Yingkai Zhang
- Department
of Chemistry, New York University, New York, New York 10003, United States
- Simons
Center for Computational Physical Chemistry at New York University, New York, New York 10003, United States
- NYU-ECNU
Center for Computational Chemistry at NYU Shanghai, Shanghai 200062, China
| |
Collapse
|
19
|
Zhang L, Pios SV, Martyka M, Ge F, Hou YF, Chen Y, Chen L, Jankowska J, Barbatti M, Dral PO. MLatom Software Ecosystem for Surface Hopping Dynamics in Python with Quantum Mechanical and Machine Learning Methods. J Chem Theory Comput 2024; 20:5043-5057. [PMID: 38836623 DOI: 10.1021/acs.jctc.4c00468] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/06/2024]
Abstract
We present an open-source MLatom@XACS software ecosystem for on-the-fly surface hopping nonadiabatic dynamics based on the Landau-Zener-Belyaev-Lebedev algorithm. The dynamics can be performed via Python API with a wide range of quantum mechanical (QM) and machine learning (ML) methods, including ab initio QM (CASSCF and ADC(2)), semiempirical QM methods (e.g., AM1, PM3, OMx, and ODMx), and many types of ML potentials (e.g., KREG, ANI, and MACE). Combinations of QM and ML methods can also be used. While the user can build their own combinations, we provide AIQM1, which is based on Δ-learning and can be used out-of-the-box. We showcase how AIQM1 reproduces the isomerization quantum yield of trans-azobenzene at a low cost. We provide example scripts that, in dozens of lines, enable the user to obtain the final population plots by simply providing the initial geometry of a molecule. Thus, those scripts perform geometry optimization, normal mode calculations, initial condition sampling, parallel trajectories propagation, population analysis, and final result plotting. Given the capabilities of MLatom to be used for training different ML models, this ecosystem can be seamlessly integrated into the protocols building ML models for nonadiabatic dynamics. In the future, a deeper and more efficient integration of MLatom with Newton-X will enable a vast range of functionalities for surface hopping dynamics, such as fewest-switches surface hopping, to facilitate similar workflows via the Python API.
Collapse
Affiliation(s)
- Lina Zhang
- College of Chemistry and Chemical Engineering, Xiamen University, Xiamen, Fujian 361005, China
| | - Sebastian V Pios
- Zhejiang Laboratory, Hangzhou, Zhejiang 311100, People's Republic of China
| | - Mikołaj Martyka
- Faculty of Chemistry, University of Warsaw, Pasteura 1, Warsaw 02-093, Poland
| | - Fuchun Ge
- College of Chemistry and Chemical Engineering, Xiamen University, Xiamen, Fujian 361005, China
| | - Yi-Fan Hou
- College of Chemistry and Chemical Engineering, Xiamen University, Xiamen, Fujian 361005, China
| | - Yuxinxin Chen
- College of Chemistry and Chemical Engineering, Xiamen University, Xiamen, Fujian 361005, China
| | - Lipeng Chen
- Zhejiang Laboratory, Hangzhou, Zhejiang 311100, People's Republic of China
| | - Joanna Jankowska
- Faculty of Chemistry, University of Warsaw, Pasteura 1, Warsaw 02-093, Poland
| | - Mario Barbatti
- Aix Marseille University, CNRS, ICR, Marseille 13397, France
- Institut Universitaire de France, Paris 75231, France
| | - Pavlo O Dral
- College of Chemistry and Chemical Engineering, Xiamen University, Xiamen, Fujian 361005, China
- State Key Laboratory of Physical Chemistry of Solid Surfaces, Xiamen University, Xiamen, Fujian 361005, China
- Innovation Laboratory for Sciences and Technologies of Energy Materials of Fujian Province (IKKEM), Xiamen University, Xiamen, Fujian 361005, China
- Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, Xiamen, Fujian 361005, China
| |
Collapse
|
20
|
Park Y, Kim J, Hwang S, Han S. Scalable Parallel Algorithm for Graph Neural Network Interatomic Potentials in Molecular Dynamics Simulations. J Chem Theory Comput 2024; 20:4857-4868. [PMID: 38813770 DOI: 10.1021/acs.jctc.4c00190] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/31/2024]
Abstract
Message-passing graph neural network interatomic potentials (GNN-IPs), particularly those with equivariant representations such as NequIP, are attracting significant attention due to their data efficiency and high accuracy. However, parallelizing GNN-IPs poses challenges because multiple message-passing layers complicate data communication within the spatial decomposition method, which is preferred by many molecular dynamics (MD) packages. In this article, we propose an efficient parallelization scheme compatible with GNN-IPs and develop a package, SevenNet (Scalable EquiVariance-Enabled Neural NETwork), based on the NequIP architecture. For MD simulations, SevenNet interfaces with the LAMMPS package. Through benchmark tests on a 32-GPU cluster with examples of SiO2, SevenNet achieves over 80% parallel efficiency in weak-scaling scenarios and exhibits nearly ideal strong-scaling performance as long as GPUs are fully utilized. However, the strong-scaling performance significantly declines with suboptimal GPU utilization, particularly affecting parallel efficiency in cases involving lightweight models or simulations with small numbers of atoms. We also pretrain SevenNet with a vast data set from the Materials Project (dubbed "SevenNet-0") and assess its performance on generating amorphous Si3N4 containing more than 100,000 atoms. By developing scalable GNN-IPs, this work aims to bridge the gap between advanced machine-learning models and large-scale MD simulations, offering researchers a powerful tool to explore complex material systems with high accuracy and efficiency.
Collapse
Affiliation(s)
- Yutack Park
- Department of Materials Science and Engineering and Research Institute of Advanced Materials, Seoul National University, Seoul 08826, Korea
| | - Jaesun Kim
- Department of Materials Science and Engineering and Research Institute of Advanced Materials, Seoul National University, Seoul 08826, Korea
| | - Seungwoo Hwang
- Department of Materials Science and Engineering and Research Institute of Advanced Materials, Seoul National University, Seoul 08826, Korea
| | - Seungwu Han
- Department of Materials Science and Engineering and Research Institute of Advanced Materials, Seoul National University, Seoul 08826, Korea
- Korea Institute for Advanced Study, Seoul 02455, Korea
| |
Collapse
|
21
|
Chen Y, Pios SV, Gelin MF, Chen L. Accelerating Molecular Vibrational Spectra Simulations with a Physically Informed Deep Learning Model. J Chem Theory Comput 2024; 20:4703-4710. [PMID: 38825857 DOI: 10.1021/acs.jctc.4c00173] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/04/2024]
Abstract
In recent years, machine learning (ML) surrogate models have emerged as an indispensable tool to accelerate simulations of physical and chemical processes. However, there is still a lack of ML models that can accurately predict molecular vibrational spectra. Here, we present a highly efficient multitask ML surrogate model termed Vibrational Spectra Neural Network (VSpecNN), to accurately calculate infrared (IR) and Raman spectra based on dipole moments and polarizabilities obtained on-the-fly via ML-enhanced molecular dynamics simulations. The methodology is applied to pyrazine, a prototypical polyatomic chromophore. The VSpecNN-predicted energies are well within the chemical accuracy (1 kcal/mol), and the errors for VSpecNN-predicted forces are only half of those obtained from a popular high-performance ML model. Compared to the ab initio reference, the VSpecNN-predicted frequencies of IR and Raman spectra differ only by less than 5.87 cm-1, and the intensities of IR spectra and the depolarization ratios of Raman spectra are well reproduced. The VSpecNN model developed in this work highlights the importance of constructing highly accurate neural network potentials for predicting molecular vibrational spectra.
Collapse
Affiliation(s)
| | | | - Maxim F Gelin
- School of Science, Hangzhou Dianzi University, Hangzhou 310018, China
| | | |
Collapse
|
22
|
Xie Q, Horsfield AP. Coordinate-Free and Low-Order Scaling Machine Learning Model for Atomic Partial Charge Prediction for Any Size of Molecules. J Chem Inf Model 2024; 64:4419-4425. [PMID: 38757521 PMCID: PMC11167589 DOI: 10.1021/acs.jcim.4c00376] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2024] [Revised: 05/03/2024] [Accepted: 05/08/2024] [Indexed: 05/18/2024]
Abstract
The atomic partial charge is of great importance in many fields, such as chemistry and drug-target recognition. However, conventional quantum-based computing of atomic charges is relatively slow, limiting further applications of atomic charge analysis. With the help of machine learning methods, various kinds of models appear to speed up atomic charge calculations. However, there are still some concerning problems. Some models based on geometric coordinates require high-accuracy geometry optimization as a preprocess, while other models have a limitation on the size of input molecules that narrow the applications of the model. Here, we propose a machine learning atomic charge model based on a message-passing featurizer. This preprocessing featurizer can quickly extract atomic environment information from a molecule according to the connectivity inside the molecule. The resulting descriptor can be used with a neural network to quickly predict the atomic partial charge. The model is able to automatically adapt to any size of molecule while remaining efficient and achieves a root-mean-square error in the Hirshfeld charge prediction of 0.018e, with an overall time complexity of O(n2). Thus, this model could enlarge the range of applications of atomic partial charge to more fields and cases.
Collapse
Affiliation(s)
- Qin Xie
- Department of Materials, Imperial
College London, SW7 2AZ London, U.K.
| | | |
Collapse
|
23
|
Lei YK, Yagi K, Sugita Y. Learning QM/MM potential using equivariant multiscale model. J Chem Phys 2024; 160:214109. [PMID: 38828815 DOI: 10.1063/5.0205123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2024] [Accepted: 05/09/2024] [Indexed: 06/05/2024] Open
Abstract
The machine learning (ML) method emerges as an efficient and precise surrogate model for high-level electronic structure theory. Its application has been limited to closed chemical systems without considering external potentials from the surrounding environment. To address this limitation and incorporate the influence of external potentials, polarization effects, and long-range interactions between a chemical system and its environment, the first two terms of the Taylor expansion of an electrostatic operator have been used as extra input to the existing ML model to represent the electrostatic environments. However, high-order electrostatic interaction is often essential to account for external potentials from the environment. The existing models based only on invariant features cannot capture significant distribution patterns of the external potentials. Here, we propose a novel ML model that includes high-order terms of the Taylor expansion of an electrostatic operator and uses an equivariant model, which can generate a high-order tensor covariant with rotations as a base model. Therefore, we can use the multipole-expansion equation to derive a useful representation by accounting for polarization and intermolecular interaction. Moreover, to deal with long-range interactions, we follow the same strategy adopted to derive long-range interactions between a target system and its environment media. Our model achieves higher prediction accuracy and transferability among various environment media with these modifications.
Collapse
Affiliation(s)
- Yao-Kun Lei
- Theoretical Molecular Science Laboratory, RIKEN Cluster for Pioneering Research, Wako, Saitama 351-0198, Japan
- Computational Biophysics Research Team, RIKEN Center for Computational Science, Kobe, Hyogo 650-0047, Japan
- RIKEN Interdisciplinary Theoretical and Mathematical Sciences Program (iTHEMS), Wako, Saitama 351-0198, Japan
| | - Kiyoshi Yagi
- Theoretical Molecular Science Laboratory, RIKEN Cluster for Pioneering Research, Wako, Saitama 351-0198, Japan
- Computational Biophysics Research Team, RIKEN Center for Computational Science, Kobe, Hyogo 650-0047, Japan
| | - Yuji Sugita
- Theoretical Molecular Science Laboratory, RIKEN Cluster for Pioneering Research, Wako, Saitama 351-0198, Japan
- Computational Biophysics Research Team, RIKEN Center for Computational Science, Kobe, Hyogo 650-0047, Japan
- RIKEN Interdisciplinary Theoretical and Mathematical Sciences Program (iTHEMS), Wako, Saitama 351-0198, Japan
- Laboratory for Biomolecular Function Simulation, RIKEN Center for Biosystems Dynamics Research, Kobe, Hyogo 650-0047, Japan
| |
Collapse
|
24
|
Tiwary P. Modeling prebiotic chemistries with quantum accuracy at classical costs. Proc Natl Acad Sci U S A 2024; 121:e2408742121. [PMID: 38809708 PMCID: PMC11161769 DOI: 10.1073/pnas.2408742121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/31/2024] Open
Affiliation(s)
- Pratyush Tiwary
- Institute for Physical Science and Technology, University of Maryland, College Park, MD20742
- Department of Chemistry and Biochemistry, University of Maryland, College Park, MD20742
- University of Maryland Institute for Health Computing, Bethesda, MD20852
| |
Collapse
|
25
|
Ben Mahmoud C, Gardner JLA, Deringer VL. Data as the next challenge in atomistic machine learning. NATURE COMPUTATIONAL SCIENCE 2024; 4:384-387. [PMID: 38866969 DOI: 10.1038/s43588-024-00636-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2024]
|
26
|
Xiang G, Yao S, Peng Y, Deng H, Wu X, Wang K, Li Y, Wu F. An effective cross-scenario remote heart rate estimation network based on global-local information and video transformer. Phys Eng Sci Med 2024; 47:729-739. [PMID: 38504066 DOI: 10.1007/s13246-024-01401-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2023] [Accepted: 02/06/2024] [Indexed: 03/21/2024]
Abstract
Remote photoplethysmography (rPPG) technology is a non-contact physiological signal measurement method, characterized by non-invasiveness and ease of use. It has broad application potential in medical health, human factors engineering, and other fields. However, current rPPG technology is highly susceptible to variations in lighting conditions, head pose changes, and partial occlusions, posing significant challenges for its widespread application. In order to improve the accuracy of remote heart rate estimation and enhance model generalization, we propose PulseFormer, a dual-path network based on transformer. By integrating local and global information and utilizing fast and slow paths, PulseFormer effectively captures the temporal variations of key regions and spatial variations of the global area, facilitating the extraction of rPPG feature information while mitigating the impact of background noise variations. Heart rate estimation results on the popular rPPG dataset show that PulseFormer achieves state-of-the-art performance on public datasets. Additionally, we establish a dataset containing facial expressions and synchronized physiological signals in driving scenarios and test the pre-trained model from the public dataset on this collected dataset. The results indicate that PulseFormer exhibits strong generalization capabilities across different data distributions in cross-scenario settings. Therefore, this model is applicable for heart rate estimation of individuals in various scenarios.
Collapse
Affiliation(s)
- Guoliang Xiang
- Key Laboratory of Traffic Safety on Track of Ministry of Education, School of Traffic & Transportation Engineering, Central South University, Changsha, 410075, China
| | - Song Yao
- Key Laboratory of Traffic Safety on Track of Ministry of Education, School of Traffic & Transportation Engineering, Central South University, Changsha, 410075, China
| | - Yong Peng
- Key Laboratory of Traffic Safety on Track of Ministry of Education, School of Traffic & Transportation Engineering, Central South University, Changsha, 410075, China.
| | - Hanwen Deng
- Key Laboratory of Traffic Safety on Track of Ministry of Education, School of Traffic & Transportation Engineering, Central South University, Changsha, 410075, China
| | - Xianhui Wu
- Key Laboratory of Traffic Safety on Track of Ministry of Education, School of Traffic & Transportation Engineering, Central South University, Changsha, 410075, China
| | - Kui Wang
- Key Laboratory of Traffic Safety on Track of Ministry of Education, School of Traffic & Transportation Engineering, Central South University, Changsha, 410075, China
| | - Yingli Li
- Key Laboratory of Traffic Safety on Track of Ministry of Education, School of Traffic & Transportation Engineering, Central South University, Changsha, 410075, China
| | - Fan Wu
- Key Laboratory of Traffic Safety on Track of Ministry of Education, School of Traffic & Transportation Engineering, Central South University, Changsha, 410075, China
| |
Collapse
|
27
|
Yang Y, Zhang S, Ranasinghe KD, Isayev O, Roitberg AE. Machine Learning of Reactive Potentials. Annu Rev Phys Chem 2024; 75:371-395. [PMID: 38941524 DOI: 10.1146/annurev-physchem-062123-024417] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/30/2024]
Abstract
In the past two decades, machine learning potentials (MLPs) have driven significant developments in chemical, biological, and material sciences. The construction and training of MLPs enable fast and accurate simulations and analysis of thermodynamic and kinetic properties. This review focuses on the application of MLPs to reaction systems with consideration of bond breaking and formation. We review the development of MLP models, primarily with neural network and kernel-based algorithms, and recent applications of reactive MLPs (RMLPs) to systems at different scales. We show how RMLPs are constructed, how they speed up the calculation of reactive dynamics, and how they facilitate the study of reaction trajectories, reaction rates, free energy calculations, and many other calculations. Different data sampling strategies applied in building RMLPs are also discussed with a focus on how to collect structures for rare events and how to further improve their performance with active learning.
Collapse
Affiliation(s)
- Yinuo Yang
- Department of Chemistry, University of Florida, Gainesville, Florida;
| | - Shuhao Zhang
- Department of Chemistry, Carnegie Mellon University, Pittsburgh, Pennsylvania;
| | | | - Olexandr Isayev
- Department of Chemistry, Carnegie Mellon University, Pittsburgh, Pennsylvania;
| | - Adrian E Roitberg
- Department of Chemistry, University of Florida, Gainesville, Florida;
| |
Collapse
|
28
|
Yuan S, Han X, Zhang J, Xie Z, Fan C, Xiao Y, Gao YQ, Yang YI. Generating High-Precision Force Fields for Molecular Dynamics Simulations to Study Chemical Reaction Mechanisms Using Molecular Configuration Transformer. J Phys Chem A 2024; 128:4378-4390. [PMID: 38759697 DOI: 10.1021/acs.jpca.4c01267] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/19/2024]
Abstract
Theoretical studies on chemical reaction mechanisms have been crucial in organic chemistry. Traditionally, calculating the manually constructed molecular conformations of transition states for chemical reactions using quantum chemical calculations is the most commonly used method. However, this way is heavily dependent on individual experience and chemical intuition. In our previous study, we proposed a research paradigm that used enhanced sampling in molecular dynamics simulations to study chemical reactions. This approach can directly simulate the entire process of a chemical reaction. However, the computational speed limited the use of high-precision potential energy functions for simulations. To address this issue, we presented a scheme for training high-precision force fields for molecular modeling using a previously developed graph-neural-network-based molecular model, molecular configuration transformer. This potential energy function allowed for highly accurate simulations at a low computational cost, leading to more precise calculations of the mechanism of chemical reactions. We applied this approach to study a Claisen rearrangement reaction and a carbonyl insertion reaction catalyzed by manganese.
Collapse
Affiliation(s)
- Sihao Yuan
- Institute of Theoretical and Computational Chemistry, College of Chemistry and Molecular Engineering, Peking University, Beijing 100871, China
- Institute of Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen 518132, China
| | - Xu Han
- Institute of Theoretical and Computational Chemistry, College of Chemistry and Molecular Engineering, Peking University, Beijing 100871, China
| | - Jun Zhang
- Changping Laboratory, Beijing 102200, China
| | - Zhaoxin Xie
- Institute of Theoretical and Computational Chemistry, College of Chemistry and Molecular Engineering, Peking University, Beijing 100871, China
- Institute of Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen 518132, China
| | - Cheng Fan
- Institute of Theoretical and Computational Chemistry, College of Chemistry and Molecular Engineering, Peking University, Beijing 100871, China
- Institute of Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen 518132, China
| | - Yunlong Xiao
- Institute of Theoretical and Computational Chemistry, College of Chemistry and Molecular Engineering, Peking University, Beijing 100871, China
| | - Yi Qin Gao
- Institute of Theoretical and Computational Chemistry, College of Chemistry and Molecular Engineering, Peking University, Beijing 100871, China
- Institute of Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen 518132, China
- Changping Laboratory, Beijing 102200, China
| | - Yi Isaac Yang
- Institute of Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen 518132, China
| |
Collapse
|
29
|
Kalayan J, Ramzan I, Williams CD, Bryce RA, Burton NA. A neural network potential based on pairwise resolved atomic forces and energies. J Comput Chem 2024; 45:1143-1151. [PMID: 38284556 DOI: 10.1002/jcc.27313] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Revised: 12/23/2023] [Accepted: 01/05/2024] [Indexed: 01/30/2024]
Abstract
Molecular simulations have become a key tool in molecular and materials design. Machine learning (ML)-based potential energy functions offer the prospect of simulating complex molecular systems efficiently at quantum chemical accuracy. In previous work, we have introduced the ML-based PairF-Net approach to neural network potentials, that adopts a pairwise interatomic scheme to predicting forces within a molecular system. Here, we further develop the PairF-Net model to intrinsically incorporate energy conservation and couple the model to a molecular mechanical (MM) environment within the OpenMM package. The updated PairF-Net model yields energy and force predictions and dynamical distributions in good agreement with the rMD17 dataset of ten small organic molecules in the gas-phase. We further show that these in vacuo ML models of small molecules can be applied to force predictions in aqueous solution via hybrid ML/MM simulations. We present a new benchmark dataset for these ten molecules in solution, obtained from QM/MM simulations, which we denote as rMD17-aq (https://zenodo.org/records/10048644); and assess the ability of PairF-Net to reproduce the molecular energy, atomic forces and dynamical distributions of these solution conformations via ML/MM simulations.
Collapse
Affiliation(s)
- Jas Kalayan
- Division of Pharmacy and Optometry, School of Health Sciences, University of Manchester, Manchester, UK
| | - Ismaeel Ramzan
- Division of Pharmacy and Optometry, School of Health Sciences, University of Manchester, Manchester, UK
- Neural Circuits and Computations Unit, RIKEN Center for Brain Science, Wako, Japan
| | - Christopher D Williams
- Division of Pharmacy and Optometry, School of Health Sciences, University of Manchester, Manchester, UK
| | - Richard A Bryce
- Division of Pharmacy and Optometry, School of Health Sciences, University of Manchester, Manchester, UK
| | - Neil A Burton
- Department of Chemistry, University of Manchester, Manchester, UK
| |
Collapse
|
30
|
Pelaez RP, Simeon G, Galvelis R, Mirarchi A, Eastman P, Doerr S, Thölke P, Markland TE, De Fabritiis G. TorchMD-Net 2.0: Fast Neural Network Potentials for Molecular Simulations. J Chem Theory Comput 2024; 20:4076-4087. [PMID: 38743033 DOI: 10.1021/acs.jctc.4c00253] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/16/2024]
Abstract
Achieving a balance between computational speed, prediction accuracy, and universal applicability in molecular simulations has been a persistent challenge. This paper presents substantial advancements in TorchMD-Net software, a pivotal step forward in the shift from conventional force fields to neural network-based potentials. The evolution of TorchMD-Net into a more comprehensive and versatile framework is highlighted, incorporating cutting-edge architectures such as TensorNet. This transformation is achieved through a modular design approach, encouraging customized applications within the scientific community. The most notable enhancement is a significant improvement in computational efficiency, achieving a very remarkable acceleration in the computation of energy and forces for TensorNet models, with performance gains ranging from 2× to 10× over previous, nonoptimized, iterations. Other enhancements include highly optimized neighbor search algorithms that support periodic boundary conditions and smooth integration with existing molecular dynamics frameworks. Additionally, the updated version introduces the capability to integrate physical priors, further enriching its application spectrum and utility in research. The software is available at https://github.com/torchmd/torchmd-net.
Collapse
Affiliation(s)
- Raul P Pelaez
- Computational Science Laboratory, Universitat Pompeu Fabra, Barcelona Biomedical Research Park (PRBB), C Dr. Aiguader 88, 08003 Barcelona, Spain
| | - Guillem Simeon
- Computational Science Laboratory, Universitat Pompeu Fabra, Barcelona Biomedical Research Park (PRBB), C Dr. Aiguader 88, 08003 Barcelona, Spain
| | - Raimondas Galvelis
- Computational Science Laboratory, Universitat Pompeu Fabra, Barcelona Biomedical Research Park (PRBB), C Dr. Aiguader 88, 08003 Barcelona, Spain
- Acellera Labs, C Dr Trueta 183, 08005 Barcelona, Spain
| | - Antonio Mirarchi
- Computational Science Laboratory, Universitat Pompeu Fabra, Barcelona Biomedical Research Park (PRBB), C Dr. Aiguader 88, 08003 Barcelona, Spain
| | - Peter Eastman
- Department of Chemistry, Stanford University, Stanford, California 94305, United States
| | - Stefan Doerr
- Acellera Labs, C Dr Trueta 183, 08005 Barcelona, Spain
| | | | - Thomas E Markland
- Department of Chemistry, Stanford University, Stanford, California 94305, United States
| | - Gianni De Fabritiis
- Computational Science Laboratory, Universitat Pompeu Fabra, Barcelona Biomedical Research Park (PRBB), C Dr. Aiguader 88, 08003 Barcelona, Spain
- Acellera Labs, C Dr Trueta 183, 08005 Barcelona, Spain
- Institució Catalana de Recerca i Estudis Avançats (ICREA), Passeig Lluis Companys 23, 08010 Barcelona, Spain
| |
Collapse
|
31
|
Rezaee M, Ekrami S, Hashemianzadeh SM. Comparing ANI-2x, ANI-1ccx neural networks, force field, and DFT methods for predicting conformational potential energy of organic molecules. Sci Rep 2024; 14:11791. [PMID: 38783010 PMCID: PMC11116541 DOI: 10.1038/s41598-024-62242-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2023] [Accepted: 05/15/2024] [Indexed: 05/25/2024] Open
Abstract
In this study, the conformational potential energy surfaces of Amylmetacresol, Benzocaine, Dopamine, Betazole, and Betahistine molecules were scanned and analyzed using the neural network architecture ANI-2 × and ANI-1ccx, the force field method OPLS, and density functional theory with the exchange-correlation functional B3LYP and the basis set 6-31G(d). The ANI-1ccx and ANI-2 × methods demonstrated the highest accuracy in predicting torsional energy profiles, effectively capturing the minimum and maximum values of these profiles. Conformational potential energy values calculated by B3LYP and the OPLS force field method differ from those calculated by ANI-1ccx and ANI-2x, which account for non-bonded intramolecular interactions, since the B3LYP functional and OPLS force field weakly consider van der Waals and other intramolecular forces in torsional energy profiles. For a more comprehensive analysis, electronic parameters such as dipole moment, HOMO, and LUMO energies for different torsional angles were calculated at two levels of theory, B3LYP/6-31G(d) and ωB97X/6-31G(d). These calculations confirmed that ANI predictions are more accurate than density functional theory calculations with B3LYP functional and OPLS force field for determining potential energy surfaces. This research successfully addressed the challenges in determining conformational potential energy levels and shows how machine learning and deep neural networks offer a more accurate, cost-effective, and rapid alternative for predicting torsional energy profiles.
Collapse
Affiliation(s)
- Mozafar Rezaee
- Molecular Simulation Research Laboratory, Department of Chemistry, Iran University of Science and Technology, Tehran, Iran
| | - Saeid Ekrami
- CNRS, LCPME, Université de Lorraine, 54000, Nancy, France
| | - Seyed Majid Hashemianzadeh
- Molecular Simulation Research Laboratory, Department of Chemistry, Iran University of Science and Technology, Tehran, Iran.
| |
Collapse
|
32
|
Xiang W, Zhong F, Ni L, Zheng M, Li X, Shi Q, Wang D. Gram matrix: an efficient representation of molecular conformation and learning objective for molecular pretraining. Brief Bioinform 2024; 25:bbae340. [PMID: 38990515 PMCID: PMC11238115 DOI: 10.1093/bib/bbae340] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2024] [Revised: 06/05/2024] [Accepted: 06/28/2024] [Indexed: 07/12/2024] Open
Abstract
Accurate prediction of molecular properties is fundamental in drug discovery and development, providing crucial guidance for effective drug design. A critical factor in achieving accurate molecular property prediction lies in the appropriate representation of molecular structures. Presently, prevalent deep learning-based molecular representations rely on 2D structure information as the primary molecular representation, often overlooking essential three-dimensional (3D) conformational information due to the inherent limitations of 2D structures in conveying atomic spatial relationships. In this study, we propose employing the Gram matrix as a condensed representation of 3D molecular structures and for efficient pretraining objectives. Subsequently, we leverage this matrix to construct a novel molecular representation model, Pre-GTM, which inherently encapsulates 3D information. The model accurately predicts the 3D structure of a molecule by estimating the Gram matrix. Our findings demonstrate that Pre-GTM model outperforms the baseline Graphormer model and other pretrained models in the QM9 and MoleculeNet quantitative property prediction task. The integration of the Gram matrix as a condensed representation of 3D molecular structure, incorporated into the Pre-GTM model, opens up promising avenues for its potential application across various domains of molecular research, including drug design, materials science, and chemical engineering.
Collapse
Affiliation(s)
| | - Feisheng Zhong
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China
- University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing 100049, China
- Fujian Key Laboratory of Drug Target Discovery and Structural and Functional Research, School of Pharmacy, Fujian Medical University, Fuzhou 350122, China
| | - Lin Ni
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China
- Nanjing University of Chinese Medicine, 138 Xianlin Road, Nanjing 210023, China
| | - Mingyue Zheng
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China
- University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing 100049, China
- Nanjing University of Chinese Medicine, 138 Xianlin Road, Nanjing 210023, China
| | - Xutong Li
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China
- University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing 100049, China
| | - Qian Shi
- Lingang Laboratory, Shanghai 200031, China
| | | |
Collapse
|
33
|
Wang G, Wang C, Zhang X, Li Z, Zhou J, Sun Z. Machine learning interatomic potential: Bridge the gap between small-scale models and realistic device-scale simulations. iScience 2024; 27:109673. [PMID: 38646181 PMCID: PMC11033164 DOI: 10.1016/j.isci.2024.109673] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/23/2024] Open
Abstract
Machine learning interatomic potential (MLIP) overcomes the challenges of high computational costs in density-functional theory and the relatively low accuracy in classical large-scale molecular dynamics, facilitating more efficient and precise simulations in materials research and design. In this review, the current state of the four essential stages of MLIP is discussed, including data generation methods, material structure descriptors, six unique machine learning algorithms, and available software. Furthermore, the applications of MLIP in various fields are investigated, notably in phase-change memory materials, structure searching, material properties predicting, and the pre-trained universal models. Eventually, the future perspectives, consisting of standard datasets, transferability, generalization, and trade-off between accuracy and complexity in MLIPs, are reported.
Collapse
Affiliation(s)
- Guanjie Wang
- School of Materials Science and Engineering, Beihang University, Beijing 100191, China
- School of Integrated Circuit Science and Engineering, Beihang University, Beijing 100191, China
| | - Changrui Wang
- School of Materials Science and Engineering, Beihang University, Beijing 100191, China
| | - Xuanguang Zhang
- School of Materials Science and Engineering, Beihang University, Beijing 100191, China
| | - Zefeng Li
- School of Materials Science and Engineering, Beihang University, Beijing 100191, China
| | - Jian Zhou
- School of Materials Science and Engineering, Beihang University, Beijing 100191, China
| | - Zhimei Sun
- School of Materials Science and Engineering, Beihang University, Beijing 100191, China
| |
Collapse
|
34
|
van Gerwen P, Briling KR, Calvino Alonso Y, Franke M, Corminboeuf C. Benchmarking machine-readable vectors of chemical reactions on computed activation barriers. DIGITAL DISCOVERY 2024; 3:932-943. [PMID: 38756222 PMCID: PMC11094696 DOI: 10.1039/d3dd00175j] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/06/2023] [Accepted: 02/28/2024] [Indexed: 05/18/2024]
Abstract
In recent years, there has been a surge of interest in predicting computed activation barriers, to enable the acceleration of the automated exploration of reaction networks. Consequently, various predictive approaches have emerged, ranging from graph-based models to methods based on the three-dimensional structure of reactants and products. In tandem, many representations have been developed to predict experimental targets, which may hold promise for barrier prediction as well. Here, we bring together all of these efforts and benchmark various methods (Morgan fingerprints, the DRFP, the CGR representation-based Chemprop, SLATMd, B2Rl2, EquiReact and language model BERT + RXNFP) for the prediction of computed activation barriers on three diverse datasets.
Collapse
Affiliation(s)
- Puck van Gerwen
- Laboratory for Computational Molecular Design, Institute of Chemical Sciences and Engineering, École Polytechnique Fédérale de Lausanne 1015 Lausanne Switzerland
- National Center for Competence in Research-Catalysis (NCCR-Catalysis), École Polytechnique Fédérale de Lausanne 1015 Lausanne Switzerland
| | - Ksenia R Briling
- Laboratory for Computational Molecular Design, Institute of Chemical Sciences and Engineering, École Polytechnique Fédérale de Lausanne 1015 Lausanne Switzerland
| | - Yannick Calvino Alonso
- Laboratory for Computational Molecular Design, Institute of Chemical Sciences and Engineering, École Polytechnique Fédérale de Lausanne 1015 Lausanne Switzerland
| | - Malte Franke
- Laboratory for Computational Molecular Design, Institute of Chemical Sciences and Engineering, École Polytechnique Fédérale de Lausanne 1015 Lausanne Switzerland
| | - Clemence Corminboeuf
- Laboratory for Computational Molecular Design, Institute of Chemical Sciences and Engineering, École Polytechnique Fédérale de Lausanne 1015 Lausanne Switzerland
- National Center for Competence in Research-Catalysis (NCCR-Catalysis), École Polytechnique Fédérale de Lausanne 1015 Lausanne Switzerland
| |
Collapse
|
35
|
Dumortier L, Chizallet C, Creton B, de Bruin T, Verstraelen T. Managing Expectations and Imbalanced Training Data in Reactive Force Field Development: An Application to Water Adsorption on Alumina. J Chem Theory Comput 2024; 20:3779-3797. [PMID: 38639642 DOI: 10.1021/acs.jctc.3c01009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/20/2024]
Abstract
ReaxFF is a computationally efficient model for reactive molecular dynamics simulations that has been applied to a wide variety of chemical systems. When ReaxFF parameters are not yet available for a chemistry of interest, they must be (re)optimized, for which one defines a set of training data that the new ReaxFF parameters should reproduce. ReaxFF training sets typically contain diverse properties with different units, some of which are more abundant (by orders of magnitude) than others. To find the best parameters, one conventionally minimizes a weighted sum of squared errors over all of the data in the training set. One of the challenges in such numerical optimizations is to assign weights so that the optimized parameters represent a good compromise among all the requirements defined in the training set. This work introduces a new loss function, called Balanced Loss, and a workflow that replaces weight assignment with a more manageable procedure. The training data are divided into categories with corresponding "tolerances", i.e., acceptable root-mean-square errors for the categories, which define the expectations for the optimized ReaxFF parameters. Through the Log-Sum-Exp form of Balanced Loss, the parameter optimization is also a validation of one's expectations, providing meaningful feedback that can be used to reconfigure the tolerances if needed. The new methodology is demonstrated with a nontrivial parametrization of ReaxFF for water adsorption on alumina. This results in a new force field that reproduces both the rare and frequent properties of a validation set not used for training. We also demonstrate the robustness of the new force field with a molecular dynamics simulation of water desorption from a γ-Al2O3 slab model.
Collapse
Affiliation(s)
- Loïc Dumortier
- IFP Energies nouvelles, 1 et 4 Avenue de Bois-Préau, 92852 Rueil-Malmaison, France
- Center for Molecular Modeling (CMM), Ghent University, Technologiepark-Zwijnaarde 46, Zwijnaarde, B-9052 Ghent, Belgium
| | - Céline Chizallet
- IFP Energies nouvelles, Rond-point de l'échangeur de Solaize, BP3, 69360 Solaize, France
| | - Benoit Creton
- IFP Energies nouvelles, 1 et 4 Avenue de Bois-Préau, 92852 Rueil-Malmaison, France
| | - Theodorus de Bruin
- IFP Energies nouvelles, 1 et 4 Avenue de Bois-Préau, 92852 Rueil-Malmaison, France
| | - Toon Verstraelen
- Center for Molecular Modeling (CMM), Ghent University, Technologiepark-Zwijnaarde 46, Zwijnaarde, B-9052 Ghent, Belgium
| |
Collapse
|
36
|
Guibourg P, Dontot L, Anglade PM, Gervais B. DFTB Simulation of Charged Clusters Using Machine Learning Charge Inference. J Chem Theory Comput 2024; 20:4007-4018. [PMID: 38690586 DOI: 10.1021/acs.jctc.4c00107] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/02/2024]
Abstract
We present a modification to self-consistent charge density functional-based tight binding (SCC-DFTB), which allows computation based on approximate atomic charges. We obtain these charges by means of a machine learning (ML) process that combines a Coulomb model with a neural network. This allows us to avoid the SCC cycles in the SCC-DFTB calculation while maintaining its accuracy. The main input of the model is the atomic positions characterized by a set of atom-centered symmetry functions. The charge inference from our ML algorithm is as close as 10-2 units of charge from the exact SCC solution. Our ML-DFTB approach provides a good approximation of the density matrix and of the energy and forces with only a single diagonalization. This is a significant computational saving with respect to the complete SCC algorithm, which allows us to investigate a bigger ensemble of atoms. We show the quality of our approach in the case of charged silicon carbide (SiC) clusters. The ML-DFTB potential energy surface (PES) mimics the SCC-DFTB PES rather well, despite its simplicity. This allows us to obtain the same geometric structure ordering with respect to energy for small clusters. The dissociation barriers for ion emission are well-reproduced, which opens the way to investigating ion field emission and charged cluster stability. The ML-DFTB approach is obviously not limited to charged clusters or SiC materials. It opens a new route to investigate larger clusters than those investigated by standard SCC-DFTB, as well as surface and solid-state chemistry at the atomic level.
Collapse
Affiliation(s)
- Paul Guibourg
- Laboratoire Cimap, UMR6252─Université de Caen Normandie, École Nationale Supérieure d'Ingénieures de Caen, Commissariat à l'Énergie Atomique, Centre National de la Recherche Scientifique, 6 Boulevard Du Maréchal Juin, 14050 Caen Cedex, France
| | - Léo Dontot
- Laboratoire Cimap, UMR6252─Université de Caen Normandie, École Nationale Supérieure d'Ingénieures de Caen, Commissariat à l'Énergie Atomique, Centre National de la Recherche Scientifique, 6 Boulevard Du Maréchal Juin, 14050 Caen Cedex, France
| | - Pierre-Matthieu Anglade
- Laboratoire Cimap, UMR6252─Université de Caen Normandie, École Nationale Supérieure d'Ingénieures de Caen, Commissariat à l'Énergie Atomique, Centre National de la Recherche Scientifique, 6 Boulevard Du Maréchal Juin, 14050 Caen Cedex, France
| | - Benoit Gervais
- Laboratoire Cimap, UMR6252─Université de Caen Normandie, École Nationale Supérieure d'Ingénieures de Caen, Commissariat à l'Énergie Atomique, Centre National de la Recherche Scientifique, 6 Boulevard Du Maréchal Juin, 14050 Caen Cedex, France
| |
Collapse
|
37
|
Wan K, He J, Shi X. Construction of High Accuracy Machine Learning Interatomic Potential for Surface/Interface of Nanomaterials-A Review. ADVANCED MATERIALS (DEERFIELD BEACH, FLA.) 2024; 36:e2305758. [PMID: 37640376 DOI: 10.1002/adma.202305758] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Revised: 08/24/2023] [Indexed: 08/31/2023]
Abstract
The inherent discontinuity and unique dimensional attributes of nanomaterial surfaces and interfaces bestow them with various exceptional properties. These properties, however, also introduce difficulties for both experimental and computational studies. The advent of machine learning interatomic potential (MLIP) addresses some of the limitations associated with empirical force fields, presenting a valuable avenue for accurate simulations of these surfaces/interfaces of nanomaterials. Central to this approach is the idea of capturing the relationship between system configuration and potential energy, leveraging the proficiency of machine learning (ML) to precisely approximate high-dimensional functions. This review offers an in-depth examination of MLIP principles and their execution and elaborates on their applications in the realm of nanomaterial surface and interface systems. The prevailing challenges faced by this potent methodology are also discussed.
Collapse
Affiliation(s)
- Kaiwei Wan
- Laboratory of Theoretical and Computational Nanoscience, National Center for Nanoscience and Technology, Chinese Academy of Sciences, Beijing, 100190, China
- University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing, 100049, China
| | - Jianxin He
- Laboratory of Theoretical and Computational Nanoscience, National Center for Nanoscience and Technology, Chinese Academy of Sciences, Beijing, 100190, China
- University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing, 100049, China
| | - Xinghua Shi
- Laboratory of Theoretical and Computational Nanoscience, National Center for Nanoscience and Technology, Chinese Academy of Sciences, Beijing, 100190, China
- University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing, 100049, China
| |
Collapse
|
38
|
Chen M, Jiang X, Zhang L, Chen X, Wen Y, Gu Z, Li X, Zheng M. The emergence of machine learning force fields in drug design. Med Res Rev 2024; 44:1147-1182. [PMID: 38173298 DOI: 10.1002/med.22008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2023] [Revised: 11/29/2023] [Accepted: 12/05/2023] [Indexed: 01/05/2024]
Abstract
In the field of molecular simulation for drug design, traditional molecular mechanic force fields and quantum chemical theories have been instrumental but limited in terms of scalability and computational efficiency. To overcome these limitations, machine learning force fields (MLFFs) have emerged as a powerful tool capable of balancing accuracy with efficiency. MLFFs rely on the relationship between molecular structures and potential energy, bypassing the need for a preconceived notion of interaction representations. Their accuracy depends on the machine learning models used, and the quality and volume of training data sets. With recent advances in equivariant neural networks and high-quality datasets, MLFFs have significantly improved their performance. This review explores MLFFs, emphasizing their potential in drug design. It elucidates MLFF principles, provides development and validation guidelines, and highlights successful MLFF implementations. It also addresses potential challenges in developing and applying MLFFs. The review concludes by illuminating the path ahead for MLFFs, outlining the challenges to be overcome and the opportunities to be harnessed. This inspires researchers to embrace MLFFs in their investigations as a new tool to perform molecular simulations in drug design.
Collapse
Affiliation(s)
- Mingan Chen
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
- School of Physical Science and Technology, ShanghaiTech University, Shanghai, China
- Lingang Laboratory, Shanghai, China
| | - Xinyu Jiang
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
- School of Pharmacy, University of Chinese Academy of Sciences, Beijing, China
| | - Lehan Zhang
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
- School of Pharmacy, University of Chinese Academy of Sciences, Beijing, China
| | - Xiaoxu Chen
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
- School of Pharmacy, University of Chinese Academy of Sciences, Beijing, China
- School of Pharmaceutical Science and Technology, Hangzhou Institute for Advanced Study, UCAS, Hangzhou, China
| | - Yiming Wen
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
- School of Pharmacy, University of Chinese Academy of Sciences, Beijing, China
- School of Pharmaceutical Science and Technology, Hangzhou Institute for Advanced Study, UCAS, Hangzhou, China
| | - Zhiyong Gu
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
- School of Pharmacy, University of Chinese Academy of Sciences, Beijing, China
- School of Pharmaceutical Science and Technology, Hangzhou Institute for Advanced Study, UCAS, Hangzhou, China
| | - Xutong Li
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
- School of Pharmacy, University of Chinese Academy of Sciences, Beijing, China
| | - Mingyue Zheng
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
- School of Pharmacy, University of Chinese Academy of Sciences, Beijing, China
- School of Pharmaceutical Science and Technology, Hangzhou Institute for Advanced Study, UCAS, Hangzhou, China
| |
Collapse
|
39
|
Song K, Upadhyay M, Meuwly M. OH-Formation following vibrationally induced reaction dynamics of H 2COO. Phys Chem Chem Phys 2024; 26:12698-12708. [PMID: 38602285 DOI: 10.1039/d4cp00739e] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/12/2024]
Abstract
The reaction dynamics of H2COO to form HCOOH and dioxirane as first steps for OH-elimination is quantitatively investigated. Using a machine learned potential energy surface (PES) at the CASPT2/aug-cc-pVTZ level of theory vibrational excitation along the CH-normal mode νCH with energies up to 40.0 kcal mol-1 (∼5νCH) leads almost exclusively to HCOOH which further decomposes into OH + HCO. Although the barrier to form dioxirane is only 21.4 kcal mol-1 the reaction probability to form dioxirane is two orders of magnitude lower if the CH-stretch mode is excited. Following the dioxirane-formation pathway is facile, however, if the COO-bend vibration is excited together with energies equivalent to ∼2νCH or ∼3νCOO. For OH-formation in the atmosphere the pathway through HCOOH is probably most relevant because the alternative pathways (through dioxirane or formic acid) involve several intermediates that can de-excite through collisions, relax via internal vibrational relaxation (IVR), or pass through loose and vulnerable transition states (formic acid). This work demonstrates how, by selectively exciting particular vibrational modes, it is possible to dial into desired reaction channels with a high degree of specificity.
Collapse
Affiliation(s)
- Kaisheng Song
- Department of Chemistry, University of Basel, Klingelbergstrasse 80, CH-4056 Basel, Switzerland.
- School of Chemistry and Chemical Engineering & Chongqing Key Laboratory of Theoretical and Computational Chemistry, Chongqing University, Chongqing 401331, China
| | - Meenu Upadhyay
- Department of Chemistry, University of Basel, Klingelbergstrasse 80, CH-4056 Basel, Switzerland.
| | - Markus Meuwly
- Department of Chemistry, University of Basel, Klingelbergstrasse 80, CH-4056 Basel, Switzerland.
| |
Collapse
|
40
|
Houston PL, Qu C, Yu Q, Pandey P, Conte R, Nandi A, Bowman JM. No Headache for PIPs: A PIP Potential for Aspirin Runs Much Faster and with Similar Precision Than Other Machine-Learned Potentials. J Chem Theory Comput 2024; 20:3008-3018. [PMID: 38593438 DOI: 10.1021/acs.jctc.4c00054] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/11/2024]
Abstract
Assessments of machine-learning (ML) potentials are an important aspect of the rapid development of this field. We recently reported an assessment of the linear-regression permutationally invariant polynomial (PIP) method for ethanol, using the widely used (revised) rMD17 data set. We demonstrated that the PIP approach outperformed numerous other methods, e.g., ANI, PhysNet, sGDML, and p-KRR, with respect to precision and notably with respect to speed [Houston et al., J. Chem. Phys. 2022, 156, 044120]. Here, we extend this assessment to the 21-atom aspirin molecule, using the rMD17 data set, with a focus on the speed of evaluation. Both energies and forces are used for training, and the precision of several PIPs is examined for both. Normal mode frequencies, the methyl torsional potential, and 1d vibrational energies for an OH stretch are presented. We show that the PIP approach achieves the level of precision obtained from other ML methods, e.g., atom-centered neural network methods, linear regression ACE, and kernel methods, as reported by Kovács et al. in J. Chem. Theory Comput. 2021, 17, 7696-7711. More significantly, we show that the PIP PESs run much faster than all other ML methods, whose timings were evaluated in that paper. We also show that the PIP PES extrapolates well enough to describe several internal motions of aspirin, including an OH stretch.
Collapse
Affiliation(s)
- Paul L Houston
- Department of Chemistry and Chemical Biology, Cornell University, Ithaca, New York 14853, United States
- Department of Chemistry and Biochemistry, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
| | - Chen Qu
- Independent Researcher, Toronto, Ontario M9B0E3, Canada
| | - Qi Yu
- Department of Chemistry, Fudan University, Shanghai 200438, P. R. China
| | - Priyanka Pandey
- Department of Chemistry, Cherry L. Emerson Center for Scientific Computation, Emory University, Atlanta, Georgia 30322, United States
| | - Riccardo Conte
- Dipartimento di Chimica, Università degli Studi di Milano, via Golgi 19, 20133 Milano, Italy
| | - Apurba Nandi
- Department of Chemistry, Cherry L. Emerson Center for Scientific Computation, Emory University, Atlanta, Georgia 30322, United States
- Department of Physics and Materials Science, University of Luxembourg, Luxembourg City L-1511, Luxembourg
| | - Joel M Bowman
- Department of Chemistry, Cherry L. Emerson Center for Scientific Computation, Emory University, Atlanta, Georgia 30322, United States
| |
Collapse
|
41
|
Jin H, Merz KM. Modeling Zinc Complexes Using Neural Networks. J Chem Inf Model 2024; 64:3140-3148. [PMID: 38587510 DOI: 10.1021/acs.jcim.4c00095] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/09/2024]
Abstract
Understanding the energetic landscapes of large molecules is necessary for the study of chemical and biological systems. Recently, deep learning has greatly accelerated the development of models based on quantum chemistry, making it possible to build potential energy surfaces and explore chemical space. However, most of this work has focused on organic molecules due to the simplicity of their electronic structures as well as the availability of data sets. In this work, we build a deep learning architecture to model the energetics of zinc organometallic complexes. To achieve this, we have compiled a configurationally and conformationally diverse data set of zinc complexes using metadynamics to overcome the limitations of traditional sampling methods. In terms of the neural network potentials, our results indicate that for zinc complexes, partial charges play an important role in modeling the long-range interactions with a neural network. Our developed model outperforms semiempirical methods in predicting the relative energy of zinc conformers, yielding a mean absolute error (MAE) of 1.32 kcal/mol with reference to the double-hybrid PWPB95 method.
Collapse
Affiliation(s)
- Hongni Jin
- Department of Chemistry, Michigan State University, East Lansing, Michigan 48824, United States
| | - Kenneth M Merz
- Department of Chemistry, Michigan State University, East Lansing, Michigan 48824, United States
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, Michigan 48824, United States
| |
Collapse
|
42
|
Gallegos M, Isamura BK, Popelier PLA, Martín Pendás Á. An Unsupervised Machine Learning Approach for the Automatic Construction of Local Chemical Descriptors. J Chem Inf Model 2024; 64:3059-3079. [PMID: 38498942 DOI: 10.1021/acs.jcim.3c01906] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/20/2024]
Abstract
Condensing the many physical variables defining a chemical system into a fixed-size array poses a significant challenge in the development of chemical Machine Learning (ML). Atom Centered Symmetry Functions (ACSFs) offer an intuitive featurization approach by means of a tedious and labor-intensive selection of tunable parameters. In this work, we implement an unsupervised ML strategy relying on a Gaussian Mixture Model (GMM) to automatically optimize the ACSF parameters. GMMs effortlessly decompose the vastness of the chemical and conformational spaces into well-defined radial and angular clusters, which are then used to build tailor-made ACSFs. The unsupervised exploration of the space has demonstrated general applicability across a diverse range of systems, spanning from various unimolecular landscapes to heterogeneous databases. The impact of the sampling technique and temperature on space exploration is also addressed, highlighting the particularly advantageous role of high-temperature Molecular Dynamics (MD) simulations. The reliability of the resulting features is assessed through the estimation of the atomic charges of a prototypical capped amino acid and a heterogeneous collection of CHON molecules. The automatically constructed ACSFs serve as high-quality descriptors, consistently yielding typical prediction errors below 0.010 electrons bound for the reported atomic charges. Altering the spatial distribution of the functions with respect to the cluster highlights the critical role of symmetry rupture in achieving significantly improved features. More specifically, using two separate functions to describe the lower and upper tails of the cluster results in the best performing models with errors as low as 0.006 electrons. Finally, the effectiveness of finely tuned features was checked across different architectures, unveiling the superior performance of Gaussian Process (GP) models over Feed Forward Neural Networks (FFNNs), particularly in low-data regimes, with nearly a 2-fold increase in prediction quality. Altogether, this approach paves the way toward an easier construction of local chemical descriptors, while providing valuable insights into how radial and angular spaces should be mapped. Finally, this work opens the possibility of encoding many-body information beyond angular terms into upcoming ML features.
Collapse
Affiliation(s)
- Miguel Gallegos
- Department of Analytical and Physical Chemistry, University of Oviedo, Oviedo E-33006, Spain
| | | | - Paul L A Popelier
- Department of Chemistry, The University of Manchester, Oxford Road, Manchester M13 9PL, U.K
| | - Ángel Martín Pendás
- Department of Analytical and Physical Chemistry, University of Oviedo, Oviedo E-33006, Spain
| |
Collapse
|
43
|
Atz K, Cotos L, Isert C, Håkansson M, Focht D, Hilleke M, Nippa DF, Iff M, Ledergerber J, Schiebroek CCG, Romeo V, Hiss JA, Merk D, Schneider P, Kuhn B, Grether U, Schneider G. Prospective de novo drug design with deep interactome learning. Nat Commun 2024; 15:3408. [PMID: 38649351 PMCID: PMC11035696 DOI: 10.1038/s41467-024-47613-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2023] [Accepted: 04/02/2024] [Indexed: 04/25/2024] Open
Abstract
De novo drug design aims to generate molecules from scratch that possess specific chemical and pharmacological properties. We present a computational approach utilizing interactome-based deep learning for ligand- and structure-based generation of drug-like molecules. This method capitalizes on the unique strengths of both graph neural networks and chemical language models, offering an alternative to the need for application-specific reinforcement, transfer, or few-shot learning. It enables the "zero-shot" construction of compound libraries tailored to possess specific bioactivity, synthesizability, and structural novelty. In order to proactively evaluate the deep interactome learning framework for protein structure-based drug design, potential new ligands targeting the binding site of the human peroxisome proliferator-activated receptor (PPAR) subtype gamma are generated. The top-ranking designs are chemically synthesized and computationally, biophysically, and biochemically characterized. Potent PPAR partial agonists are identified, demonstrating favorable activity and the desired selectivity profiles for both nuclear receptors and off-target interactions. Crystal structure determination of the ligand-receptor complex confirms the anticipated binding mode. This successful outcome positively advocates interactome-based de novo design for application in bioorganic and medicinal chemistry, enabling the creation of innovative bioactive molecules.
Collapse
Affiliation(s)
- Kenneth Atz
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 4, 8093, Zurich, Switzerland
| | - Leandro Cotos
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 4, 8093, Zurich, Switzerland
| | - Clemens Isert
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 4, 8093, Zurich, Switzerland
| | - Maria Håkansson
- SARomics Biostructures AB, Medicon Village, SE-223 81, Lund, Sweden
| | - Dorota Focht
- SARomics Biostructures AB, Medicon Village, SE-223 81, Lund, Sweden
| | - Mattis Hilleke
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 4, 8093, Zurich, Switzerland
| | - David F Nippa
- Roche Pharma Research and Early Development (pRED), Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd., Grenzacherstrasse 124, CH-4070, Basel, Switzerland
- Department of Pharmacy, Ludwig-Maximilians-Universität München, Butenandtstrasse 5, 81377, Munich, Germany
| | - Michael Iff
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 4, 8093, Zurich, Switzerland
| | - Jann Ledergerber
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 4, 8093, Zurich, Switzerland
| | - Carl C G Schiebroek
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 4, 8093, Zurich, Switzerland
| | - Valentina Romeo
- Roche Pharma Research and Early Development (pRED), Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd., Grenzacherstrasse 124, CH-4070, Basel, Switzerland
| | - Jan A Hiss
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 4, 8093, Zurich, Switzerland
| | - Daniel Merk
- Department of Pharmacy, Ludwig-Maximilians-Universität München, Butenandtstrasse 5, 81377, Munich, Germany
| | - Petra Schneider
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 4, 8093, Zurich, Switzerland
| | - Bernd Kuhn
- Roche Pharma Research and Early Development (pRED), Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd., Grenzacherstrasse 124, CH-4070, Basel, Switzerland
| | - Uwe Grether
- Roche Pharma Research and Early Development (pRED), Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd., Grenzacherstrasse 124, CH-4070, Basel, Switzerland
| | - Gisbert Schneider
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 4, 8093, Zurich, Switzerland.
| |
Collapse
|
44
|
Zills F, Schäfer MR, Segreto N, Kästner J, Holm C, Tovey S. Collaboration on Machine-Learned Potentials with IPSuite: A Modular Framework for Learning-on-the-Fly. J Phys Chem B 2024; 128:3662-3676. [PMID: 38568231 DOI: 10.1021/acs.jpcb.3c07187] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/19/2024]
Abstract
The field of machine learning potentials has experienced a rapid surge in progress, thanks to advances in machine learning theory, algorithms, and hardware capabilities. While the underlying methods are continuously evolving, the infrastructure for their deployment has lagged. The community, due to these rapid developments, frequently finds itself split into groups built around different implementations of machine-learned potentials. In this work, we introduce IPSuite, a Python-driven software package designed to connect different methods and algorithms from the comprehensive field of machine-learned potentials into a single platform while also providing a collaborative infrastructure, helping ensure reproducibility. Furthermore, the data management infrastructure of the IPSuite code enables simple model sharing and deployment in simulations. Currently, IPSuite supports six state-of-the-art machine learning approaches for the fitting of interatomic potentials as well as a variety of methods for the selection of training data, running of ab initio calculations, learning-on-the-fly strategies, model evaluation, and simulation deployment.
Collapse
Affiliation(s)
- Fabian Zills
- Institute for Computational Physics, University of Stuttgart, 70569 Stuttgart, Germany
| | - Moritz René Schäfer
- Institute for Theoretical Chemistry, University of Stuttgart, 70569 Stuttgart, Germany
| | - Nico Segreto
- Institute for Theoretical Chemistry, University of Stuttgart, 70569 Stuttgart, Germany
| | - Johannes Kästner
- Institute for Theoretical Chemistry, University of Stuttgart, 70569 Stuttgart, Germany
| | - Christian Holm
- Institute for Computational Physics, University of Stuttgart, 70569 Stuttgart, Germany
| | - Samuel Tovey
- Institute for Computational Physics, University of Stuttgart, 70569 Stuttgart, Germany
| |
Collapse
|
45
|
Pan X, Snyder R, Wang JN, Lander C, Wickizer C, Van R, Chesney A, Xue Y, Mao Y, Mei Y, Pu J, Shao Y. Training machine learning potentials for reactive systems: A Colab tutorial on basic models. J Comput Chem 2024; 45:638-647. [PMID: 38082539 PMCID: PMC10923003 DOI: 10.1002/jcc.27269] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2023] [Revised: 11/10/2023] [Accepted: 11/11/2023] [Indexed: 01/18/2024]
Abstract
In the last several years, there has been a surge in the development of machine learning potential (MLP) models for describing molecular systems. We are interested in a particular area of this field - the training of system-specific MLPs for reactive systems - with the goal of using these MLPs to accelerate free energy simulations of chemical and enzyme reactions. To help new members in our labs become familiar with the basic techniques, we have put together a self-guided Colab tutorial (https://cc-ats.github.io/mlp_tutorial/), which we expect to be also useful to other young researchers in the community. Our tutorial begins with the introduction of simple feedforward neural network (FNN) and kernel-based (using Gaussian process regression, GPR) models by fitting the two-dimensional Müller-Brown potential. Subsequently, two simple descriptors are presented for extracting features of molecular systems: symmetry functions (including the ANI variant) and embedding neural networks (such as DeepPot-SE). Lastly, these features will be fed into FNN and GPR models to reproduce the energies and forces for the molecular configurations in a Claisen rearrangement reaction.
Collapse
Affiliation(s)
- Xiaoliang Pan
- Department of Chemistry and Biochemistry, University of Oklahoma, Norman, OK 73019, USA
| | - Ryan Snyder
- Department of Chemistry and Chemical Biology, Indiana University-Purdue University Indianapolis, Indianapolis, IN 46202, USA
| | - Jia-Ning Wang
- State Key Laboratory of Precision Spectroscopy, School of Physics and Electronic Science, East China Normal University, Shanghai 200241, China
| | - Chance Lander
- Department of Chemistry and Biochemistry, University of Oklahoma, Norman, OK 73019, USA
| | - Carly Wickizer
- Department of Chemistry and Biochemistry, University of Oklahoma, Norman, OK 73019, USA
| | - Richard Van
- Department of Chemistry and Biochemistry, University of Oklahoma, Norman, OK 73019, USA
- Laboratory of Computational Biology, National, Heart, Lung and Blood Institute, National Institutes of Health, Bethesda, MD 20824, USA
| | - Andrew Chesney
- Department of Chemistry and Biochemistry, University of Oklahoma, Norman, OK 73019, USA
| | - Yuanfei Xue
- State Key Laboratory of Precision Spectroscopy, School of Physics and Electronic Science, East China Normal University, Shanghai 200241, China
| | - Yuezhi Mao
- Department of Chemistry and Biochemistry, San Diego State University, San Diego, CA 92182, USA
| | - Ye Mei
- State Key Laboratory of Precision Spectroscopy, School of Physics and Electronic Science, East China Normal University, Shanghai 200241, China
- NYU-ECNU Center for Computational Chemistry at NYU Shanghai, Shanghai 200062, China
- Collaborative Innovation Center of Extreme Optics, Shanxi University, Taiyuan, Shanxi 030006, China
| | - Jingzhi Pu
- Department of Chemistry and Chemical Biology, Indiana University-Purdue University Indianapolis, Indianapolis, IN 46202, USA
| | - Yihan Shao
- Department of Chemistry and Biochemistry, University of Oklahoma, Norman, OK 73019, USA
| |
Collapse
|
46
|
Fong KD, Sumić B, O’Neill N, Schran C, Grey CP, Michaelides A. The Interplay of Solvation and Polarization Effects on Ion Pairing in Nanoconfined Electrolytes. NANO LETTERS 2024; 24. [PMID: 38592099 PMCID: PMC11057028 DOI: 10.1021/acs.nanolett.4c00890] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/20/2024] [Revised: 04/03/2024] [Accepted: 04/03/2024] [Indexed: 04/10/2024]
Abstract
The nature of ion-ion interactions in electrolytes confined to nanoscale pores has important implications for energy storage and separation technologies. However, the physical effects dictating the structure of nanoconfined electrolytes remain debated. Here we employ machine-learning-based molecular dynamics simulations to investigate ion-ion interactions with density functional theory level accuracy in a prototypical confined electrolyte, aqueous NaCl within graphene slit pores. We find that the free energy of ion pairing in highly confined electrolytes deviates substantially from that in bulk solutions, observing a decrease in contact ion pairing but an increase in solvent-separated ion pairing. These changes arise from an interplay of ion solvation effects and graphene's electronic structure. Notably, the behavior observed from our first-principles-level simulations is not reproduced even qualitatively with the classical force fields conventionally used to model these systems. The insight provided in this work opens new avenues for predicting and controlling the structure of nanoconfined electrolytes.
Collapse
Affiliation(s)
- Kara D. Fong
- Yusuf
Hamied Department of Chemistry, University
of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom
| | - Barbara Sumić
- Yusuf
Hamied Department of Chemistry, University
of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom
| | - Niamh O’Neill
- Yusuf
Hamied Department of Chemistry, University
of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom
| | - Christoph Schran
- Cavendish
Laboratory, Department of Physics, University
of Cambridge, Cambridge CB3 OHE, United
Kingdom
| | - Clare P. Grey
- Yusuf
Hamied Department of Chemistry, University
of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom
| | - Angelos Michaelides
- Yusuf
Hamied Department of Chemistry, University
of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom
| |
Collapse
|
47
|
Tkaczyk S, Karwounopoulos J, Schöller A, Woodcock HL, Langer T, Boresch S, Wieder M. Reweighting from Molecular Mechanics Force Fields to the ANI-2x Neural Network Potential. J Chem Theory Comput 2024; 20:2719-2728. [PMID: 38527958 DOI: 10.1021/acs.jctc.3c01274] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/27/2024]
Abstract
To achieve chemical accuracy in free energy calculations, it is necessary to accurately describe the system's potential energy surface and efficiently sample configurations from its Boltzmann distribution. While neural network potentials (NNPs) have shown significantly higher accuracy than classical molecular mechanics (MM) force fields, they have a limited range of applicability and are considerably slower than MM potentials, often by orders of magnitude. To address this challenge, Rufa et al. [Rufa et al. bioRxiv 2020, 10.1101/2020.07.29.227959.] suggested a two-stage approach that uses a fast and established MM alchemical energy protocol, followed by reweighting the results using NNPs, known as endstate correction or indirect free energy calculation. This study systematically investigates the accuracy and robustness of reweighting from an MM reference to a neural network target potential (ANI-2x) for an established data set in vacuum, using single-step free-energy perturbation (FEP) and nonequilibrium (NEQ) switching simulation. We assess the influence of longer switching lengths and the impact of slow degrees of freedom on outliers in the work distribution and compare the results to those of multistate equilibrium free energy simulations. Our results demonstrate that free energy calculations between NNPs and MM potentials should be preferably performed using NEQ switching simulations to obtain accurate free energy estimates. NEQ switching simulations between the MM potentials and NNPs are efficient, robust, and trivial to implement.
Collapse
Affiliation(s)
- Sara Tkaczyk
- Department of Pharmaceutical Sciences, Pharmaceutical Chemistry Division, University of Vienna, Josef-Holaubek-Platz 2, 1090 Vienna, Austria
- Vienna Doctoral School of Pharmaceutical, Nutritional and Sport Sciences (PhaNuSpo), University of Vienna, 1090 Vienna, Austria
| | - Johannes Karwounopoulos
- Faculty of Chemistry, Institute of Computational Biological Chemistry, University of Vienna, Währingerstrasse 17, 1090 Vienna, Austria
- Vienna Doctoral School of Chemistry (DoSChem), University of Vienna, Währingerstrasse 42, 1090 Vienna, Austria
| | - Andreas Schöller
- Faculty of Chemistry, Institute of Computational Biological Chemistry, University of Vienna, Währingerstrasse 17, 1090 Vienna, Austria
- Vienna Doctoral School of Chemistry (DoSChem), University of Vienna, Währingerstrasse 42, 1090 Vienna, Austria
| | - H Lee Woodcock
- Department of Chemistry, University of South Florida, 4202 E. Fowler Ave., CHE205, Tampa, Florida 33620-5250, United States
| | - Thierry Langer
- Department of Pharmaceutical Sciences, Pharmaceutical Chemistry Division, University of Vienna, Josef-Holaubek-Platz 2, 1090 Vienna, Austria
| | - Stefan Boresch
- Faculty of Chemistry, Institute of Computational Biological Chemistry, University of Vienna, Währingerstrasse 17, 1090 Vienna, Austria
| | - Marcus Wieder
- Faculty of Chemistry, Institute of Computational Biological Chemistry, University of Vienna, Währingerstrasse 17, 1090 Vienna, Austria
| |
Collapse
|
48
|
Sai L, Fu L, Zhao J. Predicting Binding Energies and Electronic Properties of Boron Nitride Fullerenes Using a Graph Convolutional Network. J Chem Inf Model 2024; 64:2645-2653. [PMID: 38117935 DOI: 10.1021/acs.jcim.3c01708] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2023]
Abstract
As isoelectronic counterparts of carbon fullerenes, medium-sized boron nitride clusters also prefer cage structures composed of even-sized polygons. As the cluster size increases, the number of cage isomers grows rapidly, and determining the ground state structure requires a tremendous amount of DFT calculations. Herein, we develop a graph convolutional network (GCN) that can describe the energy of a (BN)n cage by its topology connection. We define a vertex feature vector on a dual polyhedron by the permutation of the neighbor vertices' degree and aggregate the information on vertices by two graph convolutional layers to learn the local feature of the dual polyhedron. The GCN is trained on (BN)28 and subsequently tested on (BN)23 and (BN)24 data sets, which satisfactorily reproduce the order of isomer energies from DFT calculations. We further employ the trained GCN to predict the ground state structures within the size range of n = 25-32, which agree well with DFT results. Using the same GCN framework, we also successfully trained the highest-occupied or lowest-unoccupied orbital energies of (BN)28 isomers. The present graph convolutional network establishes a direct mapping between the topological connection and the energetic or electronic properties of a cage-like cluster or molecule.
Collapse
Affiliation(s)
- Linwei Sai
- Department of Mathematics, Hohai University, Changzhou 213200, China
| | - Li Fu
- Key Laboratory of Materials Modification by Laser, Ion and Electron Beams, Dalian University of Technology, Ministry of Education, Dalian 116024, China
| | - Jijun Zhao
- Key Laboratory of Materials Modification by Laser, Ion and Electron Beams, Dalian University of Technology, Ministry of Education, Dalian 116024, China
| |
Collapse
|
49
|
Unke OT, Stöhr M, Ganscha S, Unterthiner T, Maennel H, Kashubin S, Ahlin D, Gastegger M, Medrano Sandonas L, Berryman JT, Tkatchenko A, Müller KR. Biomolecular dynamics with machine-learned quantum-mechanical force fields trained on diverse chemical fragments. SCIENCE ADVANCES 2024; 10:eadn4397. [PMID: 38579003 DOI: 10.1126/sciadv.adn4397] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/10/2023] [Accepted: 02/29/2024] [Indexed: 04/07/2024]
Abstract
The GEMS method enables molecular dynamics simulations of large heterogeneous systems at ab initio quality.
Collapse
Affiliation(s)
- Oliver T Unke
- Google DeepMind, Tucholskystraße 2, 10117 Berlin, Germany and Brandschenkestrasse 110, 8002 Zürich, Switzerland
- Machine Learning Group, Technische Universität Berlin, 10587 Berlin, Germany
- DFG Cluster of Excellence "Unifying Systems in Catalysis" (UniSysCat), Technische Universität Berlin, 10623 Berlin, Germany
| | - Martin Stöhr
- Department of Physics and Materials Science, University of Luxembourg, L-1511 Luxembourg City, Luxembourg
| | - Stefan Ganscha
- Google DeepMind, Tucholskystraße 2, 10117 Berlin, Germany and Brandschenkestrasse 110, 8002 Zürich, Switzerland
| | - Thomas Unterthiner
- Google DeepMind, Tucholskystraße 2, 10117 Berlin, Germany and Brandschenkestrasse 110, 8002 Zürich, Switzerland
| | - Hartmut Maennel
- Google DeepMind, Tucholskystraße 2, 10117 Berlin, Germany and Brandschenkestrasse 110, 8002 Zürich, Switzerland
| | - Sergii Kashubin
- Google DeepMind, Tucholskystraße 2, 10117 Berlin, Germany and Brandschenkestrasse 110, 8002 Zürich, Switzerland
| | - Daniel Ahlin
- Google DeepMind, Tucholskystraße 2, 10117 Berlin, Germany and Brandschenkestrasse 110, 8002 Zürich, Switzerland
| | - Michael Gastegger
- Machine Learning Group, Technische Universität Berlin, 10587 Berlin, Germany
- DFG Cluster of Excellence "Unifying Systems in Catalysis" (UniSysCat), Technische Universität Berlin, 10623 Berlin, Germany
- BASLEARN - TU Berlin/BASF Joint Lab for Machine Learning, Technische Universität Berlin, 10587 Berlin, Germany
| | - Leonardo Medrano Sandonas
- Department of Physics and Materials Science, University of Luxembourg, L-1511 Luxembourg City, Luxembourg
| | - Joshua T Berryman
- Department of Physics and Materials Science, University of Luxembourg, L-1511 Luxembourg City, Luxembourg
| | - Alexandre Tkatchenko
- Department of Physics and Materials Science, University of Luxembourg, L-1511 Luxembourg City, Luxembourg
| | - Klaus-Robert Müller
- Google DeepMind, Tucholskystraße 2, 10117 Berlin, Germany and Brandschenkestrasse 110, 8002 Zürich, Switzerland
- Machine Learning Group, Technische Universität Berlin, 10587 Berlin, Germany
- Department of Artificial Intelligence, Korea University, Anam-dong, Seongbuk-gu, Seoul 02841, Korea
- Max Planck Institute for Informatics, Stuhlsatzenhausweg, 66123 Saarbrücken, Germany
- BIFOLD - Berlin Institute for the Foundations of Learning and Data, Berlin, Germany
| |
Collapse
|
50
|
Xie P, Car R, E W. Ab initio generalized Langevin equation. Proc Natl Acad Sci U S A 2024; 121:e2308668121. [PMID: 38551836 PMCID: PMC10998567 DOI: 10.1073/pnas.2308668121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2023] [Accepted: 02/22/2024] [Indexed: 04/08/2024] Open
Abstract
We introduce a machine learning-based approach called ab initio generalized Langevin equation (AIGLE) to model the dynamics of slow collective variables (CVs) in materials and molecules. In this scheme, the parameters are learned from atomistic simulations based on ab initio quantum mechanical models. Force field, memory kernel, and noise generator are constructed in the context of the Mori-Zwanzig formalism, under the constraint of the fluctuation-dissipation theorem. Combined with deep potential molecular dynamics and electronic density functional theory, this approach opens the way to multiscale modeling in a variety of situations. Here, we demonstrate this capability with a study of two mesoscale processes in crystalline lead titanate, namely the field-driven dynamics of a planar ferroelectric domain wall, and the dynamics of an extensive lattice of coarse-grained electric dipoles. In the first case, AIGLE extends the reach of ab initio simulations to a regime of noise-driven motions not accessible to molecular dynamics. In the second case, AIGLE deals with an extensive set of CVs by adopting a local approximation for the memory kernel and retaining only short-range noise correlations. The scheme is computationally more efficient than molecular dynamics by several orders of magnitude and mimics the microscopic dynamics at low frequencies where it reproduces accurately the dominant far-infrared absorption frequency.
Collapse
Affiliation(s)
- Pinchen Xie
- Program in Applied and Computational Mathematics, Princeton University, Princeton, NJ08544
| | - Roberto Car
- Program in Applied and Computational Mathematics, Princeton University, Princeton, NJ08544
- Department of Chemistry and Princeton Materials Institute, Princeton University, Princeton, NJ08544
- Department of Physics, Princeton University, Princeton, NJ08544
| | - Weinan E
- AI for Science Institute, Beijing100080, China
- Center for Machine Learning Research and School of Mathematical Sciences, Peking University, Beijing100084, China
| |
Collapse
|