1
|
Simeon G, Mirarchi A, Pelaez RP, Galvelis R, De Fabritiis G. Broadening the Scope of Neural Network Potentials through Direct Inclusion of Additional Molecular Attributes. J Chem Theory Comput 2025; 21:1831-1837. [PMID: 39933873 DOI: 10.1021/acs.jctc.4c01625] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/13/2025]
Abstract
Most state-of-the-art neural network potentials do not account for molecular attributes other than atomic numbers and positions, which limits its range of applicability by design. In this work, we demonstrate the importance of including additional electronic attributes in neural network potential representations with a minimal architectural change to TensorNet, a state-of-the-art equivariant model based on Cartesian rank-2 tensor representations. By performing experiments on both custom-made and public benchmarking data sets, we show that this modification resolves input degeneracy issues stemming from the use of atomic numbers and positions alone, while enhancing the model's predictive accuracy across diverse chemical systems with different charge or spin states. This is accomplished without tailored strategies or the inclusion of physics-based energy terms, while maintaining efficiency and accuracy. These findings should furthermore encourage researchers to train and use models incorporating these additional representations.
Collapse
Affiliation(s)
- Guillem Simeon
- Computational Science Laboratory, Universitat Pompeu Fabra, Barcelona Biomedical Research Park (PRBB), C Dr. Aiguader 88, 08003 Barcelona, Spain
| | - Antonio Mirarchi
- Computational Science Laboratory, Universitat Pompeu Fabra, Barcelona Biomedical Research Park (PRBB), C Dr. Aiguader 88, 08003 Barcelona, Spain
| | - Raul P Pelaez
- Computational Science Laboratory, Universitat Pompeu Fabra, Barcelona Biomedical Research Park (PRBB), C Dr. Aiguader 88, 08003 Barcelona, Spain
| | - Raimondas Galvelis
- Computational Science Laboratory, Universitat Pompeu Fabra, Barcelona Biomedical Research Park (PRBB), C Dr. Aiguader 88, 08003 Barcelona, Spain
- Acellera Labs, C Dr Trueta 183, 08005 Barcelona, Spain
| | - Gianni De Fabritiis
- Computational Science Laboratory, Universitat Pompeu Fabra, Barcelona Biomedical Research Park (PRBB), C Dr. Aiguader 88, 08003 Barcelona, Spain
- Acellera Labs, C Dr Trueta 183, 08005 Barcelona, Spain
- Institució Catalana de Recerca I Estudis Avançats (ICREA), Passeig Lluis Companys 23, 08010 Barcelona, Spain
| |
Collapse
|
2
|
Chen J, Gao Q, Huang M, Yu K. Application of modern artificial intelligence techniques in the development of organic molecular force fields. Phys Chem Chem Phys 2025; 27:2294-2319. [PMID: 39820957 DOI: 10.1039/d4cp02989e] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2025]
Abstract
The molecular force field (FF) determines the accuracy of molecular dynamics (MD) and is one of the major bottlenecks that limits the application of MD in molecular design. Recently, artificial intelligence (AI) techniques, such as machine-learning potentials (MLPs), have been rapidly reshaping the landscape of MD. Meanwhile, organic molecular systems feature unique characteristics, and require more careful treatment in both model construction, optimization, and validation. While an accurate and generic organic molecular force field is still missing, significant progress has been made with the facilitation of AI, warranting a promising future. In this review, we provide an overview of the various types of AI techniques used in molecular FF development and discuss both the advantages and weaknesses of these methodologies. We show how AI methods provide unprecedented capabilities in many tasks such as potential fitting, atom typification, and automatic optimization. Meanwhile, it is also worth noting that more efforts are needed to improve the transferability of the model, develop a more comprehensive database, and establish more standardized validation procedures. With these discussions, we hope to inspire more efforts to solve the existing problems, eventually leading to the birth of next-generation generic organic FFs.
Collapse
Affiliation(s)
- Junmin Chen
- Institute of Materials Research (IMR), Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen, 518055, China.
- Tsinghua-Berkeley Shenzhen Institute (TBSI), Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen, 518055, China
| | - Qian Gao
- Institute of Materials Research (IMR), Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen, 518055, China.
| | - Miaofei Huang
- Institute of Materials Research (IMR), Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen, 518055, China.
| | - Kuang Yu
- Institute of Materials Research (IMR), Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen, 518055, China.
- Tsinghua-Berkeley Shenzhen Institute (TBSI), Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen, 518055, China
| |
Collapse
|
3
|
Arattu Thodika A, Pan X, Shao Y, Nam K. Machine Learning Quantum Mechanical/Molecular Mechanical Potentials: Evaluating Transferability in Dihydrofolate Reductase-Catalyzed Reactions. J Chem Theory Comput 2025; 21:817-832. [PMID: 39815393 PMCID: PMC11781312 DOI: 10.1021/acs.jctc.4c01487] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2024] [Revised: 12/30/2024] [Accepted: 01/03/2025] [Indexed: 01/18/2025]
Abstract
Integrating machine learning potentials (MLPs) with quantum mechanical/molecular mechanical (QM/MM) free energy simulations has emerged as a powerful approach for studying enzymatic catalysis. However, its practical application has been hindered by the time-consuming process of generating the necessary training, validation, and test data for MLP models through QM/MM simulations. Furthermore, the entire process needs to be repeated for each specific enzyme system and reaction. To overcome this bottleneck, it is required that trained MLPs exhibit transferability across different enzyme environments and reacting species, thereby eliminating the need for retraining with each new enzyme variant. In this study, we explore this potential by evaluating the transferability of a pretrained ΔMLP model across different enzyme mutations within the MM environment using the QM/MM-based ML architecture developed by Pan, X. J. Chem. Theory Comput. 2021, 17(9), 5745-5758. The study includes scenarios such as single point substitutions, a homologous enzyme from different species, and even a transition to an aqueous environment, where the last two systems have MM environment that is substantially different from that used in MLP training. The results show that the ΔMLP model effectively captures and predicts the effects of enzyme mutations on electrostatic interactions, producing reliable free energy profiles of enzyme-catalyzed reactions without the need for retraining. The study also identified notable limitations in transferability, particularly when transitioning from enzyme to water-rich MM environments. Overall, this study demonstrates the robustness of the Pan et al.'s QM/MM-based ML architecture for application to diverse enzyme systems, as well as the need for further research and the development of more sophisticated MLP models and training methods.
Collapse
Affiliation(s)
- Abdul
Raafik Arattu Thodika
- Department
of Chemistry and Biochemistry, University
of Texas at Arlington, Arlington, Texas 76019, United States
| | - Xiaoliang Pan
- Department
of Chemistry and Biochemistry, University
of Oklahoma, Norman, Oklahoma 73019, United States
| | - Yihan Shao
- Department
of Chemistry and Biochemistry, University
of Oklahoma, Norman, Oklahoma 73019, United States
| | - Kwangho Nam
- Department
of Chemistry and Biochemistry, University
of Texas at Arlington, Arlington, Texas 76019, United States
- Division
of Data Science, University of Texas at
Arlington, Arlington, Texas 76019, United States
| |
Collapse
|
4
|
Stolte N, Daru J, Forbert H, Marx D, Behler J. Random Sampling Versus Active Learning Algorithms for Machine Learning Potentials of Quantum Liquid Water. J Chem Theory Comput 2025; 21:886-899. [PMID: 39808506 DOI: 10.1021/acs.jctc.4c01382] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2025]
Abstract
Training accurate machine learning potentials requires electronic structure data comprehensively covering the configurational space of the system of interest. As the construction of this data is computationally demanding, many schemes for identifying the most important structures have been proposed. Here, we compare the performance of high-dimensional neural network potentials (HDNNPs) for quantum liquid water at ambient conditions trained to data sets constructed using random sampling as well as various flavors of active learning based on query by committee. Contrary to the common understanding of active learning, we find that for a given data set size, random sampling leads to smaller test errors for structures not included in the training process. In our analysis, we show that this can be related to small energy offsets caused by a bias in structures added in active learning, which can be overcome by using instead energy correlations as an error measure that is invariant to such shifts. Still, all HDNNPs yield very similar and accurate structural properties of quantum liquid water, which demonstrates the robustness of the training procedure with respect to the training set construction algorithm even when trained to as few as 200 structures. However, we find that for active learning based on preliminary potentials, a reasonable initial data set is important to avoid an unnecessary extension of the covered configuration space to less relevant regions.
Collapse
Affiliation(s)
- Nore Stolte
- Lehrstuhl für Theoretische Chemie, Ruhr-Universität Bochum, Bochum 44780, Germany
| | - János Daru
- Lehrstuhl für Theoretische Chemie, Ruhr-Universität Bochum, Bochum 44780, Germany
- Department of Organic Chemistry, Eötvös Loránd University, Budapest 1117, Hungary
| | - Harald Forbert
- Center for Solvation Science ZEMOS, Ruhr-Universität Bochum, Bochum 44780, Germany
| | - Dominik Marx
- Lehrstuhl für Theoretische Chemie, Ruhr-Universität Bochum, Bochum 44780, Germany
| | - Jörg Behler
- Lehrstuhl für Theoretische Chemie II, Ruhr-Universität Bochum, Bochum 44780, Germany
- Research Center Chemical Sciences and Sustainability, Research Alliance Ruhr, Bochum 44780, Germany
| |
Collapse
|
5
|
Chong S, Bigi F, Grasselli F, Loche P, Kellner M, Ceriotti M. Prediction rigidities for data-driven chemistry. Faraday Discuss 2025; 256:322-344. [PMID: 39319702 PMCID: PMC11423580 DOI: 10.1039/d4fd00101j] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2024] [Accepted: 08/22/2024] [Indexed: 09/26/2024]
Abstract
The widespread application of machine learning (ML) to the chemical sciences is making it very important to understand how the ML models learn to correlate chemical structures with their properties, and what can be done to improve the training efficiency whilst guaranteeing interpretability and transferability. In this work, we demonstrate the wide utility of prediction rigidities, a family of metrics derived from the loss function, in understanding the robustness of ML model predictions. We show that the prediction rigidities allow the assessment of the model not only at the global level, but also on the local or the component-wise level at which the intermediate (e.g. atomic, body-ordered, or range-separated) predictions are made. We leverage these metrics to understand the learning behavior of different ML models, and to guide efficient dataset construction for model training. We finally implement the formalism for a ML model targeting a coarse-grained system to demonstrate the applicability of the prediction rigidities to an even broader class of atomistic modeling problems.
Collapse
Affiliation(s)
- Sanggyu Chong
- Laboratory of Computational Science and Modeling, Institute of Materials, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland.
| | - Filippo Bigi
- Laboratory of Computational Science and Modeling, Institute of Materials, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland.
| | - Federico Grasselli
- Laboratory of Computational Science and Modeling, Institute of Materials, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland.
| | - Philip Loche
- Laboratory of Computational Science and Modeling, Institute of Materials, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland.
| | - Matthias Kellner
- Laboratory of Computational Science and Modeling, Institute of Materials, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland.
| | - Michele Ceriotti
- Laboratory of Computational Science and Modeling, Institute of Materials, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland.
| |
Collapse
|
6
|
Kulichenko M, Nebgen B, Lubbers N, Smith JS, Barros K, Allen AEA, Habib A, Shinkle E, Fedik N, Li YW, Messerly RA, Tretiak S. Data Generation for Machine Learning Interatomic Potentials and Beyond. Chem Rev 2024; 124:13681-13714. [PMID: 39572011 DOI: 10.1021/acs.chemrev.4c00572] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2024]
Abstract
The field of data-driven chemistry is undergoing an evolution, driven by innovations in machine learning models for predicting molecular properties and behavior. Recent strides in ML-based interatomic potentials have paved the way for accurate modeling of diverse chemical and structural properties at the atomic level. The key determinant defining MLIP reliability remains the quality of the training data. A paramount challenge lies in constructing training sets that capture specific domains in the vast chemical and structural space. This Review navigates the intricate landscape of essential components and integrity of training data that ensure the extensibility and transferability of the resulting models. We delve into the details of active learning, discussing its various facets and implementations. We outline different types of uncertainty quantification applied to atomistic data acquisition and the correlations between estimated uncertainty and true error. The role of atomistic data samplers in generating diverse and informative structures is highlighted. Furthermore, we discuss data acquisition via modified and surrogate potential energy surfaces as an innovative approach to diversify training data. The Review also provides a list of publicly available data sets that cover essential domains of chemical space.
Collapse
Affiliation(s)
- Maksim Kulichenko
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Benjamin Nebgen
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Nicholas Lubbers
- Computer, Computational, and Statistical Sciences Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Justin S Smith
- NVIDIA Corporation, Santa Clara, California 95051, United States
| | - Kipton Barros
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
- Center for Nonlinear Studies, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Alice E A Allen
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
- Center for Nonlinear Studies, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Adela Habib
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Emily Shinkle
- Computer, Computational, and Statistical Sciences Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Nikita Fedik
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Ying Wai Li
- Computer, Computational, and Statistical Sciences Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Richard A Messerly
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Sergei Tretiak
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
- Center for Nonlinear Studies, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
- Center for Integrated Nanotechnologies, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| |
Collapse
|
7
|
Thomsen B, Nagai Y, Kobayashi K, Hamada I, Shiga M. Self-learning path integral hybrid Monte Carlo with mixed ab initio and machine learning potentials for modeling nuclear quantum effects in water. J Chem Phys 2024; 161:204109. [PMID: 39601285 DOI: 10.1063/5.0230464] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2024] [Accepted: 10/28/2024] [Indexed: 11/29/2024] Open
Abstract
The introduction of machine learned potentials (MLPs) has greatly expanded the space available for studying Nuclear Quantum Effects computationally with ab initio path integral (PI) accuracy, with the MLPs' promise of an accuracy comparable to that of ab initio at a fraction of the cost. One of the challenges in development of MLPs is the need for a large and diverse training set calculated by ab initio methods. This dataset should ideally cover the entire phase space, while not searching this space using ab initio methods, as this would be counterproductive and generally intractable with respect to computational time. In this paper, we present the self-learning PI hybrid Monte Carlo Method using a mixed ab initio and ML potential (SL-PIHMC-MIX), where the mixed potential allows for the study of larger systems and the extension of the original SL-HMC method [Nagai et al., Phys. Rev. B 102, 041124 (2020)] to PI methods and larger systems. While the MLPs generated by this method can be directly applied to run long-time ML-PIMD simulations, we demonstrate that using PIHMC-MIX with the trained MLPs allows for an exact reproduction of the structure obtained from ab initio PIMD. Specifically, we find that the PIHMC-MIX simulations require only 5000 evaluations of the 32-bead structure, compared to the 100 000 evaluations needed for the ab initio PIMD result.
Collapse
Affiliation(s)
- Bo Thomsen
- CCSE, Japan Atomic Energy Agency, 178-4-4, Wakashiba, Kashiwa, Chiba 277-0871, Japan
| | - Yuki Nagai
- Information Technology Center, The University of Tokyo, 6-2-3 Kashiwanoha, Kashiwa, Chiba 277-0882, Japan
| | - Keita Kobayashi
- CCSE, Japan Atomic Energy Agency, 178-4-4, Wakashiba, Kashiwa, Chiba 277-0871, Japan
| | - Ikutaro Hamada
- Department of Precision Engineering, Graduate School of Engineering, Osaka University, 2-1, Yamadaoka, Suita, Osaka 565-0871, Japan
| | - Motoyuki Shiga
- CCSE, Japan Atomic Energy Agency, 178-4-4, Wakashiba, Kashiwa, Chiba 277-0871, Japan
| |
Collapse
|
8
|
Méndez E, Laria D, Hunt D. Proton quantal delocalization and H/D translocations in (MeOH)nH+ (n = 2, 3). J Chem Phys 2024; 161:174303. [PMID: 39484904 DOI: 10.1063/5.0234264] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2024] [Accepted: 10/11/2024] [Indexed: 11/03/2024] Open
Abstract
In this study, we present results from path integral molecular dynamics simulations that describe the characteristics of the quantum spatial delocalizations of protons participating in OH bonds in (MeOH)2H+ and in (MeOH)3H+. The characterization was carried out by examining the overall structures of the corresponding isomorphic polymers. To introduce full flexibility in the force treatment, we have adopted a neural network fitting procedure based on second-order Møller-Plesset perturbation theory predictions. For the dimer case, we found that the spatial extent of the shared connective proton can be portrayed in terms of a prolate-like structure with typical dimensions of ∼0.1 Å. On the other hand, the dangling polymers lie confined within a thin spherical layer, spread over length scales of the order of ∼0.25 Å. In contrast, connective protons in (MeOH)3H+ exhibit larger delocalizations along the O-H bond and more localized ones along perpendicular directions, compared to their dangling counterparts. We also examined the characteristics of the relative propensities of H and D isotopes to be localized in dangling and connective positions. Physical interpretations of the different thermodynamic trends are provided in terms of the local geometrical characteristics and of the strengths of the corresponding intermolecular connectivities.
Collapse
Affiliation(s)
- Emilio Méndez
- Sorbonne Université CNRS, Physico-chimie des Electrolytes et Nanosystèmes Interfaciaux, PHENIX, F-75005 Paris, France
| | - Daniel Laria
- Departamento de Física de la Materia Condensada, GIyA, CAC-CNEA, 1650 San Martín, Buenos Aires, Argentina and Departamento de Química Inorgánica, Analítica y Química-Física, Facultad de Ciencias Exactas y Naturales. Universidad de Buenos Aires, Ciudad Universitaria, Pabellón II, 1428 Buenos Aires, Argentina
| | - Diego Hunt
- Departamento de Física de la Materia Condensada, GIyA, CAC-CNEA, 1650 San Martín, Buenos Aires, Argentina, Instituto de Nanociencia y Nanotecnología, CNEA-CONICET, Buenos Aires, Argentina
| |
Collapse
|
9
|
Montero de Hijes P, Dellago C, Jinnouchi R, Kresse G. Density isobar of water and melting temperature of ice: Assessing common density functionals. J Chem Phys 2024; 161:131102. [PMID: 39360681 DOI: 10.1063/5.0227514] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2024] [Accepted: 09/06/2024] [Indexed: 10/04/2024] Open
Abstract
We investigate the density isobar of water and the melting temperature of ice using six different density functionals. Machine-learning potentials are employed to ensure computational affordability. Our findings reveal significant discrepancies between various base functionals. Notably, even the choice of damping can result in substantial differences. Overall, the outcomes obtained through density functional theory are not entirely satisfactory across most utilized functionals. All functionals exhibit significant deviations either in the melting temperature or equilibrium volume, with most of them even predicting an incorrect volume difference between ice and water. Our heuristic analysis indicates that a hybrid functional with 25% exact exchange and van der Waals damping averaged between zero and Becke-Johnson dampings yields the closest agreement with experimental data. This study underscores the necessity for further enhancements in the treatment of van der Waals interactions and, more broadly, density functional theory to enable accurate quantitative predictions for molecular liquids.
Collapse
Affiliation(s)
- Pablo Montero de Hijes
- University of Vienna, Faculty of Physics, Kolingasse 14, A-1090 Vienna, Austria
- University of Vienna, Faculty of Earth Sciences, Geography and Astronomy, Josef-Holaubuek-Platz 2, 1090 Vienna, Austria
| | - Christoph Dellago
- University of Vienna, Faculty of Physics, Kolingasse 14, A-1090 Vienna, Austria
| | - Ryosuke Jinnouchi
- Toyota Central R&D Labs., Inc., 41-1 Yokomichi, Nagakute, Aichi 480-1192, Japan
| | - Georg Kresse
- University of Vienna, Faculty of Physics, Kolingasse 14, A-1090 Vienna, Austria
- VASP Software GmbH, Berggasse 21, A-1090 Vienna, Austria
| |
Collapse
|
10
|
Tu NTP, Williamson S, Johnson ER, Rowley CN. Modeling Intermolecular Interactions with Exchange-Hole Dipole Moment Dispersion Corrections to Neural Network Potentials. J Phys Chem B 2024; 128:8290-8302. [PMID: 39166778 DOI: 10.1021/acs.jpcb.4c02882] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/23/2024]
Abstract
Neural network potentials (NNPs) are an innovative approach for calculating the potential energy and forces of a chemical system. In principle, these methods are capable of modeling large systems with an accuracy approaching that of a high-level ab initio calculation, but with a much smaller computational cost. Due to their training to density-functional theory (DFT) data and neglect of long-range interactions, some classes of NNPs require an additional term to include London dispersion physics. In this Perspective, we discuss the requirements for a dispersion model for use with an NNP, focusing on the MLXDM (Machine Learned eXchange-Hole Dipole Moment) model developed by our groups. This model is based on the DFT-based XDM dispersion correction, which calculates interatomic dispersion coefficients in terms of atomic moments and polarizabilities, both of which can be approximated effectively using neural networks.
Collapse
Affiliation(s)
| | - Siri Williamson
- Department of Chemistry, Carleton University, Ottawa, Ontario K1S 5B6, Canada
| | - Erin R Johnson
- Department of Chemistry, Dalhousie University, Halifax, Nova Scotia B3H 4J3, Canada
| | | |
Collapse
|
11
|
Gubler M, Finkler JA, Schäfer MR, Behler J, Goedecker S. Accelerating Fourth-Generation Machine Learning Potentials Using Quasi-Linear Scaling Particle Mesh Charge Equilibration. J Chem Theory Comput 2024; 20. [PMID: 39151921 PMCID: PMC11360134 DOI: 10.1021/acs.jctc.4c00334] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2024] [Revised: 07/16/2024] [Accepted: 07/17/2024] [Indexed: 08/19/2024]
Abstract
Machine learning potentials (MLPs) have revolutionized the field of atomistic simulations by describing atomic interactions with the accuracy of electronic structure methods at a small fraction of the cost. Most current MLPs construct the energy of a system as a sum of atomic energies, which depend on information about the atomic environments provided in the form of predefined or learnable feature vectors. If, in addition, nonlocal phenomena like long-range charge transfer are important, fourth-generation MLPs need to be used, which include a charge equilibration (Qeq) step to take the global structure of the system into account. This Qeq can significantly increase the computational cost and thus can become a computational bottleneck for large systems. In this Article, we present a highly efficient formulation of Qeq that does not require the explicit computation of the Coulomb matrix elements, resulting in a quasi-linear scaling method. Moreover, our approach also allows for the efficient calculation of energy derivatives, which explicitly consider the global structure-dependence of the atomic charges as obtained from Qeq. Due to its generality, the method is not restricted to MLPs and can also be applied within a variety of other force fields.
Collapse
Affiliation(s)
- Moritz Gubler
- Department
of Physics, University of Basel, Klingelbergstrasse 82, CH-4056 Basel, Switzerland
| | - Jonas A. Finkler
- Department
of Physics, University of Basel, Klingelbergstrasse 82, CH-4056 Basel, Switzerland
| | - Moritz R. Schäfer
- Lehrstuhl
für Theoretische Chemie II, Ruhr-Universität
Bochum, 44780 Bochum, Germany
- Research
Center Chemical Sciences and Sustainability, Research Alliance Ruhr, 44780 Bochum, Germany
| | - Jörg Behler
- Lehrstuhl
für Theoretische Chemie II, Ruhr-Universität
Bochum, 44780 Bochum, Germany
- Research
Center Chemical Sciences and Sustainability, Research Alliance Ruhr, 44780 Bochum, Germany
| | - Stefan Goedecker
- Department
of Physics, University of Basel, Klingelbergstrasse 82, CH-4056 Basel, Switzerland
| |
Collapse
|
12
|
Malosso C, Manko N, Izzo MG, Baroni S, Hassanali A. Evidence of ferroelectric features in low-density supercooled water from ab initio deep neural-network simulations. Proc Natl Acad Sci U S A 2024; 121:e2407295121. [PMID: 39083416 PMCID: PMC11317578 DOI: 10.1073/pnas.2407295121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2024] [Accepted: 06/30/2024] [Indexed: 08/02/2024] Open
Abstract
Over the last decade, an increasing body of evidence has emerged, supporting the existence of a metastable liquid-liquid critical point in supercooled water whereby two distinct liquid phases of different densities coexist. Analyzing long molecular dynamics simulations performed using deep neural-network force fields trained to accurate quantum mechanical data, we demonstrate that the low-density liquid phase displays a strong propensity toward spontaneous polarization, as witnessed by large and long-lived collective dipole fluctuations. Our findings suggest that the dynamical stability of the low-density phase, and hence the transition from high-density to low-density liquid, is triggered by a collective process involving an accumulation of rotational angular jumps, which could ignite large dipole fluctuations. This dynamical transition involves subtle changes in the electronic polarizability of water molecules which affects their rotational mobility within the two phases. These findings hold the potential for catalyzing activity in the search for dielectric-based probes of the putative second critical point.
Collapse
Affiliation(s)
- Cesare Malosso
- Scuola Internazionale Superiore di Studi Avanzati, Trieste34136, Italy
| | - Natalia Manko
- Condensed Matter and Statistical Physics (CMSP), The Abdus Salam Centre for Theoretical Physics, Trieste34151, Italy
| | - Maria Grazia Izzo
- Scuola Internazionale Superiore di Studi Avanzati, Trieste34136, Italy
| | - Stefano Baroni
- Scuola Internazionale Superiore di Studi Avanzati, Trieste34136, Italy
- Consiglio Nazionale delle Ricerche-Istituto Officina dei Materiali, Scuola Internazionale Superiore di Studi Avanzati Unit, Trieste34136, Italy
| | - Ali Hassanali
- Condensed Matter and Statistical Physics (CMSP), The Abdus Salam Centre for Theoretical Physics, Trieste34151, Italy
| |
Collapse
|
13
|
Frank JT, Unke OT, Müller KR, Chmiela S. A Euclidean transformer for fast and stable machine learned force fields. Nat Commun 2024; 15:6539. [PMID: 39107296 PMCID: PMC11303804 DOI: 10.1038/s41467-024-50620-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2023] [Accepted: 07/10/2024] [Indexed: 08/10/2024] Open
Abstract
Recent years have seen vast progress in the development of machine learned force fields (MLFFs) based on ab-initio reference calculations. Despite achieving low test errors, the reliability of MLFFs in molecular dynamics (MD) simulations is facing growing scrutiny due to concerns about instability over extended simulation timescales. Our findings suggest a potential connection between robustness to cumulative inaccuracies and the use of equivariant representations in MLFFs, but the computational cost associated with these representations can limit this advantage in practice. To address this, we propose a transformer architecture called SO3KRATES that combines sparse equivariant representations (Euclidean variables) with a self-attention mechanism that separates invariant and equivariant information, eliminating the need for expensive tensor products. SO3KRATES achieves a unique combination of accuracy, stability, and speed that enables insightful analysis of quantum properties of matter on extended time and system size scales. To showcase this capability, we generate stable MD trajectories for flexible peptides and supra-molecular structures with hundreds of atoms. Furthermore, we investigate the PES topology for medium-sized chainlike molecules (e.g., small peptides) by exploring thousands of minima. Remarkably, SO3KRATES demonstrates the ability to strike a balance between the conflicting demands of stability and the emergence of new minimum-energy conformations beyond the training data, which is crucial for realistic exploration tasks in the field of biochemistry.
Collapse
Affiliation(s)
- J Thorben Frank
- Machine Learning Group, TU Berlin, Berlin, Germany
- BIFOLD, Berlin Institute for the Foundations of Learning and Data, Berlin, Germany
| | | | - Klaus-Robert Müller
- Machine Learning Group, TU Berlin, Berlin, Germany.
- BIFOLD, Berlin Institute for the Foundations of Learning and Data, Berlin, Germany.
- Google DeepMind, Berlin, Germany.
- Department of Artificial Intelligence, Korea University, Seoul, Korea.
- Max Planck Institut für Informatik, Saarbrücken, Germany.
| | - Stefan Chmiela
- Machine Learning Group, TU Berlin, Berlin, Germany.
- BIFOLD, Berlin Institute for the Foundations of Learning and Data, Berlin, Germany.
| |
Collapse
|
14
|
Aldossary A, Campos-Gonzalez-Angulo JA, Pablo-García S, Leong SX, Rajaonson EM, Thiede L, Tom G, Wang A, Avagliano D, Aspuru-Guzik A. In Silico Chemical Experiments in the Age of AI: From Quantum Chemistry to Machine Learning and Back. ADVANCED MATERIALS (DEERFIELD BEACH, FLA.) 2024; 36:e2402369. [PMID: 38794859 DOI: 10.1002/adma.202402369] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/15/2024] [Revised: 04/28/2024] [Indexed: 05/26/2024]
Abstract
Computational chemistry is an indispensable tool for understanding molecules and predicting chemical properties. However, traditional computational methods face significant challenges due to the difficulty of solving the Schrödinger equations and the increasing computational cost with the size of the molecular system. In response, there has been a surge of interest in leveraging artificial intelligence (AI) and machine learning (ML) techniques to in silico experiments. Integrating AI and ML into computational chemistry increases the scalability and speed of the exploration of chemical space. However, challenges remain, particularly regarding the reproducibility and transferability of ML models. This review highlights the evolution of ML in learning from, complementing, or replacing traditional computational chemistry for energy and property predictions. Starting from models trained entirely on numerical data, a journey set forth toward the ideal model incorporating or learning the physical laws of quantum mechanics. This paper also reviews existing computational methods and ML models and their intertwining, outlines a roadmap for future research, and identifies areas for improvement and innovation. Ultimately, the goal is to develop AI architectures capable of predicting accurate and transferable solutions to the Schrödinger equation, thereby revolutionizing in silico experiments within chemistry and materials science.
Collapse
Affiliation(s)
- Abdulrahman Aldossary
- Department of Chemistry, University of Toronto, 80 St. George Street, Toronto, ON, M5S 3H6, Canada
| | | | - Sergio Pablo-García
- Department of Chemistry, University of Toronto, 80 St. George Street, Toronto, ON, M5S 3H6, Canada
- Department of Computer Science, University of Toronto, 40 St. George Street, Toronto, ON, M5S 2E4, Canada
| | - Shi Xuan Leong
- Department of Chemistry, University of Toronto, 80 St. George Street, Toronto, ON, M5S 3H6, Canada
| | - Ella Miray Rajaonson
- Department of Chemistry, University of Toronto, 80 St. George Street, Toronto, ON, M5S 3H6, Canada
- Vector Institute for Artificial Intelligence, 661 University Ave. Suite 710, Toronto, ON, M5G 1M1, Canada
| | - Luca Thiede
- Department of Computer Science, University of Toronto, 40 St. George Street, Toronto, ON, M5S 2E4, Canada
- Vector Institute for Artificial Intelligence, 661 University Ave. Suite 710, Toronto, ON, M5G 1M1, Canada
| | - Gary Tom
- Department of Chemistry, University of Toronto, 80 St. George Street, Toronto, ON, M5S 3H6, Canada
- Vector Institute for Artificial Intelligence, 661 University Ave. Suite 710, Toronto, ON, M5G 1M1, Canada
| | - Andrew Wang
- Department of Chemistry, University of Toronto, 80 St. George Street, Toronto, ON, M5S 3H6, Canada
| | - Davide Avagliano
- Chimie ParisTech, PSL University, CNRS, Institute of Chemistry for Life and Health Sciences (iCLeHS UMR 8060), Paris, F-75005, France
| | - Alán Aspuru-Guzik
- Department of Chemistry, University of Toronto, 80 St. George Street, Toronto, ON, M5S 3H6, Canada
- Department of Computer Science, University of Toronto, 40 St. George Street, Toronto, ON, M5S 2E4, Canada
- Vector Institute for Artificial Intelligence, 661 University Ave. Suite 710, Toronto, ON, M5G 1M1, Canada
- Department of Materials Science & Engineering, University of Toronto, 184 College St., Toronto, ON, M5S 3E4, Canada
- Department of Chemical Engineering & Applied Chemistry, University of Toronto, 200 College St., Toronto, ON, M5S 3E5, Canada
- Lebovic Fellow, Canadian Institute for Advanced Research (CIFAR), 66118 University Ave., Toronto, M5G 1M1, Canada
- Acceleration Consortium, 80 St George St, Toronto, M5S 3H6, Canada
| |
Collapse
|
15
|
Rezaee M, Ekrami S, Hashemianzadeh SM. Comparing ANI-2x, ANI-1ccx neural networks, force field, and DFT methods for predicting conformational potential energy of organic molecules. Sci Rep 2024; 14:11791. [PMID: 38783010 PMCID: PMC11116541 DOI: 10.1038/s41598-024-62242-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2023] [Accepted: 05/15/2024] [Indexed: 05/25/2024] Open
Abstract
In this study, the conformational potential energy surfaces of Amylmetacresol, Benzocaine, Dopamine, Betazole, and Betahistine molecules were scanned and analyzed using the neural network architecture ANI-2 × and ANI-1ccx, the force field method OPLS, and density functional theory with the exchange-correlation functional B3LYP and the basis set 6-31G(d). The ANI-1ccx and ANI-2 × methods demonstrated the highest accuracy in predicting torsional energy profiles, effectively capturing the minimum and maximum values of these profiles. Conformational potential energy values calculated by B3LYP and the OPLS force field method differ from those calculated by ANI-1ccx and ANI-2x, which account for non-bonded intramolecular interactions, since the B3LYP functional and OPLS force field weakly consider van der Waals and other intramolecular forces in torsional energy profiles. For a more comprehensive analysis, electronic parameters such as dipole moment, HOMO, and LUMO energies for different torsional angles were calculated at two levels of theory, B3LYP/6-31G(d) and ωB97X/6-31G(d). These calculations confirmed that ANI predictions are more accurate than density functional theory calculations with B3LYP functional and OPLS force field for determining potential energy surfaces. This research successfully addressed the challenges in determining conformational potential energy levels and shows how machine learning and deep neural networks offer a more accurate, cost-effective, and rapid alternative for predicting torsional energy profiles.
Collapse
Affiliation(s)
- Mozafar Rezaee
- Molecular Simulation Research Laboratory, Department of Chemistry, Iran University of Science and Technology, Tehran, Iran
| | - Saeid Ekrami
- CNRS, LCPME, Université de Lorraine, 54000, Nancy, France
| | - Seyed Majid Hashemianzadeh
- Molecular Simulation Research Laboratory, Department of Chemistry, Iran University of Science and Technology, Tehran, Iran.
| |
Collapse
|
16
|
Montero de Hijes P, Dellago C, Jinnouchi R, Schmiedmayer B, Kresse G. Comparing machine learning potentials for water: Kernel-based regression and Behler-Parrinello neural networks. J Chem Phys 2024; 160:114107. [PMID: 38506284 DOI: 10.1063/5.0197105] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2024] [Accepted: 03/03/2024] [Indexed: 03/21/2024] Open
Abstract
In this paper, we investigate the performance of different machine learning potentials (MLPs) in predicting key thermodynamic properties of water using RPBE + D3. Specifically, we scrutinize kernel-based regression and high-dimensional neural networks trained on a highly accurate dataset consisting of about 1500 structures, as well as a smaller dataset, about half the size, obtained using only on-the-fly learning. This study reveals that despite minor differences between the MLPs, their agreement on observables such as the diffusion constant and pair-correlation functions is excellent, especially for the large training dataset. Variations in the predicted density isobars, albeit somewhat larger, are also acceptable, particularly given the errors inherent to approximate density functional theory. Overall, this study emphasizes the relevance of the database over the fitting method. Finally, this study underscores the limitations of root mean square errors and the need for comprehensive testing, advocating the use of multiple MLPs for enhanced certainty, particularly when simulating complex thermodynamic properties that may not be fully captured by simpler tests.
Collapse
Affiliation(s)
- Pablo Montero de Hijes
- University of Vienna, Faculty of Physics, Kolingasse 14, A-1090 Vienna, Austria
- University of Vienna, Faculty of Earth Sciences, Geography and Astronomy, Josef-Holaubuek-Platz 2, 1090 Vienna, Austria
| | - Christoph Dellago
- University of Vienna, Faculty of Physics, Kolingasse 14, A-1090 Vienna, Austria
| | - Ryosuke Jinnouchi
- Toyota Central R&D Labs., Inc., 41-1 Yokomichi, Nagakute, Aichi 480-1192, Japan
| | | | - Georg Kresse
- University of Vienna, Faculty of Physics, Kolingasse 14, A-1090 Vienna, Austria
- VASP Software GmbH, Berggasse 21, A-1090 Vienna, Austria
| |
Collapse
|
17
|
Song Z, Han J, Henkelman G, Li L. Charge-Optimized Electrostatic Interaction Atom-Centered Neural Network Algorithm. J Chem Theory Comput 2024; 20:2088-2097. [PMID: 38380601 DOI: 10.1021/acs.jctc.3c01254] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/22/2024]
Abstract
Machine-learning algorithms have been proposed to capture electrostatic interactions by using effective partial charges. These algorithms often rely on a pretrained model for partial charge prediction using density functional theory-calculated partial charges as references, which introduces complexity to the force field model. The accuracy of the trained model also depends on the reliability of charge partition methods, which can be dependent on the specific system and methodology employed. In this study, we propose an atom-centered neural network (ANN) algorithm that eliminates the need for reference charges. Our algorithm requires only a single NN model for each element to obtain both atomic energy and charges. These atomic charges are then employed to compute electrostatic energies using the Ewald summation algorithm. Subsequently, the force field model is trained on total energy and forces, with the inclusion of electrostatic energy. To evaluate the performance of our algorithm, we conducted tests on three benchmark systems, including a Ge slab with an O adatom system, a TiO2 crystalline system, and a Pd-O nanoparticle system. Our results demonstrate reasonably accurate predictions of partial charges and electrostatic interactions. This algorithm provides a self-consistent charge prediction strategy and possibilities for robust and reliable modeling of electrostatic interactions in machine-learning potentials.
Collapse
Affiliation(s)
- Zichen Song
- Shenzhen Key Laboratory of Micro/Nano-Porous Functional Materials (SKLPM), Department of Materials Science and Engineering, Southern University of Science and Technology, Shenzhen 518055, China
- Department of Materials Science and Engineering, City University of Hong Kong, Kowloon, Hong Kong, China
| | - Jian Han
- Department of Materials Science and Engineering, City University of Hong Kong, Kowloon, Hong Kong, China
| | - Graeme Henkelman
- Department of Chemistry, the University of Texas at Austin, Austin, Texas 78712, United States
- Institute for Computational Engineering and Sciences, the University of Texas at Austin, Austin, Texas 78712, United States
| | - Lei Li
- Shenzhen Key Laboratory of Micro/Nano-Porous Functional Materials (SKLPM), Department of Materials Science and Engineering, Southern University of Science and Technology, Shenzhen 518055, China
| |
Collapse
|
18
|
Xia J, Zhang Y, Jiang B. Accuracy Assessment of Atomistic Neural Network Potentials: The Impact of Cutoff Radius and Message Passing. J Phys Chem A 2023; 127:9874-9883. [PMID: 37943102 DOI: 10.1021/acs.jpca.3c06024] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2023]
Abstract
Atomistic neural network potentials have achieved great success in accelerating atomistic simulations in complicated systems in recent years. They are typically based on the atomic decomposition of total properties, truncating the interatomic correlations to a local environment within a given cutoff radius. A more recently developed message passing (MP) neural network framework can, in principle, incorporate nonlocal effects through iteratively correlating some atoms outside the cutoff sphere with atoms inside, a process referred to as MP. However, how the model accuracy depends on the cutoff radius and the MP process has rarely been discussed. In this work, we investigate this dependence using a recursively embedded atom neural network method that possesses both local and MP features, in two representative systems: liquid H2O and solid Al2O3. We focus on how these settings influence predictions for structural and vibrational properties, namely, radial distribution functions (RDFs) and vibrational density of states (VDOSs). We find that while MP lowers test errors of energy and forces in general, it may not improve the prediction for RDFs and/or VDOSs if direct interatomic correlations in the local environment are insufficiently described. A cutoff radius exceeding the first neighbor shell is necessary, beyond which involving MP quickly enhances the model accuracy until convergence. This is a potentially more efficient way to increase the model accuracy than directly increasing the cutoff radius, especially with more memory savings in the GPU implementation. Our findings also suggest that using the mean test error as the measure of the model accuracy alone is inadequate.
Collapse
Affiliation(s)
- Junfan Xia
- Key Laboratory of Precision and Intelligent Chemistry, Department of Chemical Physics, University of Science and Technology of China, Hefei, Anhui 230026, China
| | - Yaolong Zhang
- École Polytechnique FFlytech de Lausanne, 1015 Lausanne, Switzerland
| | - Bin Jiang
- Key Laboratory of Precision and Intelligent Chemistry, Department of Chemical Physics, University of Science and Technology of China, Hefei, Anhui 230026, China
| |
Collapse
|
19
|
Tokita AM, Behler J. How to train a neural network potential. J Chem Phys 2023; 159:121501. [PMID: 38127396 DOI: 10.1063/5.0160326] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2023] [Accepted: 07/24/2023] [Indexed: 12/23/2023] Open
Abstract
The introduction of modern Machine Learning Potentials (MLPs) has led to a paradigm change in the development of potential energy surfaces for atomistic simulations. By providing efficient access to energies and forces, they allow us to perform large-scale simulations of extended systems, which are not directly accessible by demanding first-principles methods. In these simulations, MLPs can reach the accuracy of electronic structure calculations, provided that they have been properly trained and validated using a suitable set of reference data. Due to their highly flexible functional form, the construction of MLPs has to be done with great care. In this Tutorial, we describe the necessary key steps for training reliable MLPs, from data generation via training to final validation. The procedure, which is illustrated for the example of a high-dimensional neural network potential, is general and applicable to many types of MLPs.
Collapse
Affiliation(s)
- Alea Miako Tokita
- Lehrstuhl für Theoretische Chemie II, Ruhr-Universität Bochum, 44780 Bochum, Germany and Research Center Chemical Sciences and Sustainability, Research Alliance Ruhr, 44780 Bochum, Germany
| | - Jörg Behler
- Lehrstuhl für Theoretische Chemie II, Ruhr-Universität Bochum, 44780 Bochum, Germany and Research Center Chemical Sciences and Sustainability, Research Alliance Ruhr, 44780 Bochum, Germany
| |
Collapse
|
20
|
Bore SL, Paesani F. Realistic phase diagram of water from "first principles" data-driven quantum simulations. Nat Commun 2023; 14:3349. [PMID: 37291095 PMCID: PMC10250386 DOI: 10.1038/s41467-023-38855-1] [Citation(s) in RCA: 35] [Impact Index Per Article: 17.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2023] [Accepted: 05/12/2023] [Indexed: 06/10/2023] Open
Abstract
Since the experimental characterization of the low-pressure region of water's phase diagram in the early 1900s, scientists have been on a quest to understand the thermodynamic stability of ice polymorphs on the molecular level. In this study, we demonstrate that combining the MB-pol data-driven many-body potential for water, which was rigorously derived from "first principles" and exhibits chemical accuracy, with advanced enhanced-sampling algorithms, which correctly describe the quantum nature of molecular motion and thermodynamic equilibria, enables computer simulations of water's phase diagram with an unprecedented level of realism. Besides providing fundamental insights into how enthalpic, entropic, and nuclear quantum effects shape the free-energy landscape of water, we demonstrate that recent progress in "first principles" data-driven simulations, which rigorously encode many-body molecular interactions, has opened the door to realistic computational studies of complex molecular systems, bridging the gap between experiments and simulations.
Collapse
Affiliation(s)
- Sigbjørn Løland Bore
- Department of Chemistry and Biochemistry, University of California, San Diego, La Jolla, CA, 92093, USA
| | - Francesco Paesani
- Department of Chemistry and Biochemistry, University of California, San Diego, La Jolla, CA, 92093, USA.
- Materials Science and Engineering, University of California San Diego, La Jolla, CA, 92093, USA.
- Halicioğlu Data Science Institute, University of California San Diego, La Jolla, CA, 92093, USA.
- San Diego Supercomputer Center, University of California San Diego, La Jolla, CA, 92093, USA.
| |
Collapse
|
21
|
Han B, Isborn CM, Shi L. Incorporating Polarization and Charge Transfer into a Point-Charge Model for Water Using Machine Learning. J Phys Chem Lett 2023; 14:3869-3877. [PMID: 37067482 DOI: 10.1021/acs.jpclett.3c00036] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Rigid nonpolarizable water models with fixed point charges have been widely employed in molecular dynamics simulations due to their efficiency and reasonable accuracy for the potential energy surface. However, the dipole moment surface of water is not necessarily well-described by the same fixed charges, leading to failure in reproducing dipole-related properties. Here, we developed a machine-learning model trained against electronic structure data to assign point charges for water, and the resulting dipole moment surface significantly improved the predictions of the dielectric constant and the low-frequency IR spectrum of liquid water. Our analysis reveals that within our atom-centered point-charge description of the dipole moment surface, the intermolecular charge transfer is the major source of the peak intensity at 200 cm-1, whereas the intramolecular polarization controls the enhancement of the dielectric constant. The effects of exact Hartree-Fock exchange in the hybrid density functional on these properties are also discussed.
Collapse
Affiliation(s)
- Bowen Han
- Chemistry and Biochemistry, University of California, Merced, California 95343, United States
| | - Christine M Isborn
- Chemistry and Biochemistry, University of California, Merced, California 95343, United States
| | - Liang Shi
- Chemistry and Biochemistry, University of California, Merced, California 95343, United States
| |
Collapse
|
22
|
Méndez E, Videla PE, Laria D. Collective Proton Transfers in Cyclic Water-Ammonia Tetramers: A Path Integral Machine-Learning Study. J Phys Chem A 2023; 127:1839-1848. [PMID: 36794937 DOI: 10.1021/acs.jpca.2c07994] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/17/2023]
Abstract
We present results from machine-learning-based path integral molecular dynamics simulations that describe isomerization paths articulated via collective proton transfers along mixed, cyclic tetramers combining water and ammonia at cryogenic conditions. The net result of such isomerizations is a reverse of the chirality of the global hydrogen-bonding architecture along the different cyclic moieties. In monocomponent tetramers, the classical free energy profiles associated with these isomerizations present the usual symmetric double-well characteristics whereas the reactive paths exhibit full concertedness among the different intermolecular transfer processes. Contrastingly, in mixed water/ammonia tetramers, the incorporation of a second component introduces imbalances in the strengths of the different hydrogen bonds leading to a partial loss of concertedness, most notably at the vicinity of the transition state. As such, the highest and lowest degrees of progression are registered along OH···N and O···HN coordinations, respectively. These characteristics lead to polarized transition state scenarios akin to solvent-separated ion-pair configurations. The explicit incorporation of nuclear quantum effects promotes drastic depletions in the activation free energies and modifications in the overall shape of the profiles which include central plateau-like stages, indicating the prevalence of deep tunneling regimes. On the other hand, the quantum treatment of the nuclei partially restores the degree of concertedness among the evolutions of the individual transfers.
Collapse
Affiliation(s)
- Emilio Méndez
- Departamento de Química Inorgánica, Analítica y Química-Física and INQUIMAE-CONICET, Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires Ciudad Universitaria, Pabellón II, 1428 Buenos Aires, Argentina
| | - Pablo E Videla
- Department of Chemistry and Energy Sciences Institute, Yale University, 225 Prospect Street, New Haven, Connecticut 06520, United States
| | - Daniel Laria
- Departamento de Química Inorgánica, Analítica y Química-Física and INQUIMAE-CONICET, Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires Ciudad Universitaria, Pabellón II, 1428 Buenos Aires, Argentina.,Departamento de Física de la Materia Condensada, Comisión Nacional de Energía Atómica, Avenida Libertador 8250, 1429 Buenos Aires, Argentina
| |
Collapse
|
23
|
Arab F, Nazari F, Illas F. Artificial Neural Network-Derived Unified Six-Dimensional Potential Energy Surface for Tetra Atomic Isomers of the Biogenic [H, C, N, O] System. J Chem Theory Comput 2023; 19:1186-1196. [PMID: 36735891 PMCID: PMC9979606 DOI: 10.1021/acs.jctc.2c00915] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
Recognition of different structural patterns in different potential energy surface regions, such as in isomerizing quasilinear tetra atomic molecules, is important for understanding the details of underlying physics and chemistry. In this respect, using three variants of artificial neural networks (ANNs), we investigated the six-dimensional (6-D) singlet potential energy surfaces (PES) of tetra atomic isomers of the biogenic [H, C, N, O] system. At first, we constructed a separate ANN potential for each of the studied isomers. In the next step, a comparative assessment of the separate ANN models led to the setting up of a unified 6-D singlet PES equally and accurately describing all studied isomers. The constructed unified model yields relative energies comparable to those obtained either from the gold standard CCSD(T) method or from separate ANNs for each of the studied isomers. The accuracy of the unified singlet PES is on the order of 10-4 Hartrees (0.1 kcal/mol). The developed PES in this work captures the main features of nonlinear and quasilinear tetra atomic isomers of this biogenic system.
Collapse
Affiliation(s)
- Fatemeh Arab
- Department
of Chemistry, Institute for Advanced Studies
in Basic Sciences, Zanjan45137-66731, Iran
| | - Fariba Nazari
- Department
of Chemistry, Institute for Advanced Studies
in Basic Sciences, Zanjan45137-66731, Iran,Center
of Climate Change and Global Warming, Institute
for Advanced Studies in Basic Sciences, Zanjan45137-66731, Iran,
| | - Francesc Illas
- Departament
de Ciència de Materials i Química Física &
Institut de Química Teòrica i Computacional (IQTCUB), Universitat de Barcelona, C/Martí i Franquès 1, 08028Barcelona, Spain,
| |
Collapse
|
24
|
Abstract
Diffusion Monte Carlo (DMC) is one of the most accurate techniques available for calculating the electronic properties of molecules and materials, yet it often remains a challenge to economically compute forces using this technique. As a result, ab initio molecular dynamics simulations and geometry optimizations that employ Diffusion Monte Carlo forces are often out of reach. One potential approach for accelerating the computation of "DMC forces" is to machine learn these forces from DMC energy calculations. In this work, we employ Behler-Parrinello Neural Networks to learn DMC forces from DMC energy calculations for geometry optimization and molecular dynamics simulations of small molecules. We illustrate the unique challenges that stem from learning forces without explicit force data and from noisy energy data by making rigorous comparisons of potential energy surface, dynamics, and optimization predictions among ab initio density functional theory (DFT) simulations and machine-learning models trained on DFT energies with forces, DFT energies without forces, and DMC energies without forces. We show for three small molecules─C2, H2O, and CH3Cl─that machine-learned DMC dynamics can reproduce average bond lengths and angles within a few percent of known experimental results at one hundredth of the typical cost. Our work describes a much-needed means of performing dynamics simulations on high-accuracy, DMC PESs and for generating DMC-quality molecular geometries given current algorithmic constraints.
Collapse
Affiliation(s)
- Cancan Huang
- Department of Chemistry, Brown University, Providence, Rhode Island02912, United States
| | - Brenda M Rubenstein
- Department of Chemistry, Brown University, Providence, Rhode Island02912, United States
| |
Collapse
|
25
|
Cameron AR, Proud AJ, Pearson JK. Machine Learned Composite Methods for Electronic Structure Theory. J Chem Theory Comput 2023; 19:51-60. [PMID: 36507875 DOI: 10.1021/acs.jctc.2c00564] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
Because of the prohibitive scaling of ab initio techniques for modeling chemical species with high accuracy, they are not generally tractable for large systems. It is therefore of considerable interest to develop high-accuracy computational models with low computational cost that can afford predictions of electronic structure and properties of macromolecular species. Composite methods, as first introduced by Pople [Pople, J. A.; Head-Gordon, M.; Fox, D. J.; Raghavachari, K.; Curtiss, L. A. J. Chem. Phys.1989, 90, 5622.], are an intuitive solution to this problem as they seek to systematically increase accuracy in model chemistries by taking advantage of favorable error cancellation among reasonably low-cost models. By linearly combining a series of carefully chosen model chemistries, the result of a prohibitive-scaling correlated model chemistry with a large basis set may be approximated with relatively good fidelity. However, the full extent to which the choice of low-cost models dictates the predictive accuracy of composite methods is not known, and a full exploration of all model chemistries would be advantageous for the design and validation of a generalizable composite method for widespread application. Here, we show that remarkable accuracy can be generally achieved with composite methods that are more judiciously constructed, leading to increased accuracy with significantly reduced computational cost. By designing a systematic procedure for the automated generation and assessment of over 10 billion unique composite methods, we have extensively explored the space of modern model chemistries to elucidate important design principles in the construction of reliable composite procedures. We anticipate our work to be the starting point in the pursuit of creative approaches to modeling large chemical systems with high accuracy by using novel combinatorial modeling.
Collapse
Affiliation(s)
- Andrew R Cameron
- Institute for Quantum Computing, University of Waterloo, Waterloo, OntarioN2L 3G1, Canada.,Department of Physics & Astronomy, University of Waterloo, Waterloo, OntarioN2L 3G1, Canada
| | - Adam J Proud
- Department of Chemistry, University of Prince Edward Island, 550 University Avenue, Charlottetown, Prince Edward IslandC1A 4P3, Canada
| | - Jason K Pearson
- Department of Chemistry, University of Prince Edward Island, 550 University Avenue, Charlottetown, Prince Edward IslandC1A 4P3, Canada
| |
Collapse
|
26
|
Shanavas Rasheeda D, Martín Santa Daría A, Schröder B, Mátyus E, Behler J. High-dimensional neural network potentials for accurate vibrational frequencies: the formic acid dimer benchmark. Phys Chem Chem Phys 2022; 24:29381-29392. [PMID: 36459127 DOI: 10.1039/d2cp03893e] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
In recent years, machine learning potentials (MLP) for atomistic simulations have attracted a lot of attention in chemistry and materials science. Many new approaches have been developed with the primary aim to transfer the accuracy of electronic structure calculations to large condensed systems containing thousands of atoms. In spite of these advances, the reliability of modern MLPs in reproducing the subtle details of the multi-dimensional potential-energy surface is still difficult to assess for such systems. On the other hand, moderately sized systems enabling the application of tools for thorough and systematic quality-control are nowadays rarely investigated. In this work we use benchmark-quality harmonic and anharmonic vibrational frequencies as a sensitive probe for the validation of high-dimensional neural network potentials. For the case of the formic acid dimer, a frequently studied model system for which stringent spectroscopic data became recently available, we show that high-quality frequencies can be obtained from state-of-the-art calculations in excellent agreement with coupled cluster theory and experimental data.
Collapse
Affiliation(s)
- Dilshana Shanavas Rasheeda
- Universität Göttingen, Institut für Physikalische Chemie, Theoretische Chemie, Tammannstraβe 6, 37077 Göttingen, Germany.
| | - Alberto Martín Santa Daría
- ELTE, Eötvös Loránd University, Institute of Chemistry, Pázmány Péter sétány 1/A, 1117 Budapest, Hungary
| | - Benjamin Schröder
- Universität Göttingen, Institut für Physikalische Chemie, Tammannstraβe 6, 37077 Göttingen, Germany
| | - Edit Mátyus
- ELTE, Eötvös Loránd University, Institute of Chemistry, Pázmány Péter sétány 1/A, 1117 Budapest, Hungary
| | - Jörg Behler
- Universität Göttingen, Institut für Physikalische Chemie, Theoretische Chemie, Tammannstraβe 6, 37077 Göttingen, Germany.
| |
Collapse
|
27
|
Zhang Y, Lin Q, Jiang B. Atomistic neural network representations for chemical dynamics simulations of molecular, condensed phase, and interfacial systems: Efficiency, representability, and generalization. WIRES COMPUTATIONAL MOLECULAR SCIENCE 2022. [DOI: 10.1002/wcms.1645] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Affiliation(s)
- Yaolong Zhang
- Department of Chemical Physics, School of Chemistry and Materials Science, Key Laboratory of Surface and Interface Chemistry and Energy Catalysis of Anhui Higher Education Institutes University of Science and Technology of China Hefei Anhui China
| | - Qidong Lin
- Department of Chemical Physics, School of Chemistry and Materials Science, Key Laboratory of Surface and Interface Chemistry and Energy Catalysis of Anhui Higher Education Institutes University of Science and Technology of China Hefei Anhui China
| | - Bin Jiang
- Department of Chemical Physics, School of Chemistry and Materials Science, Key Laboratory of Surface and Interface Chemistry and Energy Catalysis of Anhui Higher Education Institutes University of Science and Technology of China Hefei Anhui China
| |
Collapse
|
28
|
Zhang X, Tian Y, Chen L, Hu X, Zhou Z. Machine Learning: A New Paradigm in Computational Electrocatalysis. J Phys Chem Lett 2022; 13:7920-7930. [PMID: 35980765 DOI: 10.1021/acs.jpclett.2c01710] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Designing and screening novel electrocatalysts, understanding electrocatalytic mechanisms at an atomic level, and uncovering scientific insights lie at the center of the development of electrocatalysis. Despite certain success in experiments and computations, it is still difficult to achieve the above objectives due to the complexity of electrocatalytic systems and the vastness of the chemical space for candidate electrocatalysts. With the advantage of machine learning (ML) and increasing interest in electrocatalysis for energy conversion and storage, data-driven scientific research motivated by artificial intelligence (AI) has provided new opportunities to discover promising electrocatalysts, investigate dynamic reaction processes, and extract knowledge from huge data. In this Perspective, we summarize the recent applications of ML in electrocatalysis, including the screening of electrocatalysts and simulation of electrocatalytic processes. Furthermore, interpretable machine learning methods for electrocatalysis are discussed to accelerate knowledge generation. Finally, the blueprint of machine learning is envisaged for future development of electrocatalysis.
Collapse
Affiliation(s)
- Xu Zhang
- School of Chemical Engineering, Zhengzhou University, Zhengzhou 450001, P. R. China
| | - Yun Tian
- School of Chemical Engineering, Zhengzhou University, Zhengzhou 450001, P. R. China
| | - Letian Chen
- School of Materials Science and Engineering, Institute of New Energy Material Chemistry, Renewable Energy Conversion and Storage Center (ReCast), Key Laboratory of Advanced Energy Chemistry (Ministry of Education), Nankai University, Tianjin 300350, P. R. China
| | - Xu Hu
- School of Materials Science and Engineering, Institute of New Energy Material Chemistry, Renewable Energy Conversion and Storage Center (ReCast), Key Laboratory of Advanced Energy Chemistry (Ministry of Education), Nankai University, Tianjin 300350, P. R. China
| | - Zhen Zhou
- School of Chemical Engineering, Zhengzhou University, Zhengzhou 450001, P. R. China
- School of Materials Science and Engineering, Institute of New Energy Material Chemistry, Renewable Energy Conversion and Storage Center (ReCast), Key Laboratory of Advanced Energy Chemistry (Ministry of Education), Nankai University, Tianjin 300350, P. R. China
| |
Collapse
|
29
|
Westermayr J, Chaudhuri S, Jeindl A, Hofmann OT, Maurer RJ. Long-range dispersion-inclusive machine learning potentials for structure search and optimization of hybrid organic-inorganic interfaces. DIGITAL DISCOVERY 2022; 1:463-475. [PMID: 36091414 PMCID: PMC9358753 DOI: 10.1039/d2dd00016d] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/25/2022] [Accepted: 06/03/2022] [Indexed: 12/16/2022]
Abstract
The computational prediction of the structure and stability of hybrid organic-inorganic interfaces provides important insights into the measurable properties of electronic thin film devices, coatings, and catalyst surfaces and plays an important role in their rational design. However, the rich diversity of molecular configurations and the important role of long-range interactions in such systems make it difficult to use machine learning (ML) potentials to facilitate structure exploration that otherwise requires computationally expensive electronic structure calculations. We present an ML approach that enables fast, yet accurate, structure optimizations by combining two different types of deep neural networks trained on high-level electronic structure data. The first model is a short-ranged interatomic ML potential trained on local energies and forces, while the second is an ML model of effective atomic volumes derived from atoms-in-molecules partitioning. The latter can be used to connect short-range potentials to well-established density-dependent long-range dispersion correction methods. For two systems, specifically gold nanoclusters on diamond (110) surfaces and organic π-conjugated molecules on silver (111) surfaces, we train models on sparse structure relaxation data from density functional theory and show the ability of the models to deliver highly efficient structure optimizations and semi-quantitative energy predictions of adsorption structures.
Collapse
Affiliation(s)
- Julia Westermayr
- Department of Chemistry, University of Warwick Coventry CV4 7AL UK
| | - Shayantan Chaudhuri
- Department of Chemistry, University of Warwick Coventry CV4 7AL UK
- Centre for Doctoral Training in Diamond Science and Technology, University of Warwick Coventry CV4 7AL UK
| | - Andreas Jeindl
- Institute of Solid State Physics, Graz University of Technology 8010 Graz Austria
| | - Oliver T Hofmann
- Institute of Solid State Physics, Graz University of Technology 8010 Graz Austria
| | | |
Collapse
|
30
|
Méndez E, Videla PE, Laria D. Equilibrium and Dynamical Characteristics of Hydrogen Bond Bifurcations in Water-Water and Water-Ammonia Dimers: A Path Integral Molecular Dynamics Study. J Phys Chem A 2022; 126:4721-4733. [PMID: 35834556 DOI: 10.1021/acs.jpca.2c02525] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
We present path integral molecular dynamics results that describe the effects of nuclear quantum fluctuations on equilibrium and dynamical characteristics pertaining to bifurcation pathways in hydrogen bonded dimers combining water and ammonia, at cryogenic temperatures of the order of 20 K. Along these isomerizations, the hydrogen atoms in the molecules acting as hydrogen-bond donors interchange their original dangling/connective characters. Our results reveal that the resulting quantum transition paths comprise three stages: the initial and final ones involve overall rotations during which the two protons retain their classical-like characteristics. Effects from quantum fluctuation are clearly manifested in the changes operated at the intermediate passages over transition states, as the spatial extents of the protons stretch over typical lengths comparable to the distances between connective and dangling basins of attractions. Consequently, the classical over-the-hill path is replaced by a tunneling controlled mechanism which, within the path integral perspective, can be cast in terms of concerted inter-basin migrations of polymer beads from dangling-to-connective and from connective-to-dangling, at practically no energy costs. We also estimated the characteristic timescales describing such interconversions within the approximate ring polymer rate theory. Effects derived from full and partial deuteration are also discussed.
Collapse
Affiliation(s)
- Emilio Méndez
- Departamento de Química Inorgánica, Analítica y Química-Física and INQUIMAE-CONICET, Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires Ciudad Universitaria, Pabellón II, 1428 Buenos Aires, Argentina
| | - Pablo E Videla
- Department of Chemistry and Energy Sciences Institute, Yale University, 225 Prospect Street, New Haven, Connecticut 06520, United States
| | - Daniel Laria
- Departamento de Física de la Materia Condensada, Comisión Nacional de Energía Atómica, Avenida Libertador 8250, 1429 Buenos Aires, Argentina
| |
Collapse
|
31
|
Cao L, Zeng J, Wang B, Zhu T, Zhang JZH. Ab initio neural network MD simulation of thermal decomposition of a high energy material CL-20/TNT. Phys Chem Chem Phys 2022; 24:11801-11811. [PMID: 35506927 PMCID: PMC9173692 DOI: 10.1039/d2cp00710j] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
CL-20 (2,4,6,8,10,12-hexanitro-2,4,6,8,10,12-hexaazaisowurtzitane, also known as HNIW) is one of the most powerful energetic materials. However, its high sensitivity to environmental stimuli greatly reduces its safety and severely limits its application. In this work, ab initio based neural network potential (NNP) energy surfaces for both β-CL-20 and CL-20/TNT co-crystals were constructed. To accurately simulate the thermal decomposition processes of these two crystal systems, reactive molecular dynamics simulations based on the NNPs were performed. Many important intermediate species and their associated reaction paths during the decomposition had been identified in the simulations and the direct results on detonation temperatures of both systems were provided. The simulations also showed clearly that 2,4,6-trinitrotoluene (TNT) molecules in the co-crystal act as a buffer to slow down the chain reactions triggered by nitrogen dioxide and this effect is more significant at lower temperatures. Specifically, the addition of TNT molecules in the CL-20/TNT co-crystal introduces intermolecular hydrogen bonds between CL-20 and TNT molecules in the system, thereby increasing the thermal stability of the co-crystal. The current reactive molecular dynamics simulation is performed based on the NNP which helps in accelerating the speed of ab initio molecular dynamics (AIMD) simulation by more than 3 orders of magnitude while preserving the accuracy of density functional theory (DFT) calculations. This enabled us to perform longer-time simulations at more realistic temperatures that traditional AIMD methods cannot achieve. With the advantage of the NNP in its powerful fitting ability and transferability, the NNP-based MD simulation can be widely applied to energetic material systems.
Collapse
Affiliation(s)
- Liqun Cao
- Shanghai Engineering Research Center of Molecular Therapeutics & New Drug Development, School of Chemistry and Molecular Engineering, East China Normal University, Shanghai, 200062, China.
| | - Jinzhe Zeng
- Department of Chemistry and Chemical Biology, Institute for Quantitative Biomedicine, Rutgers, the State University of New Jersey, Piscataway 08854-8076, NJ, USA
| | - Bo Wang
- Shanghai Engineering Research Center of Molecular Therapeutics & New Drug Development, School of Chemistry and Molecular Engineering, East China Normal University, Shanghai, 200062, China.
| | - Tong Zhu
- Shanghai Engineering Research Center of Molecular Therapeutics & New Drug Development, School of Chemistry and Molecular Engineering, East China Normal University, Shanghai, 200062, China.
- NYU-ECNU Center for Computational Chemistry at NYU Shanghai, Shanghai, 200062, China
| | - John Z H Zhang
- Shanghai Engineering Research Center of Molecular Therapeutics & New Drug Development, School of Chemistry and Molecular Engineering, East China Normal University, Shanghai, 200062, China.
- NYU-ECNU Center for Computational Chemistry at NYU Shanghai, Shanghai, 200062, China
- Department of Chemistry, New York University, New York 10003, USA
- Shenzhen Institute of Synthetic Biology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, Guangdong, China
- Collaborative Innovation Center of Extreme Optics, Shanxi University, Taiyuan, Shanxi, 030006, China
| |
Collapse
|
32
|
Li Z, Meidani K, Yadav P, Barati Farimani A. Graph neural networks accelerated molecular dynamics. J Chem Phys 2022; 156:144103. [DOI: 10.1063/5.0083060] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023] Open
Abstract
Molecular Dynamics (MD) simulation is a powerful tool for understanding the dynamics and structure of matter. Since the resolution of MD is atomic-scale, achieving long timescale simulations with femtosecond integration is very expensive. In each MD step, numerous iterative computations are performed to calculate energy based on different types of interaction and their corresponding spatial gradients. These repetitive computations can be learned and surrogated by a deep learning model, such as a Graph Neural Network (GNN). In this work, we developed a GNN Accelerated MD (GAMD) model that directly predicts forces, given the state of the system (atom positions, atom types), bypassing the evaluation of potential energy. By training the GNN on a variety of data sources (simulation data derived from classical MD and density functional theory), we show that GAMD can predict the dynamics of two typical molecular systems, Lennard-Jones system and water system, in the NVT ensemble with velocities regulated by a thermostat. We further show that GAMD’s learning and inference are agnostic to the scale, where it can scale to much larger systems at test time. We also perform a comprehensive benchmark test comparing our implementation of GAMD to production-level MD software, showing GAMD’s competitive performance on the large-scale simulation.
Collapse
Affiliation(s)
- Zijie Li
- Department of Mechanical Engineering, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, USA
| | - Kazem Meidani
- Department of Mechanical Engineering, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, USA
| | - Prakarsh Yadav
- Department of Mechanical Engineering, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, USA
| | - Amir Barati Farimani
- Machine Learning Department, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, USA
- Department of Chemical Engineering, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, USA
| |
Collapse
|
33
|
Zeng C, Chen X, Peterson AA. A nearsighted force-training approach to systematically generate training data for the machine learning of large atomic structures. J Chem Phys 2022; 156:064104. [DOI: 10.1063/5.0079314] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Affiliation(s)
- Cheng Zeng
- School of Engineering, Brown University, Providence, Rhode Island 02912, USA
| | - Xi Chen
- School of Engineering, Brown University, Providence, Rhode Island 02912, USA
| | - Andrew A. Peterson
- School of Engineering, Brown University, Providence, Rhode Island 02912, USA
| |
Collapse
|
34
|
Abstract
In the past two decades, machine learning potentials (MLPs) have reached a level of maturity that now enables applications to large-scale atomistic simulations of a wide range of systems in chemistry, physics, and materials science. Different machine learning algorithms have been used with great success in the construction of these MLPs. In this review, we discuss an important group of MLPs relying on artificial neural networks to establish a mapping from the atomic structure to the potential energy. In spite of this common feature, there are important conceptual differences among MLPs, which concern the dimensionality of the systems, the inclusion of long-range electrostatic interactions, global phenomena like nonlocal charge transfer, and the type of descriptor used to represent the atomic structure, which can be either predefined or learnable. A concise overview is given along with a discussion of the open challenges in the field. Expected final online publication date for the Annual Review of Physical Chemistry, Volume 73 is April 2022. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.
Collapse
Affiliation(s)
- Emir Kocer
- Institut für Physikalische Chemie, Theoretische Chemie, Universität Göttingen, Göttingen, Germany;, ,
| | - Tsz Wai Ko
- Institut für Physikalische Chemie, Theoretische Chemie, Universität Göttingen, Göttingen, Germany;, ,
| | - Jörg Behler
- Institut für Physikalische Chemie, Theoretische Chemie, Universität Göttingen, Göttingen, Germany;, ,
| |
Collapse
|
35
|
Unke OT, Chmiela S, Gastegger M, Schütt KT, Sauceda HE, Müller KR. SpookyNet: Learning force fields with electronic degrees of freedom and nonlocal effects. Nat Commun 2021; 12:7273. [PMID: 34907176 PMCID: PMC8671403 DOI: 10.1038/s41467-021-27504-0] [Citation(s) in RCA: 116] [Impact Index Per Article: 29.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2021] [Accepted: 11/16/2021] [Indexed: 01/12/2023] Open
Abstract
Machine-learned force fields combine the accuracy of ab initio methods with the efficiency of conventional force fields. However, current machine-learned force fields typically ignore electronic degrees of freedom, such as the total charge or spin state, and assume chemical locality, which is problematic when molecules have inconsistent electronic states, or when nonlocal effects play a significant role. This work introduces SpookyNet, a deep neural network for constructing machine-learned force fields with explicit treatment of electronic degrees of freedom and nonlocality, modeled via self-attention in a transformer architecture. Chemically meaningful inductive biases and analytical corrections built into the network architecture allow it to properly model physical limits. SpookyNet improves upon the current state-of-the-art (or achieves similar performance) on popular quantum chemistry data sets. Notably, it is able to generalize across chemical and conformational space and can leverage the learned chemical insights, e.g. by predicting unknown spin states, thus helping to close a further important remaining gap for today's machine learning models in quantum chemistry.
Collapse
Affiliation(s)
- Oliver T Unke
- Machine Learning Group, Technische Universität Berlin, 10587, Berlin, Germany.
- DFG Cluster of Excellence "Unifying Systems in Catalysis" (UniSysCat), Technische Universität Berlin, 10623, Berlin, Germany.
| | - Stefan Chmiela
- Machine Learning Group, Technische Universität Berlin, 10587, Berlin, Germany
| | - Michael Gastegger
- Machine Learning Group, Technische Universität Berlin, 10587, Berlin, Germany
- DFG Cluster of Excellence "Unifying Systems in Catalysis" (UniSysCat), Technische Universität Berlin, 10623, Berlin, Germany
| | - Kristof T Schütt
- Machine Learning Group, Technische Universität Berlin, 10587, Berlin, Germany
| | - Huziel E Sauceda
- Machine Learning Group, Technische Universität Berlin, 10587, Berlin, Germany
- BASLEARN, BASF-TU joint Lab, Technische Universität Berlin, 10587, Berlin, Germany
| | - Klaus-Robert Müller
- Machine Learning Group, Technische Universität Berlin, 10587, Berlin, Germany.
- Department of Artificial Intelligence, Korea University, Anam-dong, Seongbuk-gu, Seoul, 02841, Korea.
- Max Planck Institute for Informatics, Stuhlsatzenhausweg, 66123, Saarbrücken, Germany.
- BIFOLD-Berlin Institute for the Foundations of Learning and Data, Berlin, Germany.
- Google Research, Brain team, Berlin, Germany.
| |
Collapse
|
36
|
Saleh Y, Sanjay V, Iske A, Yachmenev A, Küpper J. Active learning of potential-energy surfaces of weakly bound complexes with regression-tree ensembles. J Chem Phys 2021; 155:144109. [PMID: 34654290 DOI: 10.1063/5.0057051] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Several pool-based active learning (AL) algorithms were employed to model potential-energy surfaces (PESs) with a minimum number of electronic structure calculations. Theoretical and empirical results suggest that superior strategies can be obtained by sampling molecular structures corresponding to large uncertainties in their predictions while at the same time not deviating much from the true distribution of the data. To model PESs in an AL framework, we propose to use a regression version of stochastic query by forest, a hybrid method that samples points corresponding to large uncertainties while avoiding collecting too many points from sparse regions of space. The algorithm is implemented with decision trees that come with relatively small computational costs. We empirically show that this algorithm requires around half the data to converge to the same accuracy in comparison to the uncertainty-based query-by-committee algorithm. Moreover, the algorithm is fully automatic and does not require any prior knowledge of the PES. Simulations on a 6D PES of pyrrole(H2O) show that <15 000 configurations are enough to build a PES with a generalization error of 16 cm-1, whereas the final model with around 50 000 configurations has a generalization error of 11 cm-1.
Collapse
Affiliation(s)
- Yahya Saleh
- Center for Free-Electron Laser Science CFEL, Deutsches Elektronen-Synchrotron DESY, Notkestraße 85, 22607 Hamburg, Germany
| | - Vishnu Sanjay
- Center for Free-Electron Laser Science CFEL, Deutsches Elektronen-Synchrotron DESY, Notkestraße 85, 22607 Hamburg, Germany
| | - Armin Iske
- Department of Mathematics, Universität Hamburg, Bundesstraße 55, 20146 Hamburg, Germany
| | - Andrey Yachmenev
- Center for Free-Electron Laser Science CFEL, Deutsches Elektronen-Synchrotron DESY, Notkestraße 85, 22607 Hamburg, Germany
| | - Jochen Küpper
- Center for Free-Electron Laser Science CFEL, Deutsches Elektronen-Synchrotron DESY, Notkestraße 85, 22607 Hamburg, Germany
| |
Collapse
|
37
|
Guan S, Shang C, Liu Z. Structure and Dynamics of Energy Materials from Machine Learning Simulations: A Topical Review
†. CHINESE J CHEM 2021. [DOI: 10.1002/cjoc.202100299] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Affiliation(s)
- Shu‐Hui Guan
- Shanghai Academy of Agricultural Sciences Shanghai 201403 China
- Collaborative Innovation Center of Chemistry for Energy Material, Shanghai Key Laboratory of molecular Catalysis and Innovative Materials, Key Laboratory of Computational Physical Science, Department of Chemistry Fudan University Shanghai 200438 China
| | - Cheng Shang
- Collaborative Innovation Center of Chemistry for Energy Material, Shanghai Key Laboratory of molecular Catalysis and Innovative Materials, Key Laboratory of Computational Physical Science, Department of Chemistry Fudan University Shanghai 200438 China
| | - Zhi‐Pan Liu
- Collaborative Innovation Center of Chemistry for Energy Material, Shanghai Key Laboratory of molecular Catalysis and Innovative Materials, Key Laboratory of Computational Physical Science, Department of Chemistry Fudan University Shanghai 200438 China
| |
Collapse
|
38
|
Deringer VL, Bartók AP, Bernstein N, Wilkins DM, Ceriotti M, Csányi G. Gaussian Process Regression for Materials and Molecules. Chem Rev 2021; 121:10073-10141. [PMID: 34398616 PMCID: PMC8391963 DOI: 10.1021/acs.chemrev.1c00022] [Citation(s) in RCA: 300] [Impact Index Per Article: 75.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2021] [Indexed: 12/18/2022]
Abstract
We provide an introduction to Gaussian process regression (GPR) machine-learning methods in computational materials science and chemistry. The focus of the present review is on the regression of atomistic properties: in particular, on the construction of interatomic potentials, or force fields, in the Gaussian Approximation Potential (GAP) framework; beyond this, we also discuss the fitting of arbitrary scalar, vectorial, and tensorial quantities. Methodological aspects of reference data generation, representation, and regression, as well as the question of how a data-driven model may be validated, are reviewed and critically discussed. A survey of applications to a variety of research questions in chemistry and materials science illustrates the rapid growth in the field. A vision is outlined for the development of the methodology in the years to come.
Collapse
Affiliation(s)
- Volker L. Deringer
- Department
of Chemistry, Inorganic Chemistry Laboratory, University of Oxford, Oxford OX1 3QR, United Kingdom
| | - Albert P. Bartók
- Department
of Physics and Warwick Centre for Predictive Modelling, School of
Engineering, University of Warwick, Coventry CV4 7AL, United Kingdom
| | - Noam Bernstein
- Center
for Computational Materials Science, U.S.
Naval Research Laboratory, Washington D.C. 20375, United States
| | - David M. Wilkins
- Atomistic
Simulation Centre, School of Mathematics and Physics, Queen’s University Belfast, Belfast BT7 1NN, Northern Ireland, United Kingdom
| | - Michele Ceriotti
- Laboratory
of Computational Science and Modeling, IMX, École Polytechnique Fédérale de Lausanne, Lausanne 1015, Switzerland
- National
Centre for Computational Design and Discovery of Novel Materials (MARVEL), École Polytechnique Fédérale
de Lausanne, Lausanne, Switzerland
| | - Gábor Csányi
- Engineering
Laboratory, University of Cambridge, Cambridge CB2 1PZ, United Kingdom
| |
Collapse
|
39
|
Unke O, Chmiela S, Sauceda HE, Gastegger M, Poltavsky I, Schütt KT, Tkatchenko A, Müller KR. Machine Learning Force Fields. Chem Rev 2021; 121:10142-10186. [PMID: 33705118 PMCID: PMC8391964 DOI: 10.1021/acs.chemrev.0c01111] [Citation(s) in RCA: 489] [Impact Index Per Article: 122.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2020] [Indexed: 12/27/2022]
Abstract
In recent years, the use of machine learning (ML) in computational chemistry has enabled numerous advances previously out of reach due to the computational complexity of traditional electronic-structure methods. One of the most promising applications is the construction of ML-based force fields (FFs), with the aim to narrow the gap between the accuracy of ab initio methods and the efficiency of classical FFs. The key idea is to learn the statistical relation between chemical structure and potential energy without relying on a preconceived notion of fixed chemical bonds or knowledge about the relevant interactions. Such universal ML approximations are in principle only limited by the quality and quantity of the reference data used to train them. This review gives an overview of applications of ML-FFs and the chemical insights that can be obtained from them. The core concepts underlying ML-FFs are described in detail, and a step-by-step guide for constructing and testing them from scratch is given. The text concludes with a discussion of the challenges that remain to be overcome by the next generation of ML-FFs.
Collapse
Affiliation(s)
- Oliver
T. Unke
- Machine
Learning Group, Technische Universität
Berlin, 10587 Berlin, Germany
- DFG
Cluster of Excellence “Unifying Systems in Catalysis”
(UniSysCat), Technische Universität Berlin, 10623 Berlin, Germany
| | - Stefan Chmiela
- Machine
Learning Group, Technische Universität
Berlin, 10587 Berlin, Germany
| | - Huziel E. Sauceda
- Machine
Learning Group, Technische Universität
Berlin, 10587 Berlin, Germany
- BASLEARN,
BASF-TU Joint Lab, Technische Universität
Berlin, 10587 Berlin, Germany
| | - Michael Gastegger
- Machine
Learning Group, Technische Universität
Berlin, 10587 Berlin, Germany
- DFG
Cluster of Excellence “Unifying Systems in Catalysis”
(UniSysCat), Technische Universität Berlin, 10623 Berlin, Germany
- BASLEARN,
BASF-TU Joint Lab, Technische Universität
Berlin, 10587 Berlin, Germany
| | - Igor Poltavsky
- Department
of Physics and Materials Science, University
of Luxembourg, L-1511 Luxembourg City, Luxembourg
| | - Kristof T. Schütt
- Machine
Learning Group, Technische Universität
Berlin, 10587 Berlin, Germany
| | - Alexandre Tkatchenko
- Department
of Physics and Materials Science, University
of Luxembourg, L-1511 Luxembourg City, Luxembourg
| | - Klaus-Robert Müller
- Machine
Learning Group, Technische Universität
Berlin, 10587 Berlin, Germany
- BIFOLD−Berlin
Institute for the Foundations of Learning and Data, Berlin, Germany
- Department
of Artificial Intelligence, Korea University, Anam-dong, Seongbuk-gu, Seoul 02841, Korea
- Max Planck
Institute for Informatics, Stuhlsatzenhausweg, 66123 Saarbrücken, Germany
- Google
Research, Brain Team, Berlin, Germany
| |
Collapse
|
40
|
Poltavsky I, Tkatchenko A. Machine Learning Force Fields: Recent Advances and Remaining Challenges. J Phys Chem Lett 2021; 12:6551-6564. [PMID: 34242032 DOI: 10.1021/acs.jpclett.1c01204] [Citation(s) in RCA: 41] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
In chemistry and physics, machine learning (ML) methods promise transformative impacts by advancing modeling and improving our understanding of complex molecules and materials. Each ML method comprises a mathematically well-defined procedure, and an increasingly larger number of easy-to-use ML packages for modeling atomistic systems are becoming available. In this Perspective, we discuss the general aspects of ML techniques in the context of creating ML force fields. We describe common features of ML modeling and quantum-mechanical approximations, so-called global and local ML models, and the physical differences behind these two classes of approaches. Finally, we describe the recent developments and emerging directions in the field of ML-driven molecular modeling. This Perspective aims to inspire interdisciplinary collaborations crossing the borders between physical chemistry, chemical physics, computer science, and data science.
Collapse
Affiliation(s)
- Igor Poltavsky
- Department of Physics and Materials Science, University of Luxembourg, L-1511 Luxembourg City, Luxembourg
| | - Alexandre Tkatchenko
- Department of Physics and Materials Science, University of Luxembourg, L-1511 Luxembourg City, Luxembourg
| |
Collapse
|
41
|
Miksch AM, Morawietz T, Kästner J, Urban A, Artrith N. Strategies for the construction of machine-learning potentials for accurate and efficient atomic-scale simulations. MACHINE LEARNING: SCIENCE AND TECHNOLOGY 2021. [DOI: 10.1088/2632-2153/abfd96] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Abstract
Recent advances in machine-learning interatomic potentials have enabled the efficient modeling of complex atomistic systems with an accuracy that is comparable to that of conventional quantum-mechanics based methods. At the same time, the construction of new machine-learning potentials can seem a daunting task, as it involves data-science techniques that are not yet common in chemistry and materials science. Here, we provide a tutorial-style overview of strategies and best practices for the construction of artificial neural network (ANN) potentials. We illustrate the most important aspects of (a) data collection, (b) model selection, (c) training and validation, and (d) testing and refinement of ANN potentials on the basis of practical examples. Current research in the areas of active learning and delta learning are also discussed in the context of ANN potentials. This tutorial review aims at equipping computational chemists and materials scientists with the required background knowledge for ANN potential construction and application, with the intention to accelerate the adoption of the method, so that it can facilitate exciting research that would otherwise be challenging with conventional strategies.
Collapse
|
42
|
Zubatiuk T, Nebgen B, Lubbers N, Smith JS, Zubatyuk R, Zhou G, Koh C, Barros K, Isayev O, Tretiak S. Machine learned Hückel theory: Interfacing physics and deep neural networks. J Chem Phys 2021; 154:244108. [PMID: 34241371 DOI: 10.1063/5.0052857] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
The Hückel Hamiltonian is an incredibly simple tight-binding model known for its ability to capture qualitative physics phenomena arising from electron interactions in molecules and materials. Part of its simplicity arises from using only two types of empirically fit physics-motivated parameters: the first describes the orbital energies on each atom and the second describes electronic interactions and bonding between atoms. By replacing these empirical parameters with machine-learned dynamic values, we vastly increase the accuracy of the extended Hückel model. The dynamic values are generated with a deep neural network, which is trained to reproduce orbital energies and densities derived from density functional theory. The resulting model retains interpretability, while the deep neural network parameterization is smooth and accurate and reproduces insightful features of the original empirical parameterization. Overall, this work shows the promise of utilizing machine learning to formulate simple, accurate, and dynamically parameterized physics models.
Collapse
Affiliation(s)
- Tetiana Zubatiuk
- Department of Chemistry, Mellon College of Science, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, USA
| | - Benjamin Nebgen
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87544, USA
| | - Nicholas Lubbers
- Center for Nonlinear Studies, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, USA
| | - Justin S Smith
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87544, USA
| | - Roman Zubatyuk
- Department of Chemistry, Mellon College of Science, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, USA
| | - Guoqing Zhou
- Department of Physics and Astronomy, University of Southern California, Los Angeles, California 90089, USA
| | - Christopher Koh
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87544, USA
| | - Kipton Barros
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87544, USA
| | - Olexandr Isayev
- Department of Chemistry, Mellon College of Science, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, USA
| | - Sergei Tretiak
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87544, USA
| |
Collapse
|
43
|
Xu M, Zhu T, Zhang JZH. Automatically Constructed Neural Network Potentials for Molecular Dynamics Simulation of Zinc Proteins. Front Chem 2021; 9:692200. [PMID: 34222200 PMCID: PMC8249736 DOI: 10.3389/fchem.2021.692200] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2021] [Accepted: 05/10/2021] [Indexed: 11/13/2022] Open
Abstract
The development of accurate and efficient potential energy functions for the molecular dynamics simulation of metalloproteins has long been a great challenge for the theoretical chemistry community. An artificial neural network provides the possibility to develop potential energy functions with both the efficiency of the classical force fields and the accuracy of the quantum chemical methods. In this work, neural network potentials were automatically constructed by using the ESOINN-DP method for typical zinc proteins. For the four most common zinc coordination modes in proteins, the potential energy, atomic forces, and atomic charges predicted by neural network models show great agreement with quantum mechanics calculations and the neural network potential can maintain the coordination geometry correctly. In addition, MD simulation and energy optimization with the neural network potential can be readily used for structural refinement. The neural network potential is not limited by the function form and complex parameterization process, and important quantum effects such as polarization and charge transfer can be accurately considered. The algorithm proposed in this work can also be directly applied to proteins containing other metal ions.
Collapse
Affiliation(s)
- Mingyuan Xu
- Shanghai Engineering Research Center of Molecular Therapeutics and New Drug Development, Shanghai Key Laboratory of Green Chemistry and Chemical Process, School of Chemistry and Molecular Engineering, East China Normal University, Shanghai, China
| | - Tong Zhu
- Shanghai Engineering Research Center of Molecular Therapeutics and New Drug Development, Shanghai Key Laboratory of Green Chemistry and Chemical Process, School of Chemistry and Molecular Engineering, East China Normal University, Shanghai, China
- NYU-ECNU Center for Computational Chemistry at NYU Shanghai, Shanghai, China
| | - John Z. H. Zhang
- Shanghai Engineering Research Center of Molecular Therapeutics and New Drug Development, Shanghai Key Laboratory of Green Chemistry and Chemical Process, School of Chemistry and Molecular Engineering, East China Normal University, Shanghai, China
- NYU-ECNU Center for Computational Chemistry at NYU Shanghai, Shanghai, China
- Department of Chemistry, New York University, New York, NY, United States
- Collaborative Innovation Center of Extreme Optics, Shanxi University, Taiyuan, China
| |
Collapse
|
44
|
Lam ST, Li QJ, Ballinger R, Forsberg C, Li J. Modeling LiF and FLiBe Molten Salts with Robust Neural Network Interatomic Potential. ACS APPLIED MATERIALS & INTERFACES 2021; 13:24582-24592. [PMID: 34019760 DOI: 10.1021/acsami.1c00604] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Lithium-based molten salts have attracted significant attention due to their applications in energy storage, advanced fission reactors, and fusion devices. Lithium fluorides and particularly 66.6%LiF-33.3%BeF2 (Flibe) are of considerable interest in nuclear systems, as they show an excellent combination of favorable heat transfer, neutron moderation, and transmutation characteristics. For nuclear salts, the range of possible local structures, compositions, and thermodynamic conditions presents significant challenges in atomistic modeling. In this work, we demonstrate that atom-centered neural network interatomic potentials (NNIPs) provide a fast method for performing molecular dynamics of molten salts that is as accurate as ab initio molecular dynamics. For LiF, these potentials are able to accurately reproduce ab initio interactions of dimers, crystalline solids under deformation, crystalline LiF near the melting point, and liquid LiF at high temperatures. For Flibe, NNIPs accurately predict the structures and dynamics at normal operating conditions, high-temperature-pressure conditions, and in the crystalline solid phase. Furthermore, we show that NNIP-based molecular dynamics of molten salts are scalable to reach long time scales (e.g., nanosecond) and large system sizes (e.g., 105 atoms) while maintaining ab initio density functional theory accuracy and providing more than 3 orders of magnitude of computational speedup for calculating structure and transport properties.
Collapse
Affiliation(s)
- Stephen T Lam
- Department of Chemical Engineering, University of Massachusetts Lowell, Lowell, Massachusetts 01854, United States
- Department of Nuclear Science and Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Qing-Jie Li
- Department of Nuclear Science and Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Ronald Ballinger
- Department of Nuclear Science and Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
- Department of Materials Science and Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Charles Forsberg
- Department of Nuclear Science and Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Ju Li
- Department of Nuclear Science and Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
- Department of Materials Science and Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| |
Collapse
|
45
|
Laurens G, Rabary M, Lam J, Peláez D, Allouche AR. Infrared spectra of neutral polycyclic aromatic hydrocarbons based on machine learning potential energy surface and dipole mapping. Theor Chem Acc 2021. [DOI: 10.1007/s00214-021-02773-6] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
|
46
|
Xu J, Cao XM, Hu P. Perspective on computational reaction prediction using machine learning methods in heterogeneous catalysis. Phys Chem Chem Phys 2021; 23:11155-11179. [PMID: 33972971 DOI: 10.1039/d1cp01349a] [Citation(s) in RCA: 26] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
Heterogeneous catalysis plays a significant role in the modern chemical industry. Towards the rational design of novel catalysts, understanding reactions over surfaces is the most essential aspect. Typical industrial catalytic processes such as syngas conversion and methane utilisation can generate a large reaction network comprising thousands of intermediates and reaction pairs. This complexity not only arises from the permutation of transformations between species but also from the extra reaction channels offered by distinct surface sites. Despite the success in investigating surface reactions at the atomic scale, the huge computational expense of ab initio methods hinders the exploration of such complicated reaction networks. With the proliferation of catalysis studies, machine learning as an emerging tool can take advantage of the accumulated reaction data to emulate the output of ab initio methods towards swift reaction prediction. Here, we briefly summarise the conventional workflow of reaction prediction, including reaction network generation, ab initio thermodynamics and microkinetic modelling. An overview of the frequently used regression models in machine learning is presented. As a promising alternative to full ab initio calculations, machine learning interatomic potentials are highlighted. Furthermore, we survey applications assisted by these methods for accelerating reaction prediction, exploring reaction networks, and computational catalyst design. Finally, we envisage future directions in computationally investigating reactions and implementing machine learning algorithms in heterogeneous catalysis.
Collapse
Affiliation(s)
- Jiayan Xu
- Key Laboratory for Advanced Materials and Joint International Research Laboratory of Precision Chemistry and Molecular Engineering, Feringa Nobel Prize Scientist Joint Research Center, Frontiers Science Center for Materiobiology and Dynamic Chemistry, Centre for Computational Chemistry and Research Institute of Industrial Catalysis, School of Chemistry and Molecular Engineering, East China University of Science and Technology, 130 Meilong Road, Shanghai, 200237, P. R. China. and School of Chemistry and Chemical Engineering, Queen's University Belfast, Belfast BT9 5AG, UK
| | - Xiao-Ming Cao
- Key Laboratory for Advanced Materials and Joint International Research Laboratory of Precision Chemistry and Molecular Engineering, Feringa Nobel Prize Scientist Joint Research Center, Frontiers Science Center for Materiobiology and Dynamic Chemistry, Centre for Computational Chemistry and Research Institute of Industrial Catalysis, School of Chemistry and Molecular Engineering, East China University of Science and Technology, 130 Meilong Road, Shanghai, 200237, P. R. China.
| | - P Hu
- Key Laboratory for Advanced Materials and Joint International Research Laboratory of Precision Chemistry and Molecular Engineering, Feringa Nobel Prize Scientist Joint Research Center, Frontiers Science Center for Materiobiology and Dynamic Chemistry, Centre for Computational Chemistry and Research Institute of Industrial Catalysis, School of Chemistry and Molecular Engineering, East China University of Science and Technology, 130 Meilong Road, Shanghai, 200237, P. R. China. and School of Chemistry and Chemical Engineering, Queen's University Belfast, Belfast BT9 5AG, UK
| |
Collapse
|
47
|
Paleico ML, Behler J. A bin and hash method for analyzing reference data and descriptors in machine learning potentials. MACHINE LEARNING: SCIENCE AND TECHNOLOGY 2021. [DOI: 10.1088/2632-2153/abe663] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open
Abstract
Abstract
In recent years the development of machine learning potentials (MLPs) has become a very active field of research. Numerous approaches have been proposed, which allow one to perform extended simulations of large systems at a small fraction of the computational costs of electronic structure calculations. The key to the success of modern MLPs is the close-to first principles quality description of the atomic interactions. This accuracy is reached by using very flexible functional forms in combination with high-level reference data from electronic structure calculations. These data sets can include up to hundreds of thousands of structures covering millions of atomic environments to ensure that all relevant features of the potential energy surface are well represented. The handling of such large data sets is nowadays becoming one of the main challenges in the construction of MLPs. In this paper we present a method, the bin-and-hash (BAH) algorithm, to overcome this problem by enabling the efficient identification and comparison of large numbers of multidimensional vectors. Such vectors emerge in multiple contexts in the construction of MLPs. Examples are the comparison of local atomic environments to identify and avoid unnecessary redundant information in the reference data sets that is costly in terms of both the electronic structure calculations as well as the training process, the assessment of the quality of the descriptors used as structural fingerprints in many types of MLPs, and the detection of possibly unreliable data points. The BAH algorithm is illustrated for the example of high-dimensional neural network potentials using atom-centered symmetry functions for the geometrical description of the atomic environments, but the method is general and can be combined with any current type of MLP.
Collapse
|
48
|
Morawietz T, Artrith N. Machine learning-accelerated quantum mechanics-based atomistic simulations for industrial applications. J Comput Aided Mol Des 2021; 35:557-586. [PMID: 33034008 PMCID: PMC8018928 DOI: 10.1007/s10822-020-00346-6] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2020] [Accepted: 09/26/2020] [Indexed: 01/13/2023]
Abstract
Atomistic simulations have become an invaluable tool for industrial applications ranging from the optimization of protein-ligand interactions for drug discovery to the design of new materials for energy applications. Here we review recent advances in the use of machine learning (ML) methods for accelerated simulations based on a quantum mechanical (QM) description of the system. We show how recent progress in ML methods has dramatically extended the applicability range of conventional QM-based simulations, allowing to calculate industrially relevant properties with enhanced accuracy, at reduced computational cost, and for length and time scales that would have otherwise not been accessible. We illustrate the benefits of ML-accelerated atomistic simulations for industrial R&D processes by showcasing relevant applications from two very different areas, drug discovery (pharmaceuticals) and energy materials. Writing from the perspective of both a molecular and a materials modeling scientist, this review aims to provide a unified picture of the impact of ML-accelerated atomistic simulations on the pharmaceutical, chemical, and materials industries and gives an outlook on the exciting opportunities that could emerge in the future.
Collapse
Affiliation(s)
- Tobias Morawietz
- Bayer AG, Pharmaceuticals, R&D, Digital Technologies, Computational Molecular Design, 42096 Wuppertal, Germany
| | - Nongnuch Artrith
- Department of Chemical Engineering, Columbia University, New York, NY 10027 USA
| |
Collapse
|
49
|
Affiliation(s)
- Jörg Behler
- Universität Göttingen, Institut für Physikalische Chemie, Theoretische Chemie, Tammannstraße 6, 37077 Göttingen, Germany
| |
Collapse
|
50
|
Automated discovery of a robust interatomic potential for aluminum. Nat Commun 2021; 12:1257. [PMID: 33623036 PMCID: PMC7902823 DOI: 10.1038/s41467-021-21376-0] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2020] [Accepted: 01/15/2021] [Indexed: 11/22/2022] Open
Abstract
Machine learning, trained on quantum mechanics (QM) calculations, is a powerful tool for modeling potential energy surfaces. A critical factor is the quality and diversity of the training dataset. Here we present a highly automated approach to dataset construction and demonstrate the method by building a potential for elemental aluminum (ANI-Al). In our active learning scheme, the ML potential under development is used to drive non-equilibrium molecular dynamics simulations with time-varying applied temperatures. Whenever a configuration is reached for which the ML uncertainty is large, new QM data is collected. The ML model is periodically retrained on all available QM data. The final ANI-Al potential makes very accurate predictions of radial distribution function in melt, liquid-solid coexistence curve, and crystal properties such as defect energies and barriers. We perform a 1.3M atom shock simulation and show that ANI-Al force predictions shine in their agreement with new reference DFT calculations. The accuracy of a machine-learned potential is limited by the quality and diversity of the training dataset. Here the authors propose an active learning approach to automatically construct general purpose machine-learning potentials here demonstrated for the aluminum case.
Collapse
|