1
|
Voss J. Machine learning for accuracy in density functional approximations. J Comput Chem 2024; 45:1829-1845. [PMID: 38668453 DOI: 10.1002/jcc.27366] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2023] [Revised: 02/16/2024] [Accepted: 03/25/2024] [Indexed: 07/21/2024]
Abstract
Machine learning techniques have found their way into computational chemistry as indispensable tools to accelerate atomistic simulations and materials design. In addition, machine learning approaches hold the potential to boost the predictive power of computationally efficient electronic structure methods, such as density functional theory, to chemical accuracy and to correct for fundamental errors in density functional approaches. Here, recent progress in applying machine learning to improve the accuracy of density functional and related approximations is reviewed. Promises and challenges in devising machine learning models transferable between different chemistries and materials classes are discussed with the help of examples applying promising models to systems far outside their training sets.
Collapse
Affiliation(s)
- Johannes Voss
- SUNCAT Center for Interface Science and Catalysis, SLAC National Accelerator Laboratory, Menlo Park, California, USA
| |
Collapse
|
2
|
Aldossary A, Campos-Gonzalez-Angulo JA, Pablo-García S, Leong SX, Rajaonson EM, Thiede L, Tom G, Wang A, Avagliano D, Aspuru-Guzik A. In Silico Chemical Experiments in the Age of AI: From Quantum Chemistry to Machine Learning and Back. ADVANCED MATERIALS (DEERFIELD BEACH, FLA.) 2024; 36:e2402369. [PMID: 38794859 DOI: 10.1002/adma.202402369] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/15/2024] [Revised: 04/28/2024] [Indexed: 05/26/2024]
Abstract
Computational chemistry is an indispensable tool for understanding molecules and predicting chemical properties. However, traditional computational methods face significant challenges due to the difficulty of solving the Schrödinger equations and the increasing computational cost with the size of the molecular system. In response, there has been a surge of interest in leveraging artificial intelligence (AI) and machine learning (ML) techniques to in silico experiments. Integrating AI and ML into computational chemistry increases the scalability and speed of the exploration of chemical space. However, challenges remain, particularly regarding the reproducibility and transferability of ML models. This review highlights the evolution of ML in learning from, complementing, or replacing traditional computational chemistry for energy and property predictions. Starting from models trained entirely on numerical data, a journey set forth toward the ideal model incorporating or learning the physical laws of quantum mechanics. This paper also reviews existing computational methods and ML models and their intertwining, outlines a roadmap for future research, and identifies areas for improvement and innovation. Ultimately, the goal is to develop AI architectures capable of predicting accurate and transferable solutions to the Schrödinger equation, thereby revolutionizing in silico experiments within chemistry and materials science.
Collapse
Affiliation(s)
- Abdulrahman Aldossary
- Department of Chemistry, University of Toronto, 80 St. George Street, Toronto, ON, M5S 3H6, Canada
| | | | - Sergio Pablo-García
- Department of Chemistry, University of Toronto, 80 St. George Street, Toronto, ON, M5S 3H6, Canada
- Department of Computer Science, University of Toronto, 40 St. George Street, Toronto, ON, M5S 2E4, Canada
| | - Shi Xuan Leong
- Department of Chemistry, University of Toronto, 80 St. George Street, Toronto, ON, M5S 3H6, Canada
| | - Ella Miray Rajaonson
- Department of Chemistry, University of Toronto, 80 St. George Street, Toronto, ON, M5S 3H6, Canada
- Vector Institute for Artificial Intelligence, 661 University Ave. Suite 710, Toronto, ON, M5G 1M1, Canada
| | - Luca Thiede
- Department of Computer Science, University of Toronto, 40 St. George Street, Toronto, ON, M5S 2E4, Canada
- Vector Institute for Artificial Intelligence, 661 University Ave. Suite 710, Toronto, ON, M5G 1M1, Canada
| | - Gary Tom
- Department of Chemistry, University of Toronto, 80 St. George Street, Toronto, ON, M5S 3H6, Canada
- Vector Institute for Artificial Intelligence, 661 University Ave. Suite 710, Toronto, ON, M5G 1M1, Canada
| | - Andrew Wang
- Department of Chemistry, University of Toronto, 80 St. George Street, Toronto, ON, M5S 3H6, Canada
| | - Davide Avagliano
- Chimie ParisTech, PSL University, CNRS, Institute of Chemistry for Life and Health Sciences (iCLeHS UMR 8060), Paris, F-75005, France
| | - Alán Aspuru-Guzik
- Department of Chemistry, University of Toronto, 80 St. George Street, Toronto, ON, M5S 3H6, Canada
- Department of Computer Science, University of Toronto, 40 St. George Street, Toronto, ON, M5S 2E4, Canada
- Vector Institute for Artificial Intelligence, 661 University Ave. Suite 710, Toronto, ON, M5G 1M1, Canada
- Department of Materials Science & Engineering, University of Toronto, 184 College St., Toronto, ON, M5S 3E4, Canada
- Department of Chemical Engineering & Applied Chemistry, University of Toronto, 200 College St., Toronto, ON, M5S 3E5, Canada
- Lebovic Fellow, Canadian Institute for Advanced Research (CIFAR), 66118 University Ave., Toronto, M5G 1M1, Canada
- Acceleration Consortium, 80 St George St, Toronto, M5S 3H6, Canada
| |
Collapse
|
3
|
Pathirage PDVS, Phillips JT, Vogiatzis KD. Exploration of the Two-Electron Excitation Space with Data-Driven Coupled Cluster. J Phys Chem A 2024. [PMID: 38422511 DOI: 10.1021/acs.jpca.3c06600] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/02/2024]
Abstract
Computational cost limits the applicability of post-Hartree-Fock methods such as coupled-cluster on larger molecular systems. The data-driven coupled-cluster (DDCC) method applies machine learning to predict the coupled-cluster two-electron amplitudes (t2) using data from second-order perturbation theory (MP2). One major limitation of the DDCC models is the size of training sets that increases exponentially with the system size. Effective sampling of the amplitude space can resolve this issue. Five different amplitude selection techniques that reduce the amount of data used for training were evaluated, an approach that also prevents model overfitting and increases the portability of data-driven coupled-cluster singles and doubles to more complex molecules or larger basis sets. In combination with a localized orbital formalism to predict the CCSD t2 amplitudes, we have achieved a 10-fold error reduction for energy calculations.
Collapse
Affiliation(s)
- P D Varuna S Pathirage
- Department of Chemistry, University of Tennessee, Knoxville, Tennessee 37996-1600, United States
| | - Justin T Phillips
- Department of Chemistry, University of Tennessee, Knoxville, Tennessee 37996-1600, United States
| | - Konstantinos D Vogiatzis
- Department of Chemistry, University of Tennessee, Knoxville, Tennessee 37996-1600, United States
| |
Collapse
|
4
|
Ng WP, Liang Q, Yang J. Low-Data Deep Quantum Chemical Learning for Accurate MP2 and Coupled-Cluster Correlations. J Chem Theory Comput 2023; 19:5439-5449. [PMID: 37506400 DOI: 10.1021/acs.jctc.3c00518] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/30/2023]
Abstract
Accurate ab initio prediction of electronic energies is very expensive for macromolecules by explicitly solving post-Hartree-Fock equations. We here exploit the physically justified local correlation feature in a compact basis of small molecules and construct an expressive low-data deep neural network (dNN) model to obtain machine-learned electron correlation energies on par with MP2 and CCSD levels of theory for more complex molecules and different datasets that are not represented in the training set. We show that our dNN-powered model is data efficient and makes highly transferable predictions across alkanes of various lengths, organic molecules with non-covalent and biomolecular interactions, as well as water clusters of different sizes and morphologies. In particular, by training 800 (H2O)8 clusters with the local correlation descriptors, accurate MP2/cc-pVTZ correlation energies up to (H2O)128 can be predicted with a small random error within chemical accuracy from exact values, while a majority of prediction deviations are attributed to an intrinsically systematic error. Our results reveal that an extremely compact local correlation feature set, which is poor for any direct post-Hartree-Fock calculations, has however a prominent advantage in reserving important electron correlation patterns for making accurate transferable predictions across distinct molecular compositions, bond types, and geometries.
Collapse
Affiliation(s)
- Wai-Pan Ng
- Department of Chemistry, The University of Hong Kong, Hong Kong 999077, P. R. China
- Hong Kong Quantum AI Lab Limited, Hong Kong 999077, P. R. China
| | - Qiujiang Liang
- Department of Chemistry, The University of Hong Kong, Hong Kong 999077, P. R. China
| | - Jun Yang
- Department of Chemistry, The University of Hong Kong, Hong Kong 999077, P. R. China
- Hong Kong Quantum AI Lab Limited, Hong Kong 999077, P. R. China
| |
Collapse
|
5
|
Nakai H, Kobayashi M, Yoshikawa T, Seino J, Ikabata Y, Nishimura Y. Divide-and-Conquer Linear-Scaling Quantum Chemical Computations. J Phys Chem A 2023; 127:589-618. [PMID: 36630608 DOI: 10.1021/acs.jpca.2c06965] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]
Abstract
Fragmentation and embedding schemes are of great importance when applying quantum-chemical calculations to more complex and attractive targets. The divide-and-conquer (DC)-based quantum-chemical model is a fragmentation scheme that can be connected to embedding schemes. This feature article explains several DC-based schemes developed by the authors over the last two decades, which was inspired by the pioneering study of DC self-consistent field (SCF) method by Yang and Lee (J. Chem. Phys. 1995, 103, 5674-5678). First, the theoretical aspects of the DC-based SCF, electron correlation, excited-state, and nuclear orbital methods are described, followed by the two-component relativistic theory, quantum-mechanical molecular dynamics simulation, and the introduction of three programs, including DC-based schemes. Illustrative applications confirmed the accuracy and feasibility of the DC-based schemes.
Collapse
Affiliation(s)
- Hiromi Nakai
- Department of Chemistry and Biochemistry, School of Advanced Science and Engineering, Waseda University, 3-4-1 Okubo, Shinjuku, Tokyo169-8555, Japan.,Waseda Research Institute for Science and Engineering, Waseda University, 3-4-1 Okubo, Shinjuku, Tokyo169-8555, Japan
| | - Masato Kobayashi
- Department of Chemistry, Faculty of Science, Hokkaido University, Kita 10 Nishi 8, Kita-ku, Sapporo, Hokkaido060-0810, Japan.,Institute for Chemical Reaction Design and Discovery (WPI-ICReDD), Hokkaido University, Kita 21 Nishi 10, Kita-ku, Sapporo, Hokkaido001-0021, Japan
| | - Takeshi Yoshikawa
- Faculty of Pharmaceutical Sciences, Toho University, 2-2-1 Miyama, Funabashi, Chiba274-8510, Japan
| | - Junji Seino
- Department of Chemistry and Biochemistry, School of Advanced Science and Engineering, Waseda University, 3-4-1 Okubo, Shinjuku, Tokyo169-8555, Japan.,Waseda Research Institute for Science and Engineering, Waseda University, 3-4-1 Okubo, Shinjuku, Tokyo169-8555, Japan
| | - Yasuhiro Ikabata
- Information and Media Center, Toyohashi University of Technology, 1-1 Hibarigaoka, Tempaku-cho, Toyohashi, Aichi441-8580, Japan.,Department of Computer Science and Engineering, Toyohashi University of Technology, 1-1 Hibarigaoka, Tempaku-cho, Toyohashi, Aichi441-8580, Japan
| | - Yoshifumi Nishimura
- Waseda Research Institute for Science and Engineering, Waseda University, 3-4-1 Okubo, Shinjuku, Tokyo169-8555, Japan
| |
Collapse
|
6
|
Lunghi A, Sanvito S. Computational design of magnetic molecules and their environment using quantum chemistry, machine learning and multiscale simulations. Nat Rev Chem 2022; 6:761-781. [PMID: 37118096 DOI: 10.1038/s41570-022-00424-3] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/15/2022] [Indexed: 11/09/2022]
Abstract
Having served as a playground for fundamental studies on the physics of d and f electrons for almost a century, magnetic molecules are now becoming increasingly important for technological applications, such as magnetic resonance, data storage, spintronics and quantum information. All of these applications require the preservation and control of spins in time, an ability hampered by the interaction with the environment, namely with other spins, conduction electrons, molecular vibrations and electromagnetic fields. Thus, the design of a novel magnetic molecule with tailored properties is a formidable task, which does not only concern its electronic structures but also calls for a deep understanding of the interaction among all the degrees of freedom at play. This Review describes how state-of-the-art ab initio computational methods, combined with data-driven approaches to materials modelling, can be integrated into a fully multiscale strategy capable of defining design rules for magnetic molecules.
Collapse
|
7
|
Ketkaew R, Creazzo F, Luber S. Machine Learning-Assisted Discovery of Hidden States in Expanded Free Energy Space. J Phys Chem Lett 2022; 13:1797-1805. [PMID: 35171614 DOI: 10.1021/acs.jpclett.1c04004] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Collective variables (CVs) are crucial parameters in enhanced sampling calculations and strongly impact the quality of the obtained free energy surface. However, many existing CVs are unique to and dependent on the system they are constructed with, making the developed CV non-transferable to other systems. Herein, we develop a non-instructor-led deep autoencoder neural network (DAENN) for discovering general-purpose CVs. The DAENN is used to train a model by learning molecular representations upon unbiased trajectories that contain only the reactant conformers. The prior knowledge of nonconstraint reactants coupled with the here-introduced topology variable and loss-like penalty function are only required to make the biasing method able to expand its configurational (phase) space to unexplored energy basins. Our developed autoencoder is efficient and relatively inexpensive to use in terms of a priori knowledge, enabling one to automatically search for hidden CVs of the reaction of interest.
Collapse
Affiliation(s)
- Rangsiman Ketkaew
- Department of Chemistry, University of Zurich, Winterthurerstrasse 190, CH-8057 Zürich, Switzerland
| | - Fabrizio Creazzo
- Department of Chemistry, University of Zurich, Winterthurerstrasse 190, CH-8057 Zürich, Switzerland
| | - Sandra Luber
- Department of Chemistry, University of Zurich, Winterthurerstrasse 190, CH-8057 Zürich, Switzerland
| |
Collapse
|
8
|
Han R, Ketkaew R, Luber S. A Concise Review on Recent Developments of Machine Learning for the Prediction of Vibrational Spectra. J Phys Chem A 2022; 126:801-812. [PMID: 35133168 DOI: 10.1021/acs.jpca.1c10417] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Machine learning has become more and more popular in computational chemistry, as well as in the important field of spectroscopy. In this concise review, we walk the reader through a short summary of machine learning algorithms and a comprehensive discussion on the connection between machine learning methods and vibrational spectroscopy, particularly for the case of infrared and Raman spectroscopy. We also briefly discuss state-of-the-art molecular representations which serve as meaningful inputs for machine learning to predict vibrational spectra. In addition, this review provides an overview of the transferability and best practices of machine learning in the prediction of vibrational spectra as well as possible future research directions.
Collapse
Affiliation(s)
- Ruocheng Han
- Department of Chemistry, University of Zurich, Winterthurerstrasse 190, CH-8057 Zürich, Switzerland
| | - Rangsiman Ketkaew
- Department of Chemistry, University of Zurich, Winterthurerstrasse 190, CH-8057 Zürich, Switzerland
| | - Sandra Luber
- Department of Chemistry, University of Zurich, Winterthurerstrasse 190, CH-8057 Zürich, Switzerland
| |
Collapse
|
9
|
Han R, Luber S. Fast Estimation of Møller-Plesset Correlation Energies Based on Atomic Contributions. J Phys Chem Lett 2021; 12:5324-5331. [PMID: 34061529 DOI: 10.1021/acs.jpclett.1c00900] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Dynamic correlation plays an important role in the accurate calculation of chemical compounds such as the description of equilibrium structures in chemical systems. A model for the fast estimation of dynamic correlation energy is introduced in this work. This model is based on the idea of decomposition of the contribution of dynamic correlation energy calculated by nth order Møller-Plesset perturbation (MPn) theory with respect to atomic regions. Multiple levels of theory, including MP2, MP2.5, and MP4, are used as the reference, and the corresponding correlation energy densities are calculated. The proposed model is concise, fast, and promising for practical use, such as the prediction of reaction energies. It can also work as a baseline model or pretrained model for follow-up studies of machine learning.
Collapse
Affiliation(s)
- R Han
- Department of Chemistry A, University of Zurich, Winterthurerstrasse 190, 8057 Zurich, Switzerland
| | - S Luber
- Department of Chemistry A, University of Zurich, Winterthurerstrasse 190, 8057 Zurich, Switzerland
| |
Collapse
|