1
|
Li S, Xie BB, Yin BW, Liu L, Shen L, Fang WH. Construction of Highly Accurate Machine Learning Potential Energy Surfaces for Excited-State Dynamics Simulations Based on Low-Level Data Sets. J Phys Chem A 2024; 128:5516-5524. [PMID: 38954640 DOI: 10.1021/acs.jpca.4c02028] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/04/2024]
Abstract
Machine learning is capable of effectively predicting the potential energies of molecules in the presence of high-quality data sets. Its application in the construction of ground- and excited-state potential energy surfaces is attractive to accelerate nonadiabatic molecular dynamics simulations of photochemical reactions. Because of the huge computational cost of excited-state electronic structure calculations, the construction of a high-quality data set becomes a bottleneck. In the present work, we first built two data sets. One was obtained from surface hopping dynamics simulations at the semiempirical OM2/MRCI level. Another was extracted from the dynamics trajectories at the CASSCF level, which was reported previously. The ground- and excited-state potential energy surfaces of ethylene-bridged azobenzene at the CASSCF computational level were constructed based on the former low-level data set. Although non-neural network machine learning methods can achieve good or modest performance during the training process, only neural network models provide reliable predictions on the latter external test data set. The BPNN and SchNet combined with the Δ-ML scheme and the force term in the loss functions are recommended for dynamics simulations. Then, we performed excited-state dynamics simulations of the photoisomerization of ethylene-bridged azobenzene on machine learning potential energy surfaces. Compared with the lifetimes of the first excited state (S1) estimated at different computational levels, our results on the E isomer are in good agreement with the high-level estimation. However, the overestimation of the Z isomer is unimproved. It suggests that smaller errors during the training process do not necessarily translate to more accurate predictions on high-level potential energies or better performance on nonadiabatic dynamics simulations, at least in the present case.
Collapse
Affiliation(s)
- Shuai Li
- Key Laboratory of Theoretical and Computational Photochemistry of Ministry of Education, College of Chemistry, Beijing Normal University, Beijing 100875, P. R. China
| | - Bin-Bin Xie
- Hangzhou Institute of Advanced Studies, Zhejiang Normal University, Hangzhou 311231, Zhejiang, P. R. China
| | - Bo-Wen Yin
- Hangzhou Institute of Advanced Studies, Zhejiang Normal University, Hangzhou 311231, Zhejiang, P. R. China
| | - Lihong Liu
- Key Laboratory of Theoretical and Computational Photochemistry of Ministry of Education, College of Chemistry, Beijing Normal University, Beijing 100875, P. R. China
| | - Lin Shen
- Key Laboratory of Theoretical and Computational Photochemistry of Ministry of Education, College of Chemistry, Beijing Normal University, Beijing 100875, P. R. China
- Yantai-Jingshi Institute of Material Genome Engineering, Yantai 265505, Shandong, P. R. China
| | - Wei-Hai Fang
- Key Laboratory of Theoretical and Computational Photochemistry of Ministry of Education, College of Chemistry, Beijing Normal University, Beijing 100875, P. R. China
- Shandong Laboratory of Yantai Advanced Materials and Green Manufacturing, Yantai 264006, Shandong, P. R. China
| |
Collapse
|
2
|
Shakiba M, Akimov AV. Machine-Learned Kohn-Sham Hamiltonian Mapping for Nonadiabatic Molecular Dynamics. J Chem Theory Comput 2024; 20:2992-3007. [PMID: 38581699 DOI: 10.1021/acs.jctc.4c00008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/08/2024]
Abstract
In this work, we report a simple, efficient, and scalable machine-learning (ML) approach for mapping non-self-consistent Kohn-Sham Hamiltonians constructed with one kind of density functional to the nearly self-consistent Hamiltonians constructed with another kind of density functional. This approach is designed as a fast surrogate Hamiltonian calculator for use in long nonadiabatic dynamics simulations of large atomistic systems. In this approach, the input and output features are Hamiltonian matrices computed from different levels of theory. We demonstrate that the developed ML-based Hamiltonian mapping method (1) speeds up the calculations by several orders of magnitude, (2) is conceptually simpler than alternative ML approaches, (3) is applicable to different systems and sizes and can be used for mapping Hamiltonians constructed with arbitrary density functionals, (4) requires a modest training data, learns fast, and generates molecular orbitals and their energies with the accuracy nearly matching that of conventional calculations, and (5) when applied to nonadiabatic dynamics simulation of excitation energy relaxation in large systems yields the corresponding time scales within the margin of error of the conventional calculations. Using this approach, we explore the excitation energy relaxation in C60 fullerene and Si75H64 quantum dot structures and derive qualitative and quantitative insights into dynamics in these systems.
Collapse
Affiliation(s)
- Mohammad Shakiba
- Department of Chemistry, University at Buffalo, The State University of New York, Buffalo, New York 14260, United States
| | - Alexey V Akimov
- Department of Chemistry, University at Buffalo, The State University of New York, Buffalo, New York 14260, United States
| |
Collapse
|
3
|
Dral PO. AI in computational chemistry through the lens of a decade-long journey. Chem Commun (Camb) 2024; 60:3240-3258. [PMID: 38444290 DOI: 10.1039/d4cc00010b] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/07/2024]
Abstract
This article gives a perspective on the progress of AI tools in computational chemistry through the lens of the author's decade-long contributions put in the wider context of the trends in this rapidly expanding field. This progress over the last decade is tremendous: while a decade ago we had a glimpse of what was to come through many proof-of-concept studies, now we witness the emergence of many AI-based computational chemistry tools that are mature enough to make faster and more accurate simulations increasingly routine. Such simulations in turn allow us to validate and even revise experimental results, deepen our understanding of the physicochemical processes in nature, and design better materials, devices, and drugs. The rapid introduction of powerful AI tools gives rise to unique challenges and opportunities that are discussed in this article too.
Collapse
Affiliation(s)
- Pavlo O Dral
- State Key Laboratory of Physical Chemistry of Solid Surfaces, College of Chemistry and Chemical Engineering, Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, and Innovation Laboratory for Sciences and Technologies of Energy Materials of Fujian Province (IKKEM), Xiamen University, Xiamen, Fujian 361005, China.
| |
Collapse
|
4
|
Li X, Lubbers N, Tretiak S, Barros K, Zhang Y. Machine Learning Framework for Modeling Exciton Polaritons in Molecular Materials. J Chem Theory Comput 2024; 20:891-901. [PMID: 38168674 DOI: 10.1021/acs.jctc.3c01068] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2024]
Abstract
A light-matter hybrid quasiparticle, called a polariton, is formed when molecules are strongly coupled to an optical cavity. Recent experiments have shown that polariton chemistry can manipulate chemical reactions. Polariton chemistry is a collective phenomenon, and its effects increase with the number of molecules in a cavity. However, simulating an ensemble of molecules in the excited state coupled to a cavity mode is theoretically and computationally challenging. Recent advances in machine learning (ML) techniques have shown promising capabilities in modeling ground-state chemical systems. This work presents a general protocol to predict excited-state properties, such as energies, transition dipoles, and nonadiabatic coupling vectors with the hierarchically interacting particle neural network. ML predictions are then applied to compute the potential energy surfaces and electronic spectra of a prototype azomethane molecule in the collective coupling scenario. These computational tools provide a much-needed framework to model and understand many molecules' emerging excited-state polariton chemistry.
Collapse
Affiliation(s)
- Xinyang Li
- Physics and Chemistry of Materials, Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Nicholas Lubbers
- Information Sciences, Computer, Computational, and Statistical Sciences Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Sergei Tretiak
- Physics and Chemistry of Materials, Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
- Center for Integrated Nanotechnologies, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Kipton Barros
- Physics and Chemistry of Materials, Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Yu Zhang
- Physics and Chemistry of Materials, Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| |
Collapse
|
5
|
Barrett R, Westermayr J. Reinforcement Learning for Traversing Chemical Structure Space: Optimizing Transition States and Minimum Energy Paths of Molecules. J Phys Chem Lett 2024; 15:349-356. [PMID: 38170921 PMCID: PMC10788951 DOI: 10.1021/acs.jpclett.3c02771] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2023] [Revised: 12/05/2023] [Accepted: 12/08/2023] [Indexed: 01/05/2024]
Abstract
In recent years, deep learning has made remarkable strides, surpassing human capabilities in tasks, such as strategy games, and it has found applications in complex domains, including protein folding. In the realm of quantum chemistry, machine learning methods have primarily served as predictive tools or design aids using generative models, while reinforcement learning remains in its early stages of exploration. This work introduces an actor-critic reinforcement learning framework suitable for diverse optimization tasks, such as searching for molecular structures with specific properties within conformational spaces. As an example, we show an implementation of this scheme for calculating minimum energy pathways of a Claisen rearrangement reaction and a number of SN2 reactions. The results show that the algorithm is able to accurately predict minimum energy pathways and, thus, transition states, providing the first steps in using actor-critic methods to study chemical reactions.
Collapse
Affiliation(s)
- Rhyan Barrett
- Institute
of Chemistry, Faculty of Chemistry and Mineralogy, University of Leipzig, Johannisallee 29, 04103 Leipzig, Germany
| | - Julia Westermayr
- Institute
of Chemistry, Faculty of Chemistry and Mineralogy, University of Leipzig, Johannisallee 29, 04103 Leipzig, Germany
- Center
for Scalable Data Analytics and Artificial Intelligence (ScaDS.AI),
Dresden/Leipzig, Humboldtstraße
25, 04105 Leipzig, Germany
| |
Collapse
|
6
|
Chen Z, Wing-Wah Yam V. Encoding Hole-Particle Information in the Multi-Channel MolOrbImage for Machine-Learned Excited-State Energies of Large Photofunctional Materials. J Am Chem Soc 2023; 145:24098-24107. [PMID: 37874942 DOI: 10.1021/jacs.3c07766] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2023]
Abstract
We present a novel class of one-electron multi-channel molecular orbital images (MolOrbImages) designed for the prediction of excited-state energetics in conjunction with the state-of-the-art VGG-type machine-learning architecture. By representing hole and particle states in the excitation process as channels of MolOrbImages, the revised VGG model achieves excellent prediction accuracy for both low-lying singlet and triplet states, with mean absolute errors (MAEs) of <0.08 and <0.1 eV for QM9 molecules and large photofunctional materials with up to 560 atoms, respectively. Remarkably, the model demonstrates exceptional performance (MAE < 1 kcal/mol) for the T1 state of QM9 molecules, making it a non-system-specific model that approaches chemical accuracy. The general rules attained, for instance, the improved performance with well-defined MO energies and the reduced overfitting concern via the inclusion of physically insightful hole-particle information, provide invaluable guidelines for the further design of orbital-based descriptors targeting molecular excited states.
Collapse
Affiliation(s)
- Ziyong Chen
- Institute of Molecular Functional Materials and Department of Chemistry, The University of Hong Kong, Pokfulam Road, Hong Kong, China
| | - Vivian Wing-Wah Yam
- Institute of Molecular Functional Materials and Department of Chemistry, The University of Hong Kong, Pokfulam Road, Hong Kong, China
- Hong Kong Quantum AI Lab Ltd., Hong Kong Science Park, Hong Kong, China
| |
Collapse
|
7
|
Hagg A, Kirschner KN. Open-Source Machine Learning in Computational Chemistry. J Chem Inf Model 2023; 63:4505-4532. [PMID: 37466636 PMCID: PMC10430767 DOI: 10.1021/acs.jcim.3c00643] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2023] [Indexed: 07/20/2023]
Abstract
The field of computational chemistry has seen a significant increase in the integration of machine learning concepts and algorithms. In this Perspective, we surveyed 179 open-source software projects, with corresponding peer-reviewed papers published within the last 5 years, to better understand the topics within the field being investigated by machine learning approaches. For each project, we provide a short description, the link to the code, the accompanying license type, and whether the training data and resulting models are made publicly available. Based on those deposited in GitHub repositories, the most popular employed Python libraries are identified. We hope that this survey will serve as a resource to learn about machine learning or specific architectures thereof by identifying accessible codes with accompanying papers on a topic basis. To this end, we also include computational chemistry open-source software for generating training data and fundamental Python libraries for machine learning. Based on our observations and considering the three pillars of collaborative machine learning work, open data, open source (code), and open models, we provide some suggestions to the community.
Collapse
Affiliation(s)
- Alexander Hagg
- Institute
of Technology, Resource and Energy-Efficient Engineering (TREE), University of Applied Sciences Bonn-Rhein-Sieg, 53757 Sankt Augustin, Germany
- Department
of Electrical Engineering, Mechanical Engineering and Technical Journalism, University of Applied Sciences Bonn-Rhein-Sieg, 53757 Sankt Augustin, Germany
| | - Karl N. Kirschner
- Institute
of Technology, Resource and Energy-Efficient Engineering (TREE), University of Applied Sciences Bonn-Rhein-Sieg, 53757 Sankt Augustin, Germany
- Department
of Computer Science, University of Applied
Sciences Bonn-Rhein-Sieg, 53757 Sankt Augustin, Germany
| |
Collapse
|
8
|
Chen MS, Mao Y, Snider A, Gupta P, Montoya-Castillo A, Zuehlsdorff TJ, Isborn CM, Markland TE. Elucidating the Role of Hydrogen Bonding in the Optical Spectroscopy of the Solvated Green Fluorescent Protein Chromophore: Using Machine Learning to Establish the Importance of High-Level Electronic Structure. J Phys Chem Lett 2023; 14:6610-6619. [PMID: 37459252 DOI: 10.1021/acs.jpclett.3c01444] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/28/2023]
Abstract
Hydrogen bonding interactions with chromophores in chemical and biological environments play a key role in determining their electronic absorption and relaxation processes, which are manifested in their linear and multidimensional optical spectra. For chromophores in the condensed phase, the large number of atoms needed to simulate the environment has traditionally prohibited the use of high-level excited-state electronic structure methods. By leveraging transfer learning, we show how to construct machine-learned models to accurately predict the high-level excitation energies of a chromophore in solution from only 400 high-level calculations. We show that when the electronic excitations of the green fluorescent protein chromophore in water are treated using EOM-CCSD embedded in a DFT description of the solvent the optical spectrum is correctly captured and that this improvement arises from correctly treating the coupling of the electronic transition to electric fields, which leads to a larger response upon hydrogen bonding between the chromophore and water.
Collapse
Affiliation(s)
- Michael S Chen
- Department of Chemistry, Stanford University, Stanford, California 94305, United States
| | - Yuezhi Mao
- Department of Chemistry, Stanford University, Stanford, California 94305, United States
| | - Andrew Snider
- Chemistry and Biochemistry, University of California Merced, Merced, California 95343, United States
| | - Prachi Gupta
- Chemistry and Biochemistry, University of California Merced, Merced, California 95343, United States
| | - Andrés Montoya-Castillo
- Department of Chemistry, University of Colorado, Boulder, Boulder, Colorado 80309, United States
| | - Tim J Zuehlsdorff
- Department of Chemistry, Oregon State University, Corvallis, Oregon 97331, United States
| | - Christine M Isborn
- Chemistry and Biochemistry, University of California Merced, Merced, California 95343, United States
| | - Thomas E Markland
- Department of Chemistry, Stanford University, Stanford, California 94305, United States
| |
Collapse
|
9
|
McNaughton AD, Joshi RP, Knutson CR, Fnu A, Luebke KJ, Malerich JP, Madrid PB, Kumar N. Machine Learning Models for Predicting Molecular UV-Vis Spectra with Quantum Mechanical Properties. J Chem Inf Model 2023; 63:1462-1471. [PMID: 36847578 DOI: 10.1021/acs.jcim.2c01662] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/01/2023]
Abstract
Accurate understanding of ultraviolet-visible (UV-vis) spectra is critical for the high-throughput synthesis of compounds for drug discovery. Experimentally determining UV-vis spectra can become expensive when dealing with a large quantity of novel compounds. This provides us an opportunity to drive computational advances in molecular property predictions using quantum mechanics and machine learning methods. In this work, we use both quantum mechanically (QM) predicted and experimentally measured UV-vis spectra as input to devise four different machine learning architectures, UVvis-SchNet, UVvis-DTNN, UVvis-Transformer, and UVvis-MPNN, and assess the performance of each method. We find that the UVvis-MPNN model outperforms the other models when using optimized 3D coordinates and QM predicted spectra as input features. This model has the highest performance for predicting UV-vis spectra with a training RMSE of 0.06 and validation RMSE of 0.08. Most importantly, our model can be used for the challenging task of predicting differences in the UV-vis spectral signatures of regioisomers.
Collapse
Affiliation(s)
- Andrew D McNaughton
- Pacific Northwest National Laboratory, Richland, Washington 99354, United States
| | - Rajendra P Joshi
- Pacific Northwest National Laboratory, Richland, Washington 99354, United States
| | - Carter R Knutson
- Pacific Northwest National Laboratory, Richland, Washington 99354, United States
| | - Anubhav Fnu
- Pacific Northwest National Laboratory, Richland, Washington 99354, United States
| | - Kevin J Luebke
- SRI International, 333 Ravenswood Avenue, Menlo Park, California 94025, United States
| | - Jeremiah P Malerich
- SRI International, 333 Ravenswood Avenue, Menlo Park, California 94025, United States
| | - Peter B Madrid
- SRI International, 333 Ravenswood Avenue, Menlo Park, California 94025, United States
| | - Neeraj Kumar
- Pacific Northwest National Laboratory, Richland, Washington 99354, United States
| |
Collapse
|
10
|
de Armas-Morejón CM, Montero-Cabrera LA, Rubio A, Jornet-Somoza J. Electronic Descriptors for Supervised Spectroscopic Predictions. J Chem Theory Comput 2023; 19:1818-1826. [PMID: 36877528 DOI: 10.1021/acs.jctc.2c01039] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/07/2023]
Abstract
Spectroscopic properties of molecules hold great importance for the description of the molecular response under the effect of UV/vis electromagnetic radiation. Computationally expensive ab initio (e.g., MultiConfigurational SCF, Coupled Cluster) or TDDFT methods are commonly used by the quantum chemistry community to compute these properties. In this work, we propose a (supervised) Machine Learning approach to model the absorption spectra of organic molecules. Several supervised ML methods have been tested such as Kernel Ridge Regression (KRR), Multiperceptron Neural Networs (MLP), and Convolutional Neural Networks. [Ramakrishnan et al. J. Chem. Phys. 2015, 143, 084111. Ghosh et al. Adv. Sci. 2019, 6, 1801367.] The use of only geometrical-atomic number descriptors (e.g., Coulomb Matrix) proved to be insufficient for an accurate training. [Ramakrishnan et al. J. Chem. Phys. 2015, 143, 084111.] Inspired by the TDDFT theory, we propose to use a set of electronic descriptors obtained from low-cost DFT methods: orbital energy differences (Δϵia = ϵa - ϵi), transition dipole moment between occupied and unoccupied Kohn-Sham orbitals (⟨ϕi|r|ϕa⟩), and when relevant, charge-transfer character of monoexcitations (Ria). We demonstrate that with these electronic descriptors and the use of Neural Networks we can predict not only a density of excited states but also get a very good estimation of the absorption spectrum and charge-transfer character of the electronic excited states, reaching results close to chemical accuracy (∼2 kcal/mol or ∼0.1 eV).
Collapse
Affiliation(s)
- Carlos Manuel de Armas-Morejón
- Nano-Bio Spectroscopy Group, Departamento de Polímeros y Materiales Avanzados: Fisica, Química y Tecnología, Universidad del País Vasco UPV/EHU, 20018 San Sebastián, Spain.,Laboratorio de Química Computacional y Teórica, Facultad de Química, Universidad de La Habana, 10400 La Habana, Cuba
| | - Luis A Montero-Cabrera
- Laboratorio de Química Computacional y Teórica, Facultad de Química, Universidad de La Habana, 10400 La Habana, Cuba.,Donostia International Physics Center, Manuel Lardizabal Ibilbidea, 4, 20018 Donostia, Spain
| | - Angel Rubio
- Nano-Bio Spectroscopy Group, Departamento de Polímeros y Materiales Avanzados: Fisica, Química y Tecnología, Universidad del País Vasco UPV/EHU, 20018 San Sebastián, Spain.,Theory Department, Max Planck Institute for the Structure and Dynamics of Matter and Center for Free-Electron Laser Science, Luruper Chaussee 149, 22761 Hamburg, Germany
| | - Joaquim Jornet-Somoza
- Nano-Bio Spectroscopy Group, Departamento de Polímeros y Materiales Avanzados: Fisica, Química y Tecnología, Universidad del País Vasco UPV/EHU, 20018 San Sebastián, Spain.,Theory Department, Max Planck Institute for the Structure and Dynamics of Matter and Center for Free-Electron Laser Science, Luruper Chaussee 149, 22761 Hamburg, Germany
| |
Collapse
|
11
|
Chen Z, Yam VWW. Machine-Learned Electronically Excited States with the MolOrbImage Generated from the Molecular Ground State. J Phys Chem Lett 2023; 14:1955-1961. [PMID: 36787423 DOI: 10.1021/acs.jpclett.3c00014] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/18/2023]
Abstract
We present a general machine learning framework for probing the electronic state properties using the novel quantum descriptor MolOrbImage. Each pixel of the MolOrbImage records the quantum information generated by the integration of the physical operator with a pair of bra and ket molecular orbital (MO) states. Inspired by the success of deep convolutional neural networks (NNs) in computer vision, we have implemented the convolutional-layer-dominated MO-NN model. Using the orbital energy and electron repulsion integral MolOrbImages, the MO-NN model achieves promising prediction accuracies against the ADC(2)/cc-pVTZ reference for transition energies to both low-lying singlet [mean absolute error (MAE) < 0.16 eV] and triplet (MAE < 0.14 eV) states. An apparent improvement in the prediction of oscillator strength, which has been shown to be challenging previously, has been demonstrated in this study. Moreover, the transferability test indicates the remarkable extrapolation capacity of the MO-NN model to describe the out of data set systems.
Collapse
Affiliation(s)
- Ziyong Chen
- Institute of Molecular Functional Materials and Department of Chemistry, The University of Hong Kong, Pokfulam Road, Hong Kong 999077, China
| | - Vivian Wing-Wah Yam
- Institute of Molecular Functional Materials and Department of Chemistry, The University of Hong Kong, Pokfulam Road, Hong Kong 999077, China
- Hong Kong Quantum AI Lab Ltd., Hong Kong Science Park, Hong Kong 999077, China
| |
Collapse
|
12
|
Schienbein P. Spectroscopy from Machine Learning by Accurately Representing the Atomic Polar Tensor. J Chem Theory Comput 2023; 19:705-712. [PMID: 36695707 PMCID: PMC9933433 DOI: 10.1021/acs.jctc.2c00788] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Abstract
Vibrational spectroscopy is a key technique to elucidate microscopic structure and dynamics. Without the aid of theoretical approaches, it is, however, often difficult to understand such spectra at a microscopic level. Ab initio molecular dynamics has repeatedly proved to be suitable for this purpose; however, the computational cost can be daunting. Here, the E(3)-equivariant neural network e3nn is used to fit the atomic polar tensor of liquid water a posteriori on top of existing molecular dynamics simulations. Notably, the introduced methodology is general and thus transferable to any other system as well. The target property is most fundamental and gives access to the IR spectrum, and more importantly, it is a highly powerful tool to directly assign IR spectral features to nuclear motion─a connection which has been pursued in the past but only using severe approximations due to the prohibitive computational cost. The herein introduced methodology overcomes this bottleneck. To benchmark the machine learning model, the IR spectrum of liquid water is calculated, indeed showing excellent agreement with the explicit reference calculation. In conclusion, the presented methodology gives a new route to calculate accurate IR spectra from molecular dynamics simulations and will facilitate the understanding of such spectra on a microscopic level.
Collapse
|
13
|
Boeije Y, Olivucci M. From a one-mode to a multi-mode understanding of conical intersection mediated ultrafast organic photochemical reactions. Chem Soc Rev 2023; 52:2643-2687. [PMID: 36970950 DOI: 10.1039/d2cs00719c] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/29/2023]
Abstract
This review discusses how ultrafast organic photochemical reactions are controlled by conical intersections, highlighting that decay to the ground-state at multiple points of the intersection space results in their multi-mode character.
Collapse
Affiliation(s)
- Yorrick Boeije
- Van 't Hoff Institute for Molecular Sciences (HIMS), University of Amsterdam, Science Park 904, 1098 XH Amsterdam, The Netherlands
| | - Massimo Olivucci
- Chemistry Department, University of Siena, Via Aldo Moro n. 2, 53100 Siena, Italy
- Chemistry Department, Bowling Green State University, Overman Hall, Bowling Green, Ohio 43403, USA
| |
Collapse
|
14
|
Reiser P, Neubert M, Eberhard A, Torresi L, Zhou C, Shao C, Metni H, van Hoesel C, Schopmans H, Sommer T, Friederich P. Graph neural networks for materials science and chemistry. COMMUNICATIONS MATERIALS 2022; 3:93. [PMID: 36468086 PMCID: PMC9702700 DOI: 10.1038/s43246-022-00315-6] [Citation(s) in RCA: 65] [Impact Index Per Article: 32.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/10/2022] [Accepted: 11/07/2022] [Indexed: 05/14/2023]
Abstract
Machine learning plays an increasingly important role in many areas of chemistry and materials science, being used to predict materials properties, accelerate simulations, design new structures, and predict synthesis routes of new materials. Graph neural networks (GNNs) are one of the fastest growing classes of machine learning models. They are of particular relevance for chemistry and materials science, as they directly work on a graph or structural representation of molecules and materials and therefore have full access to all relevant information required to characterize materials. In this Review, we provide an overview of the basic principles of GNNs, widely used datasets, and state-of-the-art architectures, followed by a discussion of a wide range of recent applications of GNNs in chemistry and materials science, and concluding with a road-map for the further development and application of GNNs.
Collapse
Affiliation(s)
- Patrick Reiser
- Institute of Theoretical Informatics, Karlsruhe Institute of Technology, Am Fasanengarten 5, 76131 Karlsruhe, Germany
- Institute of Nanotechnology, Karlsruhe Institute of Technology, Hermann-von-Helmholtz-Platz 1, 76344 Eggenstein-Leopoldshafen, Germany
| | - Marlen Neubert
- Institute of Theoretical Informatics, Karlsruhe Institute of Technology, Am Fasanengarten 5, 76131 Karlsruhe, Germany
| | - André Eberhard
- Institute of Theoretical Informatics, Karlsruhe Institute of Technology, Am Fasanengarten 5, 76131 Karlsruhe, Germany
| | - Luca Torresi
- Institute of Theoretical Informatics, Karlsruhe Institute of Technology, Am Fasanengarten 5, 76131 Karlsruhe, Germany
| | - Chen Zhou
- Institute of Theoretical Informatics, Karlsruhe Institute of Technology, Am Fasanengarten 5, 76131 Karlsruhe, Germany
| | - Chen Shao
- Institute of Theoretical Informatics, Karlsruhe Institute of Technology, Am Fasanengarten 5, 76131 Karlsruhe, Germany
- Present Address: Institute for Applied Informatics and Formal Description Systems, Karlsruhe Institute of Technology, Kaiserstr. 89, 76133 Karlsruhe, Germany
| | - Houssam Metni
- Institute of Theoretical Informatics, Karlsruhe Institute of Technology, Am Fasanengarten 5, 76131 Karlsruhe, Germany
- ECPM, Université de Strasbourg, 25 Rue Becquerel, 67087 Strasbourg, France
| | - Clint van Hoesel
- Institute of Theoretical Informatics, Karlsruhe Institute of Technology, Am Fasanengarten 5, 76131 Karlsruhe, Germany
- Department of Applied Physics, Eindhoven University of Technology, Groene Loper 19, 5612 AP Eindhoven, The Netherlands
| | - Henrik Schopmans
- Institute of Theoretical Informatics, Karlsruhe Institute of Technology, Am Fasanengarten 5, 76131 Karlsruhe, Germany
- Institute of Nanotechnology, Karlsruhe Institute of Technology, Hermann-von-Helmholtz-Platz 1, 76344 Eggenstein-Leopoldshafen, Germany
| | - Timo Sommer
- Institute of Theoretical Informatics, Karlsruhe Institute of Technology, Am Fasanengarten 5, 76131 Karlsruhe, Germany
- Institute for Theory of Condensed Matter, Karlsruhe Institute of Technology, Wolfgang-Gaede-Str. 1, 76131 Karlsruhe, Germany
- Present Address: School of Chemistry, Trinity College Dublin, College Green, Dublin 2, Ireland
| | - Pascal Friederich
- Institute of Theoretical Informatics, Karlsruhe Institute of Technology, Am Fasanengarten 5, 76131 Karlsruhe, Germany
- Institute of Nanotechnology, Karlsruhe Institute of Technology, Hermann-von-Helmholtz-Platz 1, 76344 Eggenstein-Leopoldshafen, Germany
| |
Collapse
|
15
|
Zhang Y, Lin Q, Jiang B. Atomistic neural network representations for chemical dynamics simulations of molecular, condensed phase, and interfacial systems: Efficiency, representability, and generalization. WIRES COMPUTATIONAL MOLECULAR SCIENCE 2022. [DOI: 10.1002/wcms.1645] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Affiliation(s)
- Yaolong Zhang
- Department of Chemical Physics, School of Chemistry and Materials Science, Key Laboratory of Surface and Interface Chemistry and Energy Catalysis of Anhui Higher Education Institutes University of Science and Technology of China Hefei Anhui China
| | - Qidong Lin
- Department of Chemical Physics, School of Chemistry and Materials Science, Key Laboratory of Surface and Interface Chemistry and Energy Catalysis of Anhui Higher Education Institutes University of Science and Technology of China Hefei Anhui China
| | - Bin Jiang
- Department of Chemical Physics, School of Chemistry and Materials Science, Key Laboratory of Surface and Interface Chemistry and Energy Catalysis of Anhui Higher Education Institutes University of Science and Technology of China Hefei Anhui China
| |
Collapse
|
16
|
Shmilovich K, Willmott D, Batalov I, Kornbluth M, Mailoa J, Kolter JZ. Orbital Mixer: Using Atomic Orbital Features for Basis-Dependent Prediction of Molecular Wavefunctions. J Chem Theory Comput 2022; 18:6021-6030. [PMID: 36122312 DOI: 10.1021/acs.jctc.2c00555] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Leveraging ab initio data at scale has enabled the development of machine learning models capable of extremely accurate and fast molecular property prediction. A central paradigm of many previous studies focuses on generating predictions for only a fixed set of properties. Recent lines of research instead aim to explicitly learn the electronic structure via molecular wavefunctions, from which other quantum chemical properties can be directly derived. While previous methods generate predictions as a function of only the atomic configuration, in this work we present an alternate approach that directly purposes basis-dependent information to predict molecular electronic structure. Our model, Orbital Mixer, is composed entirely of multi-layer perceptrons (MLPs) using MLP-Mixer layers within a simple, intuitive, and scalable architecture that achieves competitive Hamiltonian and molecular orbital energy and coefficient prediction accuracies compared to the state-of-the-art.
Collapse
Affiliation(s)
- Kirill Shmilovich
- Pritzker School of Molecular Engineering, University of Chicago, Chicago, Illinois 60637, United States
| | - Devin Willmott
- Bosch Center for Artificial Intelligence, Pittsburgh, Pennsylvania 15222, United States
| | - Ivan Batalov
- Bosch Center for Artificial Intelligence, Pittsburgh, Pennsylvania 15222, United States
| | - Mordechai Kornbluth
- Bosch Research and Technology Center, Cambridge, Massachusetts 02139, United States
| | - Jonathan Mailoa
- Tencent Quantum Laboratory, Shenzhen, Guangdong 518057, China
| | - J Zico Kolter
- Bosch Center for Artificial Intelligence, Pittsburgh, Pennsylvania 15222, United States.,Carnegie Mellon University, Pittsburgh, Pennsylvania 15222, United States
| |
Collapse
|
17
|
Beckmann R, Brieuc F, Schran C, Marx D. Infrared Spectra at Coupled Cluster Accuracy from Neural Network Representations. J Chem Theory Comput 2022; 18:5492-5501. [PMID: 35998360 DOI: 10.1021/acs.jctc.2c00511] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
Infrared spectroscopy is key to elucidating molecular structures, monitoring reactions, and observing conformational changes, while providing information on both structural and dynamical properties. This makes the accurate prediction of infrared spectra based on first-principle theories a highly desirable pursuit. Molecular dynamics simulations have proven to be a particularly powerful approach for this task, albeit requiring the computation of energies, forces and dipole moments for a large number of molecular configurations as a function of time. This explains why highly accurate first-principles methods, such as coupled cluster theory, have so far been inapplicable for the prediction of fully anharmonic vibrational spectra of large systems at finite temperatures. Here, we push cutting-edge machine learning techniques forward by using neural network representations of energies, forces, and in particular dipoles to predict such infrared spectra fully at "gold standard" coupled cluster accuracy as demonstrated for protonated water clusters as large as the protonated water hexamer, in its extended Zundel configuration. Furthermore, we show that this methodology can be used beyond the scope of the data considered during the development of the neural network models, allowing for the computation of finite-temperature infrared spectra of large systems inaccessible to explicit coupled cluster calculations. This substantially expands the hitherto existing limits of accuracy, speed, and system size for theoretical spectroscopy and opens up a multitude of avenues for the prediction of vibrational spectra and the understanding of complex intra- and intermolecular couplings.
Collapse
Affiliation(s)
- Richard Beckmann
- Lehrstuhl für Theoretische Chemie, Ruhr-Universität Bochum, 44780 Bochum, Germany
| | - Fabien Brieuc
- Lehrstuhl für Theoretische Chemie, Ruhr-Universität Bochum, 44780 Bochum, Germany
| | - Christoph Schran
- Lehrstuhl für Theoretische Chemie, Ruhr-Universität Bochum, 44780 Bochum, Germany
| | - Dominik Marx
- Lehrstuhl für Theoretische Chemie, Ruhr-Universität Bochum, 44780 Bochum, Germany
| |
Collapse
|
18
|
Westermayr J, Gastegger M, Vörös D, Panzenboeck L, Joerg F, González L, Marquetand P. Deep learning study of tyrosine reveals that roaming can lead to photodamage. Nat Chem 2022; 14:914-919. [PMID: 35655007 DOI: 10.1038/s41557-022-00950-z] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2021] [Accepted: 04/13/2022] [Indexed: 01/12/2023]
Abstract
Amino acids are among the building blocks of life, forming peptides and proteins, and have been carefully 'selected' to prevent harmful reactions caused by light. To prevent photodamage, molecules relax from electronic excited states to the ground state faster than the harmful reactions can occur; however, such photochemistry is not fully understood, in part because theoretical simulations of such systems are extremely expensive-with only smaller chromophores accessible. Here, we study the excited-state dynamics of tyrosine using a method based on deep neural networks that leverages the physics underlying quantum chemical data and combines different levels of theory. We reveal unconventional and dynamically controlled 'roaming' dynamics in excited tyrosine that are beyond chemical intuition and compete with other ultrafast deactivation mechanisms. Our findings suggest that the roaming atoms are radicals that can lead to photodamage, offering a new perspective on the photostability and photodamage of biological systems.
Collapse
Affiliation(s)
- Julia Westermayr
- Faculty of Chemistry, Institute of Theoretical Chemistry, University of Vienna, Vienna, Austria.,Department of Chemistry, University of Warwick, Coventry, UK
| | - Michael Gastegger
- Machine Learning Group, Technical University of Berlin, Berlin, Germany
| | - Dóra Vörös
- Faculty of Chemistry, Institute of Theoretical Chemistry, University of Vienna, Vienna, Austria
| | - Lisa Panzenboeck
- Faculty of Chemistry, Institute of Theoretical Chemistry, University of Vienna, Vienna, Austria.,Faculty of Chemistry, Department of Analytical Chemistry, University of Vienna, Vienna, Austria
| | - Florian Joerg
- Faculty of Chemistry, Institute of Theoretical Chemistry, University of Vienna, Vienna, Austria.,Faculty of Chemistry, Institute of Computational Biological Chemistry, University of Vienna, Vienna, Austria
| | - Leticia González
- Faculty of Chemistry, Institute of Theoretical Chemistry, University of Vienna, Vienna, Austria.,Vienna Research Platform on Accelerating Photoreaction Discovery, University of Vienna, Vienna, Austria
| | - Philipp Marquetand
- Faculty of Chemistry, Institute of Theoretical Chemistry, University of Vienna, Vienna, Austria. .,Vienna Research Platform on Accelerating Photoreaction Discovery, University of Vienna, Vienna, Austria. .,Research Network Data Science @ Uni Vienna, University of Vienna, Vienna, Austria.
| |
Collapse
|
19
|
Chen Z, Bononi FC, Sievers CA, Kong WY, Donadio D. UV-Visible Absorption Spectra of Solvated Molecules by Quantum Chemical Machine Learning. J Chem Theory Comput 2022; 18:4891-4902. [PMID: 35913220 DOI: 10.1021/acs.jctc.1c01181] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Predicting UV-visible absorption spectra is essential to understand photochemical processes and design energy materials. Quantum chemical methods can deliver accurate calculations of UV-visible absorption spectra, but they are computationally expensive, especially for large systems or when one computes line shapes from thermal averages. Here, we present an approach to predict UV-visible absorption spectra of solvated aromatic molecules by quantum chemistry (QC) and machine learning (ML). We show that a ML model, trained on the high-level QC calculation of the excitation energy of a set of aromatic molecules, can accurately predict the line shape of the lowest-energy UV-visible absorption band of several related molecules with less than 0.1 eV deviation with respect to reference experimental spectra. Applying linear decomposition analysis on the excitation energies, we unveil that our ML models probe vertical excitations of these aromatic molecules primarily by learning the atomic environment of their phenyl rings, which align with the physical origin of the π →π* electronic transition. Our study provides an effective workflow that combines ML with quantum chemical methods to accelerate the calculations of UV-visible absorption spectra for various molecular systems.
Collapse
Affiliation(s)
- Zekun Chen
- Department of Chemistry, University of California Davis 95616, California, United States
| | - Fernanda C Bononi
- Department of Chemistry, University of California Davis 95616, California, United States
| | - Charles A Sievers
- Department of Chemistry, University of California Davis 95616, California, United States
| | - Wang-Yeuk Kong
- Department of Chemistry, University of California Davis 95616, California, United States
| | - Davide Donadio
- Department of Chemistry, University of California Davis 95616, California, United States
| |
Collapse
|
20
|
Golze D, Hirvensalo M, Hernández-León P, Aarva A, Etula J, Susi T, Rinke P, Laurila T, Caro MA. Accurate Computational Prediction of Core-Electron Binding Energies in Carbon-Based Materials: A Machine-Learning Model Combining Density-Functional Theory and GW. CHEMISTRY OF MATERIALS : A PUBLICATION OF THE AMERICAN CHEMICAL SOCIETY 2022; 34:6240-6254. [PMID: 35910537 PMCID: PMC9330771 DOI: 10.1021/acs.chemmater.1c04279] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/13/2021] [Revised: 06/30/2022] [Indexed: 06/15/2023]
Abstract
We present a quantitatively accurate machine-learning (ML) model for the computational prediction of core-electron binding energies, from which X-ray photoelectron spectroscopy (XPS) spectra can be readily obtained. Our model combines density functional theory (DFT) with GW and uses kernel ridge regression for the ML predictions. We apply the new approach to disordered materials and small molecules containing carbon, hydrogen, and oxygen and obtain qualitative and quantitative agreement with experiment, resolving spectral features within 0.1 eV of reference experimental spectra. The method only requires the user to provide a structural model for the material under study to obtain an XPS prediction within seconds. Our new tool is freely available online through the XPS Prediction Server.
Collapse
Affiliation(s)
- Dorothea Golze
- Faculty
of Chemistry and Food Chemistry, Technische
Universität Dresden, 01062 Dresden, Germany
- Department
of Applied Physics, Aalto University, 02150 Espoo, Finland
| | - Markus Hirvensalo
- Department
of Applied Physics, Aalto University, 02150 Espoo, Finland
| | | | - Anja Aarva
- Department
of Electrical Engineering and Automation, Aalto University, 02150 Espoo, Finland
| | - Jarkko Etula
- Department
of Chemistry and Materials Science, Aalto
University, 02150 Espoo, Finland
| | - Toma Susi
- University
of Vienna, Faculty of Physics, Boltzmanngasse 5, 1090 Vienna, Austria
| | - Patrick Rinke
- Department
of Applied Physics, Aalto University, 02150 Espoo, Finland
| | - Tomi Laurila
- Department
of Electrical Engineering and Automation, Aalto University, 02150 Espoo, Finland
- Department
of Chemistry and Materials Science, Aalto
University, 02150 Espoo, Finland
| | - Miguel A. Caro
- Department
of Electrical Engineering and Automation, Aalto University, 02150 Espoo, Finland
| |
Collapse
|
21
|
Li J, Lopez SA. A Look Inside the Black Box of Machine Learning Photodynamics Simulations. Acc Chem Res 2022; 55:1972-1984. [PMID: 35796602 DOI: 10.1021/acs.accounts.2c00288] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
ConspectusPhotochemical reactions are of great importance in chemistry, biology, and materials science because they take advantage of a renewable energy source, mild reaction conditions, and high atom economy. Light absorption can excite molecules to a higher energy electronic state of the same spin multiplicity. The following nonadiabatic processes induce molecular transformations that afford exotic molecular architectures and high-energy-isomers that are inaccessible by thermal means. Computational simulations now complement time-resolved instrumentation to reveal ultrafast excited-state mechanistic information for photochemical reactions that is essential in disentangling elusive spectroscopic features, excited-state lifetimes, and excited-state mechanistic critical points. Nonadiabatic molecular dynamics (NAMD), powered by surface hopping techniques, is among the most widely applied techniques to model the photochemical reactions of medium-sized molecules. However, the computational efficiency is limited because of the requisite thousands of multiconfigurational quantum-chemical calculations multiplied by hundreds of trajectories. Machine learning (ML) has emerged as a revolutionary force in computational chemistry to predict the outcome of the resource-intensive multiconfigurational calculations on the fly. An ML potential trained with a substantial set of quantum-chemical calculations can predict the energies and forces with errors under chemical accuracy at a negligible cost. The integration of ML potentials in NAMD dramatically extends the maximum simulation time scale by ∼10 000-fold to the nanosecond regime.In this Account, we present a comprehensive demonstration of ML photodynamics simulations and summarize our most recent applications in resolving complex photochemical reactions. First, we address three fundamental components of ML techniques for photodynamics simulations: the quantum-chemical data set, the ML potential, and NAMD. Second, we describe best practices in building training data and our procedure toward training the ML photodynamics model with our recent literature contributions. We introduce a convenient training data generation scheme combining Wigner sampling and geometrical interpolation. It trains reliable and effective ML potentials suitable for subsequent active learning to detect undersampled data. We demonstrate how active learning automatically discovers new mechanistic pathways and reproduces experimental results. We point out that atomic permutation is an essential data augmentation approach to improve the learnability of distance-based molecular descriptors for highly symmetric molecules. Third, we demonstrate the utility of ML-photodynamics by showing the results of ML photodynamics simulations of (1) photo-torquoselective 4π disrotatory electrocyclic ring closing of norbornyl cyclohexadiene, which reveals a thermal conversion from experimentally unobserved intermediates to the reactant in 1 ns; (2) [2 + 2] photocycloaddition of substituted [3]-syn-ladderdienes in competition with 4π and 6π electrocyclic ring-opening reactions, uncovering substituent effects to explain the reported increased quantum yield of substituted cubane precursors; and (3) photochemical 4π disrotatory electrocyclic reactions of fluorobenzenes in nanoseconds with XMS-CASPT2-level training data. We expect this Account to broaden understanding of ML photodynamics and inspire future developments and applications to increasingly large molecules within complex environments on long time scales.
Collapse
Affiliation(s)
- Jingbai Li
- Department of Chemistry and Chemical Biology, Northeastern University, Boston, Massachusetts 02115, United States
| | - Steven A Lopez
- Department of Chemistry and Chemical Biology, Northeastern University, Boston, Massachusetts 02115, United States
| |
Collapse
|
22
|
Singh K, Münchmeyer J, Weber L, Leser U, Bande A. Graph Neural Networks for Learning Molecular Excitation Spectra. J Chem Theory Comput 2022; 18:4408-4417. [PMID: 35671364 DOI: 10.1021/acs.jctc.2c00255] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Machine learning (ML) approaches have demonstrated the ability to predict molecular spectra at a fraction of the computational cost of traditional theoretical chemistry methods while maintaining high accuracy. Graph neural networks (GNNs) are particularly promising in this regard, but different types of GNNs have not yet been systematically compared. In this work, we benchmark and analyze five different GNNs for the prediction of excitation spectra from the QM9 dataset of organic molecules. We compare the GNN performance in the obvious runtime measurements, prediction accuracy, and analysis of outliers in the test set. Moreover, through TMAP clustering and statistical analysis, we are able to highlight clear hotspots of high prediction errors as well as optimal spectra prediction for molecules with certain functional groups. This in-depth benchmarking and subsequent analysis protocol lays down a recipe for comparing different ML methods and evaluating dataset quality.
Collapse
Affiliation(s)
- Kanishka Singh
- Helmholtz-Zentrum Berlin für Materialien und Energie GmbH, Hahn-Meitner-Platz 1, Berlin 10409, Germany.,Institute of Chemistry and Biochemistry, Freie Universität Berlin, Arnimallee 22, Berlin 14195, Germany
| | - Jannes Münchmeyer
- Deutsches GeoForschungsZentrum GFZ, Telegrafenberg, 14473 Potsdam, Germany.,Humboldt-Universität zu Berlin, Unter den Linden 6, 10117 Berlin, Germany
| | - Leon Weber
- Humboldt-Universität zu Berlin, Unter den Linden 6, 10117 Berlin, Germany.,Max Delbrück Center for Molecular Medicine in the Helmholtz Association, Robert-Rössle-Strase 10, Berlin 13125, Germany
| | - Ulf Leser
- Humboldt-Universität zu Berlin, Unter den Linden 6, 10117 Berlin, Germany
| | - Annika Bande
- Helmholtz-Zentrum Berlin für Materialien und Energie GmbH, Hahn-Meitner-Platz 1, Berlin 10409, Germany
| |
Collapse
|
23
|
Cignoni E, Cupellini L, Mennucci B. A fast method for electronic couplings in embedded multichromophoric systems. JOURNAL OF PHYSICS. CONDENSED MATTER : AN INSTITUTE OF PHYSICS JOURNAL 2022; 34:304004. [PMID: 35552268 DOI: 10.1088/1361-648x/ac6f3c] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/17/2022] [Accepted: 05/12/2022] [Indexed: 06/15/2023]
Abstract
Electronic couplings are key to understanding exciton delocalization and transport in natural and artificial light harvesting processes. We develop a method to compute couplings in multichromophoric aggregates embedded in complex environments without running expensive quantum chemical calculations. We use a transition charge approximation to represent the quantum mechanical transition densities of the chromophores and an atomistic and polarizable classical model to describe the environment atoms. We extend our framework to estimate transition charges directly from the chromophore geometry, i.e., bypassing completely the quantum mechanical calculations using a regression approach. The method allows to rapidly compute accurate couplings for a large number of geometries along molecular dynamics trajectories.
Collapse
Affiliation(s)
- Edoardo Cignoni
- Dipartimento di Chimica e Chimica Industriale, University of Pisa, via G. Moruzzi 13, 56124, Pisa, Italy
| | - Lorenzo Cupellini
- Dipartimento di Chimica e Chimica Industriale, University of Pisa, via G. Moruzzi 13, 56124, Pisa, Italy
| | - Benedetta Mennucci
- Dipartimento di Chimica e Chimica Industriale, University of Pisa, via G. Moruzzi 13, 56124, Pisa, Italy
| |
Collapse
|
24
|
Cerdán L, Roca-Sanjuán D. Reconstruction of Nuclear Ensemble Approach Electronic Spectra Using Probabilistic Machine Learning. J Chem Theory Comput 2022; 18:3052-3064. [PMID: 35481363 PMCID: PMC9097286 DOI: 10.1021/acs.jctc.2c00004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2022] [Indexed: 11/29/2022]
Abstract
The theoretical prediction of molecular electronic spectra by means of quantum mechanical (QM) computations is fundamental to gain a deep insight into many photophysical and photochemical processes. A computational strategy that is attracting significant attention is the so-called Nuclear Ensemble Approach (NEA), that relies on generating a representative ensemble of nuclear geometries around the equilibrium structure and computing the vertical excitation energies (ΔE) and oscillator strengths (f) and phenomenologically broadening each transition with a line-shaped function with empirical full-width δ. Frequently, the choice of δ is carried out by visually finding the trade-off between artificial vibronic features (small δ) and over-smoothing of electronic signatures (large δ). Nevertheless, this approach is not satisfactory, as it relies on a subjective perception and may lead to spectral inaccuracies overall when the number of sampled configurations is limited due to an excessive computational burden (high-level QM methods, complex systems, solvent effects, etc.). In this work, we have developed and tested a new approach to reconstruct NEA spectra, dubbed GMM-NEA, based on the use of Gaussian Mixture Models (GMMs), a probabilistic machine learning algorithm, that circumvents the phenomenological broadening assumption and, in turn, the use of δ altogether. We show that GMM-NEA systematically outperforms other data-driven models to automatically select δ overall for small datasets. In addition, we report the use of an algorithm to detect anomalous QM computations (outliers) that can affect the overall shape and uncertainty of the NEA spectra. Finally, we apply GMM-NEA to predict the photolysis rate for HgBrOOH, a compound involved in Earth's atmospheric chemistry.
Collapse
Affiliation(s)
- Luis Cerdán
- Institut de Ciència Molecular, Universitat de València, València 46071, Spain
| | - Daniel Roca-Sanjuán
- Institut de Ciència Molecular, Universitat de València, València 46071, Spain
| |
Collapse
|
25
|
Rankine CD, Penfold TJ. Accurate, affordable, and generalizable machine learning simulations of transition metal x-ray absorption spectra using the XANESNET deep neural network. J Chem Phys 2022; 156:164102. [PMID: 35490005 DOI: 10.1063/5.0087255] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
The affordable, accurate, and generalizable prediction of spectroscopic observables plays a key role in the analysis of increasingly complex experiments. In this article, we develop and deploy a deep neural network-XANESNET-for predicting the lineshape of first-row transition metal K-edge x-ray absorption near-edge structure (XANES) spectra. XANESNET predicts the spectral intensities using only information about the local coordination geometry of the transition metal complexes encoded in a feature vector of weighted atom-centered symmetry functions. We address in detail the calibration of the feature vector for the particularities of the problem at hand, and we explore the individual feature importance to reveal the physical insight that XANESNET obtains at the Fe K-edge. XANESNET relies on only a few judiciously selected features-radial information on the first and second coordination shells suffices along with angular information sufficient to separate satisfactorily key coordination geometries. The feature importance is found to reflect the XANES spectral window under consideration and is consistent with the expected underlying physics. We subsequently apply XANESNET at nine first-row transition metal (Ti-Zn) K-edges. It can be optimized in as little as a minute, predicts instantaneously, and provides K-edge XANES spectra with an average accuracy of ∼±2%-4% in which the positions of prominent peaks are matched with a >90% hit rate to sub-eV (∼0.8 eV) error.
Collapse
Affiliation(s)
- C D Rankine
- Chemistry-School of Natural and Environmental Sciences, Newcastle University, Newcastle Upon Tyne NE1 7RU, United Kingdom
| | - T J Penfold
- Chemistry-School of Natural and Environmental Sciences, Newcastle University, Newcastle Upon Tyne NE1 7RU, United Kingdom
| |
Collapse
|
26
|
Watson L, Rankine CD, Penfold TJ. Beyond structural insight: a deep neural network for the prediction of Pt L 2/3-edge X-ray absorption spectra. Phys Chem Chem Phys 2022; 24:9156-9167. [PMID: 35393987 DOI: 10.1039/d2cp00567k] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
Abstract
X-ray absorption spectroscopy at the L2/3 edge can be used to obtain detailed information about the local electronic and geometric structure of transition metal complexes. By virtue of the dipole selection rules, the transition metal L2/3 edge usually exhibits two distinct spectral regions: (i) the "white line", which is dominated by bound electronic transitions from metal-centred 2p orbitals into unoccupied orbitals with d character; the intensity and shape of this band consequently reflects the d density of states (d-DOS), which is strongly modulated by mixing with ligand orbitals involved in chemical bonding, and (ii) the post-edge, where oscillations encode the local geometric structure around the X-ray absorption site. In this Article, we extend our recently-developed XANESNET deep neural network (DNN) beyond the K-edge to predict X-ray absorption spectra at the Pt L2/3 edge. We demonstrate that XANESNET is able to predict Pt L2/3 -edge X-ray absorption spectra, including both the parts containing electronic and geometric structural information. The performance of our DNN in practical situations is demonstrated by application to two Pt complexes, and by simulating the transient spectrum of a photoexcited dimeric Pt complex. Our discussion includes an analysis of the feature importance in our DNN which demonstrates the role of key features and assists with interpreting the performance of the network.
Collapse
Affiliation(s)
- Luke Watson
- Chemistry - School of Natural and Environmental Sciences, Newcastle University, Newcastle, upon Tyne, NE1 7RU, UK.
| | - Conor D Rankine
- Chemistry - School of Natural and Environmental Sciences, Newcastle University, Newcastle, upon Tyne, NE1 7RU, UK.
| | - Thomas J Penfold
- Chemistry - School of Natural and Environmental Sciences, Newcastle University, Newcastle, upon Tyne, NE1 7RU, UK.
| |
Collapse
|
27
|
Zaverkin V, Holzmüller D, Schuldt R, Kästner J. Predicting properties of periodic systems from cluster data: A case study of liquid water. J Chem Phys 2022; 156:114103. [DOI: 10.1063/5.0078983] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
The accuracy of the training data limits the accuracy of bulk properties from machine-learned potentials. For example, hybrid functionals or wave-function-based quantum chemical methods are readily available for cluster data but effectively out of scope for periodic structures. We show that local, atom-centered descriptors for machine-learned potentials enable the prediction of bulk properties from cluster model training data, agreeing reasonably well with predictions from bulk training data. We demonstrate such transferability by studying structural and dynamical properties of bulk liquid water with density functional theory and have found an excellent agreement with experimental and theoretical counterparts.
Collapse
Affiliation(s)
- Viktor Zaverkin
- Institute for Theoretical Chemistry, University of Stuttgart, Pfaffenwaldring 55, 70569 Stuttgart, Germany
| | - David Holzmüller
- Institute for Stochastics and Applications, University of Stuttgart, Pfaffenwaldring 57, 70569 Stuttgart, Germany
| | - Robin Schuldt
- Institute for Theoretical Chemistry, University of Stuttgart, Pfaffenwaldring 55, 70569 Stuttgart, Germany
| | - Johannes Kästner
- Institute for Theoretical Chemistry, University of Stuttgart, Pfaffenwaldring 55, 70569 Stuttgart, Germany
| |
Collapse
|
28
|
Shao J, Liu Y, Yan J, Yan ZY, Wu Y, Ru Z, Liao JY, Miao X, Qian L. Prediction of Maximum Absorption Wavelength Using Deep Neural Networks. J Chem Inf Model 2022; 62:1368-1375. [PMID: 35290042 DOI: 10.1021/acs.jcim.1c01449] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
Fluorescent molecules are important tools in biological detection, and numerous efforts have been made to develop compounds to meet the desired photophysical properties. For example, tuning the wavelength allows an appropriate penetration depth with minimal interference from the autofluorescence/scattering for a better signal-to-noise contrast. However, there are limited guidelines to rationally design or computationally predict the optical properties from first principles, and factors like the solvent effects will make it more complicated. Herein, we established a database (SMFluo1) of 1181 solvated small-molecule fluorophores covering the ultraviolet-visible-near-infrared absorption window and developed new machine learning models based on deep neural networks for accurately predicting photophysical parameters. The optimal system was applied to 120 out-of-sample compounds, and it exhibited remarkable accuracy with a mean relative error of 1.52%. In this new paradigm, a deep learning algorithm is promising to complement conventional theoretical and experimental studies of fluorophores and to greatly accelerate the discovery of new dyes. Due to its simplicity and efficiency, data from newly developed fluorophores can be easily supplemented to this system to further improve the accuracy across various dye families.
Collapse
Affiliation(s)
- Jinning Shao
- Institute of Drug Metabolism and Pharmaceutical Analysis, Zhejiang Province Key Laboratory of Anti-Cancer Drug Research, College of Pharmaceutical Sciences, Cancer Center, & Hangzhou Institute of Innovative Medicine, Zhejiang University, Hangzhou, China 310058
| | - Yue Liu
- Center for Data Science, Zhejiang University, Hangzhou, China 310058.,Polytechnic Institute, Zhejiang University, Hangzhou, China 310058
| | - Jiaqi Yan
- Institute of Drug Metabolism and Pharmaceutical Analysis, Zhejiang Province Key Laboratory of Anti-Cancer Drug Research, College of Pharmaceutical Sciences, Cancer Center, & Hangzhou Institute of Innovative Medicine, Zhejiang University, Hangzhou, China 310058
| | - Ze-Yi Yan
- Institute of Drug Metabolism and Pharmaceutical Analysis, Zhejiang Province Key Laboratory of Anti-Cancer Drug Research, College of Pharmaceutical Sciences, Cancer Center, & Hangzhou Institute of Innovative Medicine, Zhejiang University, Hangzhou, China 310058.,Polytechnic Institute, Zhejiang University, Hangzhou, China 310058
| | - Yangyang Wu
- Center for Data Science, Zhejiang University, Hangzhou, China 310058
| | - Zhongying Ru
- Center for Data Science, Zhejiang University, Hangzhou, China 310058.,Polytechnic Institute, Zhejiang University, Hangzhou, China 310058
| | - Jia-Yu Liao
- Institute of Drug Metabolism and Pharmaceutical Analysis, Zhejiang Province Key Laboratory of Anti-Cancer Drug Research, College of Pharmaceutical Sciences, Cancer Center, & Hangzhou Institute of Innovative Medicine, Zhejiang University, Hangzhou, China 310058.,Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Hangzhou, China 310018
| | - Xiaoye Miao
- Center for Data Science, Zhejiang University, Hangzhou, China 310058
| | - Linghui Qian
- Institute of Drug Metabolism and Pharmaceutical Analysis, Zhejiang Province Key Laboratory of Anti-Cancer Drug Research, College of Pharmaceutical Sciences, Cancer Center, & Hangzhou Institute of Innovative Medicine, Zhejiang University, Hangzhou, China 310058
| |
Collapse
|
29
|
Gupta A, Chakraborty S, Ghosh D, Ramakrishnan R. Data-driven modeling of S 0 → S 1 excitation energy in the BODIPY chemical space: High-throughput computation, quantum machine learning, and inverse design. J Chem Phys 2021; 155:244102. [PMID: 34972385 DOI: 10.1063/5.0076787] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Derivatives of BODIPY are popular fluorophores due to their synthetic feasibility, structural rigidity, high quantum yield, and tunable spectroscopic properties. While the characteristic absorption maximum of BODIPY is at 2.5 eV, combinations of functional groups and substitution sites can shift the peak position by ±1 eV. Time-dependent long-range corrected hybrid density functional methods can model the lowest excitation energies offering a semi-quantitative precision of ±0.3 eV. Alas, the chemical space of BODIPYs stemming from combinatorial introduction of-even a few dozen-substituents is too large for brute-force high-throughput modeling. To navigate this vast space, we select 77 412 molecules and train a kernel-based quantum machine learning model providing <2% hold-out error. Further reuse of the results presented here to navigate the entire BODIPY universe comprising over 253 giga (253 × 109) molecules is demonstrated by inverse-designing candidates with desired target excitation energies.
Collapse
Affiliation(s)
- Amit Gupta
- Centre for Interdisciplinary Sciences, Tata Institute of Fundamental Research, Hyderabad 500107, India
| | - Sabyasachi Chakraborty
- Centre for Interdisciplinary Sciences, Tata Institute of Fundamental Research, Hyderabad 500107, India
| | - Debashree Ghosh
- Indian Association for the Cultivation of Science, Kolkata 700032, India
| | - Raghunathan Ramakrishnan
- Centre for Interdisciplinary Sciences, Tata Institute of Fundamental Research, Hyderabad 500107, India
| |
Collapse
|
30
|
Lin S, Peng D, Yang W, Gu FL, Lan Z. Theoretical studies on triplet-state driven dissociation of formaldehyde by quasi-classical molecular dynamics simulation on machine-learning potential energy surface. J Chem Phys 2021; 155:214105. [PMID: 34879677 PMCID: PMC8654486 DOI: 10.1063/5.0067176] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2021] [Accepted: 11/09/2021] [Indexed: 11/15/2022] Open
Abstract
The H-atom dissociation of formaldehyde on the lowest triplet state (T1) is studied by quasi-classical molecular dynamic simulations on the high-dimensional machine-learning potential energy surface (PES) model. An atomic-energy based deep-learning neural network (NN) is used to represent the PES function, and the weighted atom-centered symmetry functions are employed as inputs of the NN model to satisfy the translational, rotational, and permutational symmetries, and to capture the geometry features of each atom and its individual chemical environment. Several standard technical tricks are used in the construction of NN-PES, which includes the application of clustering algorithm in the formation of the training dataset, the examination of the reliability of the NN-PES model by different fitted NN models, and the detection of the out-of-confidence region by the confidence interval of the training dataset. The accuracy of the full-dimensional NN-PES model is examined by two benchmark calculations with respect to ab initio data. Both the NN and electronic-structure calculations give a similar H-atom dissociation reaction pathway on the T1 state in the intrinsic reaction coordinate analysis. The small-scaled trial dynamics simulations based on NN-PES and ab initio PES give highly consistent results. After confirming the accuracy of the NN-PES, a large number of trajectories are calculated in the quasi-classical dynamics, which allows us to get a better understanding of the T1-driven H-atom dissociation dynamics efficiently. Particularly, the dynamics simulations from different initial conditions can be easily simulated with a rather low computational cost. The influence of the mode-specific vibrational excitations on the H-atom dissociation dynamics driven by the T1 state is explored. The results show that the vibrational excitations on symmetric C-H stretching, asymmetric C-H stretching, and C=O stretching motions always enhance the H-atom dissociation probability obviously.
Collapse
Affiliation(s)
| | | | - Weitao Yang
- Department of Chemistry, Duke University, Durham, North Carolina 27708, USA
| | - Feng Long Gu
- Authors to whom correspondence should be addressed: and
| | - Zhenggang Lan
- Authors to whom correspondence should be addressed: and
| |
Collapse
|
31
|
Pinheiro M, Ge F, Ferré N, Dral PO, Barbatti M. Choosing the right molecular machine learning potential. Chem Sci 2021; 12:14396-14413. [PMID: 34880991 PMCID: PMC8580106 DOI: 10.1039/d1sc03564a] [Citation(s) in RCA: 45] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2021] [Accepted: 09/14/2021] [Indexed: 11/21/2022] Open
Abstract
Quantum-chemistry simulations based on potential energy surfaces of molecules provide invaluable insight into the physicochemical processes at the atomistic level and yield such important observables as reaction rates and spectra. Machine learning potentials promise to significantly reduce the computational cost and hence enable otherwise unfeasible simulations. However, the surging number of such potentials begs the question of which one to choose or whether we still need to develop yet another one. Here, we address this question by evaluating the performance of popular machine learning potentials in terms of accuracy and computational cost. In addition, we deliver structured information for non-specialists in machine learning to guide them through the maze of acronyms, recognize each potential's main features, and judge what they could expect from each one.
Collapse
Affiliation(s)
- Max Pinheiro
- Aix Marseille University, CNRS, ICR Marseille France
| | - Fuchun Ge
- State Key Laboratory of Physical Chemistry of Solid Surfaces, Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, Department of Chemistry, College of Chemistry and Chemical Engineering, Xiamen University China
| | - Nicolas Ferré
- Aix Marseille University, CNRS, ICR Marseille France
| | - Pavlo O Dral
- State Key Laboratory of Physical Chemistry of Solid Surfaces, Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, Department of Chemistry, College of Chemistry and Chemical Engineering, Xiamen University China
| | - Mario Barbatti
- Aix Marseille University, CNRS, ICR Marseille France
- Institut Universitaire de France 75231 Paris France
| |
Collapse
|
32
|
Westermayr J, Marquetand P. Machine Learning for Electronically Excited States of Molecules. Chem Rev 2021; 121:9873-9926. [PMID: 33211478 PMCID: PMC8391943 DOI: 10.1021/acs.chemrev.0c00749] [Citation(s) in RCA: 162] [Impact Index Per Article: 54.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2020] [Indexed: 12/11/2022]
Abstract
Electronically excited states of molecules are at the heart of photochemistry, photophysics, as well as photobiology and also play a role in material science. Their theoretical description requires highly accurate quantum chemical calculations, which are computationally expensive. In this review, we focus on not only how machine learning is employed to speed up such excited-state simulations but also how this branch of artificial intelligence can be used to advance this exciting research field in all its aspects. Discussed applications of machine learning for excited states include excited-state dynamics simulations, static calculations of absorption spectra, as well as many others. In order to put these studies into context, we discuss the promises and pitfalls of the involved machine learning techniques. Since the latter are mostly based on quantum chemistry calculations, we also provide a short introduction into excited-state electronic structure methods and approaches for nonadiabatic dynamics simulations and describe tricks and problems when using them in machine learning for excited states of molecules.
Collapse
Affiliation(s)
- Julia Westermayr
- Institute
of Theoretical Chemistry, Faculty of Chemistry, University of Vienna, Währinger Strasse 17, 1090 Vienna, Austria
| | - Philipp Marquetand
- Institute
of Theoretical Chemistry, Faculty of Chemistry, University of Vienna, Währinger Strasse 17, 1090 Vienna, Austria
- Vienna
Research Platform on Accelerating Photoreaction Discovery, University of Vienna, Währinger Strasse 17, 1090 Vienna, Austria
- Data
Science @ Uni Vienna, University of Vienna, Währinger Strasse 29, 1090 Vienna, Austria
| |
Collapse
|
33
|
Abstract
Chemical compound space (CCS), the set of all theoretically conceivable combinations of chemical elements and (meta-)stable geometries that make up matter, is colossal. The first-principles based virtual sampling of this space, for example, in search of novel molecules or materials which exhibit desirable properties, is therefore prohibitive for all but the smallest subsets and simplest properties. We review studies aimed at tackling this challenge using modern machine learning techniques based on (i) synthetic data, typically generated using quantum mechanics based methods, and (ii) model architectures inspired by quantum mechanics. Such Quantum mechanics based Machine Learning (QML) approaches combine the numerical efficiency of statistical surrogate models with an ab initio view on matter. They rigorously reflect the underlying physics in order to reach universality and transferability across CCS. While state-of-the-art approximations to quantum problems impose severe computational bottlenecks, recent QML based developments indicate the possibility of substantial acceleration without sacrificing the predictive power of quantum mechanics.
Collapse
Affiliation(s)
- Bing Huang
- Faculty
of Physics, University of Vienna, 1090 Vienna, Austria
| | - O. Anatole von Lilienfeld
- Faculty
of Physics, University of Vienna, 1090 Vienna, Austria
- Institute
of Physical Chemistry and National Center for Computational Design
and Discovery of Novel Materials (MARVEL), Department of Chemistry, University of Basel, 4056 Basel, Switzerland
| |
Collapse
|
34
|
Abstract
Electronically excited states of molecules are at the heart of photochemistry, photophysics, as well as photobiology and also play a role in material science. Their theoretical description requires highly accurate quantum chemical calculations, which are computationally expensive. In this review, we focus on not only how machine learning is employed to speed up such excited-state simulations but also how this branch of artificial intelligence can be used to advance this exciting research field in all its aspects. Discussed applications of machine learning for excited states include excited-state dynamics simulations, static calculations of absorption spectra, as well as many others. In order to put these studies into context, we discuss the promises and pitfalls of the involved machine learning techniques. Since the latter are mostly based on quantum chemistry calculations, we also provide a short introduction into excited-state electronic structure methods and approaches for nonadiabatic dynamics simulations and describe tricks and problems when using them in machine learning for excited states of molecules.
Collapse
Affiliation(s)
- Julia Westermayr
- Institute of Theoretical Chemistry, Faculty of Chemistry, University of Vienna, Währinger Strasse 17, 1090 Vienna, Austria
| | - Philipp Marquetand
- Institute of Theoretical Chemistry, Faculty of Chemistry, University of Vienna, Währinger Strasse 17, 1090 Vienna, Austria
- Vienna Research Platform on Accelerating Photoreaction Discovery, University of Vienna, Währinger Strasse 17, 1090 Vienna, Austria
- Data Science @ Uni Vienna, University of Vienna, Währinger Strasse 29, 1090 Vienna, Austria
| |
Collapse
|
35
|
Westermayr J, Maurer RJ. Physically inspired deep learning of molecular excitations and photoemission spectra. Chem Sci 2021; 12:10755-10764. [PMID: 34447563 PMCID: PMC8372319 DOI: 10.1039/d1sc01542g] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2021] [Accepted: 06/29/2021] [Indexed: 12/29/2022] Open
Abstract
Modern functional materials consist of large molecular building blocks with significant chemical complexity which limits spectroscopic property prediction with accurate first-principles methods. Consequently, a targeted design of materials with tailored optoelectronic properties by high-throughput screening is bound to fail without efficient methods to predict molecular excited-state properties across chemical space. In this work, we present a deep neural network that predicts charged quasiparticle excitations for large and complex organic molecules with a rich elemental diversity and a size well out of reach of accurate many body perturbation theory calculations. The model exploits the fundamental underlying physics of molecular resonances as eigenvalues of a latent Hamiltonian matrix and is thus able to accurately describe multiple resonances simultaneously. The performance of this model is demonstrated for a range of organic molecules across chemical composition space and configuration space. We further showcase the model capabilities by predicting photoemission spectra at the level of the GW approximation for previously unseen conjugated molecules.
Collapse
Affiliation(s)
- Julia Westermayr
- Department of Chemistry, University of Warwick Gibbet Hill Road Coventry CV4 7AL UK
| | - Reinhard J Maurer
- Department of Chemistry, University of Warwick Gibbet Hill Road Coventry CV4 7AL UK
| |
Collapse
|
36
|
Westermayr J, Gastegger M, Schütt KT, Maurer RJ. Perspective on integrating machine learning into computational chemistry and materials science. J Chem Phys 2021; 154:230903. [PMID: 34241249 DOI: 10.1063/5.0047760] [Citation(s) in RCA: 67] [Impact Index Per Article: 22.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023] Open
Abstract
Machine learning (ML) methods are being used in almost every conceivable area of electronic structure theory and molecular simulation. In particular, ML has become firmly established in the construction of high-dimensional interatomic potentials. Not a day goes by without another proof of principle being published on how ML methods can represent and predict quantum mechanical properties-be they observable, such as molecular polarizabilities, or not, such as atomic charges. As ML is becoming pervasive in electronic structure theory and molecular simulation, we provide an overview of how atomistic computational modeling is being transformed by the incorporation of ML approaches. From the perspective of the practitioner in the field, we assess how common workflows to predict structure, dynamics, and spectroscopy are affected by ML. Finally, we discuss how a tighter and lasting integration of ML methods with computational chemistry and materials science can be achieved and what it will mean for research practice, software development, and postgraduate training.
Collapse
Affiliation(s)
- Julia Westermayr
- Department of Chemistry, University of Warwick, Gibbet Hill Road, Coventry CV4 7AL, United Kingdom
| | - Michael Gastegger
- Machine Learning Group, Technische Universität Berlin, 10587 Berlin, Germany
| | - Kristof T Schütt
- Machine Learning Group, Technische Universität Berlin, 10587 Berlin, Germany
| | - Reinhard J Maurer
- Department of Chemistry, University of Warwick, Gibbet Hill Road, Coventry CV4 7AL, United Kingdom
| |
Collapse
|
37
|
Abstract
Theoretical simulations of electronic excitations and associated processes in molecules are indispensable for fundamental research and technological innovations. However, such simulations are notoriously challenging to perform with quantum mechanical methods. Advances in machine learning open many new avenues for assisting molecular excited-state simulations. In this Review, we track such progress, assess the current state of the art and highlight the critical issues to solve in the future. We overview a broad range of machine learning applications in excited-state research, which include the prediction of molecular properties, improvements of quantum mechanical methods for the calculations of excited-state properties and the search for new materials. Machine learning approaches can help us understand hidden factors that influence photo-processes, leading to a better control of such processes and new rules for the design of materials for optoelectronic applications.
Collapse
|
38
|
Gupta A, Chakraborty S, Ramakrishnan R. Revving up 13C NMR shielding predictions across chemical space: benchmarks for atoms-in-molecules kernel machine learning with new data for 134 kilo molecules. MACHINE LEARNING: SCIENCE AND TECHNOLOGY 2021. [DOI: 10.1088/2632-2153/abe347] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Open
Abstract
Abstract
The requirement for accelerated and quantitatively accurate screening of nuclear magnetic resonance spectra across the small molecules chemical compound space is two-fold: (1) a robust ‘local’ machine learning (ML) strategy capturing the effect of the neighborhood on an atom’s ‘near-sighted’ property—chemical shielding; (2) an accurate reference dataset generated with a state-of-the-art first-principles method for training. Herein we report the QM9-NMR dataset comprising isotropic shielding of over 0.8 million C atoms in 134k molecules of the QM9 dataset in gas and five common solvent phases. Using these data for training, we present benchmark results for the prediction transferability of kernel-ridge regression models with popular local descriptors. Our best model, trained on 100k samples, accurately predicts isotropic shielding of 50k ‘hold-out’ atoms with a mean error of less than 1.9 ppm. For the rapid prediction of new query molecules, the models were trained on geometries from an inexpensive theory. Furthermore, by using a Δ-ML strategy, we quench the error below 1.4 ppm. Finally, we test the transferability on non-trivial benchmark sets that include benchmark molecules comprising 10–17 heavy atoms and drugs.
Collapse
|
39
|
Ceriotti M, Clementi C, Anatole von Lilienfeld O. Machine learning meets chemical physics. J Chem Phys 2021; 154:160401. [PMID: 33940847 DOI: 10.1063/5.0051418] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Over recent years, the use of statistical learning techniques applied to chemical problems has gained substantial momentum. This is particularly apparent in the realm of physical chemistry, where the balance between empiricism and physics-based theory has traditionally been rather in favor of the latter. In this guest Editorial for the special topic issue on "Machine Learning Meets Chemical Physics," a brief rationale is provided, followed by an overview of the topics covered. We conclude by making some general remarks.
Collapse
Affiliation(s)
- Michele Ceriotti
- Laboratory of Computational Science and Modeling, IMX, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
| | - Cecilia Clementi
- Department of Physics, Freie Universität Berlin, Arnimallee 14, 14195 Berlin, Germany
| | | |
Collapse
|
40
|
Fonseca G, Poltavsky I, Vassilev-Galindo V, Tkatchenko A. Improving molecular force fields across configurational space by combining supervised and unsupervised machine learning. J Chem Phys 2021; 154:124102. [DOI: 10.1063/5.0035530] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Affiliation(s)
- Gregory Fonseca
- Department of Physics and Materials Science, University of Luxembourg, L-1511 Luxembourg, Luxembourg
| | - Igor Poltavsky
- Department of Physics and Materials Science, University of Luxembourg, L-1511 Luxembourg, Luxembourg
| | - Valentin Vassilev-Galindo
- Department of Physics and Materials Science, University of Luxembourg, L-1511 Luxembourg, Luxembourg
| | - Alexandre Tkatchenko
- Department of Physics and Materials Science, University of Luxembourg, L-1511 Luxembourg, Luxembourg
| |
Collapse
|
41
|
Wang Y, Guan Y, Guo H, Yarkony DR. Enabling complete multichannel nonadiabatic dynamics: A global representation of the two-channel coupled, 1,2 1A and 1 3A states of NH 3 using neural networks. J Chem Phys 2021; 154:094121. [PMID: 33685133 DOI: 10.1063/5.0037684] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Global coupled three-state two-channel potential energy and property/interaction (dipole and spin-orbit coupling) surfaces for the dissociation of NH3(Ã) into NH + H2 and NH2 + H are reported. The permutational invariant polynomial-neural network approach is used to simultaneously fit and diabatize the electronic Hamiltonian by fitting the energies, energy gradients, and derivative couplings of the two coupled lowest-lying singlet states as well as fitting the energy and energy gradients of the lowest-lying triplet state. The key issue in fitting property matrix elements in the diabatic basis is that the diabatic surfaces must be smooth, that is, the diabatization must remove spikes in the original adiabatic property surfaces attributable to the switch of electronic wavefunctions at the conical intersection seam. Here, we employ the fit potential energy matrix to transform properties in the adiabatic representation to a quasi-diabatic representation and remove the discontinuity near the conical intersection seam. The property matrix elements can then be fit with smooth neural network functions. The coupled potential energy surfaces along with the dipole and spin-orbit coupling surfaces will enable more accurate and complete treatment of optical transitions, as well as nonadiabatic internal conversion and intersystem crossing.
Collapse
Affiliation(s)
- Yuchen Wang
- Department of Chemistry, Johns Hopkins University, Baltimore, Maryland 21218, USA
| | - Yafu Guan
- Department of Chemistry, Johns Hopkins University, Baltimore, Maryland 21218, USA
| | - Hua Guo
- Department of Chemistry and Chemical Biology, University of New Mexico, Albuquerque, New Mexico 87131, USA
| | - David R Yarkony
- Department of Chemistry, Johns Hopkins University, Baltimore, Maryland 21218, USA
| |
Collapse
|
42
|
Ha JK, Kim K, Min SK. Machine Learning-Assisted Excited State Molecular Dynamics with the State-Interaction State-Averaged Spin-Restricted Ensemble-Referenced Kohn-Sham Approach. J Chem Theory Comput 2021; 17:694-702. [PMID: 33470100 DOI: 10.1021/acs.jctc.0c01261] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
We present a machine learning-assisted excited state molecular dynamics (ML-ESMD) based on the ensemble density functional theory framework. Since we represent a diabatic Hamiltonian in terms of generalized valence bond ansatz within the state-interaction state-averaged spin-restricted ensemble-referenced Kohn-Sham (SI-SA-REKS) method, we can avoid singularities near conical intersections, which are crucial in excited state molecular dynamics simulations. We train the diabatic Hamiltonian elements and their analytical gradients with the SchNet architecture to construct machine learning models, while the phase freedom of off-diagonal elements of the Hamiltonian is cured by introducing the phase-less loss function. Our machine learning models show reasonable accuracy with mean absolute errors of ∼0.1 kcal/mol and ∼0.5 kcal/mol/Å for the diabatic Hamiltonian elements and their gradients, respectively, for penta-2,4-dieniminium cation. Moreover, by exploiting the diabatic representation, our models can predict correct conical intersection structures and their topologies. In addition, our ML-ESMD simulations give almost identical result with a direct dynamics at the same level of theory.
Collapse
Affiliation(s)
- Jong-Kwon Ha
- Department of Chemistry, School of Natural Science, Ulsan National Institute of Science and Technology (UNIST), 50 UNIST-gil, Ulju-gun, Ulsan 44919, South Korea
| | - Kicheol Kim
- Department of Chemistry, School of Natural Science, Ulsan National Institute of Science and Technology (UNIST), 50 UNIST-gil, Ulju-gun, Ulsan 44919, South Korea
| | - Seung Kyu Min
- Department of Chemistry, School of Natural Science, Ulsan National Institute of Science and Technology (UNIST), 50 UNIST-gil, Ulju-gun, Ulsan 44919, South Korea
| |
Collapse
|