1
|
Wu W, Leonardis A, Jiao J, Jiang J, Chen L. Transformer-Based Models for Predicting Molecular Structures from Infrared Spectra Using Patch-Based Self-Attention. J Phys Chem A 2025. [PMID: 39951543 DOI: 10.1021/acs.jpca.4c05665] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/16/2025]
Abstract
Infrared (IR) spectroscopy, a type of vibrational spectroscopy, provides extensive molecular structure details and is a highly effective technique for chemists to determine molecular structures. However, analyzing experimental spectra has always been challenging due to the specialized knowledge required and the variability of spectra under different experimental conditions. Here, we propose a transformer-based model with a patch-based self-attention spectrum embedding layer, designed to prevent the loss of spectral information while maintaining simplicity and effectiveness. To further enhance the model's understanding of IR spectra, we introduce a data augmentation approach, which selectively introduces vertical noise only at absorption peaks. Our approach not only achieves state-of-the-art performance on simulated data sets but also attains a top-1 accuracy of 55% on real experimental spectra, surpassing the previous state-of-the-art by approximately 10%. Additionally, our model demonstrates proficiency in analyzing intricate and variable fingerprint regions, effectively extracting critical structural information.
Collapse
Affiliation(s)
- Wenjin Wu
- State Key Laboratory of Precision and Intelligent Chemistry, University of Science and Technology of China, Hefei 230026, China
- School of Computer Science, University of Birmingham, Birmingham B15 2TT, U.K
| | - Aleš Leonardis
- School of Computer Science, University of Birmingham, Birmingham B15 2TT, U.K
| | - Jianbo Jiao
- School of Computer Science, University of Birmingham, Birmingham B15 2TT, U.K
| | - Jun Jiang
- State Key Laboratory of Precision and Intelligent Chemistry, University of Science and Technology of China, Hefei 230026, China
| | - Linjiang Chen
- State Key Laboratory of Precision and Intelligent Chemistry, University of Science and Technology of China, Hefei 230026, China
- School of Computer Science, University of Birmingham, Birmingham B15 2TT, U.K
- School of Chemistry, University of Birmingham, Birmingham B15 2TT, U.K
| |
Collapse
|
2
|
Thompson TR, Staab JK, Chilton NF. Approximate Hamiltonians from a Linear Vibronic Coupling Model for Solution-Phase Spin Dynamics. J Chem Theory Comput 2025; 21:1222-1229. [PMID: 39824753 PMCID: PMC11823414 DOI: 10.1021/acs.jctc.4c01437] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2024] [Revised: 01/06/2025] [Accepted: 01/07/2025] [Indexed: 01/20/2025]
Abstract
The linear vibronic coupling (LVC) model is an approach for approximating how a molecular Hamiltonian changes in response to small changes in molecular geometry. The LVC framework thus has the ability to approximate molecular Hamiltonians at low computational expense but with quality approaching multiconfigurational ab initio calculations, when the change in geometry compared to the reference calculation used to parametrize it is small. Here, we show how the LVC approach can be used to project approximate spin Hamiltonians of a solvated lanthanide complex along a room-temperature molecular dynamics trajectory. As expected, the LVC approximation is less accurate as the geometry diverges from that at which the model was parametrized. We examine the accuracy of the predicted Hamiltonians by performing time-dependent quantum simulations of the spin dynamics of the molecule, with reference to the dynamics obtained using spin Hamiltonians projected from ab initio calculations at each step. We find that quantitatively accurate behavior is obtained when LVC parametrizations are performed at least every 10 fs during the trajectory.
Collapse
Affiliation(s)
- Toby R.
C. Thompson
- Department
of Chemistry, The University of Manchester, Manchester M13 9PL, U.K.
| | - Jakob K. Staab
- Department
of Chemistry, The University of Manchester, Manchester M13 9PL, U.K.
- Department
of Chemistry “Ugo Schiff”, INSTM Research Unit, Universitá degli Studi di Firenze, 50019 Sesto Fiorentino, Italy
| | - Nicholas F. Chilton
- Department
of Chemistry, The University of Manchester, Manchester M13 9PL, U.K.
- Research
School of Chemistry, Australian National
University, Canberra, Australian Capital Territory 2601, Australia
| |
Collapse
|
3
|
Pracht P, Pillai Y, Kapil V, Csányi G, Gönnheimer N, Vondrák M, Margraf JT, Wales DJ. Efficient Composite Infrared Spectroscopy: Combining the Double-Harmonic Approximation with Machine Learning Potentials. J Chem Theory Comput 2024; 20:10986-11004. [PMID: 39665618 DOI: 10.1021/acs.jctc.4c01157] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2024]
Abstract
Vibrational spectroscopy is a cornerstone technique for molecular characterization and offers an ideal target for the computational investigation of molecular materials. Building on previous comprehensive assessments of efficient methods for infrared (IR) spectroscopy, this study investigates the predictive accuracy and computational efficiency of gas-phase IR spectra calculations, accessible through a combination of modern semiempirical quantum mechanical and transferable machine learning potentials. A composite approach for IR spectra prediction based on the double-harmonic approximation, utilizing harmonic vibrational frequencies in combination squared derivatives of the molecular dipole moment, is employed. This approach allows for methodical flexibility in the calculation of IR intensities from molecular dipoles and the corresponding vibrational modes. Various methods are systematically tested to suggest a suitable protocol with an emphasis on computational efficiency. Among these methods, semiempirical extended tight-binding (xTB) models, classical charge equilibrium models, and machine learning potentials trained for dipole moment prediction are assessed across a diverse data set of organic molecules. We particularly focus on the recently reported foundational machine learning potential MACE-OFF23 to address the accuracy limitations of conventional low-cost quantum mechanical and force-field methods. This study aims to establish a standard for the efficient computational prediction of IR spectra, facilitating the rapid and reliable identification of unknown compounds and advancing automated high-throughput analytical workflows in chemistry.
Collapse
Affiliation(s)
- Philipp Pracht
- Yusuf Hamied Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, U.K
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Yuthika Pillai
- Yusuf Hamied Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, U.K
| | - Venkat Kapil
- Yusuf Hamied Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, U.K
- Department of Physics and Astronomy, University College London, 17-19 Gordon Street, London WC1H 0AH, U.K
- Thomas Young Centre & London Centre for Nanotechnology, 19 Gordon Street, London WC1H 0AH, U.K
| | - Gábor Csányi
- Engineering Laboratory, University of Cambridge, Trumpington Street, Cambridge CB2 1PZ, U.K
| | - Nils Gönnheimer
- University of Bayreuth, Bavarian Center for Battery Technology (BayBatt), 95448 Bayreuth, Germany
| | - Martin Vondrák
- University of Bayreuth, Bavarian Center for Battery Technology (BayBatt), 95448 Bayreuth, Germany
| | - Johannes T Margraf
- University of Bayreuth, Bavarian Center for Battery Technology (BayBatt), 95448 Bayreuth, Germany
| | - David J Wales
- Yusuf Hamied Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, U.K
| |
Collapse
|
4
|
Na GS. Deep Learning for Generating Phase-Conditioned Infrared Spectra. Anal Chem 2024; 96:19659-19669. [PMID: 39575882 DOI: 10.1021/acs.analchem.4c04786] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2024]
Abstract
Infrared (IR) spectroscopy is an efficient method for identifying unknown chemical compounds. To accelerate IR spectrum analysis, various calculation and machine learning methods for simulating IR spectra of molecules have been studied in chemical science. However, existing calculation and machine learning methods assumed a rigid constraint that all molecules are in the gas phase, i.e., they overlooked the phase dependency of the IR spectra. In this paper, we propose an efficient phase-aware machine learning method to generate phase-conditioned IR spectra from 2D molecular structures. To this end, we devised a phase-aware graph neural network and combined it with a transformer decoder. To the best of our knowledge, the proposed method is the first IR spectrum generator that can generate the phase-conditioned IR spectra of real-world complex molecules. The proposed method outperformed state-of-the-art methods in the tasks of generating IR spectra on a benchmark dataset containing experimentally measured 11,546 IR spectra of 10,288 unique molecules. All implementations of the proposed method are publicly available at https://github.com/ngs00/PASGeN.
Collapse
Affiliation(s)
- Gyoung S Na
- Korea Research Institute of Chemical Technology, Daejeon 34114, Republic of Korea
| |
Collapse
|
5
|
Thiemann FL, O'Neill N, Kapil V, Michaelides A, Schran C. Introduction to machine learning potentials for atomistic simulations. JOURNAL OF PHYSICS. CONDENSED MATTER : AN INSTITUTE OF PHYSICS JOURNAL 2024; 37:073002. [PMID: 39577092 DOI: 10.1088/1361-648x/ad9657] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/16/2024] [Accepted: 11/22/2024] [Indexed: 11/24/2024]
Abstract
Machine learning potentials have revolutionised the field of atomistic simulations in recent years and are becoming a mainstay in the toolbox of computational scientists. This paper aims to provide an overview and introduction into machine learning potentials and their practical application to scientific problems. We provide a systematic guide for developing machine learning potentials, reviewing chemical descriptors, regression models, data generation and validation approaches. We begin with an emphasis on the earlier generation of models, such as high-dimensional neural network potentials and Gaussian approximation potentials, to provide historical perspective and guide the reader towards the understanding of recent developments, which are discussed in detail thereafter. Furthermore, we refer to relevant expert reviews, open-source software, and practical examples-further lowering the barrier to exploring these methods. The paper ends with selected showcase examples, highlighting the capabilities of machine learning potentials and how they can be applied to push the boundaries in atomistic simulations.
Collapse
Affiliation(s)
- Fabian L Thiemann
- IBM Research Europe, Daresbury, Warrington WA4 4AD, United Kingdom
- Cavendish Laboratory, Department of Physics, University of Cambridge, Cambridge CB3 0HE, United Kingdom
| | - Niamh O'Neill
- Cavendish Laboratory, Department of Physics, University of Cambridge, Cambridge CB3 0HE, United Kingdom
- Yusuf Hamied Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom
- Lennard-Jones Centre, University of Cambridge, Trinity Ln, Cambridge CB2 1TN, United Kingdom
| | - Venkat Kapil
- Yusuf Hamied Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom
- Lennard-Jones Centre, University of Cambridge, Trinity Ln, Cambridge CB2 1TN, United Kingdom
- Department of Physics and Astronomy, University College London, London, United Kingdom
- Thomas Young Centre and London Centre for Nanotechnology, London, United Kingdom
| | - Angelos Michaelides
- Yusuf Hamied Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom
- Lennard-Jones Centre, University of Cambridge, Trinity Ln, Cambridge CB2 1TN, United Kingdom
| | - Christoph Schran
- Cavendish Laboratory, Department of Physics, University of Cambridge, Cambridge CB3 0HE, United Kingdom
- Lennard-Jones Centre, University of Cambridge, Trinity Ln, Cambridge CB2 1TN, United Kingdom
| |
Collapse
|
6
|
Sharma P, Chowdhury PR, Jain A, Patwari GN. Machine Learned Potential Enables Molecular Dynamics Simulation to Predict the Experimental Branching Ratios in the NO Release Channel of Nitroaromatic Compounds. J Phys Chem A 2024; 128:10137-10142. [PMID: 39550764 DOI: 10.1021/acs.jpca.4c04703] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2024]
Abstract
This study employs a machine learning (ML) model using the Gaussian process regression algorithm to generate potential energy surfaces (PES) from density functional theory calculations, facilitating the investigation of photodissociation dynamics of nitroaromatic compounds, resulting in NO release. The experimentally observed trends in the slow-to-fast branching ratios of the NO moiety were captured by estimating the branching ratio between the two distinct reaction pathways, viz., roaming and oxaziridine mechanisms, calculated from molecular dynamics simulations performed on a reduced two-dimensional T1 surface. The qualitative agreement between the calculated and experimental results suggests that the mechanism dictating NO release is primarily governed by the dynamics on the T1 surface.
Collapse
Affiliation(s)
- Pooja Sharma
- Department of Chemistry, Indian Institute of Technology Bombay, Mumbai 400076, India
| | - Prahlad Roy Chowdhury
- Department of Chemistry, Indian Institute of Technology Bombay, Mumbai 400076, India
| | - Amber Jain
- Department of Chemistry, Indian Institute of Technology Bombay, Mumbai 400076, India
| | - G Naresh Patwari
- Department of Chemistry, Indian Institute of Technology Bombay, Mumbai 400076, India
| |
Collapse
|
7
|
Sowa JK, Rossky PJ. A Bond-Based Machine Learning Model for Molecular Polarizabilities and A Priori Raman Spectra. J Chem Theory Comput 2024; 20:10071-10079. [PMID: 39499197 DOI: 10.1021/acs.jctc.4c01086] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2024]
Abstract
The use of machine learning (ML) algorithms in molecular simulations has become commonplace in recent years. There now exists, for instance, a multitude of ML force field algorithms that have enabled simulations approaching ab initio level accuracy at time scales and system sizes that significantly exceed what is otherwise possible with traditional methods. Far fewer algorithms exist for predicting rotationally equivariant, tensorial properties such as the electric polarizability. Here, we introduce a kernel ridge regression algorithm for machine learning of the polarizability tensor. This algorithm is based on the bond polarizability model and allows prediction of the tensor components at the cost similar to that of scalar quantities. We subsequently show the utility of this algorithm by simulating gas phase Raman spectra of biphenyl and malonaldehyde using classical molecular dynamics simulations of these systems performed with the recently developed MACE-OFF23 potential. The calculated spectra are shown to agree very well with the experiments and thus confirm the expediency of our algorithm as well as the accuracy of the used force field. More generally, this work demonstrates the potential of physics-informed approaches to yield simple yet effective machine learning algorithms for molecular properties.
Collapse
Affiliation(s)
- Jakub K Sowa
- Department of Chemistry, Rice University, Houston, Texas 77005, United States
| | - Peter J Rossky
- Department of Chemistry, Rice University, Houston, Texas 77005, United States
| |
Collapse
|
8
|
Shaban Tameh M, Coropceanu V, Purcell TAR, Brédas JL. Prediction of the Infrared Absorbance Intensities and Frequencies of Hydrocarbons: A Message Passing Neural Network Approach. J Phys Chem A 2024; 128:9695-9706. [PMID: 39466724 DOI: 10.1021/acs.jpca.4c06745] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/30/2024]
Abstract
Accurately and efficiently predicting the infrared (IR) spectra of a molecule can provide insights into the structure-properties relationships of molecular species, which has led to a proliferation of machine learning tools designed for this purpose. However, earlier studies have focused primarily on obtaining normalized IR spectra, which limits their potential for a comprehensive analysis of molecular behavior in the IR range. For instance, to fully understand and predict the optical properties, such as the transparency characteristics, it is necessary to predict the molar absorptivity IR spectra instead. Here, we propose a graph-based communicative message passing neural network algorithm that can predict both the peak positions and absolute intensities corresponding to density functional theory calculated molar absorptivities in the IR domain. By modifying existing spectral loss functions, we show that our method is able to predict with DFT-accuracy level the IR molar absorptivities of a series of hydrocarbons containing up to ten carbon atoms and apply the model to a set of larger molecules. We also compare the predicted spectra with those generated by the direct message passing neural network. The results suggest that both algorithms demonstrate similar predictive capabilities for hydrocarbons, indicating that either model could be effectively used in future research on spectral prediction for such systems.
Collapse
Affiliation(s)
- Maliheh Shaban Tameh
- Department of Chemistry and Biochemistry, The University of Arizona, Tucson, Arizona 85721-0041, United States
| | - Veaceslav Coropceanu
- Department of Chemistry and Biochemistry, The University of Arizona, Tucson, Arizona 85721-0041, United States
| | - Thomas A R Purcell
- Department of Chemistry and Biochemistry, The University of Arizona, Tucson, Arizona 85721-0041, United States
| | - Jean-Luc Brédas
- Department of Chemistry and Biochemistry, The University of Arizona, Tucson, Arizona 85721-0041, United States
| |
Collapse
|
9
|
Chen W, Louaas D, Brigiano FS, Pezzotti S, Gaigeot MP. A simplified method for theoretical sum frequency generation spectroscopy calculation and interpretation: The "pop model". J Chem Phys 2024; 161:144115. [PMID: 39392142 DOI: 10.1063/5.0231540] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2024] [Accepted: 09/21/2024] [Indexed: 10/12/2024] Open
Abstract
Existing methods to compute theoretical spectra are restricted to the use of time-correlation functions evaluated from accurate atomistic molecular dynamics simulations, often at the ab initio level. The molecular interpretation of the computed spectra requires additional steps to deconvolve the spectroscopic contributions from local water and surface structural populations at the interface. The lack of a standard procedure to do this often hampers rationalization. To overcome these challenges, we rewrite the equations for spectra calculation into a sum of partial contributions from interfacial populations, weighted by their abundance at the interface. We show that SFG signatures from each population can be parameterized into a minimum dataset of reference partial spectra. Accurate spectra can then be predicted by just evaluating the statistics of interfacial populations, which can be done even with force field simulations as well as with analytic models. This approach broadens the range of simulation techniques from which theoretical spectra can be calculated, opening toward non-atomistic and Monte Carlo simulation approaches. Most notably, it allows constructing accurate theoretical spectra for interfacial conditions that cannot even be simulated, as we demonstrate for the pH-dependent SFG spectra of silica/water interfaces.
Collapse
Affiliation(s)
- Wanlin Chen
- Université Paris-Saclay, University Evry, CY Cergy Paris Université, CNRS, LAMBE UMR8587, 91025 Evry-Courcouronnes, France
- Department of Physical Chemistry II, Ruhr University Bochum, D-44801 Bochum, Germany
| | - Dorian Louaas
- Université Paris-Saclay, University Evry, CY Cergy Paris Université, CNRS, LAMBE UMR8587, 91025 Evry-Courcouronnes, France
| | - Flavio Siro Brigiano
- Laboratoire de Chimie Théorique, Sorbonne Université, UMR 7616 CNRS, 4 Place Jussieu, 75005 Paris, France
| | - Simone Pezzotti
- PASTEUR, Département de Chimie, Ecole Normale Supérieure, PSL University, Sorbonne University, CNRS, 75005 Paris, France
| | - Marie-Pierre Gaigeot
- Université Paris-Saclay, University Evry, CY Cergy Paris Université, CNRS, LAMBE UMR8587, 91025 Evry-Courcouronnes, France
- Institut Universitaire de France (IUF), 75005 Paris, France
| |
Collapse
|
10
|
Yang Z, Shi A, Zhang R, Ji Z, Li J, Lyu J, Qian J, Chen T, Wang X, You F, Xie J. When Metal Nanoclusters Meet Smart Synthesis. ACS NANO 2024; 18:27138-27166. [PMID: 39316700 DOI: 10.1021/acsnano.4c09597] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/26/2024]
Abstract
Atomically precise metal nanoclusters (MNCs) represent a fascinating class of ultrasmall nanoparticles with molecule-like properties, bridging conventional metal-ligand complexes and nanocrystals. Despite their potential for various applications, synthesis challenges such as a precise understanding of varied synthetic parameters and property-driven synthesis persist, hindering their full exploitation and wider application. Incorporating smart synthesis methodologies, including a closed-loop framework of automation, data interpretation, and feedback from AI, offers promising solutions to address these challenges. In this perspective, we summarize the closed-loop smart synthesis that has been demonstrated in various nanomaterials and explore the research frontiers of smart synthesis for MNCs. Moreover, the perspectives on the inherent challenges and opportunities of smart synthesis for MNCs are discussed, aiming to provide insights and directions for future advancements in this emerging field of AI for Science, while the integration of deep learning algorithms stands to substantially enrich research in smart synthesis by offering enhanced predictive capabilities, optimization strategies, and control mechanisms, thereby extending the potential of MNC synthesis.
Collapse
Affiliation(s)
- Zhucheng Yang
- Joint School of National University of Singapore and Tianjin University, International Campus of Tianjin University, Fuzhou 350207, P. R. China
- Department of Chemical and Biomolecular Engineering, National University of Singapore, Singapore 117585, Singapore
| | - Anye Shi
- Systems Engineering, College of Engineering, Cornell University, Ithaca, New York 14583, United States
| | - Ruixuan Zhang
- Joint School of National University of Singapore and Tianjin University, International Campus of Tianjin University, Fuzhou 350207, P. R. China
- Department of Chemical and Biomolecular Engineering, National University of Singapore, Singapore 117585, Singapore
| | - Zuowei Ji
- School of Humanities and Social Sciences, The Chinese University of Hong Kong, Shenzhen, Shenzhen 518172, P. R. China
| | - Jiali Li
- Department of Chemistry, National University of Singapore, Singapore 117543, Singapore
| | - Jingkuan Lyu
- Joint School of National University of Singapore and Tianjin University, International Campus of Tianjin University, Fuzhou 350207, P. R. China
- Department of Chemical and Biomolecular Engineering, National University of Singapore, Singapore 117585, Singapore
| | - Jing Qian
- Joint School of National University of Singapore and Tianjin University, International Campus of Tianjin University, Fuzhou 350207, P. R. China
- Department of Chemical and Biomolecular Engineering, National University of Singapore, Singapore 117585, Singapore
| | - Tiankai Chen
- School of Science and Engineering, The Chinese University of Hong Kong, Shenzhen, Shenzhen 518172, P. R. China
| | - Xiaonan Wang
- Department of Chemical Engineering, Tsinghua University, Beijing 100084, P. R. China
| | - Fengqi You
- Systems Engineering, College of Engineering, Cornell University, Ithaca, New York 14583, United States
- Smith School of Chemical and Biomolecular Engineering, Cornell University, Ithaca, New York 14853, United States
- Cornell University AI for Science Institute (CUAISci), Cornell University, Ithaca, New York 14853, United States
| | - Jianping Xie
- Joint School of National University of Singapore and Tianjin University, International Campus of Tianjin University, Fuzhou 350207, P. R. China
- Department of Chemical and Biomolecular Engineering, National University of Singapore, Singapore 117585, Singapore
| |
Collapse
|
11
|
Liu C, Zou R, Mo F. Infrared Spectra Prediction for Functional Group Region Utilizing a Machine Learning Approach with Structural Neighboring Mechanism. Anal Chem 2024; 96:15550-15562. [PMID: 39298179 DOI: 10.1021/acs.analchem.4c01972] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/02/2024]
Abstract
Infrared (IR) spectroscopy is a pivotal technique in chemical research for elucidating molecular structures and dynamics through vibrational and rotational transitions. However, the intricate molecular fingerprints characterized by unique vibrational and rotational patterns present substantial analytical challenges. Here, we present a machine learning approach employing a structural neighboring mechanism tailored to enhance the prediction and interpretation of infrared spectra. Our model distinguishes itself by honing in on chemical information proximal to functional groups, thereby significantly bolstering the accuracy, robustness, and interpretability of spectral predictions. This method not only demystifies the correlations between infrared spectral features and molecular structures but also offers a scalable and efficient paradigm for dissecting complex molecular interactions.
Collapse
Affiliation(s)
- Chengchun Liu
- School of Materials Science and Engineering, Peking University, Beijing 100871, China
| | - Ruqiang Zou
- School of Materials Science and Engineering, Peking University, Beijing 100871, China
- AI for Science (AI4S)-Preferred Program, Peking University Shenzhen Graduate School, Shenzhen 518055, China
- School of Advanced Materials, Peking University Shenzhen Graduate School, Shenzhen 518055, China
| | - Fanyang Mo
- School of Materials Science and Engineering, Peking University, Beijing 100871, China
- AI for Science (AI4S)-Preferred Program, Peking University Shenzhen Graduate School, Shenzhen 518055, China
- School of Advanced Materials, Peking University Shenzhen Graduate School, Shenzhen 518055, China
- Guangdong Provincial Key Laboratory of Nano-Micro Materials Research, Peking University Shenzhen Graduate School, Shenzhen 518055, China
| |
Collapse
|
12
|
Hou YF, Zhang L, Zhang Q, Ge F, Dral PO. Physics-Informed Active Learning for Accelerating Quantum Chemical Simulations. J Chem Theory Comput 2024. [PMID: 39264419 DOI: 10.1021/acs.jctc.4c00821] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/13/2024]
Abstract
Quantum chemical simulations can be greatly accelerated by constructing machine learning potentials, which is often done using active learning (AL). The usefulness of the constructed potentials is often limited by the high effort required and their insufficient robustness in the simulations. Here, we introduce the end-to-end AL for constructing robust data-efficient potentials with affordable investment of time and resources and minimum human interference. Our AL protocol is based on the physics-informed sampling of training points, automatic selection of initial data, uncertainty quantification, and convergence monitoring. The versatility of this protocol is shown in our implementation of quasi-classical molecular dynamics for simulating vibrational spectra, conformer search of a key biochemical molecule, and time-resolved mechanism of the Diels-Alder reaction. These investigations took us days instead of weeks of pure quantum chemical calculations on a high-performance computing cluster.
Collapse
Affiliation(s)
- Yi-Fan Hou
- State Key Laboratory of Physical Chemistry of Solid Surfaces, College of Chemistry and Chemical Engineering, Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, and Innovation Laboratory for Sciences and Technologies of Energy Materials of Fujian Province (IKKEM), Xiamen University, Xiamen, Fujian 361005, China
| | - Lina Zhang
- State Key Laboratory of Physical Chemistry of Solid Surfaces, College of Chemistry and Chemical Engineering, Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, and Innovation Laboratory for Sciences and Technologies of Energy Materials of Fujian Province (IKKEM), Xiamen University, Xiamen, Fujian 361005, China
| | - Quanhao Zhang
- State Key Laboratory of Physical Chemistry of Solid Surfaces, College of Chemistry and Chemical Engineering, Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, and Innovation Laboratory for Sciences and Technologies of Energy Materials of Fujian Province (IKKEM), Xiamen University, Xiamen, Fujian 361005, China
| | - Fuchun Ge
- State Key Laboratory of Physical Chemistry of Solid Surfaces, College of Chemistry and Chemical Engineering, Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, and Innovation Laboratory for Sciences and Technologies of Energy Materials of Fujian Province (IKKEM), Xiamen University, Xiamen, Fujian 361005, China
| | - Pavlo O Dral
- State Key Laboratory of Physical Chemistry of Solid Surfaces, College of Chemistry and Chemical Engineering, Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, and Innovation Laboratory for Sciences and Technologies of Energy Materials of Fujian Province (IKKEM), Xiamen University, Xiamen, Fujian 361005, China
- Institute of Physics, Faculty of Physics, Astronomy, and Informatics, Nicolaus Copernicus University in Toruń, ul. Grudziądzka 5, Toruń 87-100, Poland
| |
Collapse
|
13
|
Mausenberger S, Müller C, Tkatchenko A, Marquetand P, González L, Westermayr J. SpaiNN: equivariant message passing for excited-state nonadiabatic molecular dynamics. Chem Sci 2024:d4sc04164j. [PMID: 39282652 PMCID: PMC11391904 DOI: 10.1039/d4sc04164j] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2024] [Accepted: 09/01/2024] [Indexed: 09/19/2024] Open
Abstract
Excited-state molecular dynamics simulations are crucial for understanding processes like photosynthesis, vision, and radiation damage. However, the computational complexity of quantum chemical calculations restricts their scope. Machine learning offers a solution by delivering high-accuracy properties at lower computational costs. We present SpaiNN, an open-source Python software for ML-driven surface hopping nonadiabatic molecular dynamics simulations. SpaiNN combines the invariant and equivariant neural network architectures of SchNetPack with SHARC for surface hopping dynamics. Its modular design allows users to implement and adapt modules easily. We compare rotationally-invariant and equivariant representations in fitting potential energy surfaces of multiple electronic states and properties arising from the interaction of two electronic states. Simulations of the methyleneimmonium cation and various alkenes demonstrate the superior performance of equivariant SpaiNN models, improving accuracy, generalization, and efficiency in both training and inference.
Collapse
Affiliation(s)
- Sascha Mausenberger
- Faculty of Chemistry, Institute of Theoretical Chemistry, University of Vienna Währinger Str. 17 1090 Vienna Austria
- Vienna Doctoral School in Chemistry (DosChem), University of Vienna Währinger Straße 42 1090 Vienna Austria
| | - Carolin Müller
- Department Chemistry and Pharmacy, Computer-Chemistry-Center, Friedrich-Alexander-Universität Erlangen-Nürnberg Nägelsbachstraße 25 91052 Erlangen Germany
- Department of Physics and Materials Science, University of Luxembourg 162 A, Avenue de la Faïencerie L-1511 Luxembourg Luxembourg
| | - Alexandre Tkatchenko
- Department of Physics and Materials Science, University of Luxembourg 162 A, Avenue de la Faïencerie L-1511 Luxembourg Luxembourg
| | - Philipp Marquetand
- Faculty of Chemistry, Institute of Theoretical Chemistry, University of Vienna Währinger Str. 17 1090 Vienna Austria
| | - Leticia González
- Faculty of Chemistry, Institute of Theoretical Chemistry, University of Vienna Währinger Str. 17 1090 Vienna Austria
| | - Julia Westermayr
- Wilhelm Ostwald Institute for Physical and Theoretical Chemistry, Leipzig University Linnéstraße 2 04103 Leipzig Germany
- Center for Scalable Data Analytics and Artificial Intelligence (ScaDS.AI) Dresden/Leipzig Germany
| |
Collapse
|
14
|
Schmiedmayer B, Kresse G. Derivative learning of tensorial quantities-Predicting finite temperature infrared spectra from first principles. J Chem Phys 2024; 161:084703. [PMID: 39171710 DOI: 10.1063/5.0217243] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2024] [Accepted: 08/05/2024] [Indexed: 08/23/2024] Open
Abstract
We develop a strategy that integrates machine learning and first-principles calculations to achieve technically accurate predictions of infrared spectra. In particular, the methodology allows one to predict infrared spectra for complex systems at finite temperatures. The method's effectiveness is demonstrated in challenging scenarios, such as the analysis of water and the organic-inorganic halide perovskite MAPbI3, where our results consistently align with experimental data. A distinctive feature of the methodology is the incorporation of derivative learning, which proves indispensable for obtaining accurate polarization data in bulk materials and facilitates the training of a machine learning surrogate model of the polarization adapted to rotational and translational symmetries. We achieve polarization prediction accuracies of about 1% for the water dimer by training only on the predicted Born effective charges.
Collapse
Affiliation(s)
- Bernhard Schmiedmayer
- Faculty of Physics and Center for Computational Materials Science, University of Vienna, Kolingasse 14-16, A-1090 Vienna, Austria
| | - Georg Kresse
- Faculty of Physics and Center for Computational Materials Science, University of Vienna, Kolingasse 14-16, A-1090 Vienna, Austria
- VASP Software GmbH, Sensengasse 8, A-1090 Vienna, Austria
| |
Collapse
|
15
|
Kebabsa A, Maurel F, Brémond É. Boosting the Modeling of Infrared and Raman Spectra of Bulk Phase Chromophores with Machine Learning. J Chem Theory Comput 2024. [PMID: 39145741 DOI: 10.1021/acs.jctc.4c00630] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/16/2024]
Abstract
In the field of vibrational spectroscopy simulation, hybrid approximations to Kohn-Sham density-functional theory (KS-DFT) are often considered computationally prohibitive due to the significant effort required to evaluate the exchange-correlation potential in planewave codes. In this Letter, we show that by leveraging the porting of KS-DFT on GPU and incorporating machine-learning techniques, simulating IR and Raman spectra of real-life chromophores in bulk aqueous solution becomes a routine application at this level of theory.
Collapse
Affiliation(s)
- Abir Kebabsa
- Université Paris Cité, ITODYS, CNRS, F-75013 Paris, France
| | | | - Éric Brémond
- Université Paris Cité, ITODYS, CNRS, F-75013 Paris, France
| |
Collapse
|
16
|
Paul A, Rubenstein M, Ruffino A, Masiuk S, Spanier JE, Grinberg I. Accuracy and limitations of the bond polarizability model in modeling of Raman scattering from molecular dynamics simulations. J Chem Phys 2024; 161:064305. [PMID: 39132793 DOI: 10.1063/5.0217227] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2024] [Accepted: 07/22/2024] [Indexed: 08/13/2024] Open
Abstract
Calculation of Raman scattering from molecular dynamics (MD) simulations requires accurate modeling of the evolution of the electronic polarizability of the system along its MD trajectory. For large systems, this necessitates the use of atomistic models to represent the dependence of electronic polarizability on atomic coordinates. The bond polarizability model (BPM) is the simplest such model and has been used for modeling the Raman spectra of molecular systems but has not been applied to solid-state systems. Here, we systematically investigate the accuracy and limitations of the BPM parameterized from the density functional theory results for a series of simple molecules, such as CO2, SO2, H2S, H2O, NH3, and CH4; the more complex CH2O, CH3OH, CH3CH2OH, and thiophene molecules; and the BaTiO3 and CsPbBr3 perovskite solids. We find that BPM can reliably reproduce the overall features of the Raman spectra, such as shifts of peak positions. However, with the exception of highly symmetric systems, the assumption of non-interacting bonds limits the quantitative accuracy of the BPM; this assumption also leads to qualitatively inaccurate polarizability evolution and Raman spectra for systems where large deviations from the ground state structure are present.
Collapse
Affiliation(s)
- Atanu Paul
- Department of Chemistry, Bar-Ilan University, Ramat Gan 5290002, Israel
| | - Maya Rubenstein
- Department of Chemistry, Bar-Ilan University, Ramat Gan 5290002, Israel
| | - Anthony Ruffino
- Department of Physics, Drexel University, Philadelphia, Pennsylvania 19104, USA
| | - Stefan Masiuk
- Department of Mechanical Engineering and Mechanics, Drexel University, Philadelphia, Pennsylvania 19104, USA
| | - Jonathan E Spanier
- Department of Physics, Drexel University, Philadelphia, Pennsylvania 19104, USA
- Department of Mechanical Engineering and Mechanics, Drexel University, Philadelphia, Pennsylvania 19104, USA
| | - Ilya Grinberg
- Department of Chemistry, Bar-Ilan University, Ramat Gan 5290002, Israel
| |
Collapse
|
17
|
Williams CD, Kalayan J, Burton NA, Bryce RA. Stable and accurate atomistic simulations of flexible molecules using conformationally generalisable machine learned potentials. Chem Sci 2024; 15:12780-12795. [PMID: 39148799 PMCID: PMC11323334 DOI: 10.1039/d4sc01109k] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2024] [Accepted: 07/07/2024] [Indexed: 08/17/2024] Open
Abstract
Computational simulation methods based on machine learned potentials (MLPs) promise to revolutionise shape prediction of flexible molecules in solution, but their widespread adoption has been limited by the way in which training data is generated. Here, we present an approach which allows the key conformational degrees of freedom to be properly represented in reference molecular datasets. MLPs trained on these datasets using a global descriptor scheme are generalisable in conformational space, providing quantum chemical accuracy for all conformers. These MLPs are capable of propagating long, stable molecular dynamics trajectories, an attribute that has remained a challenge. We deploy the MLPs in obtaining converged conformational free energy surfaces for flexible molecules via well-tempered metadynamics simulations; this approach provides a hitherto inaccessible route to accurately computing the structural, dynamical and thermodynamical properties of a wide variety of flexible molecular systems. It is further demonstrated that MLPs must be trained on reference datasets with complete coverage of conformational space, including in barrier regions, to achieve stable molecular dynamics trajectories.
Collapse
Affiliation(s)
- Christopher D Williams
- Division of Pharmacy and Optometry, School of Health Sciences, Faculty of Biology, Medicine and Health, The University of Manchester Oxford Road Manchester M13 9PL UK
| | - Jas Kalayan
- Science and Technologies Facilities Council (STFC), Daresbury Laboratory Keckwick Lane, Daresbury Warrington WA4 4AD UK
| | - Neil A Burton
- Department of Chemistry, School of Natural Sciences, Faculty of Science and Engineering, The University of Manchester Oxford Road Manchester M13 9PL UK
| | - Richard A Bryce
- Division of Pharmacy and Optometry, School of Health Sciences, Faculty of Biology, Medicine and Health, The University of Manchester Oxford Road Manchester M13 9PL UK
| |
Collapse
|
18
|
Frank JT, Unke OT, Müller KR, Chmiela S. A Euclidean transformer for fast and stable machine learned force fields. Nat Commun 2024; 15:6539. [PMID: 39107296 PMCID: PMC11303804 DOI: 10.1038/s41467-024-50620-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2023] [Accepted: 07/10/2024] [Indexed: 08/10/2024] Open
Abstract
Recent years have seen vast progress in the development of machine learned force fields (MLFFs) based on ab-initio reference calculations. Despite achieving low test errors, the reliability of MLFFs in molecular dynamics (MD) simulations is facing growing scrutiny due to concerns about instability over extended simulation timescales. Our findings suggest a potential connection between robustness to cumulative inaccuracies and the use of equivariant representations in MLFFs, but the computational cost associated with these representations can limit this advantage in practice. To address this, we propose a transformer architecture called SO3KRATES that combines sparse equivariant representations (Euclidean variables) with a self-attention mechanism that separates invariant and equivariant information, eliminating the need for expensive tensor products. SO3KRATES achieves a unique combination of accuracy, stability, and speed that enables insightful analysis of quantum properties of matter on extended time and system size scales. To showcase this capability, we generate stable MD trajectories for flexible peptides and supra-molecular structures with hundreds of atoms. Furthermore, we investigate the PES topology for medium-sized chainlike molecules (e.g., small peptides) by exploring thousands of minima. Remarkably, SO3KRATES demonstrates the ability to strike a balance between the conflicting demands of stability and the emergence of new minimum-energy conformations beyond the training data, which is crucial for realistic exploration tasks in the field of biochemistry.
Collapse
Affiliation(s)
- J Thorben Frank
- Machine Learning Group, TU Berlin, Berlin, Germany
- BIFOLD, Berlin Institute for the Foundations of Learning and Data, Berlin, Germany
| | | | - Klaus-Robert Müller
- Machine Learning Group, TU Berlin, Berlin, Germany.
- BIFOLD, Berlin Institute for the Foundations of Learning and Data, Berlin, Germany.
- Google DeepMind, Berlin, Germany.
- Department of Artificial Intelligence, Korea University, Seoul, Korea.
- Max Planck Institut für Informatik, Saarbrücken, Germany.
| | - Stefan Chmiela
- Machine Learning Group, TU Berlin, Berlin, Germany.
- BIFOLD, Berlin Institute for the Foundations of Learning and Data, Berlin, Germany.
| |
Collapse
|
19
|
Bi S, Knijff L, Lian X, van Hees A, Zhang C, Salanne M. Modeling of Nanomaterials for Supercapacitors: Beyond Carbon Electrodes. ACS NANO 2024; 18:19931-19949. [PMID: 39053903 PMCID: PMC11308780 DOI: 10.1021/acsnano.4c01787] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/05/2024] [Revised: 04/08/2024] [Accepted: 04/23/2024] [Indexed: 07/27/2024]
Abstract
Capacitive storage devices allow for fast charge and discharge cycles, making them the perfect complements to batteries for high power applications. Many materials display interesting capacitive properties when they are put in contact with ionic solutions despite their very different structures and (surface) reactivity. Among them, nanocarbons are the most important for practical applications, but many nanomaterials have recently emerged, such as conductive metal-organic frameworks, 2D materials, and a wide variety of metal oxides. These heterogeneous and complex electrode materials are difficult to model with conventional approaches. However, the development of computational methods, the incorporation of machine learning techniques, and the increasing power in high performance computing now allow us to tackle these types of systems. In this Review, we summarize the current efforts in this direction. We show that depending on the nature of the materials and of the charging mechanisms, different methods, or combinations of them, can provide desirable atomic-scale insight on the interactions at play. We mainly focus on two important aspects: (i) the study of ion adsorption in complex nanoporous materials, which require the extension of constant potential molecular dynamics to multicomponent systems, and (ii) the characterization of Faradaic processes in pseudocapacitors, that involves the use of electronic structure-based methods. We also discuss how recently developed simulation methods will allow bridges to be made between double-layer capacitors and pseudocapacitors for future high power electricity storage devices.
Collapse
Affiliation(s)
- Sheng Bi
- Physicochimie
des Électrolytes et Nanosystèmes Interfaciaux, Sorbonne Université, CNRS, F-75005 Paris, France
- Réseau
sur le Stockage Electrochimique de l’Energie (RS2E), FR CNRS 3459, 80039 Amiens Cedex, France
| | - Lisanne Knijff
- Department
of Chemistry - Ångström Laboratory, Uppsala University, Lägerhyddsvägen 1, BOX 538, Uppsala 75121, Sweden
| | - Xiliang Lian
- Physicochimie
des Électrolytes et Nanosystèmes Interfaciaux, Sorbonne Université, CNRS, F-75005 Paris, France
- Réseau
sur le Stockage Electrochimique de l’Energie (RS2E), FR CNRS 3459, 80039 Amiens Cedex, France
| | - Alicia van Hees
- Department
of Chemistry - Ångström Laboratory, Uppsala University, Lägerhyddsvägen 1, BOX 538, Uppsala 75121, Sweden
| | - Chao Zhang
- Department
of Chemistry - Ångström Laboratory, Uppsala University, Lägerhyddsvägen 1, BOX 538, Uppsala 75121, Sweden
- Wallenberg
Initiative Materials Science for Sustainability, Uppsala University, 75121 Uppsala, Sweden
| | - Mathieu Salanne
- Réseau
sur le Stockage Electrochimique de l’Energie (RS2E), FR CNRS 3459, 80039 Amiens Cedex, France
- Institut
Universitaire de France (IUF), 75231 Paris, France
| |
Collapse
|
20
|
Li S, Xie BB, Yin BW, Liu L, Shen L, Fang WH. Construction of Highly Accurate Machine Learning Potential Energy Surfaces for Excited-State Dynamics Simulations Based on Low-Level Data Sets. J Phys Chem A 2024; 128:5516-5524. [PMID: 38954640 DOI: 10.1021/acs.jpca.4c02028] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/04/2024]
Abstract
Machine learning is capable of effectively predicting the potential energies of molecules in the presence of high-quality data sets. Its application in the construction of ground- and excited-state potential energy surfaces is attractive to accelerate nonadiabatic molecular dynamics simulations of photochemical reactions. Because of the huge computational cost of excited-state electronic structure calculations, the construction of a high-quality data set becomes a bottleneck. In the present work, we first built two data sets. One was obtained from surface hopping dynamics simulations at the semiempirical OM2/MRCI level. Another was extracted from the dynamics trajectories at the CASSCF level, which was reported previously. The ground- and excited-state potential energy surfaces of ethylene-bridged azobenzene at the CASSCF computational level were constructed based on the former low-level data set. Although non-neural network machine learning methods can achieve good or modest performance during the training process, only neural network models provide reliable predictions on the latter external test data set. The BPNN and SchNet combined with the Δ-ML scheme and the force term in the loss functions are recommended for dynamics simulations. Then, we performed excited-state dynamics simulations of the photoisomerization of ethylene-bridged azobenzene on machine learning potential energy surfaces. Compared with the lifetimes of the first excited state (S1) estimated at different computational levels, our results on the E isomer are in good agreement with the high-level estimation. However, the overestimation of the Z isomer is unimproved. It suggests that smaller errors during the training process do not necessarily translate to more accurate predictions on high-level potential energies or better performance on nonadiabatic dynamics simulations, at least in the present case.
Collapse
Affiliation(s)
- Shuai Li
- Key Laboratory of Theoretical and Computational Photochemistry of Ministry of Education, College of Chemistry, Beijing Normal University, Beijing 100875, P. R. China
| | - Bin-Bin Xie
- Hangzhou Institute of Advanced Studies, Zhejiang Normal University, Hangzhou 311231, Zhejiang, P. R. China
| | - Bo-Wen Yin
- Hangzhou Institute of Advanced Studies, Zhejiang Normal University, Hangzhou 311231, Zhejiang, P. R. China
| | - Lihong Liu
- Key Laboratory of Theoretical and Computational Photochemistry of Ministry of Education, College of Chemistry, Beijing Normal University, Beijing 100875, P. R. China
| | - Lin Shen
- Key Laboratory of Theoretical and Computational Photochemistry of Ministry of Education, College of Chemistry, Beijing Normal University, Beijing 100875, P. R. China
- Yantai-Jingshi Institute of Material Genome Engineering, Yantai 265505, Shandong, P. R. China
| | - Wei-Hai Fang
- Key Laboratory of Theoretical and Computational Photochemistry of Ministry of Education, College of Chemistry, Beijing Normal University, Beijing 100875, P. R. China
- Shandong Laboratory of Yantai Advanced Materials and Green Manufacturing, Yantai 264006, Shandong, P. R. China
| |
Collapse
|
21
|
Chen Y, Pios SV, Gelin MF, Chen L. Accelerating Molecular Vibrational Spectra Simulations with a Physically Informed Deep Learning Model. J Chem Theory Comput 2024; 20:4703-4710. [PMID: 38825857 DOI: 10.1021/acs.jctc.4c00173] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/04/2024]
Abstract
In recent years, machine learning (ML) surrogate models have emerged as an indispensable tool to accelerate simulations of physical and chemical processes. However, there is still a lack of ML models that can accurately predict molecular vibrational spectra. Here, we present a highly efficient multitask ML surrogate model termed Vibrational Spectra Neural Network (VSpecNN), to accurately calculate infrared (IR) and Raman spectra based on dipole moments and polarizabilities obtained on-the-fly via ML-enhanced molecular dynamics simulations. The methodology is applied to pyrazine, a prototypical polyatomic chromophore. The VSpecNN-predicted energies are well within the chemical accuracy (1 kcal/mol), and the errors for VSpecNN-predicted forces are only half of those obtained from a popular high-performance ML model. Compared to the ab initio reference, the VSpecNN-predicted frequencies of IR and Raman spectra differ only by less than 5.87 cm-1, and the intensities of IR spectra and the depolarization ratios of Raman spectra are well reproduced. The VSpecNN model developed in this work highlights the importance of constructing highly accurate neural network potentials for predicting molecular vibrational spectra.
Collapse
Affiliation(s)
| | | | - Maxim F Gelin
- School of Science, Hangzhou Dianzi University, Hangzhou 310018, China
| | | |
Collapse
|
22
|
Lei YK, Yagi K, Sugita Y. Learning QM/MM potential using equivariant multiscale model. J Chem Phys 2024; 160:214109. [PMID: 38828815 DOI: 10.1063/5.0205123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2024] [Accepted: 05/09/2024] [Indexed: 06/05/2024] Open
Abstract
The machine learning (ML) method emerges as an efficient and precise surrogate model for high-level electronic structure theory. Its application has been limited to closed chemical systems without considering external potentials from the surrounding environment. To address this limitation and incorporate the influence of external potentials, polarization effects, and long-range interactions between a chemical system and its environment, the first two terms of the Taylor expansion of an electrostatic operator have been used as extra input to the existing ML model to represent the electrostatic environments. However, high-order electrostatic interaction is often essential to account for external potentials from the environment. The existing models based only on invariant features cannot capture significant distribution patterns of the external potentials. Here, we propose a novel ML model that includes high-order terms of the Taylor expansion of an electrostatic operator and uses an equivariant model, which can generate a high-order tensor covariant with rotations as a base model. Therefore, we can use the multipole-expansion equation to derive a useful representation by accounting for polarization and intermolecular interaction. Moreover, to deal with long-range interactions, we follow the same strategy adopted to derive long-range interactions between a target system and its environment media. Our model achieves higher prediction accuracy and transferability among various environment media with these modifications.
Collapse
Affiliation(s)
- Yao-Kun Lei
- Theoretical Molecular Science Laboratory, RIKEN Cluster for Pioneering Research, Wako, Saitama 351-0198, Japan
- Computational Biophysics Research Team, RIKEN Center for Computational Science, Kobe, Hyogo 650-0047, Japan
- RIKEN Interdisciplinary Theoretical and Mathematical Sciences Program (iTHEMS), Wako, Saitama 351-0198, Japan
| | - Kiyoshi Yagi
- Theoretical Molecular Science Laboratory, RIKEN Cluster for Pioneering Research, Wako, Saitama 351-0198, Japan
- Computational Biophysics Research Team, RIKEN Center for Computational Science, Kobe, Hyogo 650-0047, Japan
| | - Yuji Sugita
- Theoretical Molecular Science Laboratory, RIKEN Cluster for Pioneering Research, Wako, Saitama 351-0198, Japan
- Computational Biophysics Research Team, RIKEN Center for Computational Science, Kobe, Hyogo 650-0047, Japan
- RIKEN Interdisciplinary Theoretical and Mathematical Sciences Program (iTHEMS), Wako, Saitama 351-0198, Japan
- Laboratory for Biomolecular Function Simulation, RIKEN Center for Biosystems Dynamics Research, Kobe, Hyogo 650-0047, Japan
| |
Collapse
|
23
|
Shanks BL, Sullivan HW, Shazed AR, Hoepfner MP. Accelerated Bayesian Inference for Molecular Simulations using Local Gaussian Process Surrogate Models. J Chem Theory Comput 2024; 20:3798-3808. [PMID: 38551198 DOI: 10.1021/acs.jctc.3c01358] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/19/2024]
Abstract
While Bayesian inference is the gold standard for uncertainty quantification and propagation, its use within physical chemistry encounters formidable computational barriers. These bottlenecks are magnified for modeling data with many independent variables, such as X-ray/neutron scattering patterns and electromagnetic spectra. To address this challenge, we employ local Gaussian process (LGP) surrogate models to accelerate Bayesian optimization over these complex thermophysical properties. The time-complexity of the LGPs scales linearly in the number of independent variables, in stark contrast to the computationally expensive cubic scaling of conventional Gaussian processes. To illustrate the method, we trained a LGP surrogate model on the radial distribution function of liquid neon and observed a 1,760,000-fold speed-up compared to molecular dynamics simulation, beating a conventional GP by three orders-of-magnitude. We conclude that LGPs are robust and efficient surrogate models poised to expand the application of Bayesian inference in molecular simulations to a broad spectrum of experimental data.
Collapse
Affiliation(s)
- Brennon L Shanks
- Department of Chemical Engineering, University of Utah, Salt Lake City, UT 84112-9202, United States
| | - Harry W Sullivan
- Department of Chemical Engineering, University of Utah, Salt Lake City, UT 84112-9202, United States
| | - Abdur R Shazed
- Department of Chemical Engineering, University of Utah, Salt Lake City, UT 84112-9202, United States
| | - Michael P Hoepfner
- Department of Chemical Engineering, University of Utah, Salt Lake City, UT 84112-9202, United States
| |
Collapse
|
24
|
Ge F, Wang R, Qu C, Zheng P, Nandi A, Conte R, Houston PL, Bowman JM, Dral PO. Tell Machine Learning Potentials What They Are Needed For: Simulation-Oriented Training Exemplified for Glycine. J Phys Chem Lett 2024; 15:4451-4460. [PMID: 38626460 DOI: 10.1021/acs.jpclett.4c00746] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/18/2024]
Abstract
Machine learning potentials (MLPs) are widely applied as an efficient alternative way to represent potential energy surfaces (PESs) in many chemical simulations. The MLPs are often evaluated with the root-mean-square errors on the test set drawn from the same distribution as the training data. Here, we systematically investigate the relationship between such test errors and the simulation accuracy with MLPs on an example of a full-dimensional, global PES for the glycine amino acid. Our results show that the errors in the test set do not unambiguously reflect the MLP performance in different simulation tasks, such as relative conformer energies, barriers, vibrational levels, and zero-point vibrational energies. We also offer an easily accessible solution for improving the MLP quality in a simulation-oriented manner, yielding the most precise relative conformer energies and barriers. This solution also passed the stringent test by diffusion Monte Carlo simulations.
Collapse
Affiliation(s)
- Fuchun Ge
- State Key Laboratory of Physical Chemistry of Solid Surfaces, College of Chemistry and Chemical Engineering, Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, and Innovation Laboratory for Sciences and Technologies of Energy Materials of Fujian Province (IKKEM), Xiamen University, Xiamen, Fujian 361005, China
| | - Ran Wang
- State Key Laboratory of Physical Chemistry of Solid Surfaces, College of Chemistry and Chemical Engineering, Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, and Innovation Laboratory for Sciences and Technologies of Energy Materials of Fujian Province (IKKEM), Xiamen University, Xiamen, Fujian 361005, China
| | - Chen Qu
- Independent Researcher, Toronto, Ontario M9B0E3, Canada
| | - Peikun Zheng
- State Key Laboratory of Physical Chemistry of Solid Surfaces, College of Chemistry and Chemical Engineering, Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, and Innovation Laboratory for Sciences and Technologies of Energy Materials of Fujian Province (IKKEM), Xiamen University, Xiamen, Fujian 361005, China
| | - Apurba Nandi
- Department of Chemistry and Cherry L. Emerson Center for Scientific Computation, Emory University, Atlanta, Georgia 30322, United States
- Department of Physics and Materials Science, University of Luxembourg, Luxembourg City L-1511, Luxembourg
| | - Riccardo Conte
- Dipartimento di Chimica, Università degli Studi di Milano, via Golgi 19, 20133 Milano, Italy
| | - Paul L Houston
- Department of Chemistry and Chemical Biology, Cornell University, Ithaca, New York 14853, United States
| | - Joel M Bowman
- Department of Chemistry and Cherry L. Emerson Center for Scientific Computation, Emory University, Atlanta, Georgia 30322, United States
| | - Pavlo O Dral
- State Key Laboratory of Physical Chemistry of Solid Surfaces, College of Chemistry and Chemical Engineering, Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, and Innovation Laboratory for Sciences and Technologies of Energy Materials of Fujian Province (IKKEM), Xiamen University, Xiamen, Fujian 361005, China
| |
Collapse
|
25
|
Xu N, Rosander P, Schäfer C, Lindgren E, Österbacka N, Fang M, Chen W, He Y, Fan Z, Erhart P. Tensorial Properties via the Neuroevolution Potential Framework: Fast Simulation of Infrared and Raman Spectra. J Chem Theory Comput 2024; 20:3273-3284. [PMID: 38572734 PMCID: PMC11044275 DOI: 10.1021/acs.jctc.3c01343] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2023] [Revised: 03/19/2024] [Accepted: 04/01/2024] [Indexed: 04/05/2024]
Abstract
Infrared and Raman spectroscopy are widely used for the characterization of gases, liquids, and solids, as the spectra contain a wealth of information concerning, in particular, the dynamics of these systems. Atomic scale simulations can be used to predict such spectra but are often severely limited due to high computational cost or the need for strong approximations that limit the application range and reliability. Here, we introduce a machine learning (ML) accelerated approach that addresses these shortcomings and provides a significant performance boost in terms of data and computational efficiency compared with earlier ML schemes. To this end, we generalize the neuroevolution potential approach to enable the prediction of rank one and two tensors to obtain the tensorial neuroevolution potential (TNEP) scheme. We apply the resulting framework to construct models for the dipole moment, polarizability, and susceptibility of molecules, liquids, and solids and show that our approach compares favorably with several ML models from the literature with respect to accuracy and computational efficiency. Finally, we demonstrate the application of the TNEP approach to the prediction of infrared and Raman spectra of liquid water, a molecule (PTAF-), and a prototypical perovskite with strong anharmonicity (BaZrO3). The TNEP approach is implemented in the free and open source software package gpumd, which makes this methodology readily available to the scientific community.
Collapse
Affiliation(s)
- Nan Xu
- Institute
of Zhejiang University-Quzhou, Quzhou 324000, P. R. China
- College
of Chemical and Biological Engineering, Zhejiang University, Hangzhou 310058, P. R. China
| | - Petter Rosander
- Department
of Physics, Chalmers University of Technology, SE-41296 Gothenburg, Sweden
| | - Christian Schäfer
- Department
of Physics, Chalmers University of Technology, SE-41296 Gothenburg, Sweden
| | - Eric Lindgren
- Department
of Physics, Chalmers University of Technology, SE-41296 Gothenburg, Sweden
| | - Nicklas Österbacka
- Department
of Physics, Chalmers University of Technology, SE-41296 Gothenburg, Sweden
| | - Mandi Fang
- Institute
of Zhejiang University-Quzhou, Quzhou 324000, P. R. China
- College
of Chemical and Biological Engineering, Zhejiang University, Hangzhou 310058, P. R. China
| | - Wei Chen
- State
Key Laboratory of Multiphase Complex Systems, Institute of Process
Engineering, Chinese Academy of Sciences, Beijing 100190, P. R. China
| | - Yi He
- Institute
of Zhejiang University-Quzhou, Quzhou 324000, P. R. China
- College
of Chemical and Biological Engineering, Zhejiang University, Hangzhou 310058, P. R. China
- Department
of Chemical Engineering, University of Washington, Seattle, Washington 98195, United States
| | - Zheyong Fan
- College
of Physical Science and Technology, Bohai
University, Jinzhou 121013, P. R. China
| | - Paul Erhart
- Department
of Physics, Chalmers University of Technology, SE-41296 Gothenburg, Sweden
| |
Collapse
|
26
|
Lin L, Li C, Zhang T, Xia C, Bai Q, Jin L, Shen Y. An in silico scheme for optimizing the enzymatic acquisition of natural biologically active peptides based on machine learning and virtual digestion. Anal Chim Acta 2024; 1298:342419. [PMID: 38462343 DOI: 10.1016/j.aca.2024.342419] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2023] [Revised: 12/23/2023] [Accepted: 02/26/2024] [Indexed: 03/12/2024]
Abstract
BACKGROUND As a potential natural active substance, natural biologically active peptides (NBAPs) are recently attracting increasing attention. The traditional proteolysis methods of obtaining effective NBAPs are considerably vexing, especially since multiple proteases can be used, which blocks the exploration of available NBAPs. Although the development of virtual digesting brings some degree of convenience, the activity of the obtained peptides remains unclear, which would still not allow efficient access to the NBAPs. It is necessary to develop an efficient and accurate strategy for acquiring NBAPs. RESULTS A new in silico scheme named SSA-LSTM-VD, which combines a sparrow search algorithm-long short-term memory (SSA-LSTM) deep learning and virtually digested, was presented to optimize the proteolysis acquisition of NBAPs. Therein, SSA-LSTM reached the highest Efficiency value reached 98.00 % compared to traditional machine learning algorithms, and basic LSTM algorithm. SSA-LSTM was trained to predict the activity of peptides in the proteins virtually digested results, obtain the percentage of target active peptide, and select the appropriate protease for the actual experiment. As an application, SSA-LSTM was employed to predict the percentage of neuroprotective peptides in the virtual digested result of walnut protein, and trypsin was ultimately found to possess the highest value (85.29 %). The walnut protein was digested by trypsin (WPTrH) and the peptide sequence obtained was analyzed closely matches the theoretical neuroprotective peptide. More importantly, the neuroprotective effects of WPTrH had been demonstrated in nerve damage mouse models. SIGNIFICANCE The proposed SSA-LSTM-VD in this paper makes the acquisition of NBAPs efficient and accurate. The approach combines deep learning and virtually digested skillfully. Utilizing the SSA-LSTM-VD based strategy holds promise for discovering and developing peptides with neuroprotective properties or other desired biological activities.
Collapse
Affiliation(s)
- Like Lin
- Key Laboratory of Synthetic and Natural Functional Molecule of Ministry of Education, College of Chemistry and Materials Science, National Demonstration Center for Experimental Chemistry Education, Northwest University, Xi'an, Shaanxi, 710127, People's Republic of China
| | - Cong Li
- Key Laboratory of Synthetic and Natural Functional Molecule of Ministry of Education, College of Chemistry and Materials Science, National Demonstration Center for Experimental Chemistry Education, Northwest University, Xi'an, Shaanxi, 710127, People's Republic of China.
| | - Tianlong Zhang
- Key Laboratory of Synthetic and Natural Functional Molecule of Ministry of Education, College of Chemistry and Materials Science, National Demonstration Center for Experimental Chemistry Education, Northwest University, Xi'an, Shaanxi, 710127, People's Republic of China
| | - Chaoshuang Xia
- Center for Biomedical Mass Spectrometry, Boston University Chobanian and Avedisian School of Medicine, Boston, MA, 02118, United States
| | - Qiuhong Bai
- Key Laboratory of Synthetic and Natural Functional Molecule of Ministry of Education, College of Chemistry and Materials Science, National Demonstration Center for Experimental Chemistry Education, Northwest University, Xi'an, Shaanxi, 710127, People's Republic of China
| | - Lihua Jin
- Key Laboratory of Synthetic and Natural Functional Molecule of Ministry of Education, College of Chemistry and Materials Science, National Demonstration Center for Experimental Chemistry Education, Northwest University, Xi'an, Shaanxi, 710127, People's Republic of China
| | - Yehua Shen
- Key Laboratory of Synthetic and Natural Functional Molecule of Ministry of Education, College of Chemistry and Materials Science, National Demonstration Center for Experimental Chemistry Education, Northwest University, Xi'an, Shaanxi, 710127, People's Republic of China.
| |
Collapse
|
27
|
Unke OT, Stöhr M, Ganscha S, Unterthiner T, Maennel H, Kashubin S, Ahlin D, Gastegger M, Medrano Sandonas L, Berryman JT, Tkatchenko A, Müller KR. Biomolecular dynamics with machine-learned quantum-mechanical force fields trained on diverse chemical fragments. SCIENCE ADVANCES 2024; 10:eadn4397. [PMID: 38579003 PMCID: PMC11809612 DOI: 10.1126/sciadv.adn4397] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/10/2023] [Accepted: 02/29/2024] [Indexed: 04/07/2024]
Abstract
The GEMS method enables molecular dynamics simulations of large heterogeneous systems at ab initio quality.
Collapse
Affiliation(s)
- Oliver T. Unke
- Google DeepMind, Tucholskystraße 2, 10117 Berlin, Germany and Brandschenkestrasse 110, 8002 Zürich, Switzerland
- Machine Learning Group, Technische Universität Berlin, 10587 Berlin, Germany
- DFG Cluster of Excellence “Unifying Systems in Catalysis” (UniSysCat), Technische Universität Berlin, 10623 Berlin, Germany
| | - Martin Stöhr
- Department of Physics and Materials Science, University of Luxembourg, L-1511 Luxembourg City, Luxembourg
| | - Stefan Ganscha
- Google DeepMind, Tucholskystraße 2, 10117 Berlin, Germany and Brandschenkestrasse 110, 8002 Zürich, Switzerland
| | - Thomas Unterthiner
- Google DeepMind, Tucholskystraße 2, 10117 Berlin, Germany and Brandschenkestrasse 110, 8002 Zürich, Switzerland
| | - Hartmut Maennel
- Google DeepMind, Tucholskystraße 2, 10117 Berlin, Germany and Brandschenkestrasse 110, 8002 Zürich, Switzerland
| | - Sergii Kashubin
- Google DeepMind, Tucholskystraße 2, 10117 Berlin, Germany and Brandschenkestrasse 110, 8002 Zürich, Switzerland
| | - Daniel Ahlin
- Google DeepMind, Tucholskystraße 2, 10117 Berlin, Germany and Brandschenkestrasse 110, 8002 Zürich, Switzerland
| | - Michael Gastegger
- Machine Learning Group, Technische Universität Berlin, 10587 Berlin, Germany
- DFG Cluster of Excellence “Unifying Systems in Catalysis” (UniSysCat), Technische Universität Berlin, 10623 Berlin, Germany
- BASLEARN — TU Berlin/BASF Joint Lab for Machine Learning, Technische Universität Berlin, 10587 Berlin, Germany
| | - Leonardo Medrano Sandonas
- Department of Physics and Materials Science, University of Luxembourg, L-1511 Luxembourg City, Luxembourg
| | - Joshua T. Berryman
- Department of Physics and Materials Science, University of Luxembourg, L-1511 Luxembourg City, Luxembourg
| | - Alexandre Tkatchenko
- Department of Physics and Materials Science, University of Luxembourg, L-1511 Luxembourg City, Luxembourg
| | - Klaus-Robert Müller
- Google DeepMind, Tucholskystraße 2, 10117 Berlin, Germany and Brandschenkestrasse 110, 8002 Zürich, Switzerland
- Machine Learning Group, Technische Universität Berlin, 10587 Berlin, Germany
- Department of Artificial Intelligence, Korea University, Anam-dong, Seongbuk-gu, Seoul 02841, Korea
- Max Planck Institute for Informatics, Stuhlsatzenhausweg, 66123 Saarbrücken, Germany
- BIFOLD — Berlin Institute for the Foundations of Learning and Data, Berlin, Germany
| |
Collapse
|
28
|
Dral PO. AI in computational chemistry through the lens of a decade-long journey. Chem Commun (Camb) 2024; 60:3240-3258. [PMID: 38444290 DOI: 10.1039/d4cc00010b] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/07/2024]
Abstract
This article gives a perspective on the progress of AI tools in computational chemistry through the lens of the author's decade-long contributions put in the wider context of the trends in this rapidly expanding field. This progress over the last decade is tremendous: while a decade ago we had a glimpse of what was to come through many proof-of-concept studies, now we witness the emergence of many AI-based computational chemistry tools that are mature enough to make faster and more accurate simulations increasingly routine. Such simulations in turn allow us to validate and even revise experimental results, deepen our understanding of the physicochemical processes in nature, and design better materials, devices, and drugs. The rapid introduction of powerful AI tools gives rise to unique challenges and opportunities that are discussed in this article too.
Collapse
Affiliation(s)
- Pavlo O Dral
- State Key Laboratory of Physical Chemistry of Solid Surfaces, College of Chemistry and Chemical Engineering, Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, and Innovation Laboratory for Sciences and Technologies of Energy Materials of Fujian Province (IKKEM), Xiamen University, Xiamen, Fujian 361005, China.
| |
Collapse
|
29
|
Martí C, Devereux C, Najm HN, Zádor J. Evaluation of Rate Coefficients in the Gas Phase Using Machine-Learned Potentials. J Phys Chem A 2024. [PMID: 38427974 DOI: 10.1021/acs.jpca.3c07872] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/03/2024]
Abstract
We assess the capability of machine-learned potentials to compute rate coefficients by training a neural network (NN) model and applying it to describe the chemical landscape on the C5H5 potential energy surface, which is relevant to molecular weight growth in combustion and interstellar media. We coupled the resulting NN with an automated kinetics workflow code, KinBot, to perform all necessary calculations to compute the rate coefficients. The NN is benchmarked exhaustively by evaluating its performance at the various stages of the kinetics calculations: from the electronic energy through the computation of zero point energy, barrier heights, entropic contributions, the portion of the PES explored, and finally the overall rate coefficients as formulated by transition state theory.
Collapse
Affiliation(s)
- Carles Martí
- Combustion Research Facility, Sandia National Laboratories, Livermore, California 94551, United States
| | - Christian Devereux
- Combustion Research Facility, Sandia National Laboratories, Livermore, California 94551, United States
| | - Habib N Najm
- Combustion Research Facility, Sandia National Laboratories, Livermore, California 94551, United States
| | - Judit Zádor
- Combustion Research Facility, Sandia National Laboratories, Livermore, California 94551, United States
| |
Collapse
|
30
|
Ye S, Zhong K, Huang Y, Zhang G, Sun C, Jiang J. Artificial Intelligence-based Amide-II Infrared Spectroscopy Simulation for Monitoring Protein Hydrogen Bonding Dynamics. J Am Chem Soc 2024; 146:2663-2672. [PMID: 38240637 DOI: 10.1021/jacs.3c12258] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2024]
Abstract
The structurally sensitive amide II infrared (IR) bands of proteins provide valuable information about the hydrogen bonding of protein secondary structures, which is crucial for understanding protein dynamics and associated functions. However, deciphering protein structures from experimental amide II spectra relies on time-consuming quantum chemical calculations on tens of thousands of representative configurations in solvent water. Currently, the accurate simulation of amide II spectra for whole proteins remains a challenge. Here, we present a machine learning (ML)-based protocol designed to efficiently simulate the amide II IR spectra of various proteins with an accuracy comparable to experimental results. This protocol stands out as a cost-effective and efficient alternative for studying protein dynamics, including the identification of secondary structures and monitoring the dynamics of protein hydrogen bonding under different pH conditions and during protein folding process. Our method provides a valuable tool in the field of protein research, focusing on the study of dynamic properties of proteins, especially those related to hydrogen bonding, using amide II IR spectroscopy.
Collapse
Affiliation(s)
- Sheng Ye
- School of Artificial Intelligence, Anhui University, Hefei, Anhui 230601, People's Republic of China
| | - Kai Zhong
- Zernike Institute for Advanced Materials, University of Groningen, Nijenborgh 4, Groningen 9747AG, Netherlands
| | - Yan Huang
- Key Laboratory of Precision and Intelligent Chemistry, School of Chemistry and Materials Science, University of Science and Technology of China, Hefei, Anhui 230026, China
| | - Guozhen Zhang
- Hefei National Research Center of Physical Sciences at the Microscale, School of Chemistry and Materials Science, University of Science and Technology of China, Hefei, Anhui 230026, China
- Hefei National Laboratory, University of Science and Technology of China, Hefei 230088, China
| | - Changyin Sun
- School of Artificial Intelligence, Anhui University, Hefei, Anhui 230601, People's Republic of China
| | - Jun Jiang
- Key Laboratory of Precision and Intelligent Chemistry, School of Chemistry and Materials Science, University of Science and Technology of China, Hefei, Anhui 230026, China
- Hefei National Laboratory, University of Science and Technology of China, Hefei 230088, China
| |
Collapse
|
31
|
Singh S, Sahani H. Current Advancement and Future Prospects: Biomedical Nanoengineering. Curr Radiopharm 2024; 17:120-137. [PMID: 38058099 DOI: 10.2174/0118744710274376231123063135] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2023] [Revised: 10/19/2023] [Accepted: 10/27/2023] [Indexed: 12/08/2023]
Abstract
Recent advancements in biomedicine have seen a significant reliance on nanoengineering, as traditional methods often fall short in harnessing the unique attributes of biomaterials. Nanoengineering has emerged as a valuable approach to enhance and enrich the performance and functionalities of biomaterials, driving research and development in the field. This review emphasizes the most prevalent biomaterials used in biomedicine, including polymers, nanocomposites, and metallic materials, and explores the pivotal role of nanoengineering in developing biomedical treatments and processes. Particularly, the review highlights research focused on gaining an in-depth understanding of material properties and effectively enhancing material performance through molecular dynamics simulations, all from a nanoengineering perspective.
Collapse
Affiliation(s)
- Sonia Singh
- Institute of Pharmaceutical Research, GLA University, 17 km Stone, NH-2, Mathura-Delhi Road Mathura, Chaumuhan, Uttar Pradesh, 281406, India
| | - Hrishika Sahani
- Lifecell International Pvt. Ltd., NSP Office, Pearls Business Park, 8th Floor Office No-804, Netaji Subhash Palace Delhi, 110034, India
| |
Collapse
|
32
|
Stark W, Westermayr J, Douglas-Gallardo OA, Gardner J, Habershon S, Maurer RJ. Machine Learning Interatomic Potentials for Reactive Hydrogen Dynamics at Metal Surfaces Based on Iterative Refinement of Reaction Probabilities. THE JOURNAL OF PHYSICAL CHEMISTRY. C, NANOMATERIALS AND INTERFACES 2023; 127:24168-24182. [PMID: 38148847 PMCID: PMC10749455 DOI: 10.1021/acs.jpcc.3c06648] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/06/2023] [Revised: 11/12/2023] [Accepted: 11/15/2023] [Indexed: 12/28/2023]
Abstract
The reactive chemistry of molecular hydrogen at surfaces, notably dissociative sticking and hydrogen evolution, plays a crucial role in energy storage and fuel cells. Theoretical studies can help to decipher underlying mechanisms and reaction design, but studying dynamics at surfaces is computationally challenging due to the complex electronic structure at interfaces and the high sensitivity of dynamics to reaction barriers. In addition, ab initio molecular dynamics, based on density functional theory, is too computationally demanding to accurately predict reactive sticking or desorption probabilities, as it requires averaging over tens of thousands of initial conditions. High-dimensional machine learning-based interatomic potentials are starting to be more commonly used in gas-surface dynamics, yet robust approaches to generate reliable training data and assess how model uncertainty affects the prediction of dynamic observables are not well established. Here, we employ ensemble learning to adaptively generate training data while assessing model performance with full uncertainty quantification (UQ) for reaction probabilities of hydrogen scattering on different copper facets. We use this approach to investigate the performance of two message-passing neural networks, SchNet and PaiNN. Ensemble-based UQ and iterative refinement allow us to expose the shortcomings of the invariant pairwise-distance-based feature representation in the SchNet model for gas-surface dynamics.
Collapse
Affiliation(s)
- Wojciech
G. Stark
- Department
of Chemistry, University of Warwick, Gibbet Hill Road, Coventry CV4 7AL, U.K.
| | - Julia Westermayr
- Department
of Chemistry, University of Warwick, Gibbet Hill Road, Coventry CV4 7AL, U.K.
| | | | - James Gardner
- Department
of Chemistry, University of Warwick, Gibbet Hill Road, Coventry CV4 7AL, U.K.
| | - Scott Habershon
- Department
of Chemistry, University of Warwick, Gibbet Hill Road, Coventry CV4 7AL, U.K.
| | - Reinhard J. Maurer
- Department
of Chemistry, University of Warwick, Gibbet Hill Road, Coventry CV4 7AL, U.K.
- Department
of Physics, University of Warwick, Gibbet Hill Road, Coventry CV4 7AL, U.K.
| |
Collapse
|
33
|
Mignon P, Allouche AR, Innis NR, Bousige C. Neural Network Approach for a Rapid Prediction of Metal-Supported Borophene Properties. J Am Chem Soc 2023; 145:27857-27866. [PMID: 38063165 DOI: 10.1021/jacs.3c11549] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2023]
Abstract
We developed a high-dimensional neural network potential (NNP) to describe the structural and energetic properties of borophene deposited on silver. This NNP has the accuracy of density functional theory (DFT) calculations while achieving computational speedups of several orders of magnitude, allowing the study of extensive structures that may reveal intriguing moiré patterns or surface corrugations. We describe an efficient approach to constructing the training data set using an iterative technique known as the "adaptive learning approach". The developed NNP is able to produce, with excellent agreement, the structure, energy, and forces obtained at the DFT level. Finally, the calculated stability of various borophene polymorphs, including those not initially included in the training data set, shows better stabilization for ν ∼ 0.1 hole density, and in particular for the allotrope α ( ν = 1 / 9 ) . The stability of borophene on the metal surface is shown to depend on its orientation, implying structural corrugation patterns that can be observed only from long-time simulations on extended systems. The NNP also demonstrates its ability to simulate vibrational densities of states and produce realistic structures with simulated STM images closely matching the experimental ones.
Collapse
Affiliation(s)
- Pierre Mignon
- Institut Lumière Matière, UMR CNRS 5306, Univ Lyon, Université Claude Bernard Lyon 1, F-69622 Villeurbanne, France
| | - Abdul-Rahman Allouche
- Institut Lumière Matière, UMR CNRS 5306, Univ Lyon, Université Claude Bernard Lyon 1, F-69622 Villeurbanne, France
| | - Neil Richard Innis
- Laboratoire des Multimatériaux et Interfaces, UMR CNRS 5615, Univ. Lyon, Université Claude Bernard Lyon 1, F-69622 Villeurbanne, France
| | - Colin Bousige
- Laboratoire des Multimatériaux et Interfaces, UMR CNRS 5615, Univ. Lyon, Université Claude Bernard Lyon 1, F-69622 Villeurbanne, France
| |
Collapse
|
34
|
Karandashev K, Weinreich J, Heinen S, Arismendi Arrieta DJ, von Rudorff GF, Hermansson K, von Lilienfeld OA. Evolutionary Monte Carlo of QM Properties in Chemical Space: Electrolyte Design. J Chem Theory Comput 2023; 19:8861-8870. [PMID: 38009856 PMCID: PMC10720348 DOI: 10.1021/acs.jctc.3c00822] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2023] [Revised: 10/29/2023] [Accepted: 10/30/2023] [Indexed: 11/29/2023]
Abstract
Optimizing a target function over the space of organic molecules is an important problem appearing in many fields of applied science but also a very difficult one due to the vast number of possible molecular systems. We propose an evolutionary Monte Carlo algorithm for solving such problems which is capable of straightforwardly tuning both exploration and exploitation characteristics of an optimization procedure while retaining favorable properties of genetic algorithms. The method, dubbed MOSAiCS (Metropolis Optimization by Sampling Adaptively in Chemical Space), is tested on problems related to optimizing components of battery electrolytes, namely, minimizing solvation energy in water or maximizing dipole moment while enforcing a lower bound on the HOMO-LUMO gap; optimization was carried out over sets of molecular graphs inspired by QM9 and Electrolyte Genome Project (EGP) data sets. MOSAiCS reliably generated molecular candidates with good target quantity values, which were in most cases better than the ones found in QM9 or EGP. While the optimization results presented in this work sometimes required up to 106 QM calculations and were thus feasible only thanks to computationally efficient ab initio approximations of properties of interest, we discuss possible strategies for accelerating MOSAiCS using machine learning approaches.
Collapse
Affiliation(s)
| | - Jan Weinreich
- Faculty
of Physics, University of Vienna, Kolingasse 14-16, AT-1090 Wien, Austria
| | - Stefan Heinen
- Vector
Institute for Artificial Intelligence, Toronto, M5S 1M1 Ontario, Canada
| | | | - Guido Falk von Rudorff
- Department
of Chemistry, University Kassel, Heinrich-Plett-Str.40, 34132 Kassel, Germany
- Center
for Interdisciplinary Nanostructure Science and Technology (CINSaT), Heinrich-Plett-Straße 40, 34132 Kassel, Germany
| | - Kersti Hermansson
- Department
of Chemistry-Ångström Laboratory, Uppsala University, Box 538, SE-75121 Uppsala, Sweden
| | - O. Anatole von Lilienfeld
- Vector
Institute for Artificial Intelligence, Toronto, M5S 1M1 Ontario, Canada
- Departments
of Chemistry, Materials Science and Engineering, and Physics, University of Toronto, St. George
Campus, Toronto, M5S 1A1 Ontario, Canada
- Machine
Learning Group, Technische Universität
Berlin and Institute for the Foundations of Learning and Data, 10587 Berlin, Germany
| |
Collapse
|
35
|
Kotobi A, Singh K, Höche D, Bari S, Meißner RH, Bande A. Integrating Explainability into Graph Neural Network Models for the Prediction of X-ray Absorption Spectra. J Am Chem Soc 2023; 145:22584-22598. [PMID: 37807700 PMCID: PMC10591337 DOI: 10.1021/jacs.3c07513] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2023] [Indexed: 10/10/2023]
Abstract
The use of sophisticated machine learning (ML) models, such as graph neural networks (GNNs), to predict complex molecular properties or all kinds of spectra has grown rapidly. However, ensuring the interpretability of these models' predictions remains a challenge. For example, a rigorous understanding of the predicted X-ray absorption spectrum (XAS) generated by such ML models requires an in-depth investigation of the respective black-box ML model used. Here, this is done for different GNNs based on a comprehensive, custom-generated XAS data set for small organic molecules. We show that a thorough analysis of the different ML models with respect to the local and global environments considered in each ML model is essential for the selection of an appropriate ML model that allows a robust XAS prediction. Moreover, we employ feature attribution to determine the respective contributions of various atoms in the molecules to the peaks observed in the XAS spectrum. By comparing this peak assignment to the core and virtual orbitals from the quantum chemical calculations underlying our data set, we demonstrate that it is possible to relate the atomic contributions via these orbitals to the XAS spectrum.
Collapse
Affiliation(s)
- Amir Kotobi
- Helmholtz-Zentrum
Hereon, Institute of Surface
Science, Geesthacht, DE 21502, Germany
| | - Kanishka Singh
- Helmholtz-Zentrum
Berlin für Materialien und Energie GmbH, Berlin, DE 10409, Germany
- Institute
of Chemistry and Biochemistry, Freie Universität
Berlin, Berlin, DE 14195, Germany
| | - Daniel Höche
- Helmholtz-Zentrum
Hereon, Institute of Surface
Science, Geesthacht, DE 21502, Germany
| | - Sadia Bari
- Deutsches
Elektronen-Synchrotron DESY, Hamburg, DE 22607, Germany
- Zernike
Institute for Advanced Materials, University
of Groningen, Groningen 9712, Netherlands
| | - Robert H. Meißner
- Helmholtz-Zentrum
Hereon, Institute of Surface
Science, Geesthacht, DE 21502, Germany
- Hamburg
University of Technology, Institute of Polymers
and Composites, Hamburg, DE 21073, Germany
| | - Annika Bande
- Helmholtz-Zentrum
Berlin für Materialien und Energie GmbH, Berlin, DE 10409, Germany
- Leibniz
University Hannover, Institute of Inorganic
Chemistry, Hannover, DE 30167, Germany
| |
Collapse
|
36
|
Zhang Y, Jiang B. Universal machine learning for the response of atomistic systems to external fields. Nat Commun 2023; 14:6424. [PMID: 37827998 PMCID: PMC10570356 DOI: 10.1038/s41467-023-42148-y] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2023] [Accepted: 10/01/2023] [Indexed: 10/14/2023] Open
Abstract
Machine learned interatomic interaction potentials have enabled efficient and accurate molecular simulations of closed systems. However, external fields, which can greatly change the chemical structure and/or reactivity, have been seldom included in current machine learning models. This work proposes a universal field-induced recursively embedded atom neural network (FIREANN) model, which integrates a pseudo field vector-dependent feature into atomic descriptors to represent system-field interactions with rigorous rotational equivariance. This "all-in-one" approach correlates various response properties like dipole moment and polarizability with the field-dependent potential energy in a single model, very suitable for spectroscopic and dynamics simulations in molecular and periodic systems in the presence of electric fields. Especially for periodic systems, we find that FIREANN can overcome the intrinsic multiple-value issue of the polarization by training atomic forces only. These results validate the universality and capability of the FIREANN method for efficient first-principles modeling of complicated systems in strong external fields.
Collapse
Affiliation(s)
- Yaolong Zhang
- Key Laboratory of Precision and Intelligent Chemistry, Department of Chemical Physics, Key Laboratory of Surface and Interface Chemistry and Energy Catalysis of Anhui Higher Education Institutes, University of Science and Technology of China, Hefei, Anhui, 230026, China
- École Polytechnique Fédérale de Lausanne, 1015, Lausanne, Switzerland
| | - Bin Jiang
- Key Laboratory of Precision and Intelligent Chemistry, Department of Chemical Physics, Key Laboratory of Surface and Interface Chemistry and Energy Catalysis of Anhui Higher Education Institutes, University of Science and Technology of China, Hefei, Anhui, 230026, China.
- Hefei National Laboratory, University of Science and Technology of China, Hefei, 230088, China.
| |
Collapse
|
37
|
Mohanty S, Stevenson J, Browning AR, Jacobson L, Leswing K, Halls MD, Afzal MAF. Development of scalable and generalizable machine learned force field for polymers. Sci Rep 2023; 13:17251. [PMID: 37821501 PMCID: PMC10567837 DOI: 10.1038/s41598-023-43804-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2023] [Accepted: 09/28/2023] [Indexed: 10/13/2023] Open
Abstract
Understanding and predicting the properties of polymers is vital to developing tailored polymer molecules for desired applications. Classical force fields may fail to capture key properties, for example, the transport properties of certain polymer systems such as polyethylene glycol. As a solution, we present an alternative potential energy surface, a charge recursive neural network (QRNN) model trained on DFT calculations made on smaller atomic clusters that generalizes well to oligomers comprising larger atomic clusters or longer chains. We demonstrate the validity of the polymer QRNN workflow by modeling the oligomers of ethylene glycol. We apply two rounds of active learning (addition of new training clusters based on current model performance) and implement a novel model training approach that uses partial charges from a semi-empirical method. Our developed QRNN model for polymers produces stable molecular dynamics (MD) simulation trajectory and captures the dynamics of polymer chains as indicated by the striking agreement with experimental values. Our model allows working on much larger systems than allowed by DFT simulations, at the same time providing a more accurate force field than classical force fields which provides a promising avenue for large-scale molecular simulations of polymeric systems.
Collapse
|
38
|
Brezina K, Beck H, Marsalek O. Reducing the Cost of Neural Network Potential Generation for Reactive Molecular Systems. J Chem Theory Comput 2023; 19:6589-6604. [PMID: 37747971 PMCID: PMC10569056 DOI: 10.1021/acs.jctc.3c00391] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2023] [Indexed: 09/27/2023]
Abstract
Although machine learning potentials have recently had a substantial impact on molecular simulations, the construction of a robust training set can still become a limiting factor, especially due to the requirement of a reference ab initio simulation that covers all the relevant geometries of the system. Recognizing that this can be prohibitive for certain systems, we develop the method of transition tube sampling that mitigates the computational cost of training set and model generation. In this approach, we generate classical or quantum thermal geometries around a transition path describing a conformational change or a chemical reaction using only a sparse set of local normal mode expansions along this path and select from these geometries by an active learning protocol. This yields a training set with geometries that characterize the whole transition without the need for a costly reference trajectory. The performance of the method is evaluated on different molecular systems with the complexity of the potential energy landscape increasing from a single minimum to a double proton-transfer reaction with high barriers. Our results show that the method leads to training sets that give rise to models applicable in classical and path integral simulations alike that are on par with those based directly on ab initio calculations while providing the computational speedup we have come to expect from machine learning potentials.
Collapse
Affiliation(s)
- Krystof Brezina
- Charles University, Faculty of Mathematics
and Physics, Ke Karlovu
3, 121 16, Prague
2, Czech Republic
| | - Hubert Beck
- Charles University, Faculty of Mathematics
and Physics, Ke Karlovu
3, 121 16, Prague
2, Czech Republic
| | - Ondrej Marsalek
- Charles University, Faculty of Mathematics
and Physics, Ke Karlovu
3, 121 16, Prague
2, Czech Republic
| |
Collapse
|
39
|
Xue X, Sun H, Yang M, Liu X, Hu HY, Deng Y, Wang X. Advances in the Application of Artificial Intelligence-Based Spectral Data Interpretation: A Perspective. Anal Chem 2023; 95:13733-13745. [PMID: 37688541 DOI: 10.1021/acs.analchem.3c02540] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/11/2023]
Abstract
The interpretation of spectral data, including mass, nuclear magnetic resonance, infrared, and ultraviolet-visible spectra, is critical for obtaining molecular structural information. The development of advanced sensing technology has multiplied the amount of available spectral data. Chemical experts must use basic principles corresponding to the spectral information generated by molecular fragments and functional groups. This is a time-consuming process that requires a solid professional knowledge base. In recent years, the rapid development of computer science and its applications in cheminformatics and the emergence of computer-aided expert systems have greatly reduced the difficulty in analyzing large quantities of data. For expert systems, however, the problem-solving strategy must be known in advance or extracted by human experts and translated into algorithms. Gratifyingly, the development of artificial intelligence (AI) methods has shown great promise for solving such problems. Traditional algorithms, including the latest neural network algorithms, have shown great potential for both extracting useful information and processing massive quantities of data. This Perspective highlights recent innovations covering all of the emerging AI-based spectral interpretation techniques. In addition, the main limitations and current obstacles are presented, and the corresponding directions for further research are proposed. Moreover, this Perspective gives the authors' personal outlook on the development and future applications of spectral interpretation.
Collapse
Affiliation(s)
- Xi Xue
- State Key Laboratory of Bioactive Substances and Functions of Natural Medicines, Institute of Materia Medica, Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing 100050, China
- Beijing Key Laboratory of Active Substances Discovery and Drugability Evaluation, Department of Medicinal Chemistry, Institute of Materia Medica, Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing 100050, P. R. China
| | - Hanyu Sun
- State Key Laboratory of Bioactive Substances and Functions of Natural Medicines, Institute of Materia Medica, Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing 100050, China
- Beijing Key Laboratory of Active Substances Discovery and Drugability Evaluation, Department of Medicinal Chemistry, Institute of Materia Medica, Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing 100050, P. R. China
| | - Minjian Yang
- State Key Laboratory of Bioactive Substances and Functions of Natural Medicines, Institute of Materia Medica, Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing 100050, China
- Beijing Key Laboratory of Active Substances Discovery and Drugability Evaluation, Department of Medicinal Chemistry, Institute of Materia Medica, Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing 100050, P. R. China
| | - Xue Liu
- State Key Laboratory of Bioactive Substances and Functions of Natural Medicines, Institute of Materia Medica, Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing 100050, China
| | - Hai-Yu Hu
- State Key Laboratory of Bioactive Substances and Functions of Natural Medicines, Institute of Materia Medica, Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing 100050, China
| | - Yafeng Deng
- CarbonSilicon AI Technology Co., Ltd. Beijing 100080, China
- Department of Automation, Tsinghua University, Beijing 100084, China
| | - Xiaojian Wang
- State Key Laboratory of Bioactive Substances and Functions of Natural Medicines, Institute of Materia Medica, Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing 100050, China
- CarbonSilicon AI Technology Co., Ltd. Beijing 100080, China
| |
Collapse
|
40
|
Litman Y, Lan J, Nagata Y, Wilkins DM. Fully First-Principles Surface Spectroscopy with Machine Learning. J Phys Chem Lett 2023; 14:8175-8182. [PMID: 37671886 PMCID: PMC10510433 DOI: 10.1021/acs.jpclett.3c01989] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2023] [Accepted: 08/29/2023] [Indexed: 09/07/2023]
Abstract
Our current understanding of the structure and dynamics of aqueous interfaces at the molecular level has grown substantially due to the continuous development of surface-specific spectroscopies, such as vibrational sum-frequency generation (VSFG). As in other vibrational spectroscopies, we must turn to atomistic simulations to extract all of the information encoded in the VSFG spectra. The high computational cost associated with existing methods means that they have limitations in representing systems with complex electronic structure or in achieving statistical convergence. In this work, we combine high-dimensional neural network interatomic potentials and symmetry-adapted Gaussian process regression to overcome these constraints. We show that it is possible to model VSFG signals with fully ab initio accuracy using machine learning and illustrate the versatility of our approach on the water/air interface. Our strategy allows us to identify the main sources of theoretical inaccuracy and establish a clear pathway toward the modeling of surface-sensitive spectroscopy of complex interfaces.
Collapse
Affiliation(s)
- Yair Litman
- Yusuf
Hamied Department of Chemistry, University
of Cambridge, Lensfield Road, Cambridge CB2 1EW, U.K.
- Max
Planck Institute for Polymer Research, Ackermannweg 10, 55128 Mainz, Germany
| | - Jinggang Lan
- Department
of Chemistry, New York University, New York, New York 10003, United States
- Simons
Center for Computational Physical Chemistry at New York University, New York, New York 10003, United States
| | - Yuki Nagata
- Max
Planck Institute for Polymer Research, Ackermannweg 10, 55128 Mainz, Germany
| | - David M. Wilkins
- Centre
for Quantum Materials and Technologies School of Mathematics and Physics, Queen’s University Belfast, Belfast BT7 1NN, Northern Ireland, United Kingdom
| |
Collapse
|
41
|
Ge F, Zhang L, Hou YF, Chen Y, Ullah A, Dral PO. Four-Dimensional-Spacetime Atomistic Artificial Intelligence Models. J Phys Chem Lett 2023; 14:7732-7743. [PMID: 37606602 DOI: 10.1021/acs.jpclett.3c01592] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/23/2023]
Abstract
We demonstrate that AI can learn atomistic systems in the four-dimensional (4D) spacetime. For this, we introduce the 4D-spacetime GICnet model, which for the given initial conditions (nuclear positions and velocities at time zero) can predict nuclear positions and velocities as a continuous function of time up to the distant future. Such models of molecules can be unrolled in the time dimension to yield long-time high-resolution molecular dynamics trajectories with high efficiency and accuracy. 4D-spacetime models can make predictions for different times in any order and do not need a stepwise evaluation of forces and integration of the equations of motions at discretized time steps, which is a major advance over traditional, cost-inefficient molecular dynamics. These models can be used to speed up dynamics, simulate vibrational spectra, and obtain deeper insight into nuclear motions, as we demonstrate for a series of organic molecules.
Collapse
Affiliation(s)
- Fuchun Ge
- State Key Laboratory of Physical Chemistry of Solid Surfaces, College of Chemistry and Chemical Engineering, Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, and Innovation Laboratory for Sciences and Technologies of Energy Materials of Fujian Province (IKKEM), Xiamen University, Xiamen, Fujian 361005, China
| | - Lina Zhang
- State Key Laboratory of Physical Chemistry of Solid Surfaces, College of Chemistry and Chemical Engineering, Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, and Innovation Laboratory for Sciences and Technologies of Energy Materials of Fujian Province (IKKEM), Xiamen University, Xiamen, Fujian 361005, China
| | - Yi-Fan Hou
- State Key Laboratory of Physical Chemistry of Solid Surfaces, College of Chemistry and Chemical Engineering, Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, and Innovation Laboratory for Sciences and Technologies of Energy Materials of Fujian Province (IKKEM), Xiamen University, Xiamen, Fujian 361005, China
| | - Yuxinxin Chen
- State Key Laboratory of Physical Chemistry of Solid Surfaces, College of Chemistry and Chemical Engineering, Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, and Innovation Laboratory for Sciences and Technologies of Energy Materials of Fujian Province (IKKEM), Xiamen University, Xiamen, Fujian 361005, China
| | - Arif Ullah
- School of Physics and Optoelectronic Engineering, Anhui University, Hefei 230601, China
| | - Pavlo O Dral
- State Key Laboratory of Physical Chemistry of Solid Surfaces, College of Chemistry and Chemical Engineering, Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, and Innovation Laboratory for Sciences and Technologies of Energy Materials of Fujian Province (IKKEM), Xiamen University, Xiamen, Fujian 361005, China
| |
Collapse
|
42
|
Sowa JK, Roberts ST, Rossky PJ. Exploring Configurations of Nanocrystal Ligands Using Machine-Learned Force Fields. J Phys Chem Lett 2023; 14:7215-7222. [PMID: 37552568 DOI: 10.1021/acs.jpclett.3c01618] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/10/2023]
Abstract
Semiconducting nanocrystals passivated with organic ligands have emerged as a powerful platform for light harvesting, light-driven chemical reactions, and sensing. Due to their complexity and size, little structural information is available from experiments, making these systems challenging to model computationally. Here, we develop a machine-learned force field trained on DFT data and use it to investigate the surface chemistry of a PbS nanocrystal interfaced with acetate ligands. In doing so, we go beyond considering individual local minimum energy geometries and, importantly, circumvent a precarious issue associated with the assumption of a single assigned atomic partial charge for each element in a nanocrystal, independent of its structural position. We demonstrate that the carboxylate ligands passivate the metal-rich surfaces by adopting a very wide range of "tilted-bridge" and "bridge" geometries and investigate the corresponding ligand IR spectrum. This work illustrates the potential of machine-learned force fields to transform computational modeling of these materials.
Collapse
Affiliation(s)
- Jakub K Sowa
- Department of Chemistry, Rice University, Houston, Texas 77005, United States
- Center for Adapting Flaws into Features, Rice University, Houston, Texas 77005, United States
| | - Sean T Roberts
- Department of Chemistry, The University of Texas at Austin, Austin, Texas 78712, United States
- Center for Adapting Flaws into Features, Rice University, Houston, Texas 77005, United States
| | - Peter J Rossky
- Department of Chemistry, Rice University, Houston, Texas 77005, United States
- Center for Adapting Flaws into Features, Rice University, Houston, Texas 77005, United States
| |
Collapse
|
43
|
Karunaratne E, Hill DW, Dührkop K, Böcker S, Grant DF. Combining Experimental with Computational Infrared and Mass Spectra for High-Throughput Nontargeted Chemical Structure Identification. Anal Chem 2023; 95:11901-11907. [PMID: 37540774 DOI: 10.1021/acs.analchem.3c00937] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/06/2023]
Abstract
The inability to identify the structures of most metabolites detected in environmental or biological samples limits the utility of nontargeted metabolomics. The most widely used analytical approaches combine mass spectrometry and machine learning methods to rank candidate structures contained in large chemical databases. Given the large chemical space typically searched, the use of additional orthogonal data may improve the identification rates and reliability. Here, we present results of combining experimental and computational mass and IR spectral data for high-throughput nontargeted chemical structure identification. Experimental MS/MS and gas-phase IR data for 148 test compounds were obtained from NIST. Candidate structures for each of the test compounds were obtained from PubChem (mean = 4444 candidate structures per test compound). Our workflow used CSI:FingerID to initially score and rank the candidate structures. The top 1000 ranked candidates were subsequently used for IR spectra prediction, scoring, and ranking using density functional theory (DFT-IR). Final ranking of the candidates was based on a composite score calculated as the average of the CSI:FingerID and DFT-IR rankings. This approach resulted in the correct identification of 88 of the 148 test compounds (59%). 129 of the 148 test compounds (87%) were ranked within the top 20 candidates. These identification rates are the highest yet reported when candidate structures are used from PubChem. Combining experimental and computational MS/MS and IR spectral data is a potentially powerful option for prioritizing candidates for final structure verification.
Collapse
Affiliation(s)
- Erandika Karunaratne
- Department of Pharmaceutical Sciences, University of Connecticut, Storrs, Connecticut 06269, United States
| | - Dennis W Hill
- Department of Pharmaceutical Sciences, University of Connecticut, Storrs, Connecticut 06269, United States
| | - Kai Dührkop
- Chair for Bioinformatics, Faculty of Mathematics and Computer Science, Friedrich Schiller University Jena, Jena 07743, Germany
| | - Sebastian Böcker
- Chair for Bioinformatics, Faculty of Mathematics and Computer Science, Friedrich Schiller University Jena, Jena 07743, Germany
| | - David F Grant
- Department of Pharmaceutical Sciences, University of Connecticut, Storrs, Connecticut 06269, United States
| |
Collapse
|
44
|
Guan J, Lu Y, Sen K, Abdul Nasir J, Desmoutier AW, Hou Q, Zhang X, Logsdail AJ, Dutta G, Beale AM, Strange RW, Yong C, Sherwood P, Senn HM, Catlow CRA, Keal TW, Sokol AA. Computational infrared and Raman spectra by hybrid QM/MM techniques: a study on molecular and catalytic material systems. PHILOSOPHICAL TRANSACTIONS. SERIES A, MATHEMATICAL, PHYSICAL, AND ENGINEERING SCIENCES 2023; 381:20220234. [PMID: 37211033 PMCID: PMC10200352 DOI: 10.1098/rsta.2022.0234] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/23/2022] [Accepted: 04/04/2023] [Indexed: 05/23/2023]
Abstract
Vibrational spectroscopy is one of the most well-established and important techniques for characterizing chemical systems. To aid the interpretation of experimental infrared and Raman spectra, we report on recent theoretical developments in the ChemShell computational chemistry environment for modelling vibrational signatures. The hybrid quantum mechanical and molecular mechanical approach is employed, using density functional theory for the electronic structure calculations and classical forcefields for the environment. Computational vibrational intensities at chemical active sites are reported using electrostatic and fully polarizable embedding environments to achieve more realistic vibrational signatures for materials and molecular systems, including solvated molecules, proteins, zeolites and metal oxide surfaces, providing useful insight into the effect of the chemical environment on the signatures obtained from experiment. This work has been enabled by the efficient task-farming parallelism implemented in ChemShell for high-performance computing platforms. This article is part of a discussion meeting issue 'Supercomputing simulations of advanced materials'.
Collapse
Affiliation(s)
- Jingcheng Guan
- Department of Chemistry, University College London, London WC1H 0AJ, UK
| | - You Lu
- STFC Scientific Computing, Daresbury Laboratory, Keckwick Lane, Daresbury, Warrington WA4 4AD, UK
| | - Kakali Sen
- STFC Scientific Computing, Daresbury Laboratory, Keckwick Lane, Daresbury, Warrington WA4 4AD, UK
| | - Jamal Abdul Nasir
- Department of Chemistry, University College London, London WC1H 0AJ, UK
| | | | - Qing Hou
- Department of Chemistry, University College London, London WC1H 0AJ, UK
- Institute of Photonic Chips, University of Shanghai for Science of Technology, Shanghai 201512, People’s Republic of China
| | - Xingfan Zhang
- Department of Chemistry, University College London, London WC1H 0AJ, UK
| | - Andrew J. Logsdail
- Cardiff Catalysis Institute, School of Chemistry, Cardiff University, Park Place, Cardiff CF10 3AT, UK
| | - Gargi Dutta
- Department of Chemistry, University College London, London WC1H 0AJ, UK
- Department of Physics, Balurghat College, Balurghat 733101, West Bengal, India
| | - Andrew M. Beale
- Department of Chemistry, University College London, London WC1H 0AJ, UK
- Research Complex at Harwell, Rutherford Appleton Laboratory, Harwell Oxford, Didcot OX11 0FA, UK
| | - Richard W. Strange
- School of Life Sciences, University of Essex, Wivenhoe Park, Colchester, Essex CO4 3SQ, UK
| | - Chin Yong
- STFC Scientific Computing, Daresbury Laboratory, Keckwick Lane, Daresbury, Warrington WA4 4AD, UK
| | - Paul Sherwood
- Department of Chemistry, Lancaster University, Lancaster LA1 4YB, UK
| | - Hans M. Senn
- School of Chemistry, University of Glasgow, Joseph Black Building, Glasgow G12 8QQ, UK
| | - C. Richard A. Catlow
- Department of Chemistry, University College London, London WC1H 0AJ, UK
- Cardiff Catalysis Institute, School of Chemistry, Cardiff University, Park Place, Cardiff CF10 3AT, UK
- Research Complex at Harwell, Rutherford Appleton Laboratory, Harwell Oxford, Didcot OX11 0FA, UK
| | - Thomas W. Keal
- STFC Scientific Computing, Daresbury Laboratory, Keckwick Lane, Daresbury, Warrington WA4 4AD, UK
| | - Alexey A. Sokol
- Department of Chemistry, University College London, London WC1H 0AJ, UK
| |
Collapse
|
45
|
Ruth M, Gerbig D, Schreiner PR. Machine Learning for Bridging the Gap between Density Functional Theory and Coupled Cluster Energies. J Chem Theory Comput 2023. [PMID: 37418619 DOI: 10.1021/acs.jctc.3c00274] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/09/2023]
Abstract
Accurate electronic energies and properties are crucial for successful reaction design and mechanistic investigations. Computing energies and properties of molecular structures has proven extremely useful, and, with increasing computational power, the limits of high-level approaches (such as coupled cluster theory) are expanding to ever larger systems. However, because scaling is highly unfavorable, these methods are still not universally applicable to larger systems. To address the need for fast and accurate electronic energies of larger systems, we created a database of around 8000 small organic monomers (2000 dimers) optimized at the B3LYP-D3(BJ)/cc-pVTZ level of theory. This database also includes single-point energies computed at various levels of theory, including PBE1PBE, ωΒ97Χ, M06-2X, revTPSS, B3LYP, and BP86, for density functional theory as well as DLPNO-CCSD(T) and CCSD(T) for coupled cluster theory, all in conjunction with a cc-pVTZ basis. We used this database to train machine learning models based on graph neural networks using two different graph representations. Our models are able to make energy predictions from B3LYP-D3(BJ)/cc-pVTZ inputs to CCSD(T)/cc-pVTZ outputs with a mean absolute error of 0.78 and to DLPNO-CCSD(T)/cc-pVTZ with an mean absolute error of 0.50 and 0.18 kcal mol-1 for monomers and dimers, respectively. The model for dimers was further validated on the S22 database, and the monomer model was tested on challenging systems, including those with highly conjugated or functionally complex molecules.
Collapse
Affiliation(s)
- Marcel Ruth
- Institute of Organic Chemistry, Justus Liebig University, Heinrich-Buff-Ring 17, 35392 Giessen, Germany
| | - Dennis Gerbig
- Institute of Organic Chemistry, Justus Liebig University, Heinrich-Buff-Ring 17, 35392 Giessen, Germany
| | - Peter R Schreiner
- Institute of Organic Chemistry, Justus Liebig University, Heinrich-Buff-Ring 17, 35392 Giessen, Germany
| |
Collapse
|
46
|
Kabylda A, Vassilev-Galindo V, Chmiela S, Poltavsky I, Tkatchenko A. Efficient interatomic descriptors for accurate machine learning force fields of extended molecules. Nat Commun 2023; 14:3562. [PMID: 37322039 PMCID: PMC10272221 DOI: 10.1038/s41467-023-39214-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2022] [Accepted: 05/17/2023] [Indexed: 06/17/2023] Open
Abstract
Machine learning force fields (MLFFs) are gradually evolving towards enabling molecular dynamics simulations of molecules and materials with ab initio accuracy but at a small fraction of the computational cost. However, several challenges remain to be addressed to enable predictive MLFF simulations of realistic molecules, including: (1) developing efficient descriptors for non-local interatomic interactions, which are essential to capture long-range molecular fluctuations, and (2) reducing the dimensionality of the descriptors to enhance the applicability and interpretability of MLFFs. Here we propose an automatized approach to substantially reduce the number of interatomic descriptor features while preserving the accuracy and increasing the efficiency of MLFFs. To simultaneously address the two stated challenges, we illustrate our approach on the example of the global GDML MLFF. We found that non-local features (atoms separated by as far as 15 Å in studied systems) are crucial to retain the overall accuracy of the MLFF for peptides, DNA base pairs, fatty acids, and supramolecular complexes. Interestingly, the number of required non-local features in the reduced descriptors becomes comparable to the number of local interatomic features (those below 5 Å). These results pave the way to constructing global molecular MLFFs whose cost increases linearly, instead of quadratically, with system size.
Collapse
Affiliation(s)
- Adil Kabylda
- Department of Physics and Materials Science, University of Luxembourg, L-1511, Luxembourg City, Luxembourg
| | - Valentin Vassilev-Galindo
- Department of Physics and Materials Science, University of Luxembourg, L-1511, Luxembourg City, Luxembourg
| | - Stefan Chmiela
- Machine Learning Group, Technische Universität Berlin, 10587, Berlin, Germany
- BIFOLD - Berlin Institute for the Foundations of Learning and Data, 10587, Berlin, Germany
| | - Igor Poltavsky
- Department of Physics and Materials Science, University of Luxembourg, L-1511, Luxembourg City, Luxembourg
| | - Alexandre Tkatchenko
- Department of Physics and Materials Science, University of Luxembourg, L-1511, Luxembourg City, Luxembourg.
| |
Collapse
|
47
|
Tang Z, Bromley ST, Hammer B. A machine learning potential for simulating infrared spectra of nanosilicate clusters. J Chem Phys 2023; 158:2895243. [PMID: 37290080 DOI: 10.1063/5.0150379] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2023] [Accepted: 05/23/2023] [Indexed: 06/10/2023] Open
Abstract
The use of machine learning (ML) in chemical physics has enabled the construction of interatomic potentials having the accuracy of ab initio methods and a computational cost comparable to that of classical force fields. Training an ML model requires an efficient method for the generation of training data. Here, we apply an accurate and efficient protocol to collect training data for constructing a neural network-based ML interatomic potential for nanosilicate clusters. Initial training data are taken from normal modes and farthest point sampling. Later on, the set of training data is extended via an active learning strategy in which new data are identified by the disagreement between an ensemble of ML models. The whole process is further accelerated by parallel sampling over structures. We use the ML model to run molecular dynamics simulations of nanosilicate clusters with various sizes, from which infrared spectra with anharmonicity included can be extracted. Such spectroscopic data are needed for understanding the properties of silicate dust grains in the interstellar medium and in circumstellar environments.
Collapse
Affiliation(s)
- Zeyuan Tang
- Center for Interstellar Catalysis, Department of Physics and Astronomy, Aarhus University, Ny Munkegade 120, Aarhus C 8000, Denmark
| | - Stefan T Bromley
- Departament de Ciència de Materials i Química Física and Institut de Química Teòrica i Computatcional (IQTCUB), Universitat de Barcelona, c/Martí i Franquès 1-11, 08028 Barcelona, Spain
- Institució Catalana de Recerca i Estudis Avançats (ICREA), Passeig Lluis Companys 23, 08010 Barcelona, Spain
| | - Bjørk Hammer
- Center for Interstellar Catalysis, Department of Physics and Astronomy, Aarhus University, Ny Munkegade 120, Aarhus C 8000, Denmark
| |
Collapse
|
48
|
Chan K, Ta LT, Huang Y, Su H, Lin Z. Incorporating Domain Knowledge and Structure-Based Descriptors for Machine Learning: A Case Study of Pd-Catalyzed Sonogashira Reactions. Molecules 2023; 28:4730. [PMID: 37375286 DOI: 10.3390/molecules28124730] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2023] [Revised: 06/10/2023] [Accepted: 06/10/2023] [Indexed: 06/29/2023] Open
Abstract
Machine learning has revolutionized information processing for large datasets across various fields. However, its limited interpretability poses a significant challenge when applied to chemistry. In this study, we developed a set of simple molecular representations to capture the structural information of ligands in palladium-catalyzed Sonogashira coupling reactions of aryl bromides. Drawing inspiration from human understanding of catalytic cycles, we used a graph neural network to extract structural details of the phosphine ligand, a major contributor to the overall activation energy. We combined these simple molecular representations with an electronic descriptor of aryl bromide as inputs for a fully connected neural network unit. The results allowed us to predict rate constants and gain mechanistic insights into the rate-limiting oxidative addition process using a relatively small dataset. This study highlights the importance of incorporating domain knowledge in machine learning and presents an alternative approach to data analysis.
Collapse
Affiliation(s)
- Kalok Chan
- Department of Chemistry, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong SAR, China
| | - Long Thanh Ta
- Department of Chemistry, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong SAR, China
| | - Yong Huang
- Department of Chemistry, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong SAR, China
| | - Haibin Su
- Department of Chemistry, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong SAR, China
| | - Zhenyang Lin
- Department of Chemistry, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong SAR, China
| |
Collapse
|
49
|
Wang L, Zhang P, Geng Y, Zhu Z, Yuan S. Harmonic Vibrational Frequency Simulation of Pharmaceutical Molecules via a Novel Multi-Molecular Fragment Interception Method. Molecules 2023; 28:4638. [PMID: 37375193 DOI: 10.3390/molecules28124638] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2023] [Revised: 06/05/2023] [Accepted: 06/05/2023] [Indexed: 06/29/2023] Open
Abstract
By means of a computational method based on Density Functional Theory (DFT), using commercially available software, a novel method for simulating equilibrium geometry harmonic vibrational frequencies is proposed. Finasteride, Lamivudine, and Repaglinide were selected as model molecules to study the adaptability of the new method. Three molecular models, namely the single-molecular, central-molecular, and multi-molecular fragment models, were constructed and calculated by Generalized Gradient Approximations (GGAs) with the PBE functional via the Material Studio 8.0 program. Theoretical vibrational frequencies were assigned and compared to the corresponding experimental data. The results indicated that the traditional single-molecular calculation and scaled spectra with scale factor exhibited the worst similarity for all three pharmaceutical molecules among the three models. Furthermore, the central-molecular model with a configuration closer to the empirical structure resulted in a reduction of mean absolute error (MAE) and root mean squared error (RMSE) in all three pharmaceutics, including the hydrogen-bonded functional groups. However, the improvement in computational accuracy for different drug molecules using the central-molecular model for vibrational frequency calculation was unstable. Whereas, the new multi-molecular fragment interception method showed the best agreement with experimental results, exhibiting MAE and RMSE values of 8.21 cm-1 and 18.35 cm-1 for Finasteride, 15.95 cm-1 and 26.46 cm-1 for Lamivudine, and 12.10 cm-1 and 25.82 cm-1 for Repaglinide. Additionally, this work provides comprehensive vibrational frequency calculations and assignments for Finasteride, Lamivudine, and Repaglinide, which have never been thoroughly investigated in previous research.
Collapse
Affiliation(s)
- Linjie Wang
- School of Chemical Engineering, Shandong Institute of Petroleum and Chemical Technology, Dongying 257061, China
| | - Pengtu Zhang
- School of Chemical Engineering, Shandong Institute of Petroleum and Chemical Technology, Dongying 257061, China
| | - Yali Geng
- School of Chemical Engineering, Shandong Institute of Petroleum and Chemical Technology, Dongying 257061, China
| | - Zaisheng Zhu
- School of Chemical Engineering, Shandong Institute of Petroleum and Chemical Technology, Dongying 257061, China
| | - Shiling Yuan
- School of Chemical Engineering, Shandong Institute of Petroleum and Chemical Technology, Dongying 257061, China
- School of Chemistry and Chemical Engineering, Shandong University, Jinan 250199, China
| |
Collapse
|
50
|
Mortazavi B, Zhuang X, Rabczuk T, Shapeev AV. Atomistic modeling of the mechanical properties: the rise of machine learning interatomic potentials. MATERIALS HORIZONS 2023; 10:1956-1968. [PMID: 37014053 DOI: 10.1039/d3mh00125c] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
Since the birth of the concept of machine learning interatomic potentials (MLIPs) in 2007, a growing interest has been developed in the replacement of empirical interatomic potentials (EIPs) with MLIPs, in order to conduct more accurate and reliable molecular dynamics calculations. As an exciting novel progress, in the last couple of years the applications of MLIPs have been extended towards the analysis of mechanical and failure responses, providing novel opportunities not heretofore efficiently achievable, neither by EIPs nor by density functional theory (DFT) calculations. In this minireview, we first briefly discuss the basic concepts of MLIPs and outline popular strategies for developing a MLIP. Next, by considering several examples of recent studies, the robustness of MLIPs in the analysis of the mechanical properties will be highlighted, and their advantages over EIP and DFT methods will be emphasized. MLIPs furthermore offer astonishing capabilities to combine the robustness of the DFT method with continuum mechanics, enabling the first-principles multiscale modeling of mechanical properties of nanostructures at the continuum level. Last but not least, the common challenges of MLIP-based molecular dynamics simulations of mechanical properties are outlined and suggestions for future investigations are proposed.
Collapse
Affiliation(s)
- Bohayra Mortazavi
- Chair of Computational Science and Simulation Technology, Department of Mathematics and Physics, Leibniz Universität Hannover, Appelstraße 11, 30167 Hannover, Germany.
- Cluster of Excellence PhoenixD (Photonics, Optics, And Engineering-Innovation Across Disciplines), Gottfried Wilhelm Leibniz Universität Hannover, Hannover, Germany
| | - Xiaoying Zhuang
- Chair of Computational Science and Simulation Technology, Department of Mathematics and Physics, Leibniz Universität Hannover, Appelstraße 11, 30167 Hannover, Germany.
- College of Civil Engineering, Department of Geotechnical Engineering, Tongji University, 1239 Siping Road, Shanghai, China
| | - Timon Rabczuk
- Institute of Structural Mechanics, Bauhaus-Universität Weimar, Marienstr. 15, 99423 Weimar, Germany
| | - Alexander V Shapeev
- Skolkovo Institute of Science and Technology, Skolkovo Innovation Center, Bolshoy Bulvar 30, Moscow, 143026, Russia.
| |
Collapse
|