1
|
Berger E, Niemelä J, Lampela O, Juffer AH, Komsa HP. Raman Spectra of Amino Acids and Peptides from Machine Learning Polarizabilities. J Chem Inf Model 2024; 64:4601-4612. [PMID: 38829726 DOI: 10.1021/acs.jcim.4c00077] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/05/2024]
Abstract
Raman spectroscopy is an important tool in the study of vibrational properties and composition of molecules, peptides, and even proteins. Raman spectra can be simulated based on the change of the electronic polarizability with vibrations, which can nowadays be efficiently obtained via machine learning models trained on first-principles data. However, the transferability of the models trained on small molecules to larger structures is unclear, and direct training on large structures is prohibitively expensive. In this work, we first train two machine learning models to predict the polarizabilities of all 20 amino acids. Both models are carefully benchmarked and compared to density functional theory (DFT) calculations, with the neural network method being found to offer better transferability. By combination of machine learning models with classical force field molecular dynamics, Raman spectra of all amino acids are also obtained and investigated, showing good agreement with experiments. The models are further extended to small peptides. We find that adding structures containing peptide bonds to the training set greatly improves predictions, even for peptides not included in training sets.
Collapse
Affiliation(s)
- Ethan Berger
- Microelectronics Research Unit, Faculty of Information Technology and Electrical Engineering, University of Oulu, P.O. Box 4500, Oulu FIN-90014, Finland
| | - Juha Niemelä
- Faculty of Biochemistry and Molecular Medicine, University of Oulu, Oulu FIN-90014, Finland
| | - Outi Lampela
- Biocenter Oulu and Faculty of Biochemistry and Molecular Medicine, University of Oulu, Oulu FIN-90014, Finland
| | - André H Juffer
- Biocenter Oulu and Faculty of Biochemistry and Molecular Medicine, University of Oulu, Oulu FIN-90014, Finland
| | - Hannu-Pekka Komsa
- Microelectronics Research Unit, Faculty of Information Technology and Electrical Engineering, University of Oulu, P.O. Box 4500, Oulu FIN-90014, Finland
| |
Collapse
|
2
|
Jana A, Shepherd S, Litman Y, Wilkins DM. Learning Electronic Polarizations in Aqueous Systems. J Chem Inf Model 2024; 64:4426-4435. [PMID: 38804973 PMCID: PMC11167596 DOI: 10.1021/acs.jcim.4c00421] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2024] [Revised: 05/10/2024] [Accepted: 05/14/2024] [Indexed: 05/29/2024]
Abstract
The polarization of periodically repeating systems is a discontinuous function of the atomic positions, a fact which seems at first to stymie attempts at their statistical learning. Two approaches to build models for bulk polarizations are compared: one in which a simple point charge model is used to preprocess the raw polarization to give a learning target that is a smooth function of atomic positions and the total polarization is learned as a sum of atom-centered dipoles and one in which instead the average position of Wannier centers around atoms is predicted. For a range of bulk aqueous systems, both of these methods perform perform comparatively well, with the former being slightly better but often requiring an extra effort to find a suitable point charge model. As a challenging test, we also analyze the performance of the models at the air-water interface. In this case, while the Wannier center approach delivers accurate predictions without further modifications, the preprocessing method requires augmentation with information from isolated water molecules to reach similar accuracy. Finally, we present a simple protocol to preprocess the polarizations in a data-driven way using a small number of derivatives calculated at a much lower level of theory, thus overcoming the need to find point charge models without appreciably increasing the computation cost. We believe that the training strategies presented here help the construction of accurate polarization models required for the study of the dielectric properties of realistic complex bulk systems and interfaces with ab initio accuracy.
Collapse
Affiliation(s)
- Arnab Jana
- Centre
for Quantum Materials and Technologies, School of Mathematics and
Physics, Queen’s University Belfast, Belfast BT7 1NN, U.K.
| | - Sam Shepherd
- Centre
for Quantum Materials and Technologies, School of Mathematics and
Physics, Queen’s University Belfast, Belfast BT7 1NN, U.K.
| | - Yair Litman
- Yusuf
Hamied Department of Chemistry, University
of Cambridge, Lensfield Road, Cambridge CB2 1EW, U.K.
| | - David M. Wilkins
- Centre
for Quantum Materials and Technologies, School of Mathematics and
Physics, Queen’s University Belfast, Belfast BT7 1NN, U.K.
| |
Collapse
|
3
|
Althorpe SC. Path Integral Simulations of Condensed-Phase Vibrational Spectroscopy. Annu Rev Phys Chem 2024; 75:397-420. [PMID: 38941531 DOI: 10.1146/annurev-physchem-090722-124705] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/30/2024]
Abstract
Recent theoretical and algorithmic developments have improved the accuracy with which path integral dynamics methods can include nuclear quantum effects in simulations of condensed-phase vibrational spectra. Such methods are now understood to be approximations to the delocalized classical Matsubara dynamics of smooth Feynman paths, which dominate the dynamics of systems such as liquid water at room temperature. Focusing mainly on simulations of liquid water and hexagonal ice, we explain how the recently developed quasicentroid molecular dynamics (QCMD), fast-QCMD, and temperature-elevated path integral coarse-graining simulations (Te PIGS) methods generate classical dynamics on potentials of mean force obtained by averaging over quantum thermal fluctuations. These new methods give very close agreement with one another, and the Te PIGS method has recently yielded excellent agreement with experimentally measured vibrational spectra for liquid water, ice, and the liquid-air interface. We also discuss the limitations of such methods.
Collapse
Affiliation(s)
- Stuart C Althorpe
- Yusuf Hamied Department of Chemistry, University of Cambridge, Cambridge, United Kingdom;
| |
Collapse
|
4
|
Xu N, Rosander P, Schäfer C, Lindgren E, Österbacka N, Fang M, Chen W, He Y, Fan Z, Erhart P. Tensorial Properties via the Neuroevolution Potential Framework: Fast Simulation of Infrared and Raman Spectra. J Chem Theory Comput 2024; 20:3273-3284. [PMID: 38572734 PMCID: PMC11044275 DOI: 10.1021/acs.jctc.3c01343] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2023] [Revised: 03/19/2024] [Accepted: 04/01/2024] [Indexed: 04/05/2024]
Abstract
Infrared and Raman spectroscopy are widely used for the characterization of gases, liquids, and solids, as the spectra contain a wealth of information concerning, in particular, the dynamics of these systems. Atomic scale simulations can be used to predict such spectra but are often severely limited due to high computational cost or the need for strong approximations that limit the application range and reliability. Here, we introduce a machine learning (ML) accelerated approach that addresses these shortcomings and provides a significant performance boost in terms of data and computational efficiency compared with earlier ML schemes. To this end, we generalize the neuroevolution potential approach to enable the prediction of rank one and two tensors to obtain the tensorial neuroevolution potential (TNEP) scheme. We apply the resulting framework to construct models for the dipole moment, polarizability, and susceptibility of molecules, liquids, and solids and show that our approach compares favorably with several ML models from the literature with respect to accuracy and computational efficiency. Finally, we demonstrate the application of the TNEP approach to the prediction of infrared and Raman spectra of liquid water, a molecule (PTAF-), and a prototypical perovskite with strong anharmonicity (BaZrO3). The TNEP approach is implemented in the free and open source software package gpumd, which makes this methodology readily available to the scientific community.
Collapse
Affiliation(s)
- Nan Xu
- Institute
of Zhejiang University-Quzhou, Quzhou 324000, P. R. China
- College
of Chemical and Biological Engineering, Zhejiang University, Hangzhou 310058, P. R. China
| | - Petter Rosander
- Department
of Physics, Chalmers University of Technology, SE-41296 Gothenburg, Sweden
| | - Christian Schäfer
- Department
of Physics, Chalmers University of Technology, SE-41296 Gothenburg, Sweden
| | - Eric Lindgren
- Department
of Physics, Chalmers University of Technology, SE-41296 Gothenburg, Sweden
| | - Nicklas Österbacka
- Department
of Physics, Chalmers University of Technology, SE-41296 Gothenburg, Sweden
| | - Mandi Fang
- Institute
of Zhejiang University-Quzhou, Quzhou 324000, P. R. China
- College
of Chemical and Biological Engineering, Zhejiang University, Hangzhou 310058, P. R. China
| | - Wei Chen
- State
Key Laboratory of Multiphase Complex Systems, Institute of Process
Engineering, Chinese Academy of Sciences, Beijing 100190, P. R. China
| | - Yi He
- Institute
of Zhejiang University-Quzhou, Quzhou 324000, P. R. China
- College
of Chemical and Biological Engineering, Zhejiang University, Hangzhou 310058, P. R. China
- Department
of Chemical Engineering, University of Washington, Seattle, Washington 98195, United States
| | - Zheyong Fan
- College
of Physical Science and Technology, Bohai
University, Jinzhou 121013, P. R. China
| | - Paul Erhart
- Department
of Physics, Chalmers University of Technology, SE-41296 Gothenburg, Sweden
| |
Collapse
|
5
|
de la Puente M, Gomez A, Laage D. Neural Network-Based Sum-Frequency Generation Spectra of Pure and Acidified Water Interfaces with Air. J Phys Chem Lett 2024; 15:3096-3102. [PMID: 38470065 DOI: 10.1021/acs.jpclett.4c00113] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/13/2024]
Abstract
The affinity of hydronium ions (H3O+) for the air-water interface is a crucial question in environmental chemistry. While sum-frequency generation (SFG) spectroscopy has been instrumental in indicating the preference of H3O+ for the interface, key questions persist regarding the molecular origin of the SFG spectral changes in acidified water. Here we combine nanosecond long neural network (NN) reactive simulations of pure and acidified water slabs with NN predictions of molecular dipoles and polarizabilities to calculate SFG spectra of long reactive trajectories including proton transfer events. Our simulations show that H3O+ ions cause two distinct changes in phase-resolved SFG spectra: first, a low-frequency tail due to the vibrations of H3O+ and its first hydration shell, analogous to the bulk proton continuum, and second, an enhanced hydrogen-bonded band due to the ion-induced static field polarizing molecules in deeper layers. Our calculations confirm that changes in the SFG spectra of acidic solutions are caused by hydronium ions preferentially residing at the interface.
Collapse
Affiliation(s)
- Miguel de la Puente
- PASTEUR, Department of Chemistry, École Normale Supérieur, PSL University, Sorbonne Université, CNRS, 75005 Paris, France
| | - Axel Gomez
- PASTEUR, Department of Chemistry, École Normale Supérieur, PSL University, Sorbonne Université, CNRS, 75005 Paris, France
| | - Damien Laage
- PASTEUR, Department of Chemistry, École Normale Supérieur, PSL University, Sorbonne Université, CNRS, 75005 Paris, France
| |
Collapse
|
6
|
Kapil V, Kovács DP, Csányi G, Michaelides A. First-principles spectroscopy of aqueous interfaces using machine-learned electronic and quantum nuclear effects. Faraday Discuss 2024; 249:50-68. [PMID: 37799072 PMCID: PMC10845015 DOI: 10.1039/d3fd00113j] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2023] [Accepted: 07/18/2023] [Indexed: 10/07/2023]
Abstract
Vibrational spectroscopy is a powerful approach to visualising interfacial phenomena. However, extracting structural and dynamical information from vibrational spectra is a challenge that requires first-principles simulations, including non-Condon and quantum nuclear effects. We address this challenge by developing a machine-learning enhanced first-principles framework to speed up predictive modelling of infrared, Raman, and sum-frequency generation spectra. Our approach uses machine learning potentials that encode quantum nuclear effects to generate quantum trajectories using simple molecular dynamics efficiently. In addition, we reformulate bulk and interfacial selection rules to express them unambiguously in terms of the derivatives of polarisation and polarisabilities of the whole system and predict these derivatives efficiently using fully-differentiable machine learning models of dielectric response tensors. We demonstrate our framework's performance by predicting the IR, Raman, and sum-frequency generation spectra of liquid water, ice and the water-air interface by achieving near quantitative agreement with experiments at nearly the same computational efficiency as pure classical methods. Finally, to aid the experimental discovery of new phases of nanoconfined water, we predict the temperature-dependent vibrational spectra of monolayer water across the solid-hexatic-liquid phases transition.
Collapse
Affiliation(s)
- Venkat Kapil
- Yusuf Hamied Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge, CB2 1EW, UK.
| | | | - Gábor Csányi
- Engineering Laboratory, University of Cambridge, Cambridge, CB2 1PZ, UK
| | - Angelos Michaelides
- Yusuf Hamied Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge, CB2 1EW, UK.
| |
Collapse
|
7
|
Yu X, Chiang KY, Yu CC, Bonn M, Nagata Y. On the Fresnel factor correction of sum-frequency generation spectra of interfacial water. J Chem Phys 2023; 158:044701. [PMID: 36725499 DOI: 10.1063/5.0133428] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023] Open
Abstract
Insights into the microscopic structure of aqueous interfaces are essential for understanding the chemical and physical processes on the water surface, including chemical synthesis, atmospheric chemistry, and events in biomolecular systems. These aqueous interfaces have been probed by heterodyne-detected sum-frequency generation (HD-SFG) spectroscopy. To obtain the molecular response from the measured HD-SFG spectra, one needs to correct the measured ssp spectra for local electromagnetic field effects at the interface due to a spatially varying dielectric function. This so-called Fresnel factor correction can change the inferred response substantially, and different ways of performing this correction lead to different conclusions about the interfacial water response. Here, we compare the simulated and experimental spectra at the air/water interface. We use three previously developed models to compare the experiment with theory: an advanced approach taking into account the detailed inhomogeneous interfacial dielectric profile and the Lorentz and slab models to approximate the interfacial dielectric function. Using the advanced model, we obtain an excellent quantitative agreement between theory and experiment, in both spectral shape and amplitude. Remarkably, we find that for the Fresnel factor correction of the ssp spectra, the Lorentz model for the interfacial dielectric function is equally accurate in the hydrogen (H)-bonded region of the response, while the slab model underestimates this response significantly. The Lorentz model, thus, provides a straightforward method to obtain the molecular response from the measured spectra of aqueous interfaces in the H-bonded region.
Collapse
Affiliation(s)
- Xiaoqing Yu
- Max Planck Institute for Polymer Research, Ackermannweg 10, 55128 Mainz, Germany
| | - Kuo-Yang Chiang
- Max Planck Institute for Polymer Research, Ackermannweg 10, 55128 Mainz, Germany
| | - Chun-Chieh Yu
- Max Planck Institute for Polymer Research, Ackermannweg 10, 55128 Mainz, Germany
| | - Mischa Bonn
- Max Planck Institute for Polymer Research, Ackermannweg 10, 55128 Mainz, Germany
| | - Yuki Nagata
- Max Planck Institute for Polymer Research, Ackermannweg 10, 55128 Mainz, Germany
| |
Collapse
|
8
|
Schienbein P. Spectroscopy from Machine Learning by Accurately Representing the Atomic Polar Tensor. J Chem Theory Comput 2023; 19:705-712. [PMID: 36695707 PMCID: PMC9933433 DOI: 10.1021/acs.jctc.2c00788] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Abstract
Vibrational spectroscopy is a key technique to elucidate microscopic structure and dynamics. Without the aid of theoretical approaches, it is, however, often difficult to understand such spectra at a microscopic level. Ab initio molecular dynamics has repeatedly proved to be suitable for this purpose; however, the computational cost can be daunting. Here, the E(3)-equivariant neural network e3nn is used to fit the atomic polar tensor of liquid water a posteriori on top of existing molecular dynamics simulations. Notably, the introduced methodology is general and thus transferable to any other system as well. The target property is most fundamental and gives access to the IR spectrum, and more importantly, it is a highly powerful tool to directly assign IR spectral features to nuclear motion─a connection which has been pursued in the past but only using severe approximations due to the prohibitive computational cost. The herein introduced methodology overcomes this bottleneck. To benchmark the machine learning model, the IR spectrum of liquid water is calculated, indeed showing excellent agreement with the explicit reference calculation. In conclusion, the presented methodology gives a new route to calculate accurate IR spectra from molecular dynamics simulations and will facilitate the understanding of such spectra on a microscopic level.
Collapse
|
9
|
Shanavas Rasheeda D, Martín Santa Daría A, Schröder B, Mátyus E, Behler J. High-dimensional neural network potentials for accurate vibrational frequencies: the formic acid dimer benchmark. Phys Chem Chem Phys 2022; 24:29381-29392. [PMID: 36459127 DOI: 10.1039/d2cp03893e] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
In recent years, machine learning potentials (MLP) for atomistic simulations have attracted a lot of attention in chemistry and materials science. Many new approaches have been developed with the primary aim to transfer the accuracy of electronic structure calculations to large condensed systems containing thousands of atoms. In spite of these advances, the reliability of modern MLPs in reproducing the subtle details of the multi-dimensional potential-energy surface is still difficult to assess for such systems. On the other hand, moderately sized systems enabling the application of tools for thorough and systematic quality-control are nowadays rarely investigated. In this work we use benchmark-quality harmonic and anharmonic vibrational frequencies as a sensitive probe for the validation of high-dimensional neural network potentials. For the case of the formic acid dimer, a frequently studied model system for which stringent spectroscopic data became recently available, we show that high-quality frequencies can be obtained from state-of-the-art calculations in excellent agreement with coupled cluster theory and experimental data.
Collapse
Affiliation(s)
- Dilshana Shanavas Rasheeda
- Universität Göttingen, Institut für Physikalische Chemie, Theoretische Chemie, Tammannstraβe 6, 37077 Göttingen, Germany.
| | - Alberto Martín Santa Daría
- ELTE, Eötvös Loránd University, Institute of Chemistry, Pázmány Péter sétány 1/A, 1117 Budapest, Hungary
| | - Benjamin Schröder
- Universität Göttingen, Institut für Physikalische Chemie, Tammannstraβe 6, 37077 Göttingen, Germany
| | - Edit Mátyus
- ELTE, Eötvös Loránd University, Institute of Chemistry, Pázmány Péter sétány 1/A, 1117 Budapest, Hungary
| | - Jörg Behler
- Universität Göttingen, Institut für Physikalische Chemie, Theoretische Chemie, Tammannstraβe 6, 37077 Göttingen, Germany.
| |
Collapse
|
10
|
|
11
|
A complete description of thermodynamic stabilities of molecular crystals. Proc Natl Acad Sci U S A 2022; 119:2111769119. [PMID: 35131847 PMCID: PMC8832981 DOI: 10.1073/pnas.2111769119] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/23/2021] [Indexed: 12/27/2022] Open
Abstract
Predicting stable polymorphs of molecular crystals remains one of the grand challenges of computational science. Current methods invoke approximations to electronic structure and statistical mechanics and thus fail to consistently reproduce the delicate balance of physical effects determining thermodynamic stability. We compute the rigorous ab initio Gibbs free energies for competing polymorphs of paradigmatic compounds, using machine learning to mitigate costs. The accurate description of electronic structure and full treatment of quantum statistical mechanics allow us to predict the experimentally observed phase behavior. This constitutes a key step toward the first-principles design of functional materials for applications from photovoltaics to pharmaceuticals. Predictions of relative stabilities of (competing) molecular crystals are of great technological relevance, most notably for the pharmaceutical industry. However, they present a long-standing challenge for modeling, as often minuscule free energy differences are sensitively affected by the description of electronic structure, the statistical mechanics of the nuclei and the cell, and thermal expansion. The importance of these effects has been individually established, but rigorous free energy calculations for general molecular compounds, which simultaneously account for all effects, have hitherto not been computationally viable. Here we present an efficient “end to end” framework that seamlessly combines state-of-the art electronic structure calculations, machine-learning potentials, and advanced free energy methods to calculate ab initio Gibbs free energies for general organic molecular materials. The facile generation of machine-learning potentials for a diverse set of polymorphic compounds—benzene, glycine, and succinic acid—and predictions of thermodynamic stabilities in qualitative and quantitative agreement with experiments highlight that predictive thermodynamic studies of industrially relevant molecular materials are no longer a daunting task.
Collapse
|