1
|
Barrios Herrera L, Lourenço MP, Hostaš J, Calaminici P, Köster AM, Tchagang A, Salahub DR. Active-learning for global optimization of Ni-Ceria nanoparticles: The case of Ce 4-xNi xO 8- x (x = 1, 2, 3). J Comput Chem 2024; 45:1643-1656. [PMID: 38551129 DOI: 10.1002/jcc.27346] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2023] [Revised: 02/15/2024] [Accepted: 03/05/2024] [Indexed: 06/04/2024]
Abstract
Ni-CeO2 nanoparticles (NPs) are promising nanocatalysts for water splitting and water gas shift reactions due to the ability of ceria to temporarily donate oxygen to the catalytic reaction and accept oxygen after the reaction is completed. Therefore, elucidating how different properties of the Ni-Ceria NPs relate to the activity and selectivity of the catalytic reaction, is of crucial importance for the development of novel catalysts. In this work the active learning (AL) method based on machine learning regression and its uncertainty is used for the global optimization of Ce(4-x)NixO(8-x) (x = 1, 2, 3) nanoparticles, employing density functional theory calculations. Additionally, further investigation of the NPs by mass-scaled parallel-tempering Born-Oppenheimer molecular dynamics resulted in the same putative global minimum structures found by AL, demonstrating the robustness of our AL search to learn from small datasets and assist in the global optimization of complex electronic structure systems.
Collapse
Affiliation(s)
- Lizandra Barrios Herrera
- Department of Chemistry, Department of Physics and Astronomy, CMS Centre for Molecular Simulation, IQST Institute for Quantum Science and Technology, Quantum Alberta, University of Calgary, Calgary, Canada
| | - Maicon Pierre Lourenço
- Departamento de Química e Física, Centro de Ciências Exatas, Naturais e da Saúde (CCENS), Universidade Federal do Espírito Santo, Espírito Santo, Brasil
| | - Jiří Hostaš
- Department of Chemistry, Department of Physics and Astronomy, CMS Centre for Molecular Simulation, IQST Institute for Quantum Science and Technology, Quantum Alberta, University of Calgary, Calgary, Canada
- Digital Technologies Research Centre, National Research Council of Canada, Ottawa, Canada
| | | | | | - Alain Tchagang
- Digital Technologies Research Centre, National Research Council of Canada, Ottawa, Canada
| | - Dennis R Salahub
- Department of Chemistry, Department of Physics and Astronomy, CMS Centre for Molecular Simulation, IQST Institute for Quantum Science and Technology, Quantum Alberta, University of Calgary, Calgary, Canada
| |
Collapse
|
2
|
Deo S, Kreider ME, Kamat G, Hubert M, Zamora Zeledón JA, Wei L, Matthews J, Keyes N, Singh I, Jaramillo TF, Abild-Pedersen F, Burke Stevens M, Winther K, Voss J. Interpretable Machine Learning Models for Practical Antimonate Electrocatalyst Performance. Chemphyschem 2024; 25:e202400010. [PMID: 38547332 DOI: 10.1002/cphc.202400010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2024] [Revised: 02/27/2024] [Indexed: 07/03/2024]
Abstract
Computationally predicting the performance of catalysts under reaction conditions is a challenging task due to the complexity of catalytic surfaces and their evolution in situ, different reaction paths, and the presence of solid-liquid interfaces in the case of electrochemistry. We demonstrate here how relatively simple machine learning models can be found that enable prediction of experimentally observed onset potentials. Inputs to our model are comprised of data from the oxygen reduction reaction on non-precious transition-metal antimony oxide nanoparticulate catalysts with a combination of experimental conditions and computationally affordable bulk atomic and electronic structural descriptors from density functional theory simulations. From human-interpretable genetic programming models, we identify key experimental descriptors and key supplemental bulk electronic and atomic structural descriptors that govern trends in onset potentials for these oxides and deduce how these descriptors should be tuned to increase onset potentials. We finally validate these machine learning predictions by experimentally confirming that scandium as a dopant in nickel antimony oxide leads to a desired onset potential increase. Macroscopic experimental factors are found to be crucially important descriptors to be considered for models of catalytic performance, highlighting the important role machine learning can play here even in the presence of small datasets.
Collapse
Affiliation(s)
- Shyam Deo
- Department of Chemical Engineering, Stanford University, Stanford, CA 94305, United States
- SUNCAT Center for Interface Science and Catalysis, SLAC National Accelerator Laboratory, Menlo Park, CA 94025, United States
| | - Melissa E Kreider
- Department of Chemical Engineering, Stanford University, Stanford, CA 94305, United States
- SUNCAT Center for Interface Science and Catalysis, SLAC National Accelerator Laboratory, Menlo Park, CA 94025, United States
| | - Gaurav Kamat
- Department of Chemical Engineering, Stanford University, Stanford, CA 94305, United States
- SUNCAT Center for Interface Science and Catalysis, SLAC National Accelerator Laboratory, Menlo Park, CA 94025, United States
| | - McKenzie Hubert
- Department of Chemical Engineering, Stanford University, Stanford, CA 94305, United States
- SUNCAT Center for Interface Science and Catalysis, SLAC National Accelerator Laboratory, Menlo Park, CA 94025, United States
| | - José A Zamora Zeledón
- Department of Chemical Engineering, Stanford University, Stanford, CA 94305, United States
- SUNCAT Center for Interface Science and Catalysis, SLAC National Accelerator Laboratory, Menlo Park, CA 94025, United States
| | - Lingze Wei
- Department of Chemical Engineering, Stanford University, Stanford, CA 94305, United States
- SUNCAT Center for Interface Science and Catalysis, SLAC National Accelerator Laboratory, Menlo Park, CA 94025, United States
| | - Jesse Matthews
- Department of Chemical Engineering, Stanford University, Stanford, CA 94305, United States
- SUNCAT Center for Interface Science and Catalysis, SLAC National Accelerator Laboratory, Menlo Park, CA 94025, United States
| | - Nathaniel Keyes
- Department of Chemical Engineering, Stanford University, Stanford, CA 94305, United States
- SUNCAT Center for Interface Science and Catalysis, SLAC National Accelerator Laboratory, Menlo Park, CA 94025, United States
| | - Ishaan Singh
- Department of Chemical Engineering, Stanford University, Stanford, CA 94305, United States
- SUNCAT Center for Interface Science and Catalysis, SLAC National Accelerator Laboratory, Menlo Park, CA 94025, United States
| | - Thomas F Jaramillo
- Department of Chemical Engineering, Stanford University, Stanford, CA 94305, United States
- SUNCAT Center for Interface Science and Catalysis, SLAC National Accelerator Laboratory, Menlo Park, CA 94025, United States
| | - Frank Abild-Pedersen
- SUNCAT Center for Interface Science and Catalysis, SLAC National Accelerator Laboratory, Menlo Park, CA 94025, United States
| | - Michaela Burke Stevens
- SUNCAT Center for Interface Science and Catalysis, SLAC National Accelerator Laboratory, Menlo Park, CA 94025, United States
| | - Kirsten Winther
- SUNCAT Center for Interface Science and Catalysis, SLAC National Accelerator Laboratory, Menlo Park, CA 94025, United States
| | - Johannes Voss
- SUNCAT Center for Interface Science and Catalysis, SLAC National Accelerator Laboratory, Menlo Park, CA 94025, United States
| |
Collapse
|
3
|
Yang J, Li J, Li J, Li J. Gaussian Process Regression for State-to-State Integral Cross Sections: The Case of the O + O 2 Collision Dissociation Reactions. J Phys Chem A 2024; 128:4966-4975. [PMID: 38869143 DOI: 10.1021/acs.jpca.4c01445] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2024]
Abstract
Research on hypersonic vehicles has become increasingly important worldwide in recent years. However, accurately simulating the dynamics of the nonequilibrium high-temperature reactions that are in the hypersonic flow around the vehicles presents a significant challenge as a large number of states and transitions are accessible even for the smallest atom-diatom reaction systems. It is quite difficult, sometimes even impossible, to exhaustively investigate all relevant combinations or determine high-dimensional analytical representations for the state-to-state reaction probabilities. In this study, we used Gaussian process regression (GPR) to fit a model based on only 807 QCT data for training. The confidence interval of the GPR prediction and the Kullback-Leibler (KL) divergence were used to help minimize the sampling amount of data for fitting the converged GPR model. The model aims to predict the state-to-state integral cross section (ICS) of the O + O2 → 3O dissociation reaction under random initial conditions (Et, v, j). In total, it took almost a month to obtain this converged GPR model, but it took only a few seconds to predict the ICS value for any initial condition. For 330 initial conditions not included in the training set, the mean-square error (MSE) between the QCT-calculated ICSs and the GPR-predicted ones is only 0.08 Å2 and the R2 is 0.9986, indicating that the GPR model can replace the direct expensive QCT calculation with high accuracy. Finally, we calculated the equilibrium dissociation rate coefficients based on the StS ICS values predicted by the GPR model, and the results were in good agreement with available experimental and theoretical results. Thus, this study provides an effective and accurate approach to the extensive direct state-to-state reaction dynamic calculations.
Collapse
Affiliation(s)
- Jiawei Yang
- School of Chemistry and Chemical Engineering & Chongqing Key Laboratory of Chemical Theory and Mechanism, Chongqing University, Chongqing 401331, China
| | - Jia Li
- School of Chemistry and Chemical Engineering & Chongqing Key Laboratory of Chemical Theory and Mechanism, Chongqing University, Chongqing 401331, China
| | - Junhong Li
- School of Chemistry and Chemical Engineering & Chongqing Key Laboratory of Chemical Theory and Mechanism, Chongqing University, Chongqing 401331, China
| | - Jun Li
- School of Chemistry and Chemical Engineering & Chongqing Key Laboratory of Chemical Theory and Mechanism, Chongqing University, Chongqing 401331, China
| |
Collapse
|
4
|
Sose AT, Gustke T, Wang F, Anand G, Pasupuleti S, Savara A, Deshmukh SA. Evaluation of Sampling Algorithms Used for Bayesian Uncertainty Quantification of Molecular Dynamics Force Fields. J Chem Theory Comput 2024. [PMID: 38924093 DOI: 10.1021/acs.jctc.4c00130] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/28/2024]
Abstract
New Bayesian parameter estimation methods have the capability to enable more physically realistic and reliable molecular dynamics (MD) simulations by providing accurate estimates of uncertainties of force-field (FF) parameters and associated properties. However, the choice of which Bayesian parameter estimation algorithm to use has not been widely investigated, despite its impact on the effective exploration of parameter space. Here, using a case example of the Embedded Atom Method (EAM) FF parameters, we investigated the ramifications of several of the algorithm choices. We found that Ensemble Slice Sampling (ESS) and Affine-Invariant Ensemble Sampling (AIES) demonstrate a new level of superior performance, culminating in more accurate parameter and property estimations with tighter uncertainty bounds, compared to traditional methods such as Metropolis-Hastings (MH), Gradient Search (GS), and Uniform Random Sampler (URS). We demonstrate that Bayesian Uncertainty Quantification with ESS and AIES leads to significantly more accurate and reliable predictions of the FF parameters and properties. The results suggest that ESS and AIES should be used to obtain more accurate parameter and uncertainty estimations while providing deeper physical insights.
Collapse
Affiliation(s)
- Abhishek T Sose
- Department of Chemical Engineering, Virginia Tech, Blacksburg, Virginia 24060, United States
| | - Troy Gustke
- Department of Chemical Engineering, Virginia Tech, Blacksburg, Virginia 24060, United States
| | - Fangxi Wang
- Department of Chemical Engineering, Virginia Tech, Blacksburg, Virginia 24060, United States
| | - Gaurav Anand
- Department of Chemical Engineering, Virginia Tech, Blacksburg, Virginia 24060, United States
| | - Sanjana Pasupuleti
- Department of Chemical Engineering, Virginia Tech, Blacksburg, Virginia 24060, United States
| | - Aditya Savara
- Oak Ridge National Laboratory, Oak Ridge, Tennessee 37830, United States
| | - Sanket A Deshmukh
- Department of Chemical Engineering, Virginia Tech, Blacksburg, Virginia 24060, United States
| |
Collapse
|
5
|
Berger E, Niemelä J, Lampela O, Juffer AH, Komsa HP. Raman Spectra of Amino Acids and Peptides from Machine Learning Polarizabilities. J Chem Inf Model 2024; 64:4601-4612. [PMID: 38829726 DOI: 10.1021/acs.jcim.4c00077] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/05/2024]
Abstract
Raman spectroscopy is an important tool in the study of vibrational properties and composition of molecules, peptides, and even proteins. Raman spectra can be simulated based on the change of the electronic polarizability with vibrations, which can nowadays be efficiently obtained via machine learning models trained on first-principles data. However, the transferability of the models trained on small molecules to larger structures is unclear, and direct training on large structures is prohibitively expensive. In this work, we first train two machine learning models to predict the polarizabilities of all 20 amino acids. Both models are carefully benchmarked and compared to density functional theory (DFT) calculations, with the neural network method being found to offer better transferability. By combination of machine learning models with classical force field molecular dynamics, Raman spectra of all amino acids are also obtained and investigated, showing good agreement with experiments. The models are further extended to small peptides. We find that adding structures containing peptide bonds to the training set greatly improves predictions, even for peptides not included in training sets.
Collapse
Affiliation(s)
- Ethan Berger
- Microelectronics Research Unit, Faculty of Information Technology and Electrical Engineering, University of Oulu, P.O. Box 4500, Oulu FIN-90014, Finland
| | - Juha Niemelä
- Faculty of Biochemistry and Molecular Medicine, University of Oulu, Oulu FIN-90014, Finland
| | - Outi Lampela
- Biocenter Oulu and Faculty of Biochemistry and Molecular Medicine, University of Oulu, Oulu FIN-90014, Finland
| | - André H Juffer
- Biocenter Oulu and Faculty of Biochemistry and Molecular Medicine, University of Oulu, Oulu FIN-90014, Finland
| | - Hannu-Pekka Komsa
- Microelectronics Research Unit, Faculty of Information Technology and Electrical Engineering, University of Oulu, P.O. Box 4500, Oulu FIN-90014, Finland
| |
Collapse
|
6
|
Singh S, Hernández-Lobato JM. Deep Kernel learning for reaction outcome prediction and optimization. Commun Chem 2024; 7:136. [PMID: 38877182 PMCID: PMC11178803 DOI: 10.1038/s42004-024-01219-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2024] [Accepted: 06/05/2024] [Indexed: 06/16/2024] Open
Abstract
Recent years have seen a rapid growth in the application of various machine learning methods for reaction outcome prediction. Deep learning models have gained popularity due to their ability to learn representations directly from the molecular structure. Gaussian processes (GPs), on the other hand, provide reliable uncertainty estimates but are unable to learn representations from the data. We combine the feature learning ability of neural networks (NNs) with uncertainty quantification of GPs in a deep kernel learning (DKL) framework to predict the reaction outcome. The DKL model is observed to obtain very good predictive performance across different input representations. It significantly outperforms standard GPs and provides comparable performance to graph neural networks, but with uncertainty estimation. Additionally, the uncertainty estimates on predictions provided by the DKL model facilitated its incorporation as a surrogate model for Bayesian optimization (BO). The proposed method, therefore, has a great potential towards accelerating reaction discovery by integrating accurate predictive models that provide reliable uncertainty estimates with BO.
Collapse
Affiliation(s)
- Sukriti Singh
- Department of Engineering, University of Cambridge, Cambridge, UK.
| | | |
Collapse
|
7
|
Weymuth T, Unsleber JP, Türtscher PL, Steiner M, Sobez JG, Müller CH, Mörchen M, Klasovita V, Grimmel SA, Eckhoff M, Csizi KS, Bosia F, Bensberg M, Reiher M. SCINE-Software for chemical interaction networks. J Chem Phys 2024; 160:222501. [PMID: 38857173 DOI: 10.1063/5.0206974] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2024] [Accepted: 05/09/2024] [Indexed: 06/12/2024] Open
Abstract
The software for chemical interaction networks (SCINE) project aims at pushing the frontier of quantum chemical calculations on molecular structures to a new level. While calculations on individual structures as well as on simple relations between them have become routine in chemistry, new developments have pushed the frontier in the field to high-throughput calculations. Chemical relations may be created by a search for specific molecular properties in a molecular design attempt, or they can be defined by a set of elementary reaction steps that form a chemical reaction network. The software modules of SCINE have been designed to facilitate such studies. The features of the modules are (i) general applicability of the applied methodologies ranging from electronic structure (no restriction to specific elements of the periodic table) to microkinetic modeling (with little restrictions on molecularity), full modularity so that SCINE modules can also be applied as stand-alone programs or be exchanged for external software packages that fulfill a similar purpose (to increase options for computational campaigns and to provide alternatives in case of tasks that are hard or impossible to accomplish with certain programs), (ii) high stability and autonomous operations so that control and steering by an operator are as easy as possible, and (iii) easy embedding into complex heterogeneous environments for molecular structures taken individually or in the context of a reaction network. A graphical user interface unites all modules and ensures interoperability. All components of the software have been made available as open source and free of charge.
Collapse
Affiliation(s)
- Thomas Weymuth
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 2, 8093 Zurich, Switzerland
| | - Jan P Unsleber
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 2, 8093 Zurich, Switzerland
| | - Paul L Türtscher
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 2, 8093 Zurich, Switzerland
| | - Miguel Steiner
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 2, 8093 Zurich, Switzerland
| | - Jan-Grimo Sobez
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 2, 8093 Zurich, Switzerland
| | - Charlotte H Müller
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 2, 8093 Zurich, Switzerland
| | - Maximilian Mörchen
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 2, 8093 Zurich, Switzerland
| | - Veronika Klasovita
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 2, 8093 Zurich, Switzerland
| | - Stephanie A Grimmel
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 2, 8093 Zurich, Switzerland
| | - Marco Eckhoff
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 2, 8093 Zurich, Switzerland
| | - Katja-Sophia Csizi
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 2, 8093 Zurich, Switzerland
| | - Francesco Bosia
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 2, 8093 Zurich, Switzerland
| | - Moritz Bensberg
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 2, 8093 Zurich, Switzerland
| | - Markus Reiher
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 2, 8093 Zurich, Switzerland
| |
Collapse
|
8
|
Zinovjev K, Hedges L, Montagud Andreu R, Woods C, Tuñón I, van der Kamp MW. emle-engine: A Flexible Electrostatic Machine Learning Embedding Package for Multiscale Molecular Dynamics Simulations. J Chem Theory Comput 2024; 20:4514-4522. [PMID: 38804055 PMCID: PMC11171281 DOI: 10.1021/acs.jctc.4c00248] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2024] [Revised: 05/17/2024] [Accepted: 05/20/2024] [Indexed: 05/29/2024]
Abstract
We present in this work the emle-engine package (https://github.com/chemle/emle-engine)─the implementation of a new machine learning embedding scheme for hybrid machine learning potential/molecular-mechanics (ML/MM) dynamics simulations. The package is based on an embedding scheme that uses a physics-based model of the electronic density and induction with a handful of tunable parameters derived from in vacuo properties of the subsystem to be embedded. This scheme is completely independent of the in vacuo potential and requires only the positions of the atoms of the machine learning subsystem and the positions and partial charges of the molecular mechanics environment. These characteristics allow emle-engine to be employed in existing QM/MM software. We demonstrate that the implemented electrostatic machine learning embedding scheme (named EMLE) is stable in enhanced sampling molecular dynamics simulations. Through the calculation of free energy surfaces of alanine dipeptide in water with two different ML options for the in vacuo potential and three embedding models, we test the performance of EMLE. When compared to the reference DFT/MM surface, the EMLE embedding is clearly superior to the MM one based on fixed partial charges. The configurational dependence of the electronic density and the inclusion of the induction energy introduced by the EMLE model leads to a systematic reduction in the average error of the free energy surface when compared to MM embedding. By enabling the usage of EMLE embedding in practical ML/MM simulations, emle-engine will make it possible to accurately model systems and processes that feature significant variations in the charge distribution of the ML subsystem and/or the interacting environment.
Collapse
Affiliation(s)
- Kirill Zinovjev
- Departamento
de Química Física, Universidad
de Valencia, 46100 Burjassot, Spain
| | - Lester Hedges
- School
of Biochemistry, University of Bristol, Biomedical Sciences Building, University
Walk, Bristol BS8 1TD, U.K.
- Research
Software Engineering, Advanced Computing
Research Centre, 31 Great
George Street, Bristol BS1 5QD, U.K.
| | | | - Christopher Woods
- Research
Software Engineering, Advanced Computing
Research Centre, 31 Great
George Street, Bristol BS1 5QD, U.K.
| | - Iñaki Tuñón
- Departamento
de Química Física, Universidad
de Valencia, 46100 Burjassot, Spain
| | - Marc W. van der Kamp
- School
of Biochemistry, University of Bristol, Biomedical Sciences Building, University
Walk, Bristol BS8 1TD, U.K.
| |
Collapse
|
9
|
Jana A, Shepherd S, Litman Y, Wilkins DM. Learning Electronic Polarizations in Aqueous Systems. J Chem Inf Model 2024; 64:4426-4435. [PMID: 38804973 PMCID: PMC11167596 DOI: 10.1021/acs.jcim.4c00421] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2024] [Revised: 05/10/2024] [Accepted: 05/14/2024] [Indexed: 05/29/2024]
Abstract
The polarization of periodically repeating systems is a discontinuous function of the atomic positions, a fact which seems at first to stymie attempts at their statistical learning. Two approaches to build models for bulk polarizations are compared: one in which a simple point charge model is used to preprocess the raw polarization to give a learning target that is a smooth function of atomic positions and the total polarization is learned as a sum of atom-centered dipoles and one in which instead the average position of Wannier centers around atoms is predicted. For a range of bulk aqueous systems, both of these methods perform perform comparatively well, with the former being slightly better but often requiring an extra effort to find a suitable point charge model. As a challenging test, we also analyze the performance of the models at the air-water interface. In this case, while the Wannier center approach delivers accurate predictions without further modifications, the preprocessing method requires augmentation with information from isolated water molecules to reach similar accuracy. Finally, we present a simple protocol to preprocess the polarizations in a data-driven way using a small number of derivatives calculated at a much lower level of theory, thus overcoming the need to find point charge models without appreciably increasing the computation cost. We believe that the training strategies presented here help the construction of accurate polarization models required for the study of the dielectric properties of realistic complex bulk systems and interfaces with ab initio accuracy.
Collapse
Affiliation(s)
- Arnab Jana
- Centre
for Quantum Materials and Technologies, School of Mathematics and
Physics, Queen’s University Belfast, Belfast BT7 1NN, U.K.
| | - Sam Shepherd
- Centre
for Quantum Materials and Technologies, School of Mathematics and
Physics, Queen’s University Belfast, Belfast BT7 1NN, U.K.
| | - Yair Litman
- Yusuf
Hamied Department of Chemistry, University
of Cambridge, Lensfield Road, Cambridge CB2 1EW, U.K.
| | - David M. Wilkins
- Centre
for Quantum Materials and Technologies, School of Mathematics and
Physics, Queen’s University Belfast, Belfast BT7 1NN, U.K.
| |
Collapse
|
10
|
Hahn AW, Zsombor-Pindera J, Kennepohl P, DeBeer S. Introducing SpectraFit: An Open-Source Tool for Interactive Spectral Analysis. ACS OMEGA 2024; 9:23252-23265. [PMID: 38854548 PMCID: PMC11155667 DOI: 10.1021/acsomega.3c09262] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/20/2023] [Revised: 05/05/2024] [Accepted: 05/10/2024] [Indexed: 06/11/2024]
Abstract
In chemistry, analyzing spectra through peak fitting is a crucial task that helps scientists extract useful quantitative information about a sample's chemical composition or electronic structure. To make this process more efficient, we have developed a new open-source software tool called SpectraFit. This tool allows users to perform quick data fitting using expressions of distribution and linear functions through the command line interface (CLI) or Jupyter Notebook, which can run on Linux, Windows, and MacOS, as well as in a Docker container. As part of our commitment to good scientific practice, we have introduced an output file-locking system to ensure the accuracy and consistency of information. This system collects input data, results data, and the initial fitting model in a single file, promoting transparency, reproducibility, collaboration, and innovation. To demonstrate SpectraFit's user-friendly interface and the advantages of its output file-locking system, we are focusing on a series of previously published iron-sulfur dimers and their XAS spectra. We will show how to analyze the XAS spectra via CLI and in a Jupyter Notebook by simultaneously fitting multiple data sets using SpectraFit. Additionally, we will demonstrate how SpectraFit can be used as a black box and white box solution, allowing users to apply their own algorithms to engineer the data further. This publication, along with its Supporting Information and the Jupyter Notebook, serves as a tutorial to guide users through each step of the process. SpectraFit will streamline the peak fitting process and provide a convenient, standardized platform for users to share fitting models, which we hope will improve transparency and reproducibility in the field of spectroscopy.
Collapse
Affiliation(s)
- Anselm W. Hahn
- Max
Planck Institute for Chemical Energy Conversion, Stiftstraße 34-36, Mülheim an der Ruhr 45470, Germany
| | - Joseph Zsombor-Pindera
- Department
of Chemistry, University of Calgary, Calgary, AB T2N 1N4, Canada
- Department
of Chemistry, The University of British
Columbia, Vancouver, BC V6T 1Z1, Canada
| | - Pierre Kennepohl
- Department
of Chemistry, University of Calgary, Calgary, AB T2N 1N4, Canada
| | - Serena DeBeer
- Max
Planck Institute for Chemical Energy Conversion, Stiftstraße 34-36, Mülheim an der Ruhr 45470, Germany
| |
Collapse
|
11
|
Ben Mahmoud C, Gardner JLA, Deringer VL. Data as the next challenge in atomistic machine learning. NATURE COMPUTATIONAL SCIENCE 2024; 4:384-387. [PMID: 38866969 DOI: 10.1038/s43588-024-00636-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2024]
|
12
|
Selloni A. Aqueous Titania Interfaces. Annu Rev Phys Chem 2024; 75:47-65. [PMID: 38271659 DOI: 10.1146/annurev-physchem-090722-015957] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2024]
Abstract
Water-metal oxide interfaces are central to many phenomena and applications, ranging from material corrosion and dissolution to photoelectrochemistry and bioengineering. In particular, the discovery of photocatalytic water splitting on TiO2 has motivated intensive studies of water-TiO2 interfaces for decades. So far, a broad understanding of the interaction of water vapor with several TiO2 surfaces has been obtained. However, much less is known about liquid water-TiO2 interfaces, which are more relevant to many practical applications. Probing these complex systems at the molecular level is experimentally challenging and is sometimes possible only through computational studies. This review summarizes recent advances in the atomistic understanding, mostly through computational simulations, of the structure and dynamics of interfacial water on TiO2 surfaces. The main focus is on the nature, molecular or dissociated, of water in direct contact with low-index defect-free crystalline surfaces. The hydroxyls resulting from water dissociation are essential in the photooxidation of water and critically affect the surface chemistry of TiO2.
Collapse
Affiliation(s)
- Annabella Selloni
- Department of Chemistry, Princeton University, Princeton, New Jersey, USA;
| |
Collapse
|
13
|
Yang Y, Zhang S, Ranasinghe KD, Isayev O, Roitberg AE. Machine Learning of Reactive Potentials. Annu Rev Phys Chem 2024; 75:371-395. [PMID: 38941524 DOI: 10.1146/annurev-physchem-062123-024417] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/30/2024]
Abstract
In the past two decades, machine learning potentials (MLPs) have driven significant developments in chemical, biological, and material sciences. The construction and training of MLPs enable fast and accurate simulations and analysis of thermodynamic and kinetic properties. This review focuses on the application of MLPs to reaction systems with consideration of bond breaking and formation. We review the development of MLP models, primarily with neural network and kernel-based algorithms, and recent applications of reactive MLPs (RMLPs) to systems at different scales. We show how RMLPs are constructed, how they speed up the calculation of reactive dynamics, and how they facilitate the study of reaction trajectories, reaction rates, free energy calculations, and many other calculations. Different data sampling strategies applied in building RMLPs are also discussed with a focus on how to collect structures for rare events and how to further improve their performance with active learning.
Collapse
Affiliation(s)
- Yinuo Yang
- Department of Chemistry, University of Florida, Gainesville, Florida;
| | - Shuhao Zhang
- Department of Chemistry, Carnegie Mellon University, Pittsburgh, Pennsylvania;
| | | | - Olexandr Isayev
- Department of Chemistry, Carnegie Mellon University, Pittsburgh, Pennsylvania;
| | - Adrian E Roitberg
- Department of Chemistry, University of Florida, Gainesville, Florida;
| |
Collapse
|
14
|
Fan M, Wen T, Chen S, Dong Y, Wang C. Perspectives Toward Damage-Tolerant Nanostructure Ceramics. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2024; 11:e2309834. [PMID: 38582503 PMCID: PMC11199990 DOI: 10.1002/advs.202309834] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/14/2023] [Revised: 03/13/2024] [Indexed: 04/08/2024]
Abstract
Advanced ceramic materials and devices call for better reliability and damage tolerance. In addition to their strong bonding nature, there are examples demonstrating superior mechanical properties of nanostructure ceramics, such as damage-tolerant ceramic aerogels that can withstand high deformation without cracking and local plasticity in dense nanocrystalline ceramics. The recent progresses shall be reviewed in this perspective article. Three topics including highly elastic nano-fibrous ceramic aerogels, load-bearing nanoceramics with improved mechanical properties, and implementing machine learning-assisted simulations toolbox in understanding the relationship among structure, deformation mechanisms, and microstructure-properties shall be discussed. It is hoped that the perspectives present here can help the discovery, synthesis, and processing of future structural ceramic materials that are insensitive to processing flaws and local damages in service.
Collapse
Affiliation(s)
- Meicen Fan
- State Key Lab of New Ceramics and Fine ProcessingSchool of Materials Science and EngineeringTsinghua UniversityBeijing100084China
| | - Tongqi Wen
- Department of Mechanical EngineeringThe University of Hong KongHong KongSARChina
| | - Shile Chen
- State Key Lab of New Ceramics and Fine ProcessingSchool of Materials Science and EngineeringTsinghua UniversityBeijing100084China
| | - Yanhao Dong
- State Key Lab of New Ceramics and Fine ProcessingSchool of Materials Science and EngineeringTsinghua UniversityBeijing100084China
| | - Chang‐An Wang
- State Key Lab of New Ceramics and Fine ProcessingSchool of Materials Science and EngineeringTsinghua UniversityBeijing100084China
| |
Collapse
|
15
|
Zarrouk T, Ibragimova R, Bartók AP, Caro MA. Experiment-Driven Atomistic Materials Modeling: A Case Study Combining X-Ray Photoelectron Spectroscopy and Machine Learning Potentials to Infer the Structure of Oxygen-Rich Amorphous Carbon. J Am Chem Soc 2024; 146:14645-14659. [PMID: 38749497 PMCID: PMC11140750 DOI: 10.1021/jacs.4c01897] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2024] [Revised: 05/02/2024] [Accepted: 05/03/2024] [Indexed: 05/30/2024]
Abstract
An important yet challenging aspect of atomistic materials modeling is reconciling experimental and computational results. Conventional approaches involve generating numerous configurations through molecular dynamics or Monte Carlo structure optimization and selecting the one with the closest match to experiment. However, this inefficient process is not guaranteed to succeed. We introduce a general method to combine atomistic machine learning (ML) with experimental observables that produces atomistic structures compatible with experiment by design. We use this approach in combination with grand-canonical Monte Carlo within a modified Hamiltonian formalism, to generate configurations that agree with experimental data and are chemically sound (low in energy). We apply our approach to understand the atomistic structure of oxygenated amorphous carbon (a-COx), an intriguing carbon-based material, to answer the question of how much oxygen can be added to carbon before it fully decomposes into CO and CO2. Utilizing an ML-based X-ray photoelectron spectroscopy (XPS) model trained from GW and density functional theory (DFT) data, in conjunction with an ML interatomic potential, we identify a-COx structures compliant with experimental XPS predictions that are also energetically favorable with respect to DFT. Employing a network analysis, we accurately deconvolve the XPS spectrum into motif contributions, both revealing the inaccuracies inherent to experimental XPS interpretation and granting us atomistic insight into the structure of a-COx. This method generalizes to multiple experimental observables and allows for the elucidation of the atomistic structure of materials directly from experimental data, thereby enabling experiment-driven materials modeling with a degree of realism previously out of reach.
Collapse
Affiliation(s)
- Tigany Zarrouk
- Department
of Chemistry and Materials Science, Aalto
University, Espoo 02150, Finland
| | - Rina Ibragimova
- Department
of Chemistry and Materials Science, Aalto
University, Espoo 02150, Finland
| | - Albert P. Bartók
- Department
of Physics, University of Warwick, Coventry CV4 7AL, U.K.
- Warwick
Centre for Predictive Modelling, School of Engineering, University of Warwick, Coventry CV4 7AL, U.K.
| | - Miguel A. Caro
- Department
of Chemistry and Materials Science, Aalto
University, Espoo 02150, Finland
| |
Collapse
|
16
|
Morrow JD, Ugwumadu C, Drabold DA, Elliott SR, Goodwin AL, Deringer VL. Understanding Defects in Amorphous Silicon with Million-Atom Simulations and Machine Learning. Angew Chem Int Ed Engl 2024; 63:e202403842. [PMID: 38517212 DOI: 10.1002/anie.202403842] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2024] [Revised: 03/14/2024] [Accepted: 03/18/2024] [Indexed: 03/23/2024]
Abstract
The structure of amorphous silicon (a-Si) is widely thought of as a fourfold-connected random network, and yet it is defective atoms, with fewer or more than four bonds, that make it particularly interesting. Despite many attempts to explain such "dangling-bond" and "floating-bond" defects, respectively, a unified understanding is still missing. Here, we use advanced computational chemistry methods to reveal the complex structural and energetic landscape of defects in a-Si. We study an ultra-large-scale, quantum-accurate structural model containing a million atoms, and thousands of individual defects, allowing reliable defect-related statistics to be obtained. We combine structural descriptors and machine-learned atomic energies to develop a classification of the different types of defects in a-Si. The results suggest a revision of the established floating-bond model by showing that fivefold-bonded atoms in a-Si exhibit a wide range of local environments-analogous to fivefold centers in coordination chemistry. Furthermore, it is shown that fivefold (but not threefold) coordination defects tend to cluster together. Our study provides new insights into one of the most widely studied amorphous solids, and has general implications for understanding defects in disordered materials beyond silicon alone.
Collapse
Affiliation(s)
- Joe D Morrow
- Inorganic Chemistry Laboratory, Department of Chemistry, University of Oxford, Oxford, OX1 3QR, United Kingdom
| | - Chinonso Ugwumadu
- Department of Physics and Astronomy, Nanoscale and Quantum Phenomena Institute (NQPI), Ohio University, Athens, Ohio, 45701, United States
| | - David A Drabold
- Department of Physics and Astronomy, Nanoscale and Quantum Phenomena Institute (NQPI), Ohio University, Athens, Ohio, 45701, United States
| | - Stephen R Elliott
- Physical and Theoretical Chemistry Laboratory, Department of Chemistry, University of, Oxford, OX1 3QZ, United Kingdom
| | - Andrew L Goodwin
- Inorganic Chemistry Laboratory, Department of Chemistry, University of Oxford, Oxford, OX1 3QR, United Kingdom
| | - Volker L Deringer
- Inorganic Chemistry Laboratory, Department of Chemistry, University of Oxford, Oxford, OX1 3QR, United Kingdom
| |
Collapse
|
17
|
Shi J, Pršlja P, Jin B, Suominen M, Sainio J, Jiang H, Han N, Robertson D, Košir J, Caro M, Kallio T. Experimental and Computational Study Toward Identifying Active Sites of Supported SnO x Nanoparticles for Electrochemical CO 2 Reduction Using Machine-Learned Interatomic Potentials. SMALL (WEINHEIM AN DER BERGSTRASSE, GERMANY) 2024:e2402190. [PMID: 38794869 DOI: 10.1002/smll.202402190] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/29/2024] [Indexed: 05/26/2024]
Abstract
SnOx has received great attention as an electrocatalyst for CO2 reduction reaction (CO2RR), however; it still suffers from low activity. Moreover, the atomic-level SnOx structure and the nature of the active sites are still ambiguous due to the dynamism of surface structure and difficulty in structure characterization under electrochemical conditions. Herein, CO2RR performance is enhanced by supporting SnO2 nanoparticles on two common supports, vulcan carbon and TiO2. Then, electrolysis of CO2 at various temperatures in a neutral electrolyte reveals that the application window for this catalyst is between 12 and 30 °C. Furthermore, this study introduces a machine learning interatomic potential method for the atomistic simulation to investigate SnO2 reduction and establish a correlation between SnOx structures and their CO2RR performance. In addition, selectivity is analyzed computationally with density functional theory simulations to identify the key differences between the binding energies of *H and *CO2 -, where both are correlated with the presence of oxygen on the nanoparticle surface. This study offers in-depth insights into the rational design and application of SnOx-based electrocatalysts for CO2RR.
Collapse
Affiliation(s)
- Junjie Shi
- Department of Chemistry and Materials Science, School of Chemical Engineering, Aalto University, Espoo, Finland
| | - Paulina Pršlja
- Department of Chemistry and Materials Science, School of Chemical Engineering, Aalto University, Espoo, Finland
| | - Benjin Jin
- Department of Chemistry and Materials Science, School of Chemical Engineering, Aalto University, Espoo, Finland
| | - Milla Suominen
- Department of Chemistry and Materials Science, School of Chemical Engineering, Aalto University, Espoo, Finland
| | - Jani Sainio
- Department of Applied Physics, School of Science, Aalto University, Espoo, Finland
| | - Hua Jiang
- Department of Applied Physics, School of Science, Aalto University, Espoo, Finland
| | - Nana Han
- Department of Chemistry and Materials Science, School of Chemical Engineering, Aalto University, Espoo, Finland
| | - Daria Robertson
- Department of Bioproducts and Biosystems, School of Chemical Engineering, Aalto University, Espoo, Finland
| | - Janez Košir
- Department of Chemistry and Materials Science, School of Chemical Engineering, Aalto University, Espoo, Finland
| | - Miguel Caro
- Department of Chemistry and Materials Science, School of Chemical Engineering, Aalto University, Espoo, Finland
| | - Tanja Kallio
- Department of Chemistry and Materials Science, School of Chemical Engineering, Aalto University, Espoo, Finland
| |
Collapse
|
18
|
Wang G, Wang C, Zhang X, Li Z, Zhou J, Sun Z. Machine learning interatomic potential: Bridge the gap between small-scale models and realistic device-scale simulations. iScience 2024; 27:109673. [PMID: 38646181 PMCID: PMC11033164 DOI: 10.1016/j.isci.2024.109673] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/23/2024] Open
Abstract
Machine learning interatomic potential (MLIP) overcomes the challenges of high computational costs in density-functional theory and the relatively low accuracy in classical large-scale molecular dynamics, facilitating more efficient and precise simulations in materials research and design. In this review, the current state of the four essential stages of MLIP is discussed, including data generation methods, material structure descriptors, six unique machine learning algorithms, and available software. Furthermore, the applications of MLIP in various fields are investigated, notably in phase-change memory materials, structure searching, material properties predicting, and the pre-trained universal models. Eventually, the future perspectives, consisting of standard datasets, transferability, generalization, and trade-off between accuracy and complexity in MLIPs, are reported.
Collapse
Affiliation(s)
- Guanjie Wang
- School of Materials Science and Engineering, Beihang University, Beijing 100191, China
- School of Integrated Circuit Science and Engineering, Beihang University, Beijing 100191, China
| | - Changrui Wang
- School of Materials Science and Engineering, Beihang University, Beijing 100191, China
| | - Xuanguang Zhang
- School of Materials Science and Engineering, Beihang University, Beijing 100191, China
| | - Zefeng Li
- School of Materials Science and Engineering, Beihang University, Beijing 100191, China
| | - Jian Zhou
- School of Materials Science and Engineering, Beihang University, Beijing 100191, China
| | - Zhimei Sun
- School of Materials Science and Engineering, Beihang University, Beijing 100191, China
| |
Collapse
|
19
|
Guibourg P, Dontot L, Anglade PM, Gervais B. DFTB Simulation of Charged Clusters Using Machine Learning Charge Inference. J Chem Theory Comput 2024; 20:4007-4018. [PMID: 38690586 DOI: 10.1021/acs.jctc.4c00107] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/02/2024]
Abstract
We present a modification to self-consistent charge density functional-based tight binding (SCC-DFTB), which allows computation based on approximate atomic charges. We obtain these charges by means of a machine learning (ML) process that combines a Coulomb model with a neural network. This allows us to avoid the SCC cycles in the SCC-DFTB calculation while maintaining its accuracy. The main input of the model is the atomic positions characterized by a set of atom-centered symmetry functions. The charge inference from our ML algorithm is as close as 10-2 units of charge from the exact SCC solution. Our ML-DFTB approach provides a good approximation of the density matrix and of the energy and forces with only a single diagonalization. This is a significant computational saving with respect to the complete SCC algorithm, which allows us to investigate a bigger ensemble of atoms. We show the quality of our approach in the case of charged silicon carbide (SiC) clusters. The ML-DFTB potential energy surface (PES) mimics the SCC-DFTB PES rather well, despite its simplicity. This allows us to obtain the same geometric structure ordering with respect to energy for small clusters. The dissociation barriers for ion emission are well-reproduced, which opens the way to investigating ion field emission and charged cluster stability. The ML-DFTB approach is obviously not limited to charged clusters or SiC materials. It opens a new route to investigate larger clusters than those investigated by standard SCC-DFTB, as well as surface and solid-state chemistry at the atomic level.
Collapse
Affiliation(s)
- Paul Guibourg
- Laboratoire Cimap, UMR6252─Université de Caen Normandie, École Nationale Supérieure d'Ingénieures de Caen, Commissariat à l'Énergie Atomique, Centre National de la Recherche Scientifique, 6 Boulevard Du Maréchal Juin, 14050 Caen Cedex, France
| | - Léo Dontot
- Laboratoire Cimap, UMR6252─Université de Caen Normandie, École Nationale Supérieure d'Ingénieures de Caen, Commissariat à l'Énergie Atomique, Centre National de la Recherche Scientifique, 6 Boulevard Du Maréchal Juin, 14050 Caen Cedex, France
| | - Pierre-Matthieu Anglade
- Laboratoire Cimap, UMR6252─Université de Caen Normandie, École Nationale Supérieure d'Ingénieures de Caen, Commissariat à l'Énergie Atomique, Centre National de la Recherche Scientifique, 6 Boulevard Du Maréchal Juin, 14050 Caen Cedex, France
| | - Benoit Gervais
- Laboratoire Cimap, UMR6252─Université de Caen Normandie, École Nationale Supérieure d'Ingénieures de Caen, Commissariat à l'Énergie Atomique, Centre National de la Recherche Scientifique, 6 Boulevard Du Maréchal Juin, 14050 Caen Cedex, France
| |
Collapse
|
20
|
Shanks BL, Sullivan HW, Shazed AR, Hoepfner MP. Accelerated Bayesian Inference for Molecular Simulations using Local Gaussian Process Surrogate Models. J Chem Theory Comput 2024; 20:3798-3808. [PMID: 38551198 DOI: 10.1021/acs.jctc.3c01358] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/19/2024]
Abstract
While Bayesian inference is the gold standard for uncertainty quantification and propagation, its use within physical chemistry encounters formidable computational barriers. These bottlenecks are magnified for modeling data with many independent variables, such as X-ray/neutron scattering patterns and electromagnetic spectra. To address this challenge, we employ local Gaussian process (LGP) surrogate models to accelerate Bayesian optimization over these complex thermophysical properties. The time-complexity of the LGPs scales linearly in the number of independent variables, in stark contrast to the computationally expensive cubic scaling of conventional Gaussian processes. To illustrate the method, we trained a LGP surrogate model on the radial distribution function of liquid neon and observed a 1,760,000-fold speed-up compared to molecular dynamics simulation, beating a conventional GP by three orders-of-magnitude. We conclude that LGPs are robust and efficient surrogate models poised to expand the application of Bayesian inference in molecular simulations to a broad spectrum of experimental data.
Collapse
Affiliation(s)
- Brennon L Shanks
- Department of Chemical Engineering, University of Utah, Salt Lake City, UT 84112-9202, United States
| | - Harry W Sullivan
- Department of Chemical Engineering, University of Utah, Salt Lake City, UT 84112-9202, United States
| | - Abdur R Shazed
- Department of Chemical Engineering, University of Utah, Salt Lake City, UT 84112-9202, United States
| | - Michael P Hoepfner
- Department of Chemical Engineering, University of Utah, Salt Lake City, UT 84112-9202, United States
| |
Collapse
|
21
|
Arcidiacono A, Cignoni E, Mazzeo P, Cupellini L, Mennucci B. Predicting Solvatochromism of Chromophores in Proteins through QM/MM and Machine Learning. J Phys Chem A 2024; 128:3646-3658. [PMID: 38683801 PMCID: PMC11089512 DOI: 10.1021/acs.jpca.4c00249] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2024] [Revised: 03/03/2024] [Accepted: 04/01/2024] [Indexed: 05/02/2024]
Abstract
Solvatochromism occurs in both homogeneous solvents and more complex biological environments, such as proteins. While in both cases the solvatochromic effects report on the surroundings of the chromophore, their interpretation in proteins becomes more complicated not only because of structural effects induced by the protein pocket but also because the protein environment is highly anisotropic. This is particularly evident for highly conjugated and flexible molecules such as carotenoids, whose excitation energy is strongly dependent on both the geometry and the electrostatics of the environment. Here, we introduce a machine learning (ML) strategy trained on quantum mechanics/molecular mechanics calculations of geometrical and electrochromic contributions to carotenoids' excitation energies. We employ this strategy to compare solvatochromism in protein and solvent environments. Despite the important specifities of the protein, ML models trained on solvents can faithfully predict excitation energies in the protein environment, demonstrating the robustness of the chosen descriptors.
Collapse
Affiliation(s)
- Amanda Arcidiacono
- Department of Chemistry and Industrial
Chemistry, University of Pisa, Via G. Moruzzi 13, 56124 Pisa, Italy
| | - Edoardo Cignoni
- Department of Chemistry and Industrial
Chemistry, University of Pisa, Via G. Moruzzi 13, 56124 Pisa, Italy
| | - Patrizia Mazzeo
- Department of Chemistry and Industrial
Chemistry, University of Pisa, Via G. Moruzzi 13, 56124 Pisa, Italy
| | - Lorenzo Cupellini
- Department of Chemistry and Industrial
Chemistry, University of Pisa, Via G. Moruzzi 13, 56124 Pisa, Italy
| | - Benedetta Mennucci
- Department of Chemistry and Industrial
Chemistry, University of Pisa, Via G. Moruzzi 13, 56124 Pisa, Italy
| |
Collapse
|
22
|
Croy A. From Local Atomic Environments to Molecular Information Entropy. ACS OMEGA 2024; 9:20616-20622. [PMID: 38737089 PMCID: PMC11080039 DOI: 10.1021/acsomega.4c02770] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/22/2024] [Revised: 04/01/2024] [Accepted: 04/05/2024] [Indexed: 05/14/2024]
Abstract
The similarity of local atomic environments is an important concept in many machine learning techniques, which find applications in computational chemistry and material science. Here, we present and discuss a connection between the information entropy and the similarity matrix of a molecule. The resulting entropy can be used as a measure of the complexity of a molecule. Exemplarily, we introduce and evaluate two specific choices for defining the similarity: one is based on a SMILES representation of local substructures, and the other is based on the SOAP kernel. By tuning the sensitivity of the latter, we can achieve good agreement between the respective entropies. Finally, we consider the entropy of two molecules in a mixture. The gain of entropy due to the mixing can be used as a similarity measure of the molecules. We compare this measure to the average and best-match kernel. The results indicate a connection between the different approaches and demonstrate the usefulness and broad applicability of the similarity-based entropy approach.
Collapse
Affiliation(s)
- Alexander Croy
- Institute of Physical Chemistry, Friedrich Schiller University Jena, 07737 Jena, Germany
| |
Collapse
|
23
|
Santos-Ceballos JC, Salehnia F, Romero A, Vilanova X. Application of digital twins for simulation based tailoring of laser induced graphene. Sci Rep 2024; 14:10363. [PMID: 38710895 DOI: 10.1038/s41598-024-61237-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2023] [Accepted: 05/02/2024] [Indexed: 05/08/2024] Open
Abstract
In the era of man-machine interfaces, digital twins stand as a key technology, offering virtual representations of real-world objects, processes, and systems through computational models. They enable novel ways of interacting with, comprehending, and manipulating real-world entities within a virtual realm. The real implementation of graphene-based sensors and electronic devices remains challenging due to the integration complexities of high-quality graphene materials with existing manufacturing processes. To address this, scalable techniques for the in-situ fabrication of graphene-like materials are essential. One promising method involves using a CO2 laser to convert polyimide into graphene. Optimizing this graphitization process is hindered by complex parameter interactions and nonlinear terms. This article explores how these digital replicas can enhance the fabrication of laser-induced graphene (LIG) through laser simulation and machine learning methods to enable rapid single-step LIG patterning. This approach aims to create a universal simulation for all CO2 lasers, calculating optical energy flux and utilizing machine learning to control and predict LIG conductivity (ability to conduct current), morphology, and electrical resistance. The proposed procedure, integrating digital twins in the LIG production process, will avoid or reduce the preliminary tests required to determine the proper laser parameters to reach the desired LIG characteristics. Accordingly, this approach will reduce the time and costs associated with these tests and thus increase the efficiency and optimize the procedure.
Collapse
Affiliation(s)
- José Carlos Santos-Ceballos
- Universitat Rovira i Virgili, Microsystems and Nanotechnologies for Chemical Analysis (MINOS), Tarragona, Spain
| | - Foad Salehnia
- Universitat Rovira i Virgili, Microsystems and Nanotechnologies for Chemical Analysis (MINOS), Tarragona, Spain.
| | - Alfonso Romero
- Universitat Rovira i Virgili, Microsystems and Nanotechnologies for Chemical Analysis (MINOS), Tarragona, Spain
| | - Xavier Vilanova
- Universitat Rovira i Virgili, Microsystems and Nanotechnologies for Chemical Analysis (MINOS), Tarragona, Spain
| |
Collapse
|
24
|
Steiner M, Reiher M. A human-machine interface for automatic exploration of chemical reaction networks. Nat Commun 2024; 15:3680. [PMID: 38693117 PMCID: PMC11063077 DOI: 10.1038/s41467-024-47997-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2023] [Accepted: 04/15/2024] [Indexed: 05/03/2024] Open
Abstract
Autonomous reaction network exploration algorithms offer a systematic approach to explore mechanisms of complex chemical processes. However, the resulting reaction networks are so vast that an exploration of all potentially accessible intermediates is computationally too demanding. This renders brute-force explorations unfeasible, while explorations with completely pre-defined intermediates or hard-wired chemical constraints, such as element-specific coordination numbers, are not flexible enough for complex chemical systems. Here, we introduce a STEERING WHEEL to guide an otherwise unbiased automated exploration. The STEERING WHEEL algorithm is intuitive, generally applicable, and enables one to focus on specific regions of an emerging network. It also allows for guiding automated data generation in the context of mechanism exploration, catalyst design, and other chemical optimization challenges. The algorithm is demonstrated for reaction mechanism elucidation of transition metal catalysts. We highlight how to explore catalytic cycles in a systematic and reproducible way. The exploration objectives are fully adjustable, allowing one to harness the STEERING WHEEL for both structure-specific (accurate) calculations as well as for broad high-throughput screening of possible reaction intermediates.
Collapse
Affiliation(s)
- Miguel Steiner
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 2, 8093, Zurich, Switzerland
- ETH Zurich, NCCR Catalysis, Vladimir-Prelog-Weg 2, 8093, Zurich, Switzerland
| | - Markus Reiher
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 2, 8093, Zurich, Switzerland.
- ETH Zurich, NCCR Catalysis, Vladimir-Prelog-Weg 2, 8093, Zurich, Switzerland.
| |
Collapse
|
25
|
Singh P, Gopi P, Rani MSS, Singh S, Pandya P. Biophysical and structural characterization of tetramethrin serum protein complex and its toxicological implications. J Mol Recognit 2024; 37:e3076. [PMID: 38366770 DOI: 10.1002/jmr.3076] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2023] [Revised: 01/31/2024] [Accepted: 02/02/2024] [Indexed: 02/18/2024]
Abstract
Tetramethrin (TMT) is a commonly used insecticide and has a carcinogenic and neurodegenerative effect on humans. The binding mechanism and toxicological implications of TMT to human serum albumin (HSA) were examined in this study employing a combination of biophysical and computational methods indicating moderate binding affinity and potential hepato and renal toxicity. Fluorescence quenching experiments showed that TMT binds to HSA with a moderate affinity, and the binding process was spontaneous and predominantly enthalpy-driven. Circular dichroism spectroscopy revealed that TMT binding did not induce any significant conformational changes in HSA, resulting in no changes in its alpha-helix content. The binding site and modalities of TMT interactions with HSA as computed by molecular docking and molecular dynamics simulations revealed that it binds to Sudlow site II of HSA via hydrophobic interactions through its dimethylcyclopropane carboxylate methyl propanyl group. The structural dynamics of TMT induce proper fit into the binding site creating increased and stabilizing interactions. Additionally, molecular mechanics-Poisson Boltzmann surface area calculations also indicated that non-polar and van der Waals were found to be the major contributors to the high binding free energy of the complex. Quantum mechanics (QM) revealed the conformational energies of the binding confirmation and the degree of deviation from the global minimum energy conformation of TMT. The results of this study provide a comprehensive understanding of the binding mechanism of TMT with HSA, which is important for evaluating the toxicity of this insecticide in humans.
Collapse
Affiliation(s)
- Pratik Singh
- Amity Institute of Forensic Sciences, Amity University, Noida, India
| | - Priyanka Gopi
- Amity Institute of Forensic Sciences, Amity University, Noida, India
| | | | - Shweta Singh
- Amity Institute of Forensic Sciences, Amity University, Noida, India
| | - Prateek Pandya
- Amity Institute of Forensic Sciences, Amity University, Noida, India
| |
Collapse
|
26
|
Xie J, Zhou Y, Faizan M, Li Z, Li T, Fu Y, Wang X, Zhang L. Designing semiconductor materials and devices in the post-Moore era by tackling computational challenges with data-driven strategies. NATURE COMPUTATIONAL SCIENCE 2024; 4:322-333. [PMID: 38783137 DOI: 10.1038/s43588-024-00632-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/16/2023] [Accepted: 04/18/2024] [Indexed: 05/25/2024]
Abstract
In the post-Moore's law era, the progress of electronics relies on discovering superior semiconductor materials and optimizing device fabrication. Computational methods, augmented by emerging data-driven strategies, offer a promising alternative to the traditional trial-and-error approach. In this Perspective, we highlight data-driven computational frameworks for enhancing semiconductor discovery and device development by elaborating on their advances in exploring the materials design space, predicting semiconductor properties and optimizing device fabrication, with a concluding discussion on the challenges and opportunities in these areas.
Collapse
Affiliation(s)
- Jiahao Xie
- State Key Laboratory of Integrated Optoelectronics, Key Laboratory of Automobile Materials of MOE, Key Laboratory of Material Simulation Methods & Software of MOE, and School of Materials Science and Engineering, Jilin University, Changchun, China
| | - Yansong Zhou
- State Key Laboratory of Superhard Materials, International Center of Computational Method and Software, School of Physics, Jilin University, Changchun, China
| | - Muhammad Faizan
- State Key Laboratory of Integrated Optoelectronics, Key Laboratory of Automobile Materials of MOE, Key Laboratory of Material Simulation Methods & Software of MOE, and School of Materials Science and Engineering, Jilin University, Changchun, China
| | - Zewei Li
- State Key Laboratory of Integrated Optoelectronics, Key Laboratory of Automobile Materials of MOE, Key Laboratory of Material Simulation Methods & Software of MOE, and School of Materials Science and Engineering, Jilin University, Changchun, China
| | - Tianshu Li
- State Key Laboratory of Integrated Optoelectronics, Key Laboratory of Automobile Materials of MOE, Key Laboratory of Material Simulation Methods & Software of MOE, and School of Materials Science and Engineering, Jilin University, Changchun, China
| | - Yuhao Fu
- State Key Laboratory of Superhard Materials, International Center of Computational Method and Software, School of Physics, Jilin University, Changchun, China
| | - Xinjiang Wang
- State Key Laboratory of Integrated Optoelectronics, Key Laboratory of Automobile Materials of MOE, Key Laboratory of Material Simulation Methods & Software of MOE, and School of Materials Science and Engineering, Jilin University, Changchun, China.
| | - Lijun Zhang
- State Key Laboratory of Integrated Optoelectronics, Key Laboratory of Automobile Materials of MOE, Key Laboratory of Material Simulation Methods & Software of MOE, and School of Materials Science and Engineering, Jilin University, Changchun, China.
| |
Collapse
|
27
|
Csizi KS, Reiher M. Automated preparation of nanoscopic structures: Graph-based sequence analysis, mismatch detection, and pH-consistent protonation with uncertainty estimates. J Comput Chem 2024; 45:761-776. [PMID: 38124290 DOI: 10.1002/jcc.27276] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2023] [Accepted: 11/14/2023] [Indexed: 12/23/2023]
Abstract
Structure and function in nanoscale atomistic assemblies are tightly coupled, and every atom with its specific position and even every electron will have a decisive effect on the electronic structure, and hence, on the molecular properties. Molecular simulations of nanoscopic atomistic structures therefore require accurately resolved three-dimensional input structures. If extracted from experiment, these structures often suffer from severe uncertainties, of which the lack of information on hydrogen atoms is a prominent example. Hence, experimental structures require careful review and curation, which is a time-consuming and error-prone process. Here, we present a fast and robust protocol for the automated structure analysis and pH-consistent protonation, in short, ASAP. For biomolecules as a target, the ASAP protocol integrates sequence analysis and error assessment of a given input structure. ASAP allows for pK a prediction from reference data through Gaussian process regression including uncertainty estimation and connects to system-focused atomistic modeling described in Brunken and Reiher (J. Chem. Theory Comput. 16, 2020, 1646). Although focused on biomolecules, ASAP can be extended to other nanoscopic objects, because most of its design elements rely on a general graph-based foundation guaranteeing transferability. The modular character of the underlying pipeline supports different degrees of automation, which allows for (i) efficient feedback loops for human-machine interaction with a low entrance barrier and for (ii) integration into autonomous procedures such as automated force field parametrizations. This facilitates fast switching of the pH-state through on-the-fly system-focused reparametrization during a molecular simulation at virtually no extra computational cost.
Collapse
Affiliation(s)
- Katja-Sophia Csizi
- Department of Chemistry and Applied Biosciences, ETH Zurich, Zurich, Switzerland
| | - Markus Reiher
- Department of Chemistry and Applied Biosciences, ETH Zurich, Zurich, Switzerland
| |
Collapse
|
28
|
Tepper L, Dalton B, Netz RR. Accurate Memory Kernel Extraction from Discretized Time-Series Data. J Chem Theory Comput 2024; 20:3061-3068. [PMID: 38603471 PMCID: PMC11044577 DOI: 10.1021/acs.jctc.3c01289] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2023] [Revised: 03/25/2024] [Accepted: 04/01/2024] [Indexed: 04/13/2024]
Abstract
Memory effects emerge as a fundamental consequence of dimensionality reduction when low-dimensional observables are used to describe the dynamics of complex many-body systems. In the context of molecular dynamics (MD) data analysis, accounting for memory effects using the framework of the generalized Langevin equation (GLE) has proven efficient, accurate, and insightful, particularly when working with high-resolution time series data. However, in experimental systems, high-resolution data are often unavailable, raising questions about the impact of the data resolution on the estimated GLE parameters. This study demonstrates that direct memory extraction from time series data remains accurate when the discretization time is below the memory time. To obtain memory functions reliably, even when the discretization time exceeds the memory time, we introduce a Gaussian Process Optimization (GPO) scheme. This scheme minimizes the deviation of discretized two-point correlation functions between time series data and GLE simulations and is able to estimate accurate memory kernels as long as the discretization time stays below the longest time scale in the data, typically the barrier crossing time.
Collapse
Affiliation(s)
- Lucas Tepper
- Department of Physics, Freie
Universität Berlin, 14195 Berlin, Germany
| | - Benjamin Dalton
- Department of Physics, Freie
Universität Berlin, 14195 Berlin, Germany
| | - Roland R. Netz
- Department of Physics, Freie
Universität Berlin, 14195 Berlin, Germany
| |
Collapse
|
29
|
Pan X, Snyder R, Wang JN, Lander C, Wickizer C, Van R, Chesney A, Xue Y, Mao Y, Mei Y, Pu J, Shao Y. Training machine learning potentials for reactive systems: A Colab tutorial on basic models. J Comput Chem 2024; 45:638-647. [PMID: 38082539 PMCID: PMC10923003 DOI: 10.1002/jcc.27269] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2023] [Revised: 11/10/2023] [Accepted: 11/11/2023] [Indexed: 01/18/2024]
Abstract
In the last several years, there has been a surge in the development of machine learning potential (MLP) models for describing molecular systems. We are interested in a particular area of this field - the training of system-specific MLPs for reactive systems - with the goal of using these MLPs to accelerate free energy simulations of chemical and enzyme reactions. To help new members in our labs become familiar with the basic techniques, we have put together a self-guided Colab tutorial (https://cc-ats.github.io/mlp_tutorial/), which we expect to be also useful to other young researchers in the community. Our tutorial begins with the introduction of simple feedforward neural network (FNN) and kernel-based (using Gaussian process regression, GPR) models by fitting the two-dimensional Müller-Brown potential. Subsequently, two simple descriptors are presented for extracting features of molecular systems: symmetry functions (including the ANI variant) and embedding neural networks (such as DeepPot-SE). Lastly, these features will be fed into FNN and GPR models to reproduce the energies and forces for the molecular configurations in a Claisen rearrangement reaction.
Collapse
Affiliation(s)
- Xiaoliang Pan
- Department of Chemistry and Biochemistry, University of Oklahoma, Norman, OK 73019, USA
| | - Ryan Snyder
- Department of Chemistry and Chemical Biology, Indiana University-Purdue University Indianapolis, Indianapolis, IN 46202, USA
| | - Jia-Ning Wang
- State Key Laboratory of Precision Spectroscopy, School of Physics and Electronic Science, East China Normal University, Shanghai 200241, China
| | - Chance Lander
- Department of Chemistry and Biochemistry, University of Oklahoma, Norman, OK 73019, USA
| | - Carly Wickizer
- Department of Chemistry and Biochemistry, University of Oklahoma, Norman, OK 73019, USA
| | - Richard Van
- Department of Chemistry and Biochemistry, University of Oklahoma, Norman, OK 73019, USA
- Laboratory of Computational Biology, National, Heart, Lung and Blood Institute, National Institutes of Health, Bethesda, MD 20824, USA
| | - Andrew Chesney
- Department of Chemistry and Biochemistry, University of Oklahoma, Norman, OK 73019, USA
| | - Yuanfei Xue
- State Key Laboratory of Precision Spectroscopy, School of Physics and Electronic Science, East China Normal University, Shanghai 200241, China
| | - Yuezhi Mao
- Department of Chemistry and Biochemistry, San Diego State University, San Diego, CA 92182, USA
| | - Ye Mei
- State Key Laboratory of Precision Spectroscopy, School of Physics and Electronic Science, East China Normal University, Shanghai 200241, China
- NYU-ECNU Center for Computational Chemistry at NYU Shanghai, Shanghai 200062, China
- Collaborative Innovation Center of Extreme Optics, Shanxi University, Taiyuan, Shanxi 030006, China
| | - Jingzhi Pu
- Department of Chemistry and Chemical Biology, Indiana University-Purdue University Indianapolis, Indianapolis, IN 46202, USA
| | - Yihan Shao
- Department of Chemistry and Biochemistry, University of Oklahoma, Norman, OK 73019, USA
| |
Collapse
|
30
|
Liu M, Wang J, Hu J, Liu P, Niu H, Yan X, Li J, Yan H, Yang B, Sun Y, Chen C, Kresse G, Zuo L, Chen XQ. Layer-by-layer phase transformation in Ti 3O 5 revealed by machine-learning molecular dynamics simulations. Nat Commun 2024; 15:3079. [PMID: 38594273 PMCID: PMC11004112 DOI: 10.1038/s41467-024-47422-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2023] [Accepted: 03/28/2024] [Indexed: 04/11/2024] Open
Abstract
Reconstructive phase transitions involving breaking and reconstruction of primary chemical bonds are ubiquitous and important for many technological applications. In contrast to displacive phase transitions, the dynamics of reconstructive phase transitions are usually slow due to the large energy barrier. Nevertheless, the reconstructive phase transformation from β- to λ-Ti3O5 exhibits an ultrafast and reversible behavior. Despite extensive studies, the underlying microscopic mechanism remains unclear. Here, we discover a kinetically favorable in-plane nucleated layer-by-layer transformation mechanism through metadynamics and large-scale molecular dynamics simulations. This is enabled by developing an efficient machine learning potential with near first-principles accuracy through an on-the-fly active learning method and an advanced sampling technique. Our results reveal that the β-λ phase transformation initiates with the formation of two-dimensional nuclei in the ab-plane and then proceeds layer-by-layer through a multistep barrier-lowering kinetic process via intermediate metastable phases. Our work not only provides important insight into the ultrafast and reversible nature of the β-λ transition, but also presents useful strategies and methods for tackling other complex structural phase transitions.
Collapse
Affiliation(s)
- Mingfeng Liu
- Shenyang National Laboratory for Materials Science, Institute of Metal Research, Chinese Academy of Sciences, Shenyang, 110016, China
- School of Materials Science and Engineering, University of Science and Technology of China, Shenyang, 110016, China
| | - Jiantao Wang
- Shenyang National Laboratory for Materials Science, Institute of Metal Research, Chinese Academy of Sciences, Shenyang, 110016, China
- School of Materials Science and Engineering, University of Science and Technology of China, Shenyang, 110016, China
| | - Junwei Hu
- State Key Laboratory of Solidification Processing, International Center for Materials Discovery, School of Materials Science and Engineering, Northwestern Polytechnical University, Xi'an, 710072, China
| | - Peitao Liu
- Shenyang National Laboratory for Materials Science, Institute of Metal Research, Chinese Academy of Sciences, Shenyang, 110016, China.
| | - Haiyang Niu
- State Key Laboratory of Solidification Processing, International Center for Materials Discovery, School of Materials Science and Engineering, Northwestern Polytechnical University, Xi'an, 710072, China.
| | - Xuexi Yan
- Shenyang National Laboratory for Materials Science, Institute of Metal Research, Chinese Academy of Sciences, Shenyang, 110016, China
| | - Jiangxu Li
- Shenyang National Laboratory for Materials Science, Institute of Metal Research, Chinese Academy of Sciences, Shenyang, 110016, China
| | - Haile Yan
- Key Laboratory for Anisotropy and Texture of Materials (Ministry of Education), School of Materials Science and Engineering, Northeastern University, Shenyang, 110819, China
| | - Bo Yang
- Key Laboratory for Anisotropy and Texture of Materials (Ministry of Education), School of Materials Science and Engineering, Northeastern University, Shenyang, 110819, China
| | - Yan Sun
- Shenyang National Laboratory for Materials Science, Institute of Metal Research, Chinese Academy of Sciences, Shenyang, 110016, China
| | - Chunlin Chen
- Shenyang National Laboratory for Materials Science, Institute of Metal Research, Chinese Academy of Sciences, Shenyang, 110016, China
| | - Georg Kresse
- University of Vienna, Faculty of Physics and Center for Computational Materials Science, Kolingasse 14-16, A-1090, Vienna, Austria
| | - Liang Zuo
- Key Laboratory for Anisotropy and Texture of Materials (Ministry of Education), School of Materials Science and Engineering, Northeastern University, Shenyang, 110819, China
| | - Xing-Qiu Chen
- Shenyang National Laboratory for Materials Science, Institute of Metal Research, Chinese Academy of Sciences, Shenyang, 110016, China
| |
Collapse
|
31
|
Du T, Li S, Ganisetti S, Bauchy M, Yue Y, Smedskjaer MM. Deciphering the controlling factors for phase transitions in zeolitic imidazolate frameworks. Natl Sci Rev 2024; 11:nwae023. [PMID: 38560493 PMCID: PMC10980346 DOI: 10.1093/nsr/nwae023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2023] [Revised: 01/04/2024] [Accepted: 01/08/2024] [Indexed: 04/04/2024] Open
Abstract
Zeolitic imidazolate frameworks (ZIFs) feature complex phase transitions, including polymorphism, melting, vitrification, and polyamorphism. Experimentally probing their structural evolution during transitions involving amorphous phases is a significant challenge, especially at the medium-range length scale. To overcome this challenge, here we first train a deep learning-based force field to identify the structural characteristics of both crystalline and non-crystalline ZIF phases. This allows us to reproduce the structural evolution trend during the melting of crystals and formation of ZIF glasses at various length scales with an accuracy comparable to that of ab initio molecular dynamics, yet at a much lower computational cost. Based on this approach, we propose a new structural descriptor, namely, the ring orientation index, to capture the propensity for crystallization of ZIF-4 (Zn(Im)2, Im = C3H3N2-) glasses, as well as for the formation of ZIF-zni (Zn(Im)2) out of the high-density amorphous phase. This crystal formation process is a result of the reorientation of imidazole rings by sacrificing the order of the structure around the zinc-centered tetrahedra. The outcomes of this work are useful for studying phase transitions in other metal-organic frameworks (MOFs) and may thus guide the development of MOF glasses.
Collapse
Affiliation(s)
- Tao Du
- Department of Chemistry and Bioscience, Aalborg University, Aalborg 9220, Denmark
| | - Shanwu Li
- Department of Mechanical Engineering-Engineering Mechanics, Michigan Technological University, Houghton MI 49931, USA
| | - Sudheer Ganisetti
- Department of Chemistry and Bioscience, Aalborg University, Aalborg 9220, Denmark
| | - Mathieu Bauchy
- Physics of AmoRphous and Inorganic Solids Laboratory (PARISlab), Department of Civil and Environmental Engineering, University of California, Los Angeles, Los Angeles, CA 90095, USA
| | - Yuanzheng Yue
- Department of Chemistry and Bioscience, Aalborg University, Aalborg 9220, Denmark
| | - Morten M Smedskjaer
- Department of Chemistry and Bioscience, Aalborg University, Aalborg 9220, Denmark
| |
Collapse
|
32
|
Ahmad I, Rabbi F, Nisar A, Ul-Haq Z, Khan A. In vitro-in silico pharmacology and chemistry of Stercularin, isolated from Sterculia diversifolia. Comput Biol Chem 2024; 109:108008. [PMID: 38198964 DOI: 10.1016/j.compbiolchem.2023.108008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2023] [Revised: 12/19/2023] [Accepted: 12/20/2023] [Indexed: 01/12/2024]
Abstract
Stercularin is a coumarin, isolated from the ethyl acetate fraction of stem bark and leaves of S. diversifolia. Pharmacologically it is active against cancer, diabetes, and inflammation etc. The molecule is further screened for in vitro pharmacological activities. In addition, a detailed description on its drug likeness and pharmacokinetic profile has been established to further explore its fate as a drug candidate. Stercularin exhibited antiglycation, immunomodulatory, and leishmanicidal activity in three different in vitro models. The IC50 values obtained in these three assays were 80.22 ± 0.46 mg/ml, 12.8 ± 1.6 μg/ml, and 8.32 ± 0.42 μg/ml, respectively. In case of drug likeness evaluation, Stercularin has acceptable physicochemical properties and compliant with major drug likeness descriptors i.e., Lipinski rule, Pfizer rule, GSK rule, and "golden triangle". Accepting Lipinski rule implies the oral drug development of Stercularin. Pharmacokinetically, Stercularin is permeable to Caco-2 and MDCK cell lines. 'Boiled-egg' plot suggest intestinal route of absorption, blood brain barrier nonpermeating, and not affected by p-glycoprotein. Stercularin has high plasma protein binding with low free fraction circulating in the plasma. Stercularin proved to be the substrate and/or inhibitor of CYP 450 system with a moderate half-life and clearance rate to allow flexible dosing regimen. Finally, slight risk of toxicity exists for Stercularin, but not being limiting factors of drug knock out. A nature isolated Stercularin possess pharmacological activities and is predicted to have acceptable pharmacokinetic profile. Further drug development and in vivo studies are desirable for optimization.
Collapse
Affiliation(s)
- Imad Ahmad
- Department of Pharmacy, The Professional Institute of Health Sciences, Mardan, Khyber Pakhtunkhwa, Pakistan; Department of Pharmacy, Abdul Wali Khan University Mardan, Khyber Pakhtunkhwa, Pakistan
| | - Fazle Rabbi
- Department of Pharmacy, Abasyn University Peshawar, Peshawar, Khyber Pakhtunkhwa 25000, Pakistan.
| | - Amna Nisar
- Department of Pharmacy, University of Peshawar, Peshawar, Khyber Pakhtunkhwa 25120, Pakistan
| | - Zaheer Ul-Haq
- H.E.J. Research Institute of Chemistry, International Center for Chemical and Biological Sciences, University of Karachi, Karachi 75270, Pakistan; Dr. Panjwani Center for Molecular Medicine and Drug Research, International Center for Chemical and Biological Sciences, University of Karachi, Karachi 75270, Pakistan
| | - Alamgir Khan
- H.E.J. Research Institute of Chemistry, International Center for Chemical and Biological Sciences, University of Karachi, Karachi 75270, Pakistan
| |
Collapse
|
33
|
Daniel DT, Mitra S, Eichel RA, Diddens D, Granwehr J. Machine Learning Isotropic g Values of Radical Polymers. J Chem Theory Comput 2024; 20:2592-2604. [PMID: 38456629 DOI: 10.1021/acs.jctc.3c01252] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/09/2024]
Abstract
Methods for electronic structure computations, such as density functional theory (DFT), are routinely used for the calculation of spectroscopic parameters to establish and validate structure-parameter correlations. DFT calculations, however, are computationally expensive for large systems such as polymers. This work explores the machine learning (ML) of isotropic g values, giso, obtained from electron paramagnetic resonance (EPR) experiments of an organic radical polymer. An ML model based on regression trees is trained on DFT-calculated g values of poly(2,2,6,6-tetramethylpiperidinyloxy-4-yl methacrylate) (PTMA) polymer structures extracted from different time frames of a molecular dynamics trajectory. The DFT-derived g values, gisocalc, for different radical densities of PTMA, are compared against experimentally derived g values obtained from in operando EPR measurements of a PTMA-based organic radical battery. The ML-predicted giso values, gisopred, were compared with gisocalc to evaluate the performance of the model. Mean deviations of gisopred from gisocalc were found to be on the order of 0.0001. Furthermore, a performance evaluation on test structures from a separate MD trajectory indicated that the model is sensitive to the radical density and efficiently learns to predict giso values even for radical densities that were not part of the training data set. Since our trained model can reproduce the changes in giso along the MD trajectory and is sensitive to the extent of equilibration of the polymer structure, it is a promising alternative to computationally more expensive DFT methods, particularly for large systems that cannot be easily represented by a smaller model system.
Collapse
Affiliation(s)
- Davis Thomas Daniel
- Institute of Energy and Climate Research (IEK-9), Forschungszentrum Jülich GmbH, 52425 Jülich, Germany
- Institute of Technical and Macromolecular Chemistry, RWTH Aachen University, 52056 Aachen, Germany
| | - Souvik Mitra
- Institute of Physical Chemistry, University of Münster, 48149 Münster, Germany
| | - Rüdiger-A Eichel
- Institute of Energy and Climate Research (IEK-9), Forschungszentrum Jülich GmbH, 52425 Jülich, Germany
- Institute of Physical Chemistry, RWTH Aachen University, Aachen 52056, Germany
| | - Diddo Diddens
- Helmholtz Institute Münster (IEK-12), Forschungszentrum Jülich GmbH, 48149 Münster, Germany
| | - Josef Granwehr
- Institute of Energy and Climate Research (IEK-9), Forschungszentrum Jülich GmbH, 52425 Jülich, Germany
- Institute of Technical and Macromolecular Chemistry, RWTH Aachen University, 52056 Aachen, Germany
| |
Collapse
|
34
|
Yu Z, Batista ER, Yang P, Perez D. Acceleration of Solvation Free Energy Calculation via Thermodynamic Integration Coupled with Gaussian Process Regression and Improved Gelman-Rubin Convergence Diagnostics. J Chem Theory Comput 2024; 20:2570-2581. [PMID: 38470415 DOI: 10.1021/acs.jctc.3c01381] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/13/2024]
Abstract
The determination of the solvation free energy of ions and molecules holds profound importance across a spectrum of applications spanning chemistry, biology, energy storage, and the environment. Molecular dynamics simulations are powerful tools for computing this critical parameter. Nevertheless, the accurate and efficient calculation of the solvation free energy becomes a formidable endeavor when dealing with complex systems characterized by potent Coulombic interactions and sluggish ion dynamics and, consequently, slow transition across various metastable states. In the present study, we expose limitations stemming from the conventional calculation of the statistical inefficiency g in the thermodynamic integration method, a factor that can hinder the determination of convergence of the solvation free energy and its associated uncertainty. Instead, we propose a robust scheme based on Gelman-Rubin convergence diagnostics. We leverage this improved estimation of uncertainties to introduce an innovative accelerated thermodynamic integration method based on the Gaussian Process regression. This methodology is applied to the calculation of the solvation free energy of trivalent rare-earth elements immersed in ionic liquids, a scenario in which the aforementioned challenges render standard approaches ineffective. The proposed method proves to be effective in computing solvation free energy in situations where traditional thermodynamic integration methods fall short.
Collapse
Affiliation(s)
- Zhou Yu
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Enrique R Batista
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Ping Yang
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Danny Perez
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| |
Collapse
|
35
|
Rossignol H, Minotakis M, Cobelli M, Sanvito S. Machine-Learning-Assisted Construction of Ternary Convex Hull Diagrams. J Chem Inf Model 2024; 64:1828-1840. [PMID: 38271693 DOI: 10.1021/acs.jcim.3c01391] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2024]
Abstract
In the search for novel intermetallic ternary alloys, much of the effort goes into performing a large number of ab initio calculations covering a wide range of compositions and structures. These are essential to building a reliable convex hull diagram. While density functional theory (DFT) provides accurate predictions for many systems, its computational overheads set a throughput limit on the number of hypothetical phases that can be probed. Here, we demonstrate how an ensemble of machine-learning (ML) spectral neighbor-analysis potentials (SNAPs) can be integrated into a workflow for the construction of accurate ternary convex hull diagrams, highlighting regions that are fertile for materials discovery. Our workflow relies on using available binary-alloy data both to train the SNAP models and to create prototypes for ternary phases. From the prototype structures, all unique ternary decorations are created and used to form a pool of candidate compounds. The SNAPs ensemble is then used to prerelax the structures and screen the most favorable prototypes before using DFT to build the final phase diagram. As constructed, the proposed workflow relies on no extra first-principles data to train the ML surrogate model and yields a DFT-level accurate convex hull. We demonstrate its efficacy by investigating the Cu-Ag-Au and Mo-Ta-W ternary systems.
Collapse
Affiliation(s)
- Hugo Rossignol
- School of Physics and CRANN Institute, Trinity College Dublin, College Green, Dublin 2, Ireland
| | - Michail Minotakis
- School of Physics and CRANN Institute, Trinity College Dublin, College Green, Dublin 2, Ireland
| | - Matteo Cobelli
- School of Physics and CRANN Institute, Trinity College Dublin, College Green, Dublin 2, Ireland
| | - Stefano Sanvito
- School of Physics and CRANN Institute, Trinity College Dublin, College Green, Dublin 2, Ireland
| |
Collapse
|
36
|
Dral PO. AI in computational chemistry through the lens of a decade-long journey. Chem Commun (Camb) 2024; 60:3240-3258. [PMID: 38444290 DOI: 10.1039/d4cc00010b] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/07/2024]
Abstract
This article gives a perspective on the progress of AI tools in computational chemistry through the lens of the author's decade-long contributions put in the wider context of the trends in this rapidly expanding field. This progress over the last decade is tremendous: while a decade ago we had a glimpse of what was to come through many proof-of-concept studies, now we witness the emergence of many AI-based computational chemistry tools that are mature enough to make faster and more accurate simulations increasingly routine. Such simulations in turn allow us to validate and even revise experimental results, deepen our understanding of the physicochemical processes in nature, and design better materials, devices, and drugs. The rapid introduction of powerful AI tools gives rise to unique challenges and opportunities that are discussed in this article too.
Collapse
Affiliation(s)
- Pavlo O Dral
- State Key Laboratory of Physical Chemistry of Solid Surfaces, College of Chemistry and Chemical Engineering, Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, and Innovation Laboratory for Sciences and Technologies of Energy Materials of Fujian Province (IKKEM), Xiamen University, Xiamen, Fujian 361005, China.
| |
Collapse
|
37
|
Panchagnula K, Graf D, Albertani FEA, Thom AJW. Translational eigenstates of He@C60 from four-dimensional ab initio potential energy surfaces interpolated using Gaussian process regression. J Chem Phys 2024; 160:104303. [PMID: 38465682 DOI: 10.1063/5.0197903] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2024] [Accepted: 02/22/2024] [Indexed: 03/12/2024] Open
Abstract
We investigate the endofullerene system 3He@C60 with a four-dimensional potential energy surface (PES) to include the three He translational degrees of freedom and C60 cage radius. We compare second order Møller-Plesset perturbation theory (MP2), spin component scaled-MP2, scaled opposite spin-MP2, random phase approximation (RPA)@Perdew, Burke, and Ernzerhof (PBE), and corrected Hartree-Fock-RPA to calibrate and gain confidence in the choice of electronic structure method. Due to the high cost of these calculations, the PES is interpolated using Gaussian Process Regression (GPR), owing to its effectiveness with sparse training data. The PES is split into a two-dimensional radial surface, to which corrections are applied to achieve an overall four-dimensional surface. The nuclear Hamiltonian is diagonalized to generate the in-cage translational/vibrational eigenstates. The degeneracy of the three-dimensional harmonic oscillator energies with principal quantum number n is lifted due to the anharmonicity in the radial potential. The (2l + 1)-fold degeneracy of the angular momentum states is also weakly lifted, due to the angular dependence in the potential. We calculate the fundamental frequency to range between 96 and 110 cm-1 depending on the electronic structure method used. Error bars of the eigenstate energies were calculated from the GPR and are on the order of ∼±1.5 cm-1. Wavefunctions are also compared by considering their overlap and Hellinger distance to the one-dimensional empirical potential. As with the energies, the two ab initio methods MP2 and RPA@PBE show the best agreement. While MP2 has better agreement than RPA@PBE, due to its higher computational efficiency and comparable performance, we recommend RPA as an alternative electronic structure method of choice to MP2 for these systems.
Collapse
Affiliation(s)
- K Panchagnula
- Yusuf Hamied Department of Chemistry, University of Cambridge, Cambridge, United Kingdom
| | - D Graf
- Yusuf Hamied Department of Chemistry, University of Cambridge, Cambridge, United Kingdom
| | - F E A Albertani
- Yusuf Hamied Department of Chemistry, University of Cambridge, Cambridge, United Kingdom
| | - A J W Thom
- Yusuf Hamied Department of Chemistry, University of Cambridge, Cambridge, United Kingdom
| |
Collapse
|
38
|
Ruiz-Pernía JJ, Świderek K, Bertran J, Moliner V, Tuñón I. Electrostatics as a Guiding Principle in Understanding and Designing Enzymes. J Chem Theory Comput 2024; 20:1783-1795. [PMID: 38410913 PMCID: PMC10938506 DOI: 10.1021/acs.jctc.3c01395] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2023] [Revised: 02/14/2024] [Accepted: 02/14/2024] [Indexed: 02/28/2024]
Abstract
Enzyme design faces challenges related to the implementation of the basic principles that govern the catalytic activity in natural enzymes. In this work, we revisit basic electrostatic concepts that have been shown to explain the origin of enzymatic efficiency like preorganization and reorganization. Using magnitudes such as the electrostatic potential and the electric field generated by the protein, we explain how these concepts work in different enzymes and how they can be used to rationalize the consequences of point mutations. We also discuss examples of protein design in which electrostatic effects have been implemented. For the near future, molecular simulations, coupled with the use of machine learning methods, can be used to implement electrostatics as a guiding principle for enzyme designs.
Collapse
Affiliation(s)
| | - Katarzyna Świderek
- Biocomp
group, Institute of Advanced Materials (INAM), Universitat Jaume I, 12071 Castellón Spain
| | - Joan Bertran
- Departament
de Química, Universitat Autònoma
de Barcelona, 08193 Bellaterra, Spain
| | - Vicent Moliner
- Biocomp
group, Institute of Advanced Materials (INAM), Universitat Jaume I, 12071 Castellón Spain
| | - Iñaki Tuñón
- Departament
de Química Física, Universitat
de València, 46100 Burjassot, Spain
| |
Collapse
|
39
|
Noordhoek K, Bartel CJ. Accelerating the prediction of inorganic surfaces with machine learning interatomic potentials. NANOSCALE 2024. [PMID: 38470833 DOI: 10.1039/d3nr06468a] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/14/2024]
Abstract
The surface properties of solid-state materials often dictate their functionality, especially for applications where nanoscale effects become important. The relevant surface(s) and their properties are determined, in large part, by the material's synthesis or operating conditions. These conditions dictate thermodynamic driving forces and kinetic rates responsible for yielding the observed surface structure and morphology. Computational surface science methods have long been applied to connect thermochemical conditions to surface phase stability, particularly in the heterogeneous catalysis and thin film growth communities. This review provides a brief introduction to first-principles approaches to compute surface phase diagrams before introducing emerging data-driven approaches. The remainder of the review focuses on the application of machine learning, predominantly in the form of learned interatomic potentials, to study complex surfaces. As machine learning algorithms and large datasets on which to train them become more commonplace in materials science, computational methods are poised to become even more predictive and powerful for modeling the complexities of inorganic surfaces at the nanoscale.
Collapse
Affiliation(s)
- Kyle Noordhoek
- Department of Chemical Engineering and Materials Science, University of Minnesota, Minneapolis, MN, 55455, USA.
| | - Christopher J Bartel
- Department of Chemical Engineering and Materials Science, University of Minnesota, Minneapolis, MN, 55455, USA.
| |
Collapse
|
40
|
Moon SW, Min SK. Gaussian Process Regression-Based Near-Infrared d-Luciferin Analogue Design Using Mutation-Controlled Graph-Based Genetic Algorithm. J Chem Inf Model 2024; 64:1522-1532. [PMID: 38365605 DOI: 10.1021/acs.jcim.3c00870] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/18/2024]
Abstract
Molecular discovery is central to the field of chemical informatics. Although optimization approaches have been developed that target-specific molecular properties in combination with machine learning techniques, optimization using databases of limited size is challenging for efficient molecular design. We present a molecular design method with a Gaussian process regression model and a graph-based genetic algorithm (GB-GA) from a data set comprising a small number of compounds by introducing mutation probability control in the genetic algorithm to enhance the optimization capability and speed up the convergence to the optimal solution. In addition, we propose reducing the number of parameters in the conventional GB-GA focusing on efficient molecular design from a small database. We generated a target-specific database by combining active learning and iterative design in the evolutionary methodologies and chose Gaussian process regression as the prediction model for molecular properties. We show that the proposed scheme is more efficient for optimization toward the target properties from goal-directed benchmarks with several drug-like molecules compared to the conventional GB-GA method. Finally, we provide a demonstration whereby we designed D-luciferin analogues with near-infrared fluorescence for bioimaging, which is desirable for effective in vivo light sources, from a small-size data set.
Collapse
Affiliation(s)
- Sung Wook Moon
- Departmet of Chemistry, School of Natural Science, Ulsan National Institute of Science and Technology (UNIST), 50 UNIST-gil, Ulju-gun, Ulsan 44919, South Korea
| | - Seung Kyu Min
- Departmet of Chemistry, School of Natural Science, Ulsan National Institute of Science and Technology (UNIST), 50 UNIST-gil, Ulju-gun, Ulsan 44919, South Korea
| |
Collapse
|
41
|
Fang L, Laakso J, Rinke P, Chen X. Machine-learning accelerated structure search for ligand-protected clusters. J Chem Phys 2024; 160:094106. [PMID: 38426517 DOI: 10.1063/5.0180529] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2023] [Accepted: 02/09/2024] [Indexed: 03/02/2024] Open
Abstract
Finding low-energy structures of ligand-protected clusters is challenging due to the enormous conformational space and the high computational cost of accurate quantum chemical methods for determining the structures and energies of conformers. Here, we adopted and utilized a kernel rigid regression based machine learning method to accelerate the search for low-energy structures of ligand-protected clusters. We chose the Au25(Cys)18 (Cys: cysteine) cluster as a model system to test and demonstrate our method. We found that the low-energy structures of the cluster are characterized by a specific hydrogen bond type in the cysteine. The different configurations of the ligand layer influence the structural and electronic properties of clusters.
Collapse
Affiliation(s)
- Lincan Fang
- Department of Applied Physics, Aalto University, 00076 AALTO, Espoo, Finland
| | - Jarno Laakso
- Department of Applied Physics, Aalto University, 00076 AALTO, Espoo, Finland
| | - Patrick Rinke
- Department of Applied Physics, Aalto University, 00076 AALTO, Espoo, Finland
| | - Xi Chen
- Department of Applied Physics, Aalto University, 00076 AALTO, Espoo, Finland
- School of Physical Science and Technology, Lanzhou University, Lanzhou, Gansu 730000, China
- Lanzhou Center for Theoretical Physics and Key Laboratory for Quantum Theory and Applications of the Ministry of Education, Lanzhou University, Lanzhou, Gansu 730000, China
| |
Collapse
|
42
|
Mahato KD, Kumar U. Optimized Machine learning techniques Enable prediction of organic dyes photophysical Properties: Absorption Wavelengths, emission Wavelengths, and quantum yields. SPECTROCHIMICA ACTA. PART A, MOLECULAR AND BIOMOLECULAR SPECTROSCOPY 2024; 308:123768. [PMID: 38134661 DOI: 10.1016/j.saa.2023.123768] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/03/2023] [Revised: 12/05/2023] [Accepted: 12/12/2023] [Indexed: 12/24/2023]
Abstract
Applications of organic dyes, ranging from basic research to industry, are functions of their photophysical properties. Two important aspects- (1) knowledge of the photophysical properties of existing dyes long before real applications and (2) discovery of new organic dyes with desired photophysical properties for either upgradation of existing or development of new applications-are needed to be addressed. These two cases are coupled together with the common goal of estimating photophysical properties with high accuracy at the minimum cost of time and money long before the hard-core laboratory experiment. For this purpose, machine learning-based techniques are the most suitable approach. In this study, we used optimized machine-learning techniques to assess a dataset of 3066 organic dyes, which were evaluated using three evaluation parameters: Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), and the coefficient of determination (R2). The Quadratic Support Vector Machine (QSVM) was the best predictive model for RMSE-16.614, MAE-10.837, and R2-0.961 for absorption wavelengths and RMSE-23.636, MAE-16.278, and R2-0.929 for emission wavelengths. These R2 values are 0.7% and 0.4% greater than the Gradient Boost Regression Tree (GBRT) model's recently reported values of 0.954 and 0.925 for absorption and emission wavelengths, respectively. Furthermore, we estimated the quantum yield and found that the Coarse Gaussian Support Vector Machine (CGSVM) outperformed all examined models. For more validation of these models, we compared the predicted results with the experimental results of selective dyes. The proposed automated approach can be used for predicting photophysical properties without much computer programming knowledge.
Collapse
Affiliation(s)
- Kapil Dev Mahato
- Department of Physics, National Institute of Technology Jamshedpur, Jharkhand 831014, India.
| | - Uday Kumar
- Department of Physics, National Institute of Technology Jamshedpur, Jharkhand 831014, India
| |
Collapse
|
43
|
Célerse F, Wodrich MD, Vela S, Gallarati S, Fabregat R, Juraskova V, Corminboeuf C. From Organic Fragments to Photoswitchable Catalysts: The OFF-ON Structural Repository for Transferable Kernel-Based Potentials. J Chem Inf Model 2024; 64:1201-1212. [PMID: 38319296 PMCID: PMC10900300 DOI: 10.1021/acs.jcim.3c01953] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2023] [Revised: 01/18/2024] [Accepted: 01/22/2024] [Indexed: 02/07/2024]
Abstract
Structurally and conformationally diverse databases are needed to train accurate neural networks or kernel-based potentials capable of exploring the complex free energy landscape of flexible functional organic molecules. Curating such databases for species beyond "simple" drug-like compounds or molecules composed of well-defined building blocks (e.g., peptides) is challenging as it requires thorough chemical space mapping and evaluation of both chemical and conformational diversities. Here, we introduce the OFF-ON (organic fragments from organocatalysts that are non-modular) database, a repository of 7869 equilibrium and 67,457 nonequilibrium geometries of organic compounds and dimers aimed at describing conformationally flexible functional organic molecules, with an emphasis on photoswitchable organocatalysts. The relevance of this database is then demonstrated by training a local kernel regression model on a low-cost semiempirical baseline and comparing it with a PBE0-D3 reference for several known catalysts, notably the free energy surfaces of exemplary photoswitchable organocatalysts. Our results demonstrate that the OFF-ON data set offers reliable predictions for simulating the conformational behavior of virtually any (photoswitchable) organocatalyst or organic compound composed of H, C, N, O, F, and S atoms, thereby opening a computationally feasible route to explore complex free energy surfaces in order to rationalize and predict catalytic behavior.
Collapse
Affiliation(s)
- Frédéric Célerse
- Laboratory
for Computational Molecular Design (LCMD), Institute of Chemical Sciences
and Engineering, Ecole Polytechnique Fédérale
de Lausanne (EPFL), Lausanne 1015, Switzerland
| | - Matthew D. Wodrich
- Laboratory
for Computational Molecular Design (LCMD), Institute of Chemical Sciences
and Engineering, Ecole Polytechnique Fédérale
de Lausanne (EPFL), Lausanne 1015, Switzerland
- National
Center for Competence in Research-Catalysis (NCCR-Catalysis), Ecole Polytechnique Fédérale de Lausanne, Lausanne 1015, Switzerland
| | - Sergi Vela
- Laboratory
for Computational Molecular Design (LCMD), Institute of Chemical Sciences
and Engineering, Ecole Polytechnique Fédérale
de Lausanne (EPFL), Lausanne 1015, Switzerland
| | - Simone Gallarati
- Laboratory
for Computational Molecular Design (LCMD), Institute of Chemical Sciences
and Engineering, Ecole Polytechnique Fédérale
de Lausanne (EPFL), Lausanne 1015, Switzerland
| | - Raimon Fabregat
- Laboratory
for Computational Molecular Design (LCMD), Institute of Chemical Sciences
and Engineering, Ecole Polytechnique Fédérale
de Lausanne (EPFL), Lausanne 1015, Switzerland
| | - Veronika Juraskova
- Laboratory
for Computational Molecular Design (LCMD), Institute of Chemical Sciences
and Engineering, Ecole Polytechnique Fédérale
de Lausanne (EPFL), Lausanne 1015, Switzerland
| | - Clémence Corminboeuf
- Laboratory
for Computational Molecular Design (LCMD), Institute of Chemical Sciences
and Engineering, Ecole Polytechnique Fédérale
de Lausanne (EPFL), Lausanne 1015, Switzerland
- National
Center for Competence in Research-Catalysis (NCCR-Catalysis), Ecole Polytechnique Fédérale de Lausanne, Lausanne 1015, Switzerland
- National
Centre for Computational Design and Discovery of Novel Materials (MARVEL), Ecole Polytechnique Fédérale de Lausanne, Lausanne 1015, Switzerland
| |
Collapse
|
44
|
Li R, Zhou C, Singh A, Pei Y, Henkelman G, Li L. Local-environment-guided selection of atomic structures for the development of machine-learning potentials. J Chem Phys 2024; 160:074109. [PMID: 38380745 DOI: 10.1063/5.0187892] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2023] [Accepted: 01/26/2024] [Indexed: 02/22/2024] Open
Abstract
Machine learning potentials (MLPs) have attracted significant attention in computational chemistry and materials science due to their high accuracy and computational efficiency. The proper selection of atomic structures is crucial for developing reliable MLPs. Insufficient or redundant atomic structures can impede the training process and potentially result in a poor quality MLP. Here, we propose a local-environment-guided screening algorithm for efficient dataset selection in MLP development. The algorithm utilizes a local environment bank to store unique local environments of atoms. The dissimilarity between a particular local environment and those stored in the bank is evaluated using the Euclidean distance. A new structure is selected only if its local environment is significantly different from those already present in the bank. Consequently, the bank is then updated with all the new local environments found in the selected structure. To demonstrate the effectiveness of our algorithm, we applied it to select structures for a Ge system and a Pd13H2 particle system. The algorithm reduced the training data size by around 80% for both without compromising the performance of the MLP models. We verified that the results were independent of the selection and ordering of the initial structures. We also compared the performance of our method with the farthest point sampling algorithm, and the results show that our algorithm is superior in both robustness and computational efficiency. Furthermore, the generated local environment bank can be continuously updated and can potentially serve as a growing database of feature local environments, aiding in efficient dataset maintenance for constructing accurate MLPs.
Collapse
Affiliation(s)
- Renzhe Li
- Shenzhen Key Laboratory of Micro/Nano-Porous Functional Materials (SKLPM), Department of Materials Science and Engineering, Southern University of Science and Technology, Shenzhen 518055, People's Republic of China
- College of Chemistry, Xiangtan University, Xiangtan 411105, Hunan Province, People's Republic of China
| | - Chuan Zhou
- Shenzhen Key Laboratory of Micro/Nano-Porous Functional Materials (SKLPM), Department of Materials Science and Engineering, Southern University of Science and Technology, Shenzhen 518055, People's Republic of China
| | - Akksay Singh
- Shenzhen Key Laboratory of Micro/Nano-Porous Functional Materials (SKLPM), Department of Materials Science and Engineering, Southern University of Science and Technology, Shenzhen 518055, People's Republic of China
- Department of Chemistry, The University of Texas at Austin, Austin, Texas 78712, USA
- Institute for Computational Engineering and Sciences, The University of Texas at Austin, Austin, Texas 78712, USA
| | - Yong Pei
- College of Chemistry, Xiangtan University, Xiangtan 411105, Hunan Province, People's Republic of China
| | - Graeme Henkelman
- Department of Chemistry, The University of Texas at Austin, Austin, Texas 78712, USA
- Institute for Computational Engineering and Sciences, The University of Texas at Austin, Austin, Texas 78712, USA
| | - Lei Li
- Shenzhen Key Laboratory of Micro/Nano-Porous Functional Materials (SKLPM), Department of Materials Science and Engineering, Southern University of Science and Technology, Shenzhen 518055, People's Republic of China
| |
Collapse
|
45
|
Gigli L, Tisi D, Grasselli F, Ceriotti M. Mechanism of Charge Transport in Lithium Thiophosphate. CHEMISTRY OF MATERIALS : A PUBLICATION OF THE AMERICAN CHEMICAL SOCIETY 2024; 36:1482-1496. [PMID: 38370276 PMCID: PMC10870718 DOI: 10.1021/acs.chemmater.3c02726] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/23/2023] [Revised: 01/10/2024] [Accepted: 01/10/2024] [Indexed: 02/20/2024]
Abstract
Lithium ortho-thiophosphate (Li3PS4) has emerged as a promising candidate for solid-state electrolyte batteries, thanks to its highly conductive phases, cheap components, and large electrochemical stability range. Nonetheless, the microscopic mechanisms of Li-ion transport in Li3PS4 are far from being fully understood, the role of PS4 dynamics in charge transport still being controversial. In this work, we build machine learning potentials targeting state-of-the-art DFT references (PBEsol, r2SCAN, and PBE0) to tackle this problem in all known phases of Li3PS4 (α, β, and γ), for large system sizes and time scales. We discuss the physical origin of the observed superionic behavior of Li3PS4: the activation of PS4 flipping drives a structural transition to a highly conductive phase, characterized by an increase in Li-site availability and by a drastic reduction in the activation energy of Li-ion diffusion. We also rule out any paddle-wheel effects of PS4 tetrahedra in the superionic phases-previously claimed to enhance Li-ion diffusion-due to the orders-of-magnitude difference between the rate of PS4 flips and Li-ion hops at all temperatures below melting. We finally elucidate the role of interionic dynamical correlations in charge transport, by highlighting the failure of the Nernst-Einstein approximation to estimate the electrical conductivity. Our results show a strong dependence on the target DFT reference, with PBE0 yielding the best quantitative agreement with experimental measurements not only for the electronic band gap but also for the electrical conductivity of β- and α-Li3PS4.
Collapse
Affiliation(s)
| | | | - Federico Grasselli
- Laboratory of Computational
Science and Modeling, Institut des Matériaux, École Polytechnique Fédérale de Lausanne, Lausanne 1015, Switzerland
| | - Michele Ceriotti
- Laboratory of Computational
Science and Modeling, Institut des Matériaux, École Polytechnique Fédérale de Lausanne, Lausanne 1015, Switzerland
| |
Collapse
|
46
|
Kim HJ, Lee G, Oh SHV, Stampfl C, Soon A. Recalibrating the Experimentally Derived Structure of the Metastable Surface Oxide on Copper via Machine Learning-Accelerated In Silico Global Optimization. ACS NANO 2024; 18:4559-4569. [PMID: 38264984 DOI: 10.1021/acsnano.3c12249] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/25/2024]
Abstract
The oxidation of copper and its surface oxides are gaining increasing attention due to the enhanced CO2 reduction reaction (CO2RR) activity exhibited by partially oxidized copper among the copper-based catalysts. The "8" surface oxide on Cu(111) is seen as a promising structure for further study due to its resemblance to the highly active Cu2O(110) surface in the C-C coupling of the CO2RR, setting it apart from other O/Cu(111) surface oxides resembling Cu2O(111). However, recent X-ray photoelectron spectroscopy analysis challenges the currently accepted atomic structure of the "8" surface oxide, prompting a need for reevaluation. This study highlights the limitations of conventional methods when addressing such challenges, leading us to adopt global optimization search techniques. After a rigorous process to ensure robustness, the unbiased global minimum of the "8" surface oxide is identified. Interestingly, this configuration differs significantly from other surface oxides and also from previous "8" models while retaining similarities to the Cu2O(110) surface.
Collapse
Affiliation(s)
- Hyun Jun Kim
- Department of Materials Science & Engineering, Yonsei University, Seoul 03722, Republic of Korea
| | - Giyeok Lee
- Department of Materials Science & Engineering, Yonsei University, Seoul 03722, Republic of Korea
| | - Seung-Hyun Victor Oh
- Department of Materials Science & Engineering, Yonsei University, Seoul 03722, Republic of Korea
| | - Catherine Stampfl
- School of Physics, The University of Sydney, Sydney, New South Wales 2006, Australia
- The University of Sydney Nano Institute, The University of Sydney, Sydney, New South Wales 2006, Australia
| | - Aloysius Soon
- Department of Materials Science & Engineering, Yonsei University, Seoul 03722, Republic of Korea
- School of Physics, The University of Sydney, Sydney, New South Wales 2006, Australia
| |
Collapse
|
47
|
Kapil V, Kovács DP, Csányi G, Michaelides A. First-principles spectroscopy of aqueous interfaces using machine-learned electronic and quantum nuclear effects. Faraday Discuss 2024; 249:50-68. [PMID: 37799072 PMCID: PMC10845015 DOI: 10.1039/d3fd00113j] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2023] [Accepted: 07/18/2023] [Indexed: 10/07/2023]
Abstract
Vibrational spectroscopy is a powerful approach to visualising interfacial phenomena. However, extracting structural and dynamical information from vibrational spectra is a challenge that requires first-principles simulations, including non-Condon and quantum nuclear effects. We address this challenge by developing a machine-learning enhanced first-principles framework to speed up predictive modelling of infrared, Raman, and sum-frequency generation spectra. Our approach uses machine learning potentials that encode quantum nuclear effects to generate quantum trajectories using simple molecular dynamics efficiently. In addition, we reformulate bulk and interfacial selection rules to express them unambiguously in terms of the derivatives of polarisation and polarisabilities of the whole system and predict these derivatives efficiently using fully-differentiable machine learning models of dielectric response tensors. We demonstrate our framework's performance by predicting the IR, Raman, and sum-frequency generation spectra of liquid water, ice and the water-air interface by achieving near quantitative agreement with experiments at nearly the same computational efficiency as pure classical methods. Finally, to aid the experimental discovery of new phases of nanoconfined water, we predict the temperature-dependent vibrational spectra of monolayer water across the solid-hexatic-liquid phases transition.
Collapse
Affiliation(s)
- Venkat Kapil
- Yusuf Hamied Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge, CB2 1EW, UK.
| | | | - Gábor Csányi
- Engineering Laboratory, University of Cambridge, Cambridge, CB2 1PZ, UK
| | - Angelos Michaelides
- Yusuf Hamied Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge, CB2 1EW, UK.
| |
Collapse
|
48
|
Matin S, Allen AEA, Smith J, Lubbers N, Jadrich RB, Messerly R, Nebgen B, Li YW, Tretiak S, Barros K. Machine Learning Potentials with the Iterative Boltzmann Inversion: Training to Experiment. J Chem Theory Comput 2024. [PMID: 38307009 DOI: 10.1021/acs.jctc.3c01051] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2024]
Abstract
Methodologies for training machine learning potentials (MLPs) with quantum-mechanical simulation data have recently seen tremendous progress. Experimental data have a very different character than simulated data, and most MLP training procedures cannot be easily adapted to incorporate both types of data into the training process. We investigate a training procedure based on iterative Boltzmann inversion that produces a pair potential correction to an existing MLP using equilibrium radial distribution function data. By applying these corrections to an MLP for pure aluminum based on density functional theory, we observe that the resulting model largely addresses previous overstructuring in the melt phase. Interestingly, the corrected MLP also exhibits improved performance in predicting experimental diffusion constants, which are not included in the training procedure. The presented method does not require autodifferentiating through a molecular dynamics solver and does not make assumptions about the MLP architecture. Our results suggest a practical framework for incorporating experimental data into machine learning models to improve the accuracy of molecular dynamics simulations.
Collapse
Affiliation(s)
- Sakib Matin
- Department of Physics, Boston University, Boston, Massachusetts 02215, United States
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87546, United States
- Center for Nonlinear Studies, Los Alamos National Laboratory, Los Alamos, New Mexico 87546, United States
| | - Alice E A Allen
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87546, United States
- Center for Nonlinear Studies, Los Alamos National Laboratory, Los Alamos, New Mexico 87546, United States
| | - Justin Smith
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87546, United States
- NVIDIA Corp., Santa Clara, California 95051, United States
| | - Nicholas Lubbers
- Computer, Computational, and Statistical Sciences Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Ryan B Jadrich
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87546, United States
| | - Richard Messerly
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87546, United States
| | - Benjamin Nebgen
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87546, United States
| | - Ying Wai Li
- Computer, Computational, and Statistical Sciences Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Sergei Tretiak
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87546, United States
- Center for Nonlinear Studies, Los Alamos National Laboratory, Los Alamos, New Mexico 87546, United States
- Center for Integrated Nanotechnologies, Los Alamos National Laboratory, Los Alamos, New Mexico 87546, United States
| | - Kipton Barros
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87546, United States
- Center for Nonlinear Studies, Los Alamos National Laboratory, Los Alamos, New Mexico 87546, United States
| |
Collapse
|
49
|
Jin J, Reichman DR. Perturbative Expansion in Reciprocal Space: Bridging Microscopic and Mesoscopic Descriptions of Molecular Interactions. J Phys Chem B 2024; 128:1061-1078. [PMID: 38232134 DOI: 10.1021/acs.jpcb.3c06048] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2024]
Abstract
Determining the Fourier representation of various molecular interactions is important for constructing density-based field theories from a microscopic point of view, enabling a multiscale bridge between microscopic and mesoscopic descriptions. However, due to the strongly repulsive nature of short-ranged interactions, interparticle interactions cannot be formally defined in Fourier space, which renders coarse-grained (CG) approaches in k-space somewhat ambiguous. In this paper, we address this issue by designing a perturbative expansion of pair interactions in reciprocal space. Our perturbation theory, starting from reciprocal space, elucidates the microscopic origins underlying zeroth-order (long-range attractions) and divergent repulsive interactions from higher order contributions. We propose a systematic framework for constructing a faithful Fourier-space representation of molecular interactions, capturing key structural correlations in various systems, including simple model systems and molecular CG models of liquids. Building upon the Ornstein-Zernike equation, our approach can be combined with appropriate closure relations, and to further improve the closure approximations, we develop a bottom-up parameterization strategy for inferring the bridge function from microscopic statistics. By incorporating the bridge function into the Fourier representation, our findings suggest a systematic, bottom-up approach to performing coarse-graining in reciprocal space, leading to the systematic construction of a bottom-up classical field theory of complex aqueous systems.
Collapse
Affiliation(s)
- Jaehyeok Jin
- Department of Chemistry, Columbia University, 3000 Broadway, New York, New York 10027, United States
| | - David R Reichman
- Department of Chemistry, Columbia University, 3000 Broadway, New York, New York 10027, United States
| |
Collapse
|
50
|
Zhang L, Ye L, Wang F, Gao W, Yu J, Zhang L. Prediction of Hydrogen Abstraction Rate Constants at the Allylic Site between Alkenes and OH with Multiple Machine Learning Models. J Phys Chem A 2024; 128:761-772. [PMID: 38237153 DOI: 10.1021/acs.jpca.3c06917] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2024]
Abstract
Hydrogen abstraction reactions between hydrocarbons and hydroxyl radicals are important propagation steps in radical chain reactions, playing a crucial role in atmospheric and combustion chemistry. This study focuses on predicting the rate constants of the prototype of the reaction class of hydrogen abstractions, i.e., the primary allylic hydrogen abstraction from alkenes by the OH radical, via utilizing machine learning (ML) methods. Specifically, three distinct models, namely, feedforward neural network (FNN), support vector regression (SVR), and Gaussian process regression (GPR), have been employed to construct robust ML models for prediction. We proposed a novel strategy that seamlessly integrates descriptor preprocessing, a pairwise linear correlation analysis, and a model-specific Wrapper method to enhance the effectiveness of the feature selection procedure. The selected feature subset was then evaluated using two cross-validation techniques, i.e., leave-one-group-out (LOGO) and K-fold cross-validations, for each of the three ML models (FNN, SVR, and GPR) to assess their predictive and stability performance. The results demonstrate that the FNN model, trained with seven representative descriptors, achieves superior performance compared to the other two methods. For the FNN model, the average percentage deviation is 39.06% on the test set by performing LOGO cross-validation, while the repeated 10-fold cross-validation achieves a percentage prediction deviation of 19.1%. Two larger alkenes with 10 carbons were selected to test the prediction performance of the trained FNN model on primary allylic hydrogen abstraction. Results show that the kinetic predictions follow well the modified three-parameter Arrhenius equation, indicating the reliable performance of FNN in predicting hydrogen abstraction rate constants, especially for the primary allylic site. Hopefully, this work can shed useful light on the application of ML in generating chemical kinetic parameters of hydrocarbon combustion chemistry.
Collapse
Affiliation(s)
- Lei Zhang
- School of Chemical Engineering, Dalian University of Technology, Dalian, Liaoning 116024, China
| | - Lili Ye
- School of Chemical Engineering, Dalian University of Technology, Dalian, Liaoning 116024, China
| | - Fan Wang
- School of Chemical Engineering, Dalian University of Technology, Dalian, Liaoning 116024, China
| | - Wei Gao
- State Key Laboratory of Fire Science, University of Science and Technology of China, Hefei, Anhui 230026, China
| | - Jinhui Yu
- Key Laboratory of Plant Germplasm Enhancement and Specialty Agriculture, Wuhan Botanical Garden, Chinese Academy of Sciences, Wuhan, Hubei 430074, China
| | - Lidong Zhang
- National Synchrotron Radiation Laboratory, University of Science and Technology of China, Hefei, Anhui 230026, China
| |
Collapse
|