1
|
Park J, Sorourifar F, Muthyala MR, Houser AM, Tuttle M, Paulson JA, Zhang S. Zero-Shot Discovery of High-Performance, Low-Cost Organic Battery Materials Using Machine Learning. J Am Chem Soc 2024. [PMID: 39484799 DOI: 10.1021/jacs.4c11663] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2024]
Abstract
Organic electrode materials (OEMs), composed of abundant elements such as carbon, nitrogen, and oxygen, offer sustainable alternatives to conventional electrode materials that depend on finite metal resources. The vast structural diversity of organic compounds provides a virtually unlimited design space; however, exploring this space through Edisonian trial-and-error approaches is costly and time-consuming. In this work, we develop a new framework, SPARKLE, that combines computational chemistry, molecular generation, and machine learning to achieve zero-shot predictions of OEMs that simultaneously balance reward (specific energy), risk (solubility), and cost (synthesizability). We demonstrate that SPARKLE significantly outperforms alternative black-box machine learning algorithms on interpolation and extrapolation tasks. By deploying SPARKLE over a design space of more than 670,000 organic compounds, we identified ≈5000 novel OEM candidates. Twenty-seven of them were synthesized and fabricated into coin-cell batteries for experimental testing. Among SPARKLE-discovered OEMs, 62.9% exceeded benchmark performance metrics, representing a 3-fold improvement over OEMs selected by human intuition alone (20.8% based on six years of prior lab experience). The top-performing OEMs among the 27 candidates exhibit specific energy and cycling stability that surpass the state-of-the-art while being synthesizable at a fraction of the cost.
Collapse
Affiliation(s)
- Jaehyun Park
- Department of Chemistry & Biochemistry, The Ohio State University, 100 West 18th Avenue, Columbus, Ohio 43210, United States
| | - Farshud Sorourifar
- Department of Chemical and Biomolecular Engineering, The Ohio State University, 151 W. Woodruff Avenue, Columbus, Ohio 43210, United States
| | - Madhav R Muthyala
- Department of Chemical and Biomolecular Engineering, The Ohio State University, 151 W. Woodruff Avenue, Columbus, Ohio 43210, United States
| | - Abigail M Houser
- Department of Chemistry & Biochemistry, The Ohio State University, 100 West 18th Avenue, Columbus, Ohio 43210, United States
| | - Madison Tuttle
- Department of Chemistry & Biochemistry, The Ohio State University, 100 West 18th Avenue, Columbus, Ohio 43210, United States
| | - Joel A Paulson
- Department of Chemical and Biomolecular Engineering, The Ohio State University, 151 W. Woodruff Avenue, Columbus, Ohio 43210, United States
| | - Shiyu Zhang
- Department of Chemistry & Biochemistry, The Ohio State University, 100 West 18th Avenue, Columbus, Ohio 43210, United States
| |
Collapse
|
2
|
Novick A, Cai D, Nguyen Q, Garnett R, Adams R, Toberer E. Probabilistic prediction of material stability: integrating convex hulls into active learning. MATERIALS HORIZONS 2024; 11:5381-5393. [PMID: 39158003 DOI: 10.1039/d4mh00432a] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/20/2024]
Abstract
Active learning is a valuable tool for efficiently exploring complex spaces, finding a variety of uses in materials science. However, the determination of convex hulls for phase diagrams does not neatly fit into traditional active learning approaches due to their global nature. Specifically, the thermodynamic stability of a material is not simply a function of its own energy, but rather requires energetic information from all other competing compositions and phases. Here we present convex hull-aware active learning (CAL), a novel Bayesian algorithm that chooses experiments to minimize the uncertainty in the convex hull. CAL prioritizes compositions that are close to or on the hull, leaving significant uncertainty in other compositions that are quickly determined to be irrelevant to the convex hull. The convex hull can thus be predicted with significantly fewer observations than approaches that focus solely on energy. Intrinsic to this Bayesian approach is uncertainty quantification in both the convex hull and all subsequent predictions (e.g., stability and chemical potential). By providing increased search efficiency and uncertainty quantification, CAL can be readily incorporated into the emerging paradigm of uncertainty-based workflows for thermodynamic prediction.
Collapse
Affiliation(s)
- Andrew Novick
- Department of Physics, Colorado School of Mines, Golden, Colorado, USA.
| | - Diana Cai
- Center for Computational Mathematics, Flatiron Institute Address, New York, New York, USA
| | - Quan Nguyen
- Department of Computer Science and Engineering, Washington University in St. Louis, St. Louis, Missouri, USA
| | - Roman Garnett
- Department of Computer Science and Engineering, Washington University in St. Louis, St. Louis, Missouri, USA
| | - Ryan Adams
- Department of Computer Science, Princeton University, New Jersey, USA
| | - Eric Toberer
- Department of Physics, Colorado School of Mines, Golden, Colorado, USA.
| |
Collapse
|
3
|
Ibrahim E, Lysogorskiy Y, Drautz R. Efficient Parametrization of Transferable Atomic Cluster Expansion for Water. J Chem Theory Comput 2024. [PMID: 39431422 DOI: 10.1021/acs.jctc.4c00802] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2024]
Abstract
We present a highly accurate and transferable parametrization of water using the atomic cluster expansion (ACE). To efficiently sample liquid water, we propose a novel approach that involves sampling static calculations of various ice phases and utilizing the active learning (AL) feature of the ACE-based D-optimality algorithm to select relevant liquid water configurations, bypassing computationally intensive ab initio molecular dynamics simulations. Our results demonstrate that the ACE descriptors enable a potential initially fitted solely on ice structures, which is later upfitted with few configurations of liquid, identified with AL to provide an excellent description of liquid water. The developed potential exhibits remarkable agreement with first-principles reference, accurately capturing various properties of liquid water, including structural characteristics such as pair correlation functions, covalent bonding profiles, and hydrogen bonding profiles, as well as dynamic properties like the vibrational density of states, diffusion coefficient, and thermodynamic properties such as the melting point of the ice Ih. Our research introduces a new and efficient sampling technique for machine learning potentials in water simulations while also presenting a transferable interatomic potential for water that reveals the accuracy of first-principles reference. This advancement not only enhances our understanding of the relationship between ice and liquid water at the atomic level but also opens up new avenues for studying complex aqueous systems.
Collapse
Affiliation(s)
- Eslam Ibrahim
- ICAMS, Ruhr Universität Bochum, 44780 Bochum, Germany
| | | | - Ralf Drautz
- ICAMS, Ruhr Universität Bochum, 44780 Bochum, Germany
| |
Collapse
|
4
|
Hwang W, Austin SL, Blondel A, Boittier ED, Boresch S, Buck M, Buckner J, Caflisch A, Chang HT, Cheng X, Choi YK, Chu JW, Crowley MF, Cui Q, Damjanovic A, Deng Y, Devereux M, Ding X, Feig MF, Gao J, Glowacki DR, Gonzales JE, Hamaneh MB, Harder ED, Hayes RL, Huang J, Huang Y, Hudson PS, Im W, Islam SM, Jiang W, Jones MR, Käser S, Kearns FL, Kern NR, Klauda JB, Lazaridis T, Lee J, Lemkul JA, Liu X, Luo Y, MacKerell AD, Major DT, Meuwly M, Nam K, Nilsson L, Ovchinnikov V, Paci E, Park S, Pastor RW, Pittman AR, Post CB, Prasad S, Pu J, Qi Y, Rathinavelan T, Roe DR, Roux B, Rowley CN, Shen J, Simmonett AC, Sodt AJ, Töpfer K, Upadhyay M, van der Vaart A, Vazquez-Salazar LI, Venable RM, Warrensford LC, Woodcock HL, Wu Y, Brooks CL, Brooks BR, Karplus M. CHARMM at 45: Enhancements in Accessibility, Functionality, and Speed. J Phys Chem B 2024; 128:9976-10042. [PMID: 39303207 PMCID: PMC11492285 DOI: 10.1021/acs.jpcb.4c04100] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2024] [Revised: 08/15/2024] [Accepted: 08/22/2024] [Indexed: 09/22/2024]
Abstract
Since its inception nearly a half century ago, CHARMM has been playing a central role in computational biochemistry and biophysics. Commensurate with the developments in experimental research and advances in computer hardware, the range of methods and applicability of CHARMM have also grown. This review summarizes major developments that occurred after 2009 when the last review of CHARMM was published. They include the following: new faster simulation engines, accessible user interfaces for convenient workflows, and a vast array of simulation and analysis methods that encompass quantum mechanical, atomistic, and coarse-grained levels, as well as extensive coverage of force fields. In addition to providing the current snapshot of the CHARMM development, this review may serve as a starting point for exploring relevant theories and computational methods for tackling contemporary and emerging problems in biomolecular systems. CHARMM is freely available for academic and nonprofit research at https://academiccharmm.org/program.
Collapse
Affiliation(s)
- Wonmuk Hwang
- Department
of Biomedical Engineering, Texas A&M
University, College
Station, Texas 77843, United States
- Department
of Materials Science and Engineering, Texas
A&M University, College Station, Texas 77843, United States
- Department
of Physics and Astronomy, Texas A&M
University, College Station, Texas 77843, United States
- Center for
AI and Natural Sciences, Korea Institute
for Advanced Study, Seoul 02455, Republic
of Korea
| | - Steven L. Austin
- Department
of Chemistry, University of South Florida, Tampa, Florida 33620, United States
| | - Arnaud Blondel
- Institut
Pasteur, Université Paris Cité, CNRS UMR3825, Structural
Bioinformatics Unit, 28 rue du Dr. Roux F-75015 Paris, France
| | - Eric D. Boittier
- Department
of Chemistry, University of Basel, Klingelbergstrasse 80, CH-4056 Basel, Switzerland
| | - Stefan Boresch
- Faculty of
Chemistry, Department of Computational Biological Chemistry, University of Vienna, Wahringerstrasse 17, 1090 Vienna, Austria
| | - Matthias Buck
- Department
of Physiology and Biophysics, Case Western
Reserve University, School of Medicine, Cleveland, Ohio 44106, United States
| | - Joshua Buckner
- Department
of Chemistry, University of Michigan, Ann Arbor, Michigan 48109, United States
| | - Amedeo Caflisch
- Department
of Biochemistry, University of Zürich, CH-8057 Zürich, Switzerland
| | - Hao-Ting Chang
- Institute
of Bioinformatics and Systems Biology, National
Yang Ming Chiao Tung University, Hsinchu 30010, Taiwan, ROC
| | - Xi Cheng
- Shanghai
Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China
| | - Yeol Kyo Choi
- Department
of Biological Sciences, Lehigh University, Bethlehem, Pennsylvania 18015, United States
| | - Jhih-Wei Chu
- Institute
of Bioinformatics and Systems Biology, Department of Biological Science
and Technology, Institute of Molecular Medicine and Bioengineering,
and Center for Intelligent Drug Systems and Smart Bio-devices (IDSB), National Yang Ming Chiao Tung
University, Hsinchu 30010, Taiwan,
ROC
| | - Michael F. Crowley
- Renewable
Resources and Enabling Sciences Center, National Renewable Energy Laboratory, Golden, Colorado 80401, United States
| | - Qiang Cui
- Department
of Chemistry, Boston University, 590 Commonwealth Avenue, Boston, Massachusetts 02215, United States
- Department
of Physics, Boston University, 590 Commonwealth Avenue, Boston, Massachusetts 02215, United States
- Department
of Biomedical Engineering, Boston University, 44 Cummington Mall, Boston, Massachusetts 02215, United States
| | - Ana Damjanovic
- Department
of Biophysics, Johns Hopkins University, Baltimore, Maryland 21218, United States
- Department
of Physics and Astronomy, Johns Hopkins
University, Baltimore, Maryland 21218, United States
- Laboratory
of Computational Biology, National Heart
Lung and Blood Institute, National Institutes of Health, Bethesda, Maryland 20892, United States
| | - Yuqing Deng
- Shanghai
R&D Center, DP Technology, Ltd., Shanghai 201210, China
| | - Mike Devereux
- Department
of Chemistry, University of Basel, Klingelbergstrasse 80, CH-4056 Basel, Switzerland
| | - Xinqiang Ding
- Department
of Chemistry, Tufts University, Medford, Massachusetts 02155, United States
| | - Michael F. Feig
- Department
of Biochemistry and Molecular Biology, Michigan
State University, East Lansing, Michigan 48824, United States
| | - Jiali Gao
- School
of Chemical Biology & Biotechnology, Peking University Shenzhen Graduate School, Shenzhen, Guangdong 518055, China
- Institute
of Systems and Physical Biology, Shenzhen
Bay Laboratory, Shenzhen, Guangdong 518055, China
- Department
of Chemistry and Supercomputing Institute, University of Minnesota, Minneapolis, Minnesota 55455, United States
| | - David R. Glowacki
- CiTIUS
Centro Singular de Investigación en Tecnoloxías Intelixentes
da USC, 15705 Santiago de Compostela, Spain
| | - James E. Gonzales
- Department
of Biomedical Engineering, Texas A&M
University, College
Station, Texas 77843, United States
- Laboratory
of Computational Biology, National Heart
Lung and Blood Institute, National Institutes of Health, Bethesda, Maryland 20892, United States
| | - Mehdi Bagerhi Hamaneh
- Department
of Physiology and Biophysics, Case Western
Reserve University, School of Medicine, Cleveland, Ohio 44106, United States
| | | | - Ryan L. Hayes
- Department
of Chemical and Biomolecular Engineering, University of California, Irvine, Irvine, California 92697, United States
- Department
of Pharmaceutical Sciences, University of
California, Irvine, Irvine, California 92697, United States
| | - Jing Huang
- Key Laboratory
of Structural Biology of Zhejiang Province, School of Life Sciences, Westlake University, Hangzhou, Zhejiang 310024, China
| | - Yandong Huang
- College
of Computer Engineering, Jimei University, Xiamen 361021, China
| | - Phillip S. Hudson
- Department
of Chemistry, University of South Florida, Tampa, Florida 33620, United States
- Medicine
Design, Pfizer Inc., Cambridge, Massachusetts 02139, United States
| | - Wonpil Im
- Department
of Biological Sciences, Lehigh University, Bethlehem, Pennsylvania 18015, United States
| | - Shahidul M. Islam
- Department
of Chemistry, Delaware State University, Dover, Delaware 19901, United States
| | - Wei Jiang
- Computational
Science Division, Argonne National Laboratory, Argonne, Illinois 60439, United States
| | - Michael R. Jones
- Laboratory
of Computational Biology, National Heart
Lung and Blood Institute, National Institutes of Health, Bethesda, Maryland 20892, United States
| | - Silvan Käser
- Department
of Chemistry, University of Basel, Klingelbergstrasse 80, CH-4056 Basel, Switzerland
| | - Fiona L. Kearns
- Department
of Chemistry, University of South Florida, Tampa, Florida 33620, United States
| | - Nathan R. Kern
- Department
of Biological Sciences, Lehigh University, Bethlehem, Pennsylvania 18015, United States
| | - Jeffery B. Klauda
- Department
of Chemical and Biomolecular Engineering, Institute for Physical Science
and Technology, Biophysics Program, University
of Maryland, College Park, Maryland 20742, United States
| | - Themis Lazaridis
- Department
of Chemistry, City College of New York, New York, New York 10031, United States
| | - Jinhyuk Lee
- Disease
Target Structure Research Center, Korea
Research Institute of Bioscience and Biotechnology, Daejeon 34141, Republic of Korea
- Department
of Bioinformatics, KRIBB School of Bioscience, University of Science and Technology, Daejeon 34141, Republic of Korea
| | - Justin A. Lemkul
- Department
of Biochemistry, Virginia Polytechnic Institute
and State University, Blacksburg, Virginia 24061, United States
| | - Xiaorong Liu
- Department
of Chemistry, University of Michigan, Ann Arbor, Michigan 48109, United States
| | - Yun Luo
- Department
of Biotechnology and Pharmaceutical Sciences, College of Pharmacy, Western University of Health Sciences, Pomona, California 91766, United States
| | - Alexander D. MacKerell
- Department
of Pharmaceutical Sciences, University of
Maryland School of Pharmacy, Baltimore, Maryland 21201, United States
| | - Dan T. Major
- Department
of Chemistry and Institute for Nanotechnology & Advanced Materials, Bar-Ilan University, Ramat-Gan 52900, Israel
| | - Markus Meuwly
- Department
of Chemistry, University of Basel, Klingelbergstrasse 80, CH-4056 Basel, Switzerland
- Department
of Chemistry, Brown University, Providence, Rhode Island 02912, United States
| | - Kwangho Nam
- Department
of Chemistry and Biochemistry, University
of Texas at Arlington, Arlington, Texas 76019, United States
| | - Lennart Nilsson
- Karolinska
Institutet, Department of Biosciences and
Nutrition, SE-14183 Huddinge, Sweden
| | - Victor Ovchinnikov
- Harvard
University, Department of Chemistry
and Chemical Biology, Cambridge, Massachusetts 02138, United States
| | - Emanuele Paci
- Dipartimento
di Fisica e Astronomia, Universitá
di Bologna, Bologna 40127, Italy
| | - Soohyung Park
- Department
of Biological Sciences, Lehigh University, Bethlehem, Pennsylvania 18015, United States
| | - Richard W. Pastor
- Laboratory
of Computational Biology, National Heart
Lung and Blood Institute, National Institutes of Health, Bethesda, Maryland 20892, United States
| | - Amanda R. Pittman
- Department
of Chemistry, University of South Florida, Tampa, Florida 33620, United States
| | - Carol Beth Post
- Borch Department
of Medicinal Chemistry and Molecular Pharmacology, Purdue University, West Lafayette, Indiana 47907, United States
| | - Samarjeet Prasad
- Laboratory
of Computational Biology, National Heart
Lung and Blood Institute, National Institutes of Health, Bethesda, Maryland 20892, United States
| | - Jingzhi Pu
- Department
of Chemistry and Chemical Biology, Indiana
University Indianapolis, Indianapolis, Indiana 46202, United States
| | - Yifei Qi
- School
of Pharmacy, Fudan University, Shanghai 201203, China
| | | | - Daniel R. Roe
- Laboratory
of Computational Biology, National Heart
Lung and Blood Institute, National Institutes of Health, Bethesda, Maryland 20892, United States
| | - Benoit Roux
- Department
of Chemistry, University of Chicago, Chicago, Illinois 60637, United States
| | | | - Jana Shen
- Department
of Pharmaceutical Sciences, University of
Maryland School of Pharmacy, Baltimore, Maryland 21201, United States
| | - Andrew C. Simmonett
- Laboratory
of Computational Biology, National Heart
Lung and Blood Institute, National Institutes of Health, Bethesda, Maryland 20892, United States
| | - Alexander J. Sodt
- Eunice
Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health, Bethesda, Maryland 20892, United States
| | - Kai Töpfer
- Department
of Chemistry, University of Basel, Klingelbergstrasse 80, CH-4056 Basel, Switzerland
| | - Meenu Upadhyay
- Department
of Chemistry, University of Basel, Klingelbergstrasse 80, CH-4056 Basel, Switzerland
| | - Arjan van der Vaart
- Department
of Chemistry, University of South Florida, Tampa, Florida 33620, United States
| | | | - Richard M. Venable
- Laboratory
of Computational Biology, National Heart
Lung and Blood Institute, National Institutes of Health, Bethesda, Maryland 20892, United States
| | - Luke C. Warrensford
- Department
of Chemistry, University of South Florida, Tampa, Florida 33620, United States
| | - H. Lee Woodcock
- Department
of Chemistry, University of South Florida, Tampa, Florida 33620, United States
| | - Yujin Wu
- Department
of Chemistry, University of Michigan, Ann Arbor, Michigan 48109, United States
| | - Charles L. Brooks
- Department
of Chemistry, University of Michigan, Ann Arbor, Michigan 48109, United States
| | - Bernard R. Brooks
- Laboratory
of Computational Biology, National Heart
Lung and Blood Institute, National Institutes of Health, Bethesda, Maryland 20892, United States
| | - Martin Karplus
- Harvard
University, Department of Chemistry
and Chemical Biology, Cambridge, Massachusetts 02138, United States
- Laboratoire
de Chimie Biophysique, ISIS, Université
de Strasbourg, 67000 Strasbourg, France
| |
Collapse
|
5
|
Einabadi E, Mashkoori M. Predicting refractive index of inorganic compounds using machine learning. Sci Rep 2024; 14:24204. [PMID: 39406781 PMCID: PMC11480490 DOI: 10.1038/s41598-024-73551-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2024] [Accepted: 09/18/2024] [Indexed: 10/19/2024] Open
Abstract
Refractive index (RI) is one of the most important optical properties of materials. Due to the high importance of this physical parameter, there has always been a demand to find a method that provides the most optimal estimation. In this research, we utilize experimentally measured RI values of 272 inorganic compounds to build a machine learning model capable of predicting the RI of materials with low computational cost. Considering the significant relationship between the band gap and RI, we select this parameter as a predictor. In addition to the band gap, the atomic properties related to the building elements of the compounds form our data set in this work. To find the most optimal model and set of suitable predictors, we examine our data in four categories with 1, 5, 10, and 21 features. In addition, we compare the predicted RIs of 6 different independent regression methods, namely, ordinary least squares (OLSR), Gaussian process (GPR), support vector (SVR), random forest (RFR), gradient boosted trees (GBTR), and extremely randomized trees regression(ERTR). We notice that ERTR predicts RI with the highest accuracy compared to other regression methods. The prediction strength of our model excels in empirical relations and provides accurate results for a wide range of RIs. Thus, we demonstrate the high potential of machine learning methods for evaluating the RI, especially when it comes to providing an estimation of a desired physical quantity.
Collapse
Affiliation(s)
- Elham Einabadi
- Department of Physics, K.N. Toosi University of Technology, P. O. Box 15875-4416, Tehran, Iran
| | - Mahdi Mashkoori
- Department of Physics, K.N. Toosi University of Technology, P. O. Box 15875-4416, Tehran, Iran.
- School of Physics, Institute for Research in Fundamental Sciences (IPM), P.O. Box 19395-5531, Tehran, Iran.
| |
Collapse
|
6
|
Karimitari N, Baldwin WJ, Muller EW, Bare ZJL, Kennedy WJ, Csányi G, Sutton C. Accurate Crystal Structure Prediction of New 2D Hybrid Organic-Inorganic Perovskites. J Am Chem Soc 2024; 146:27392-27404. [PMID: 39344597 PMCID: PMC11468779 DOI: 10.1021/jacs.4c06549] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2024] [Revised: 07/15/2024] [Accepted: 08/19/2024] [Indexed: 10/01/2024]
Abstract
Low-dimensional hybrid organic-inorganic perovskites (HOIPs) are promising electronically active materials for light absorption and emission. The design space of HOIPs is extremely large, as a variety of organic cations can be combined with different inorganic frameworks. This not only allows for tunable electronic and mechanical properties but also necessitates the development of new tools for in silico high throughput analysis of candidate materials. In this work, we present an accurate, efficient, and widely applicable machine learning interatomic potential (MLIP) trained on 86 diverse experimentally reported HOIP materials. This MLIP was tested on 73 experimentally reported perovskite compositions and achieves a high accuracy, relative to density functional theory (DFT). We also introduce a novel random structure search algorithm designed for the crystal structure prediction of 2D HOIPs. The combination of MLIP and the structure search algorithm reliably recovers the crystal structure of 14 known 2D perovskites by specifying only the organic molecule and inorganic cation/halide. Performing this crystal structure search with ab initio methods would be computationally prohibitive but is relatively inexpensive with the MLIP. Finally, the developed procedure is used to predict the structure of a totally new HOIP with cation (cis-1,3-cyclohexanediamine). Subsequently, the new compound was synthesized and characterized, which matches the predicted structure, confirming the accuracy of our method. This capability will enable the efficient and accurate screening of thousands of combinations of organic cations and inorganic layers for further investigation.
Collapse
Affiliation(s)
- Nima Karimitari
- Department
of Chemistry and Biochemistry, University
of South Carolina, Columbia, South Carolina 29208, United States
| | - William J. Baldwin
- Department
of Engineering, University of Cambridge, Cambridge CB2 1PZ, U.K.
| | | | - Zachary J. L. Bare
- Department
of Chemistry and Biochemistry, University
of South Carolina, Columbia, South Carolina 29208, United States
| | - W. Joshua Kennedy
- Materials
and Manufacturing Directorate, Air Force
Research Laboratory, Wright-Patterson AFB, Dayton, Ohio 45433, United States
| | - Gábor Csányi
- Department
of Engineering, University of Cambridge, Cambridge CB2 1PZ, U.K.
| | - Christopher Sutton
- Department
of Chemistry and Biochemistry, University
of South Carolina, Columbia, South Carolina 29208, United States
| |
Collapse
|
7
|
Osaro E, Fajardo-Rojas F, Cooper GM, Gómez-Gualdrón D, Colón YJ. Active learning of alchemical adsorption simulations; towards a universal adsorption model. Chem Sci 2024:d4sc02156h. [PMID: 39391382 PMCID: PMC11459438 DOI: 10.1039/d4sc02156h] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2024] [Accepted: 09/27/2024] [Indexed: 10/12/2024] Open
Abstract
Adsorption is a fundamental process studied in materials science and engineering because it plays a critical role in various applications, including gas storage and separation. Understanding and predicting gas adsorption within porous materials demands comprehensive computational simulations that are often resource intensive, limiting the identification of promising materials. Active learning (AL) methods offer an effective strategy to reduce the computational burden by selectively acquiring critical data for model training. Metal-organic frameworks (MOFs) exhibit immense potential across various adsorption applications due to their porous structure and their modular nature, leading to diverse pore sizes and chemistry that serve as an ideal platform to develop adsorption models. Here, we demonstrate the efficacy of AL in predicting gas adsorption within MOFs using "alchemical" molecules and their interactions as surrogates for real molecules. We first applied AL separately to each MOF, reducing the training dataset size by 57.5% while retaining predictive accuracy. Subsequently, we amalgamated the refined datasets across 1800 MOFs to train a multilayer perceptron (MLP) model, successfully predicting adsorption of real molecules. Furthermore, by integrating MOF features into the AL framework using principal component analysis (PCA), we navigated MOF space effectively, achieving high predictive accuracy with only a subset of MOFs. Our results highlight AL's efficiency in reducing dataset size, enhancing model performance, and offering insights into adsorption phenomenon in large datasets of MOFs. This study underscores AL's crucial role in advancing computational material science and developing more accurate and less data intensive models for gas adsorption in porous materials.
Collapse
Affiliation(s)
- Etinosa Osaro
- Department of Chemical and Biomolecular Engineering, University of Notre Dame IN 46556 USA
| | - Fernando Fajardo-Rojas
- Department of Chemical and Biological Engineering, Colorado School of Mines 1500 Illinois St Golden CO 80401 USA
| | - Gregory M Cooper
- Department of Chemical and Biomolecular Engineering, University of Notre Dame IN 46556 USA
| | - Diego Gómez-Gualdrón
- Department of Chemical and Biological Engineering, Colorado School of Mines 1500 Illinois St Golden CO 80401 USA
| | - Yamil J Colón
- Department of Chemical and Biomolecular Engineering, University of Notre Dame IN 46556 USA
| |
Collapse
|
8
|
Rahimi-Soujeh Z, Safaie N, Moradi S, Abbod M, Sharifi R, Mojerlou S, Mokhtassi-Bidgoli A. New binary mixtures of fungicides against Macrophomina phaseolina: Machine learning-driven QSAR, read-across prediction, and molecular dynamics simulation. CHEMOSPHERE 2024; 366:143533. [PMID: 39419329 DOI: 10.1016/j.chemosphere.2024.143533] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/03/2024] [Revised: 10/10/2024] [Accepted: 10/12/2024] [Indexed: 10/19/2024]
Abstract
Quantitative Structure-Activity Relationship (QSAR) analysis greatly enhances the development and research of pesticides. This study employed Multiple Linear Regression (MLR), machine learning (ML), and read-across (RA) approaches to investigate the combined effects of binary mixtures of fungicides on Macrophomina phaseolina. Using the Fixed Ratio Ray Design (FRRD) method, 75 binary mixtures of six frequently used fungicides were generated, with many exhibiting additive interactions as indicated by the Concentration Addition (CA) and Independent Action (IA) models. The QSAR analysis revealed that Support Vector Regression (SVR) and Gaussian Process Regression (GPR) models were the most effective, outperforming the Least Squares Kernel (LSK), MLR, and RA methods. SVR achieved an outstanding R2 of 0.95 and Q2LMO of 0.81, whereas GPR demonstrated values of 0.93 and 0.81 for the same metrics. Internal and external validation confirmed the reliability and generalizability of these models, suggesting they could be applied to a wider array of data. Moreover, Molecular Dynamics (MD) simulations showed that the effects of the fungicides are linked to physiological mechanisms rather than intermolecular interactions within their formulations. This study establishes a robust framework for creating potent fungicide combinations that improve disease management efficacy while promoting environmental sustainability and reducing the chemical load to mitigate negative impacts.
Collapse
Affiliation(s)
- Zaniar Rahimi-Soujeh
- Department of Plant Pathology, Faculty of Agriculture, Tarbiat Modares University, Tehran, Iran
| | - Naser Safaie
- Department of Plant Pathology, Faculty of Agriculture, Tarbiat Modares University, Tehran, Iran.
| | - Sajad Moradi
- Nano Drug Delivery Research Center, Health Technology Institute, Kermanshah University of Medical, Kermanshah, Iran
| | - Mohsen Abbod
- Department of Plant Protection, Faculty of Agriculture, Al-Baath University, Homs, Syria
| | - Rouhalah Sharifi
- Department of Plant Protection, Faculty of Agricultural Engineering, Razi University, Kermanshah, Iran
| | - Shideh Mojerlou
- Department of Horticulture and Plant Protection, Faculty of Agriculture, Shahrood University of Technology, Shahrood, Iran
| | - Ali Mokhtassi-Bidgoli
- Department of Agronomy, Faculty of Agriculture, Tarbiat Modares University, Tehran, Iran
| |
Collapse
|
9
|
Shi J, Pršlja P, Jin B, Suominen M, Sainio J, Jiang H, Han N, Robertson D, Košir J, Caro M, Kallio T. Experimental and Computational Study Toward Identifying Active Sites of Supported SnO x Nanoparticles for Electrochemical CO 2 Reduction Using Machine-Learned Interatomic Potentials. SMALL (WEINHEIM AN DER BERGSTRASSE, GERMANY) 2024; 20:e2402190. [PMID: 38794869 DOI: 10.1002/smll.202402190] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/29/2024] [Indexed: 05/26/2024]
Abstract
SnOx has received great attention as an electrocatalyst for CO2 reduction reaction (CO2RR), however; it still suffers from low activity. Moreover, the atomic-level SnOx structure and the nature of the active sites are still ambiguous due to the dynamism of surface structure and difficulty in structure characterization under electrochemical conditions. Herein, CO2RR performance is enhanced by supporting SnO2 nanoparticles on two common supports, vulcan carbon and TiO2. Then, electrolysis of CO2 at various temperatures in a neutral electrolyte reveals that the application window for this catalyst is between 12 and 30 °C. Furthermore, this study introduces a machine learning interatomic potential method for the atomistic simulation to investigate SnO2 reduction and establish a correlation between SnOx structures and their CO2RR performance. In addition, selectivity is analyzed computationally with density functional theory simulations to identify the key differences between the binding energies of *H and *CO2 -, where both are correlated with the presence of oxygen on the nanoparticle surface. This study offers in-depth insights into the rational design and application of SnOx-based electrocatalysts for CO2RR.
Collapse
Affiliation(s)
- Junjie Shi
- Department of Chemistry and Materials Science, School of Chemical Engineering, Aalto University, Espoo, Finland
| | - Paulina Pršlja
- Department of Chemistry and Materials Science, School of Chemical Engineering, Aalto University, Espoo, Finland
| | - Benjin Jin
- Department of Chemistry and Materials Science, School of Chemical Engineering, Aalto University, Espoo, Finland
| | - Milla Suominen
- Department of Chemistry and Materials Science, School of Chemical Engineering, Aalto University, Espoo, Finland
| | - Jani Sainio
- Department of Applied Physics, School of Science, Aalto University, Espoo, Finland
| | - Hua Jiang
- Department of Applied Physics, School of Science, Aalto University, Espoo, Finland
| | - Nana Han
- Department of Chemistry and Materials Science, School of Chemical Engineering, Aalto University, Espoo, Finland
| | - Daria Robertson
- Department of Bioproducts and Biosystems, School of Chemical Engineering, Aalto University, Espoo, Finland
| | - Janez Košir
- Department of Chemistry and Materials Science, School of Chemical Engineering, Aalto University, Espoo, Finland
| | - Miguel Caro
- Department of Chemistry and Materials Science, School of Chemical Engineering, Aalto University, Espoo, Finland
| | - Tanja Kallio
- Department of Chemistry and Materials Science, School of Chemical Engineering, Aalto University, Espoo, Finland
| |
Collapse
|
10
|
Mroz AM, Toka PN, Del Río Chanona EA, Jelfs KE. Web-BO: towards increased accessibility of Bayesian optimisation (BO) for chemistry. Faraday Discuss 2024. [PMID: 39344946 DOI: 10.1039/d4fd00109e] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/01/2024]
Abstract
Historically, the chemical discovery process has predominantly been a matter of trial-and-improvement, where small modifications are made to a chemical system, guided by chemical knowledge, with the aim of optimising towards a target property or combination of properties. While a trial-and-improvement approach is frequently successful, especially when assisted by the help of serendipity, the approach is incredibly time- and resource-intensive. Complicating this further, the available chemical space that could, in theory, be explored is remarkably vast. As we are faced with near infinite possibilities and limited resources, we require improved search methods to effectively move towards desired optima, e.g. chemical systems exhibiting a target property, or several desired properties. Bayesian optimisation (BO) has recently gained significant traction in chemistry, where within the BO framework, prior knowledge is used to inform and guide the search process to optimise towards desired chemical targets, e.g. optimal reaction conditions to maximise yield, or optimal catalyst exhibiting improved catalytic activity. While powerful, implementing BO algorithms in practice is largely limited to interfacing via various APIs - requiring advanced coding experience and bespoke scripts for each optimisation task. Further, it is challenging to seamlessly link these with electronic lab notebooks via a graphical user interface (GUI). Ultimately, this limits the accessibility of BO algorithms. Here, we present Web-BO, a GUI to support BO for chemical optimisation tasks. We demonstrate its performance using an open source dataset and associated emulator, and link the platform with an existing electronic lab notebook, datalab. By providing a GUI-based BO service, we hope to improve the accessibility of data-driven optimisation tools in chemistry; https://suprashare.rcs.ic.ac.uk/web-bo/.
Collapse
Affiliation(s)
- Austin M Mroz
- Department of Chemistry, Imperial College London, White City Campus, W12 0BZ, UK.
- I-X Centre for AI in Science, Imperial College London, White City Campus, W12 0BZ, UK
| | - Piotr N Toka
- Department of Chemistry, Imperial College London, White City Campus, W12 0BZ, UK.
| | | | - Kim E Jelfs
- Department of Chemistry, Imperial College London, White City Campus, W12 0BZ, UK.
| |
Collapse
|
11
|
Kaur H, Della Pia F, Batatia I, Advincula XR, Shi BX, Lan J, Csányi G, Michaelides A, Kapil V. Data-efficient fine-tuning of foundational models for first-principles quality sublimation enthalpies. Faraday Discuss 2024. [PMID: 39329168 PMCID: PMC11428088 DOI: 10.1039/d4fd00107a] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/28/2024]
Abstract
Calculating sublimation enthalpies of molecular crystal polymorphs is relevant to a wide range of technological applications. However, predicting these quantities at first-principles accuracy - even with the aid of machine learning potentials - is a challenge that requires sub-kJ mol-1 accuracy in the potential energy surface and finite-temperature sampling. We present an accurate and data-efficient protocol for training machine learning interatomic potentials by fine-tuning the foundational MACE-MP-0 model and showcase its capabilities on sublimation enthalpies and physical properties of ice polymorphs. Our approach requires only a few tens of training structures to achieve sub-kJ mol-1 accuracy in the sublimation enthalpies and sub-1% error in densities at finite temperature and pressure. Exploiting this data efficiency, we perform preliminary NPT simulations of hexagonal ice at the random phase approximation level and demonstrate a good agreement with experiments. Our results show promise for finite-temperature modelling of molecular crystals with the accuracy of correlated electronic structure theory methods.
Collapse
Affiliation(s)
- Harveen Kaur
- Yusuf Hamied Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge, CB2 1EW, UK.
| | - Flaviano Della Pia
- Yusuf Hamied Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge, CB2 1EW, UK.
| | - Ilyes Batatia
- Engineering Laboratory, University of Cambridge, Cambridge, CB2 1PZ, UK
| | - Xavier R Advincula
- Yusuf Hamied Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge, CB2 1EW, UK.
- Cavendish Laboratory, Department of Physics, University of Cambridge, Cambridge, CB3 0HE, UK
| | - Benjamin X Shi
- Yusuf Hamied Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge, CB2 1EW, UK.
| | - Jinggang Lan
- Department of Chemistry, New York University, New York, NY, 10003, USA
- Simons Center for Computational Physical Chemistry at New York University, New York, New York 10003, USA
| | - Gábor Csányi
- Engineering Laboratory, University of Cambridge, Cambridge, CB2 1PZ, UK
| | - Angelos Michaelides
- Yusuf Hamied Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge, CB2 1EW, UK.
| | - Venkat Kapil
- Yusuf Hamied Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge, CB2 1EW, UK.
- Department of Physics and Astronomy, University College, London WC1E 6BT, UK
- Thomas Young Centre and London Centre for Nanotechnology, London WC1E 6BT, UK.
| |
Collapse
|
12
|
Abranches DO, Dean W, Muñoz M, Wang W, Liang Y, Gurkan B, Maginn EJ, Colón YJ. Combining High-Throughput Experiments and Active Learning to Characterize Deep Eutectic Solvents. ACS SUSTAINABLE CHEMISTRY & ENGINEERING 2024; 12:14218-14229. [PMID: 39329020 PMCID: PMC11423404 DOI: 10.1021/acssuschemeng.4c04507] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/01/2024] [Revised: 08/26/2024] [Accepted: 08/27/2024] [Indexed: 09/28/2024]
Abstract
The high tunability of deep eutectic solvents (DESs) stems from the ease of changing their precursors and relative compositions. However, measuring the physicochemical properties across large composition and temperature ranges, necessary to properly design target-specific DESs, is tedious and error-prone and represents a bottleneck in the advancement and scalability of DES-based applications. As such, active learning (AL) methodologies based on Gaussian processes (GPs) were developed in this work to minimize the experimental effort necessary to characterize DESs. Owing to its importance for large-scale applications, the reduction of DES viscosity through the addition of a low-molecular-weight solvent was explored as a case study. A high-throughput experimental screening was initially performed on nine different ternary DESs. Then, GPs were successfully trained to predict DES viscosity from its composition and temperature, showcasing the ability of these stochastic, nonparametric models to accurately describe the physicochemical properties of complex mixtures. Finally, the ability of GPs to provide estimates of their own uncertainty was leveraged through an AL framework to minimize the number of data points necessary to obtain accurate viscosity modes. This led to a significant reduction in data requirements, with many systems requiring only five independent viscosity data points to be properly described.
Collapse
Affiliation(s)
- Dinis O Abranches
- Department of Chemical and Biomolecular Engineering, University of Notre Dame, Notre Dame, Indiana 46556, United States
| | - William Dean
- Chemical and Biomolecular Engineering Department, Case Western Reserve University, Cleveland, Ohio 44106, United States
| | - Miguel Muñoz
- Chemical and Biomolecular Engineering Department, Case Western Reserve University, Cleveland, Ohio 44106, United States
| | - Wei Wang
- Pacific Northwest National Laboratory, 902 Battelle Boulevard, Richland, Washington 99354, United States
| | - Yangang Liang
- Pacific Northwest National Laboratory, 902 Battelle Boulevard, Richland, Washington 99354, United States
| | - Burcu Gurkan
- Chemical and Biomolecular Engineering Department, Case Western Reserve University, Cleveland, Ohio 44106, United States
| | - Edward J Maginn
- Department of Chemical and Biomolecular Engineering, University of Notre Dame, Notre Dame, Indiana 46556, United States
| | - Yamil J Colón
- Department of Chemical and Biomolecular Engineering, University of Notre Dame, Notre Dame, Indiana 46556, United States
| |
Collapse
|
13
|
Kearney GM, Fardad M. Optimization based data enrichment using stochastic dynamical system models. PLoS One 2024; 19:e0310504. [PMID: 39302954 DOI: 10.1371/journal.pone.0310504] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2024] [Accepted: 09/02/2024] [Indexed: 09/22/2024] Open
Abstract
We develop a general framework for state estimation in systems modeled with noise-polluted continuous time dynamics and discrete time noisy measurements. Our approach is based on maximum likelihood estimation and employs the calculus of variations to derive optimality conditions for continuous time functions. We make no prior assumptions on the form of the mapping from measurements to state-estimate or on the distributions of the noise terms, making the framework more general than Kalman filtering/smoothing where this mapping is assumed to be linear and the noises Gaussian. The optimal solution that arises is interpreted as a continuous time spline, the structure and temporal dependency of which is determined by the system dynamics and the distributions of the process and measurement noise. Similar to Kalman smoothing, the optimal spline yields increased data accuracy at instants when measurements are taken, in addition to providing continuous time estimates outside the measurement instances. We demonstrate the utility and generality of our approach via illustrative examples that render both linear and nonlinear data filters depending on the particular system. Application of the proposed approach to a Monte Carlo simulation exhibits significant performance improvement in comparison to a common existing method.
Collapse
Affiliation(s)
- Griffin M Kearney
- OpB Data Insights LLC, Syracuse, NY, United States of America
- Department of Electrical Engineering and Computer Science, Syracuse University, Syracuse, NY, United States of America
| | - Makan Fardad
- Department of Electrical Engineering and Computer Science, Syracuse University, Syracuse, NY, United States of America
| |
Collapse
|
14
|
Schmid SP, Schlosser L, Glorius F, Jorner K. Catalysing (organo-)catalysis: Trends in the application of machine learning to enantioselective organocatalysis. Beilstein J Org Chem 2024; 20:2280-2304. [PMID: 39290209 PMCID: PMC11406055 DOI: 10.3762/bjoc.20.196] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2024] [Accepted: 08/09/2024] [Indexed: 09/19/2024] Open
Abstract
Organocatalysis has established itself as a third pillar of homogeneous catalysis, besides transition metal catalysis and biocatalysis, as its use for enantioselective reactions has gathered significant interest over the last decades. Concurrent to this development, machine learning (ML) has been increasingly applied in the chemical domain to efficiently uncover hidden patterns in data and accelerate scientific discovery. While the uptake of ML in organocatalysis has been comparably slow, the last two decades have showed an increased interest from the community. This review gives an overview of the work in the field of ML in organocatalysis. The review starts by giving a short primer on ML for experimental chemists, before discussing its application for predicting the selectivity of organocatalytic transformations. Subsequently, we review ML employed for privileged catalysts, before focusing on its application for catalyst and reaction design. Concluding, we give our view on current challenges and future directions for this field, drawing inspiration from the application of ML to other scientific domains.
Collapse
Affiliation(s)
- Stefan P Schmid
- Institute of Chemical and Bioengineering, Department of Chemistry and Applied Biosciences, ETH Zurich, Zurich CH-8093, Switzerland
| | - Leon Schlosser
- Organisch-Chemisches Institut, Universität Münster, 48149 Münster, Germany
| | - Frank Glorius
- Organisch-Chemisches Institut, Universität Münster, 48149 Münster, Germany
| | - Kjell Jorner
- Institute of Chemical and Bioengineering, Department of Chemistry and Applied Biosciences, ETH Zurich, Zurich CH-8093, Switzerland
- National Centre of Competence in Research (NCCR) Catalysis, ETH Zurich, Zurich CH-8093, Switzerland
| |
Collapse
|
15
|
Zare M, Sahsah D, Saleheen M, Behler J, Heyden A. Hybrid Quantum Mechanical, Molecular Mechanical, and Machine Learning Potential for Computing Aqueous-Phase Adsorption Free Energies on Metal Surfaces. J Chem Theory Comput 2024. [PMID: 39254514 DOI: 10.1021/acs.jctc.4c00869] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/11/2024]
Abstract
Performing reliable computer simulations of elementary processes occurring at metal-water interfaces is pivotal for novel catalyst design in sustainable energy applications. Computational catalyst design hinges on the ability to reliably and efficiently compute the potential energy surface (PES) of the system. Due to the large system sizes needed for studying processes at liquid water-metal interfaces, these systems can currently not be described using density functional theory (DFT). In this work, we used a hybrid quantum mechanical, molecular mechanical, and machine learning potential for studying the adsorption behavior of phenol, atomic hydrogen, 2-butanol, and 2-butanone on the (0001) facet of Ru under reducing conditions when Ru is not oxidized. Specifically, we describe the adsorbate and the surrounding metal atoms at the DFT level of theory. Here, we also considered the electrostatic field effect of the water molecules on adsorbate-metal interactions. Next, for the water-water and water-adsorbate interactions, we used established classical force fields. Finally, for the water-Ru surface interaction, for which no reliable force fields have been published, we used Behler-Parrinello high-dimensional neural network potentials (HDNNPs). Employing this setup, we used our explicit solvation for metal surface (eSMS) approach to compute the aqueous-phase effect on the low-coverage adsorption of selected molecules and atoms on the (0001) facet of Ru. In agreement with previous experimental and computational studies of oxygenated molecules over transition metal facets, we found that liquid water destabilizes the tested adsorbates on Ru(0001). Interestingly, our findings indicate that adsorbates on Ru are less affected by the presence of an aqueous phase than on other transition metals (e.g., Pt), highlighting the necessity of experimental investigations of Ru-based catalytic systems in liquid water.
Collapse
Affiliation(s)
- Mehdi Zare
- Department of Chemical Engineering, University of South Carolina, Columbia, South Carolina 29208, United States
| | - Dia Sahsah
- Department of Chemical Engineering, University of South Carolina, Columbia, South Carolina 29208, United States
| | - Mohammad Saleheen
- Department of Chemical Engineering, University of South Carolina, Columbia, South Carolina 29208, United States
| | - Jörg Behler
- Lehrstuhl für Theoretische Chemie II, Ruhr-Universität Bochum, Bochum 44780, Germany
- Research Center Chemical Sciences and Sustainability, Research Alliance Ruhr, Bochum 44780, Germany
| | - Andreas Heyden
- Department of Chemical Engineering, University of South Carolina, Columbia, South Carolina 29208, United States
| |
Collapse
|
16
|
Tu NTP, Williamson S, Johnson ER, Rowley CN. Modeling Intermolecular Interactions with Exchange-Hole Dipole Moment Dispersion Corrections to Neural Network Potentials. J Phys Chem B 2024; 128:8290-8302. [PMID: 39166778 DOI: 10.1021/acs.jpcb.4c02882] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/23/2024]
Abstract
Neural network potentials (NNPs) are an innovative approach for calculating the potential energy and forces of a chemical system. In principle, these methods are capable of modeling large systems with an accuracy approaching that of a high-level ab initio calculation, but with a much smaller computational cost. Due to their training to density-functional theory (DFT) data and neglect of long-range interactions, some classes of NNPs require an additional term to include London dispersion physics. In this Perspective, we discuss the requirements for a dispersion model for use with an NNP, focusing on the MLXDM (Machine Learned eXchange-Hole Dipole Moment) model developed by our groups. This model is based on the DFT-based XDM dispersion correction, which calculates interatomic dispersion coefficients in terms of atomic moments and polarizabilities, both of which can be approximated effectively using neural networks.
Collapse
Affiliation(s)
| | - Siri Williamson
- Department of Chemistry, Carleton University, Ottawa, Ontario K1S 5B6, Canada
| | - Erin R Johnson
- Department of Chemistry, Dalhousie University, Halifax, Nova Scotia B3H 4J3, Canada
| | | |
Collapse
|
17
|
Hou P, Tian Y, Meng X. Improving Molecular-Dynamics Simulations for Solid-Liquid Interfaces with Machine-Learning Interatomic Potentials. Chemistry 2024; 30:e202401373. [PMID: 38877181 DOI: 10.1002/chem.202401373] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2024] [Revised: 06/13/2024] [Accepted: 06/14/2024] [Indexed: 06/16/2024]
Abstract
Emerging developments in artificial intelligence have opened infinite possibilities for material simulation. Depending on the powerful fitting of machine learning algorithms to first-principles data, machine learning interatomic potentials (MLIPs) can effectively balance the accuracy and efficiency problems in molecular dynamics (MD) simulations, serving as powerful tools in various complex physicochemical systems. Consequently, this brings unprecedented enthusiasm for researchers to apply such novel technology in multiple fields to revisit the major scientific problems that have remained controversial owing to the limitations of previous computational methods. Herein, we introduce the evolution of MLIPs, provide valuable application examples for solid-liquid interfaces, and present current challenges. Driven by solving multitudinous difficulties in terms of the accuracy, efficiency, and versatility of MLIPs, this booming technique, combined with molecular simulation methods, will provide an underlying and valuable understanding of interdisciplinary scientific challenges, including materials, physics, and chemistry.
Collapse
Affiliation(s)
- Pengfei Hou
- Key Laboratory of Physics and Technology for Advanced Batteries (Ministry of Education), College of Physics, Jilin University, Changchun, 130012, China
- Key Laboratory of Material Simulation Methods and Software of Ministry of Education, College of Physics, Jilin University, Changchun, 130012, China
| | - Yumiao Tian
- Key Laboratory of Physics and Technology for Advanced Batteries (Ministry of Education), College of Physics, Jilin University, Changchun, 130012, China
- Key Laboratory of Material Simulation Methods and Software of Ministry of Education, College of Physics, Jilin University, Changchun, 130012, China
| | - Xing Meng
- Key Laboratory of Physics and Technology for Advanced Batteries (Ministry of Education), College of Physics, Jilin University, Changchun, 130012, China
- Key Laboratory of Material Simulation Methods and Software of Ministry of Education, College of Physics, Jilin University, Changchun, 130012, China
| |
Collapse
|
18
|
Willman JT, Gonzalez JM, Nguyen-Cong K, Hamel S, Lordi V, Oleynik II. Accuracy, transferability, and computational efficiency of interatomic potentials for simulations of carbon under extreme conditions. J Chem Phys 2024; 161:084709. [PMID: 39193946 DOI: 10.1063/5.0218705] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2024] [Accepted: 07/14/2024] [Indexed: 08/29/2024] Open
Abstract
Large-scale atomistic molecular dynamics (MD) simulations provide an exceptional opportunity to advance the fundamental understanding of carbon under extreme conditions of high pressures and temperatures. However, the fidelity of these simulations depends heavily on the accuracy of classical interatomic potentials governing the dynamics of many-atom systems. This study critically assesses several popular empirical potentials for carbon, as well as machine learning interatomic potentials (MLIPs), in their ability to simulate a range of physical properties at high pressures and temperatures, including the diamond equation of state, its melting line, shock Hugoniot, uniaxial compressions, and the structure of liquid carbon. Empirical potentials fail to accurately predict the behavior of carbon under high pressure-temperature conditions. In contrast, MLIPs demonstrate quantum accuracy, with Spectral Neighbor Analysis Potential (SNAP) and atomic cluster expansion (ACE) being the most accurate in reproducing the density functional theory results. ACE displays remarkable transferability despite not being specifically trained for extreme conditions. Furthermore, ACE and SNAP exhibit superior computational performance on graphics processing unit-based systems in billion atom MD simulations, with SNAP emerging as the fastest. In addition to offering practical guidance in selecting an interatomic potential with a fine balance of accuracy, transferability, and computational efficiency, this work also highlights transformative opportunities for groundbreaking scientific discoveries facilitated by quantum-accurate MD simulations with MLIPs on emerging exascale supercomputers.
Collapse
Affiliation(s)
| | - Joseph M Gonzalez
- Department of Physics, University of South Florida, Tampa, Florida 33620, USA
| | - Kien Nguyen-Cong
- Lawrence Livermore National Laboratory, Livermore, California 94550, USA
| | - Sebastien Hamel
- Lawrence Livermore National Laboratory, Livermore, California 94550, USA
| | - Vincenzo Lordi
- Lawrence Livermore National Laboratory, Livermore, California 94550, USA
| | - Ivan I Oleynik
- Department of Physics, University of South Florida, Tampa, Florida 33620, USA
| |
Collapse
|
19
|
Aristoff D, Johnson M, Simpson G, Webber RJ. The fast committor machine: Interpretable prediction with kernels. J Chem Phys 2024; 161:084113. [PMID: 39193940 DOI: 10.1063/5.0222798] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2024] [Accepted: 08/07/2024] [Indexed: 08/29/2024] Open
Abstract
In the study of stochastic systems, the committor function describes the probability that a system starting from an initial configuration x will reach a set B before a set A. This paper introduces an efficient and interpretable algorithm for approximating the committor, called the "fast committor machine" (FCM). The FCM uses simulated trajectory data to build a kernel-based model of the committor. The kernel function is constructed to emphasize low-dimensional subspaces that optimally describe the A to B transitions. The coefficients in the kernel model are determined using randomized linear algebra, leading to a runtime that scales linearly with the number of data points. In numerical experiments involving a triple-well potential and alanine dipeptide, the FCM yields higher accuracy and trains more quickly than a neural network with the same number of parameters. The FCM is also more interpretable than the neural net.
Collapse
Affiliation(s)
- David Aristoff
- Mathematics, Colorado State University, Fort Collins, Colorado 80523, USA
| | - Mats Johnson
- Mathematics, Colorado State University, Fort Collins, Colorado 80523, USA
| | - Gideon Simpson
- Mathematics, Drexel University, Philadelphia, Pennsylvania 19104, USA
| | - Robert J Webber
- Mathematics, University of California San Diego, La Jolla, California 92093, USA
| |
Collapse
|
20
|
Willow SY, Kim DG, Sundheep R, Hajibabaei A, Kim KS, Myung CW. Active sparse Bayesian committee machine potential for isothermal-isobaric molecular dynamics simulations. Phys Chem Chem Phys 2024; 26:22073-22082. [PMID: 39113586 DOI: 10.1039/d4cp01801j] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/23/2024]
Abstract
Recent advancements in machine learning potentials (MLPs) have significantly impacted the fields of chemistry, physics, and biology by enabling large-scale first-principles simulations. Among different machine learning approaches, kernel-based MLPs distinguish themselves through their ability to handle small datasets, quantify uncertainties, and minimize over-fitting. Nevertheless, their extensive computational requirements present considerable challenges. To alleviate these, sparsification methods have been developed, aiming to reduce computational scaling without compromising accuracy. In the context of isothermal and isobaric ML molecular dynamics (MD) simulations, achieving precise pressure estimation is crucial for reproducing reliable system behavior under constant pressure. Despite progress, sparse kernel MLPs struggle with precise pressure prediction. Here, we introduce a virial kernel function that significantly enhances the pressure estimation accuracy of MLPs. Additionally, we propose the active sparse Bayesian committee machine (BCM) potential, an on-the-fly MLP architecture that aggregates local sparse Gaussian process regression (SGPR) MLPs. The sparse BCM potential overcomes the steep computational scaling with the kernel size, and a predefined restriction on the size of kernel allows for fast and efficient on-the-fly training. Our advancements facilitate accurate and computationally efficient machine learning-enhanced MD (MLMD) simulations across diverse systems, including ice-liquid coexisting phases, Li10Ge(PS6)2 lithium solid electrolyte, and high-pressure liquid boron nitride.
Collapse
Affiliation(s)
- Soohaeng Yoo Willow
- Department of Energy Science, Sungkyunkwan University, Seobu-ro 2066, Suwon 16419, Korea.
| | - Dong Geon Kim
- Department of Energy Science, Sungkyunkwan University, Seobu-ro 2066, Suwon 16419, Korea.
| | - R Sundheep
- Department of Energy Science, Sungkyunkwan University, Seobu-ro 2066, Suwon 16419, Korea.
| | - Amir Hajibabaei
- Yusuf Hamied Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, UK
| | - Kwang S Kim
- Center for Superfunctional Materials, Department of Chemistry, Ulsan National Institute of Science and Technology, Ulsan 44919, Korea
| | - Chang Woo Myung
- Department of Energy Science, Sungkyunkwan University, Seobu-ro 2066, Suwon 16419, Korea.
| |
Collapse
|
21
|
Pal Y, Fiala TA, Swords WB, Yoon TP, Schmidt JR. Predicting Emission Spectra of Heteroleptic Iridium Complexes Using Artificial Chemical Intelligence. Chemphyschem 2024; 25:e202400176. [PMID: 38752882 DOI: 10.1002/cphc.202400176] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2024] [Revised: 05/15/2024] [Indexed: 07/09/2024]
Abstract
We report a deep learning-based approach to accurately predict the emission spectra of phosphorescent heteroleptic [Ir(C ∧ N ${{\rm{C}}^\wedge {\rm{N}}}$ )2(N ∧ N ${{\rm{N}}^\wedge {\rm{N}}}$ )]+ complexes, enabling the rapid discovery of novel Ir(III) chromophores for diverse applications including organic light-emitting diodes and solar fuel cells. The deep learning models utilize graph neural networks and other chemical features in architectures that reflect the inherent structure of the heteroleptic complexes, composed ofC ∧ N ${{\rm{C}}^\wedge {\rm{N}}}$ andN ∧ N ${{\rm{N}}^\wedge {\rm{N}}}$ ligands, and are thus geared towards efficient training over the dataset. By leveraging experimental emission data, our models reliably predict the full emission spectra of these complexes across various emission profiles, surpassing the accuracy of conventional DFT and correlated wavefunction methods, while simultaneously achieving robustness to the presence of imperfect (noisy, low-quality) training spectra. We showcase the potential applications for these and related models for in silico prediction of complexes with tailored emission properties, as well as in "design of experiment" contexts to reduce the synthetic burden of high-throughput screening. In the latter case, we demonstrate that the models allow us to exploit a limited amount of experimental data to explore a wide range of chemical space, thus leveraging a modest synthetic effort.
Collapse
Affiliation(s)
- Yudhajit Pal
- Theoretical Chemistry Institute and Department of Chemistry, University of Wisconsin-Madison, 1101 University Avenue, Madison, WI 53706, United States
| | - Tahoe A Fiala
- Department of Chemistry, University of Wisconsin-Madison, 1101 University Avenue, Madison, WI 53706, United States
| | - Wesley B Swords
- Department of Chemistry, University of Wisconsin-Madison, 1101 University Avenue, Madison, WI 53706, United States
| | - Tehshik P Yoon
- Department of Chemistry, University of Wisconsin-Madison, 1101 University Avenue, Madison, WI 53706, United States
| | - J R Schmidt
- Theoretical Chemistry Institute and Department of Chemistry, University of Wisconsin-Madison, 1101 University Avenue, Madison, WI 53706, United States
| |
Collapse
|
22
|
Williams CD, Kalayan J, Burton NA, Bryce RA. Stable and accurate atomistic simulations of flexible molecules using conformationally generalisable machine learned potentials. Chem Sci 2024; 15:12780-12795. [PMID: 39148799 PMCID: PMC11323334 DOI: 10.1039/d4sc01109k] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2024] [Accepted: 07/07/2024] [Indexed: 08/17/2024] Open
Abstract
Computational simulation methods based on machine learned potentials (MLPs) promise to revolutionise shape prediction of flexible molecules in solution, but their widespread adoption has been limited by the way in which training data is generated. Here, we present an approach which allows the key conformational degrees of freedom to be properly represented in reference molecular datasets. MLPs trained on these datasets using a global descriptor scheme are generalisable in conformational space, providing quantum chemical accuracy for all conformers. These MLPs are capable of propagating long, stable molecular dynamics trajectories, an attribute that has remained a challenge. We deploy the MLPs in obtaining converged conformational free energy surfaces for flexible molecules via well-tempered metadynamics simulations; this approach provides a hitherto inaccessible route to accurately computing the structural, dynamical and thermodynamical properties of a wide variety of flexible molecular systems. It is further demonstrated that MLPs must be trained on reference datasets with complete coverage of conformational space, including in barrier regions, to achieve stable molecular dynamics trajectories.
Collapse
Affiliation(s)
- Christopher D Williams
- Division of Pharmacy and Optometry, School of Health Sciences, Faculty of Biology, Medicine and Health, The University of Manchester Oxford Road Manchester M13 9PL UK
| | - Jas Kalayan
- Science and Technologies Facilities Council (STFC), Daresbury Laboratory Keckwick Lane, Daresbury Warrington WA4 4AD UK
| | - Neil A Burton
- Department of Chemistry, School of Natural Sciences, Faculty of Science and Engineering, The University of Manchester Oxford Road Manchester M13 9PL UK
| | - Richard A Bryce
- Division of Pharmacy and Optometry, School of Health Sciences, Faculty of Biology, Medicine and Health, The University of Manchester Oxford Road Manchester M13 9PL UK
| |
Collapse
|
23
|
Goodwin ZAH, Wenny MB, Yang JH, Cepellotti A, Ding J, Bystrom K, Duschatko BR, Johansson A, Sun L, Batzner S, Musaelian A, Mason JA, Kozinsky B, Molinari N. Transferability and Accuracy of Ionic Liquid Simulations with Equivariant Machine Learning Interatomic Potentials. J Phys Chem Lett 2024; 15:7539-7547. [PMID: 39023916 DOI: 10.1021/acs.jpclett.4c01942] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/20/2024]
Abstract
Ionic liquids (ILs) are an exciting class of electrolytes finding applications in many areas from energy storage to solvents, where they have been touted as "designer solvents" as they can be mixed to precisely tailor the physiochemical properties. As using machine learning interatomic potentials (MLIPs) to simulate ILs is still relatively unexplored, several questions need to be answered to see if MLIPs can be transformative for ILs. Since ILs are often not pure, but are either mixed together or contain additives, we first demonstrate that a MLIP can be trained to be compositionally transferable; i.e., the MLIP can be applied to mixtures of ions not directly trained on, while only being trained on a few mixtures of the same ions. We also investigated the accuracy of MLIPs for a novel IL, which we experimentally synthesize and characterize. Our MLIP trained on ∼200 DFT frames is in reasonable agreement with our experiments and DFT.
Collapse
Affiliation(s)
- Zachary A H Goodwin
- John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, Massachusetts 02138, United States
| | - Malia B Wenny
- Department of Chemistry and Chemical Biology, Harvard University, Cambridge, Massachusetts 02138, United States
| | - Julia H Yang
- John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, Massachusetts 02138, United States
- Harvard University Center for the Environment, 26 Oxford St., Cambridge, Massachusetts 02138, United States
| | - Andrea Cepellotti
- John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, Massachusetts 02138, United States
| | - Jingxuan Ding
- John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, Massachusetts 02138, United States
| | - Kyle Bystrom
- John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, Massachusetts 02138, United States
| | - Blake R Duschatko
- John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, Massachusetts 02138, United States
| | - Anders Johansson
- John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, Massachusetts 02138, United States
| | - Lixin Sun
- John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, Massachusetts 02138, United States
| | - Simon Batzner
- John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, Massachusetts 02138, United States
| | - Albert Musaelian
- John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, Massachusetts 02138, United States
| | - Jarad A Mason
- Department of Chemistry and Chemical Biology, Harvard University, Cambridge, Massachusetts 02138, United States
| | - Boris Kozinsky
- John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, Massachusetts 02138, United States
- Research and Technology Center, Robert Bosch LLC, Cambridge, Massachusetts 02142, United States
| | - Nicola Molinari
- John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, Massachusetts 02138, United States
- Research and Technology Center, Robert Bosch LLC, Cambridge, Massachusetts 02142, United States
| |
Collapse
|
24
|
Chaudhry I, Hu G, Ye H, Jensen L. Toward Modeling the Complexity of the Chemical Mechanism in SERS. ACS NANO 2024. [PMID: 39087679 DOI: 10.1021/acsnano.4c07198] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/02/2024]
Abstract
Surface-enhanced Raman scattering (SERS) provides detailed information about the binding of molecules at interfaces and their interactions with the local environment due to the large enhancement of Raman scattering. This enhancement arises from a combination of the electromagnetic mechanism (EM) and chemical mechanism (CM). While it is commonly accepted that EM gives rise to most of the enhancement, large spectral changes originate from CM. To elucidate the rich information contained in SERS spectra about molecules at interfaces, a comprehensive understanding of the enhancement mechanisms is necessary. In this Perspective, we discuss the current understanding of the enhancement mechanisms and highlight their interplay in complex local environments. We will also discuss emerging areas where the development of computational and theoretical models is needed with specific attention given to how the CM contributes to the spectral changes. Future efforts in modeling should focus on overcoming the challenges presented in this review in order to capture the complexity of CM in SERS.
Collapse
Affiliation(s)
- Imran Chaudhry
- Department of Chemistry, The Pennsylvania State University, 104 Benkovic Building, University Park, Pennsylvania 16802, United States
| | - Gaohe Hu
- Department of Chemistry, The Pennsylvania State University, 104 Benkovic Building, University Park, Pennsylvania 16802, United States
| | - Hepeng Ye
- Department of Chemistry, The Pennsylvania State University, 104 Benkovic Building, University Park, Pennsylvania 16802, United States
| | - Lasse Jensen
- Department of Chemistry, The Pennsylvania State University, 104 Benkovic Building, University Park, Pennsylvania 16802, United States
| |
Collapse
|
25
|
Abranches DO, Maginn EJ, Colón YJ. Stochastic machine learning via sigma profiles to build a digital chemical space. Proc Natl Acad Sci U S A 2024; 121:e2404676121. [PMID: 39042681 PMCID: PMC11295021 DOI: 10.1073/pnas.2404676121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2024] [Accepted: 06/16/2024] [Indexed: 07/25/2024] Open
Abstract
This work establishes a different paradigm on digital molecular spaces and their efficient navigation by exploiting sigma profiles. To do so, the remarkable capability of Gaussian processes (GPs), a type of stochastic machine learning model, to correlate and predict physicochemical properties from sigma profiles is demonstrated, outperforming state-of-the-art neural networks previously published. The amount of chemical information encoded in sigma profiles eases the learning burden of machine learning models, permitting the training of GPs on small datasets which, due to their negligible computational cost and ease of implementation, are ideal models to be combined with optimization tools such as gradient search or Bayesian optimization (BO). Gradient search is used to efficiently navigate the sigma profile digital space, quickly converging to local extrema of target physicochemical properties. While this requires the availability of pretrained GP models on existing datasets, such limitations are eliminated with the implementation of BO, which can find global extrema with a limited number of iterations. A remarkable example of this is that of BO toward boiling temperature optimization. Holding no knowledge of chemistry except for the sigma profile and boiling temperature of carbon monoxide (the worst possible initial guess), BO finds the global maximum of the available boiling temperature dataset (over 1,000 molecules encompassing more than 40 families of organic and inorganic compounds) in just 15 iterations (i.e., 15 property measurements), cementing sigma profiles as a powerful digital chemical space for molecular optimization and discovery, particularly when little to no experimental data is initially available.
Collapse
Affiliation(s)
- Dinis O. Abranches
- Department of Chemical and Biomolecular Engineering, University of Notre Dame, Notre Dame, IN46556
| | - Edward J. Maginn
- Department of Chemical and Biomolecular Engineering, University of Notre Dame, Notre Dame, IN46556
| | - Yamil J. Colón
- Department of Chemical and Biomolecular Engineering, University of Notre Dame, Notre Dame, IN46556
| |
Collapse
|
26
|
Bigi F, Pozdnyakov SN, Ceriotti M. Wigner kernels: Body-ordered equivariant machine learning without a basis. J Chem Phys 2024; 161:044116. [PMID: 39056390 DOI: 10.1063/5.0208746] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2024] [Accepted: 06/10/2024] [Indexed: 07/28/2024] Open
Abstract
Machine-learning models based on a point-cloud representation of a physical object are ubiquitous in scientific applications and particularly well-suited to the atomic-scale description of molecules and materials. Among the many different approaches that have been pursued, the description of local atomic environments in terms of their discretized neighbor densities has been used widely and very successfully. We propose a novel density-based method, which involves computing "Wigner kernels." These are fully equivariant and body-ordered kernels that can be computed iteratively at a cost that is independent of the basis used to discretize the density and grows only linearly with the maximum body-order considered. Wigner kernels represent the infinite-width limit of feature-space models, whose dimensionality and computational cost instead scale exponentially with the increasing order of correlations. We present several examples of the accuracy of models based on Wigner kernels in chemical applications, for both scalar and tensorial targets, reaching an accuracy that is competitive with state-of-the-art deep-learning architectures. We discuss the broader relevance of these findings to equivariant geometric machine-learning.
Collapse
Affiliation(s)
- Filippo Bigi
- Laboratory of Computational Science and Modeling, Institut des Matériaux, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
| | - Sergey N Pozdnyakov
- Laboratory of Computational Science and Modeling, Institut des Matériaux, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
| | - Michele Ceriotti
- Laboratory of Computational Science and Modeling, Institut des Matériaux, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
| |
Collapse
|
27
|
Malenfant-Thuot O, Ryczko K, Tamblyn I, Côté M. Efficient determination of Born-effective charges, LO-TO splitting, and Raman tensors of solids with a real-space atom-centered deep learning approach. JOURNAL OF PHYSICS. CONDENSED MATTER : AN INSTITUTE OF PHYSICS JOURNAL 2024; 36:425901. [PMID: 39019077 DOI: 10.1088/1361-648x/ad64a2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/11/2024] [Accepted: 07/17/2024] [Indexed: 07/19/2024]
Abstract
We introduce a deep neural network (DNN) framework called theReal-spaceAtomicDecompositionNETwork (radnet), which is capable of making accurate predictions of polarization and of electronic dielectric permittivity tensors in solids and aims to address limitations of previously available machine learning models for Raman predictions in periodic systems. This framework builds on previous, atom-centered approaches while utilizing deep convolutional neural networks. We report excellent accuracies on direct predictions for two prototypical examples: GaAs and BN. We then use automatic differentiation to efficiently calculate the Born-effective charges, longitudinal optical-transverse optical (LO-TO) splitting frequencies, and Raman tensors of these materials. We compute the Raman spectra, and find agreement withab initioresults. Lastly, we explore ways to generalize the predictions of polarization while taking into account periodic boundary conditions and symmetries.
Collapse
Affiliation(s)
- Olivier Malenfant-Thuot
- Département de physique et Institut Courtois, Université de Montréal, Montréal, Québec, Canada
| | - Kevin Ryczko
- Department of Physics, University of Ottawa, Ottawa, Ontario, Canada
- Vector Institute for Artificial Intelligence, Toronto, Ontario, Canada
- SandboxAQ, Palo Alto, CA, United States of America
| | - Isaac Tamblyn
- Department of Physics, University of Ottawa, Ottawa, Ontario, Canada
- Vector Institute for Artificial Intelligence, Toronto, Ontario, Canada
| | - Michel Côté
- Département de physique et Institut Courtois, Université de Montréal, Montréal, Québec, Canada
| |
Collapse
|
28
|
Bi S, Knijff L, Lian X, van Hees A, Zhang C, Salanne M. Modeling of Nanomaterials for Supercapacitors: Beyond Carbon Electrodes. ACS NANO 2024; 18:19931-19949. [PMID: 39053903 PMCID: PMC11308780 DOI: 10.1021/acsnano.4c01787] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/05/2024] [Revised: 04/08/2024] [Accepted: 04/23/2024] [Indexed: 07/27/2024]
Abstract
Capacitive storage devices allow for fast charge and discharge cycles, making them the perfect complements to batteries for high power applications. Many materials display interesting capacitive properties when they are put in contact with ionic solutions despite their very different structures and (surface) reactivity. Among them, nanocarbons are the most important for practical applications, but many nanomaterials have recently emerged, such as conductive metal-organic frameworks, 2D materials, and a wide variety of metal oxides. These heterogeneous and complex electrode materials are difficult to model with conventional approaches. However, the development of computational methods, the incorporation of machine learning techniques, and the increasing power in high performance computing now allow us to tackle these types of systems. In this Review, we summarize the current efforts in this direction. We show that depending on the nature of the materials and of the charging mechanisms, different methods, or combinations of them, can provide desirable atomic-scale insight on the interactions at play. We mainly focus on two important aspects: (i) the study of ion adsorption in complex nanoporous materials, which require the extension of constant potential molecular dynamics to multicomponent systems, and (ii) the characterization of Faradaic processes in pseudocapacitors, that involves the use of electronic structure-based methods. We also discuss how recently developed simulation methods will allow bridges to be made between double-layer capacitors and pseudocapacitors for future high power electricity storage devices.
Collapse
Affiliation(s)
- Sheng Bi
- Physicochimie
des Électrolytes et Nanosystèmes Interfaciaux, Sorbonne Université, CNRS, F-75005 Paris, France
- Réseau
sur le Stockage Electrochimique de l’Energie (RS2E), FR CNRS 3459, 80039 Amiens Cedex, France
| | - Lisanne Knijff
- Department
of Chemistry - Ångström Laboratory, Uppsala University, Lägerhyddsvägen 1, BOX 538, Uppsala 75121, Sweden
| | - Xiliang Lian
- Physicochimie
des Électrolytes et Nanosystèmes Interfaciaux, Sorbonne Université, CNRS, F-75005 Paris, France
- Réseau
sur le Stockage Electrochimique de l’Energie (RS2E), FR CNRS 3459, 80039 Amiens Cedex, France
| | - Alicia van Hees
- Department
of Chemistry - Ångström Laboratory, Uppsala University, Lägerhyddsvägen 1, BOX 538, Uppsala 75121, Sweden
| | - Chao Zhang
- Department
of Chemistry - Ångström Laboratory, Uppsala University, Lägerhyddsvägen 1, BOX 538, Uppsala 75121, Sweden
- Wallenberg
Initiative Materials Science for Sustainability, Uppsala University, 75121 Uppsala, Sweden
| | - Mathieu Salanne
- Réseau
sur le Stockage Electrochimique de l’Energie (RS2E), FR CNRS 3459, 80039 Amiens Cedex, France
- Institut
Universitaire de France (IUF), 75231 Paris, France
| |
Collapse
|
29
|
Slootman E, Poltavsky I, Shinde R, Cocomello J, Moroni S, Tkatchenko A, Filippi C. Accurate Quantum Monte Carlo Forces for Machine-Learned Force Fields: Ethanol as a Benchmark. J Chem Theory Comput 2024; 20:6020-6027. [PMID: 39003522 PMCID: PMC11270822 DOI: 10.1021/acs.jctc.4c00498] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2024] [Revised: 05/31/2024] [Accepted: 06/03/2024] [Indexed: 07/15/2024]
Abstract
Quantum Monte Carlo (QMC) is a powerful method to calculate accurate energies and forces for molecular systems. In this work, we demonstrate how we can obtain accurate QMC forces for the fluxional ethanol molecule at room temperature by using either multideterminant Jastrow-Slater wave functions in variational Monte Carlo or just a single determinant in diffusion Monte Carlo. The excellent performance of our protocols is assessed against high-level coupled cluster calculations on a diverse set of representative configurations of the system. Finally, we train machine-learning force fields on the QMC forces and compare them to models trained on coupled cluster reference data, showing that a force field based on the diffusion Monte Carlo forces with a single determinant can faithfully reproduce coupled cluster power spectra in molecular dynamics simulations.
Collapse
Affiliation(s)
- E. Slootman
- MESA+
Institute for Nanotechnology, University
of Twente, P.O. Box 217,
7500 AE Enschede, The Netherlands
| | - I. Poltavsky
- Department
of Physics and Materials Science, University
of Luxembourg, L-1511 Luxembourg City, Luxembourg
| | - R. Shinde
- MESA+
Institute for Nanotechnology, University
of Twente, P.O. Box 217,
7500 AE Enschede, The Netherlands
| | - J. Cocomello
- MESA+
Institute for Nanotechnology, University
of Twente, P.O. Box 217,
7500 AE Enschede, The Netherlands
| | - S. Moroni
- CNR-IOM
DEMOCRITOS, Istituto Officina dei Materiali,
and SISSA Scuola Internazionale Superiore di Studi Avanzati, Via Bonomea 265, I-34136 Trieste, Italy
| | - A. Tkatchenko
- Department
of Physics and Materials Science, University
of Luxembourg, L-1511 Luxembourg City, Luxembourg
| | - C. Filippi
- MESA+
Institute for Nanotechnology, University
of Twente, P.O. Box 217,
7500 AE Enschede, The Netherlands
| |
Collapse
|
30
|
Yang Z, Cao F, Cheng H, Liu S, Sun J. A Globally Accurate Neural Network Potential Energy Surface and Quantum Dynamics Studies on Be +( 2S) + H 2/D 2 → BeH +/BeD + + H/D Reactions. Molecules 2024; 29:3436. [PMID: 39065017 PMCID: PMC11487451 DOI: 10.3390/molecules29143436] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2024] [Revised: 07/18/2024] [Accepted: 07/20/2024] [Indexed: 07/28/2024] Open
Abstract
Chemical reactions between Be+ ions and H2 molecules have significance in the fields of ultracold chemistry and astrophysics, but the corresponding dynamics studies on the ground-state reaction have not been reported because of the lack of a global potential energy surface (PES). Herein, a globally accurate ground-state BeH2+ PES is constructed using the neural network model based on 18,657 ab initio points calculated by the multi-reference configuration interaction method with the aug-cc-PVQZ basis set. On the newly constructed PES, the state-to-state quantum dynamics calculations of the Be+(2S) + H2(v0 = 0; j0 = 0) and Be+(2S) + D2(v0 = 0; j0 = 0) reactions are performed using the time-dependent wave packet method. The calculated results suggest that the two reactions are dominated by the complex-forming mechanism and the direct abstraction process at relatively low and high collision energies, respectively, and the isotope substitution has little effect on the reaction dynamics characteristics. The new PES can be used to further study the reaction dynamics of the BeH2+ system, such as the effects of rovibrational excitations and alignment of reactant molecules, and the present dynamics data could provide an important reference for further experimental studies at a finer level.
Collapse
Affiliation(s)
- Zijiang Yang
- School of Physics and Electronic Technology, Liaoning Normal University, Dalian 116029, China
| | | | | | | | | |
Collapse
|
31
|
Barrios Herrera L, Lourenço MP, Hostaš J, Calaminici P, Köster AM, Tchagang A, Salahub DR. Active-learning for global optimization of Ni-Ceria nanoparticles: The case of Ce 4-xNi xO 8- x (x = 1, 2, 3). J Comput Chem 2024; 45:1643-1656. [PMID: 38551129 DOI: 10.1002/jcc.27346] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2023] [Revised: 02/15/2024] [Accepted: 03/05/2024] [Indexed: 06/04/2024]
Abstract
Ni-CeO2 nanoparticles (NPs) are promising nanocatalysts for water splitting and water gas shift reactions due to the ability of ceria to temporarily donate oxygen to the catalytic reaction and accept oxygen after the reaction is completed. Therefore, elucidating how different properties of the Ni-Ceria NPs relate to the activity and selectivity of the catalytic reaction, is of crucial importance for the development of novel catalysts. In this work the active learning (AL) method based on machine learning regression and its uncertainty is used for the global optimization of Ce(4-x)NixO(8-x) (x = 1, 2, 3) nanoparticles, employing density functional theory calculations. Additionally, further investigation of the NPs by mass-scaled parallel-tempering Born-Oppenheimer molecular dynamics resulted in the same putative global minimum structures found by AL, demonstrating the robustness of our AL search to learn from small datasets and assist in the global optimization of complex electronic structure systems.
Collapse
Affiliation(s)
- Lizandra Barrios Herrera
- Department of Chemistry, Department of Physics and Astronomy, CMS Centre for Molecular Simulation, IQST Institute for Quantum Science and Technology, Quantum Alberta, University of Calgary, Calgary, Canada
| | - Maicon Pierre Lourenço
- Departamento de Química e Física, Centro de Ciências Exatas, Naturais e da Saúde (CCENS), Universidade Federal do Espírito Santo, Espírito Santo, Brasil
| | - Jiří Hostaš
- Department of Chemistry, Department of Physics and Astronomy, CMS Centre for Molecular Simulation, IQST Institute for Quantum Science and Technology, Quantum Alberta, University of Calgary, Calgary, Canada
- Digital Technologies Research Centre, National Research Council of Canada, Ottawa, Canada
| | | | | | - Alain Tchagang
- Digital Technologies Research Centre, National Research Council of Canada, Ottawa, Canada
| | - Dennis R Salahub
- Department of Chemistry, Department of Physics and Astronomy, CMS Centre for Molecular Simulation, IQST Institute for Quantum Science and Technology, Quantum Alberta, University of Calgary, Calgary, Canada
| |
Collapse
|
32
|
Sose AT, Gustke T, Wang F, Anand G, Pasupuleti S, Savara A, Deshmukh SA. Evaluation of Sampling Algorithms Used for Bayesian Uncertainty Quantification of Molecular Dynamics Force Fields. J Chem Theory Comput 2024; 20:5732-5742. [PMID: 38924093 PMCID: PMC11238537 DOI: 10.1021/acs.jctc.4c00130] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2024] [Revised: 06/11/2024] [Accepted: 06/12/2024] [Indexed: 06/28/2024]
Abstract
New Bayesian parameter estimation methods have the capability to enable more physically realistic and reliable molecular dynamics (MD) simulations by providing accurate estimates of uncertainties of force-field (FF) parameters and associated properties. However, the choice of which Bayesian parameter estimation algorithm to use has not been widely investigated, despite its impact on the effective exploration of parameter space. Here, using a case example of the Embedded Atom Method (EAM) FF parameters, we investigated the ramifications of several of the algorithm choices. We found that Ensemble Slice Sampling (ESS) and Affine-Invariant Ensemble Sampling (AIES) demonstrate a new level of superior performance, culminating in more accurate parameter and property estimations with tighter uncertainty bounds, compared to traditional methods such as Metropolis-Hastings (MH), Gradient Search (GS), and Uniform Random Sampler (URS). We demonstrate that Bayesian Uncertainty Quantification with ESS and AIES leads to significantly more accurate and reliable predictions of the FF parameters and properties. The results suggest that ESS and AIES should be used to obtain more accurate parameter and uncertainty estimations while providing deeper physical insights.
Collapse
Affiliation(s)
- Abhishek T Sose
- Department of Chemical Engineering, Virginia Tech, Blacksburg, Virginia 24060, United States
| | - Troy Gustke
- Department of Chemical Engineering, Virginia Tech, Blacksburg, Virginia 24060, United States
| | - Fangxi Wang
- Department of Chemical Engineering, Virginia Tech, Blacksburg, Virginia 24060, United States
| | - Gaurav Anand
- Department of Chemical Engineering, Virginia Tech, Blacksburg, Virginia 24060, United States
| | - Sanjana Pasupuleti
- Department of Chemical Engineering, Virginia Tech, Blacksburg, Virginia 24060, United States
| | - Aditya Savara
- Oak Ridge National Laboratory, Oak Ridge, Tennessee 37830, United States
| | - Sanket A Deshmukh
- Department of Chemical Engineering, Virginia Tech, Blacksburg, Virginia 24060, United States
| |
Collapse
|
33
|
Fisher KE, Herbst MF, Marzouk YM. Multitask methods for predicting molecular properties from heterogeneous data. J Chem Phys 2024; 161:014114. [PMID: 38958501 DOI: 10.1063/5.0201681] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2024] [Accepted: 06/12/2024] [Indexed: 07/04/2024] Open
Abstract
Data generation remains a bottleneck in training surrogate models to predict molecular properties. We demonstrate that multitask Gaussian process regression overcomes this limitation by leveraging both expensive and cheap data sources. In particular, we consider training sets constructed from coupled-cluster (CC) and density functional theory (DFT) data. We report that multitask surrogates can predict at CC-level accuracy with a reduction in data generation cost by over an order of magnitude. Of note, our approach allows the training set to include DFT data generated by a heterogeneous mix of exchange-correlation functionals without imposing any artificial hierarchy on functional accuracy. More generally, the multitask framework can accommodate a wider range of training set structures-including the full disparity between the different levels of fidelity-than existing kernel approaches based on Δ-learning although we show that the accuracy of the two approaches can be similar. Consequently, multitask regression can be a tool for reducing data generation costs even further by opportunistically exploiting existing data sources.
Collapse
Affiliation(s)
- K E Fisher
- Department of Aeronautics and Astronautics, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
| | - M F Herbst
- Mathematics for Materials Modelling, Institute of Mathematics and Institute of Materials, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
- National Centre for Computational Design and Discovery of Novel Materials (MARVEL), École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
| | - Y M Marzouk
- Department of Aeronautics and Astronautics, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
| |
Collapse
|
34
|
Deo S, Kreider ME, Kamat G, Hubert M, Zamora Zeledón JA, Wei L, Matthews J, Keyes N, Singh I, Jaramillo TF, Abild-Pedersen F, Burke Stevens M, Winther K, Voss J. Interpretable Machine Learning Models for Practical Antimonate Electrocatalyst Performance. Chemphyschem 2024; 25:e202400010. [PMID: 38547332 DOI: 10.1002/cphc.202400010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2024] [Revised: 02/27/2024] [Indexed: 07/03/2024]
Abstract
Computationally predicting the performance of catalysts under reaction conditions is a challenging task due to the complexity of catalytic surfaces and their evolution in situ, different reaction paths, and the presence of solid-liquid interfaces in the case of electrochemistry. We demonstrate here how relatively simple machine learning models can be found that enable prediction of experimentally observed onset potentials. Inputs to our model are comprised of data from the oxygen reduction reaction on non-precious transition-metal antimony oxide nanoparticulate catalysts with a combination of experimental conditions and computationally affordable bulk atomic and electronic structural descriptors from density functional theory simulations. From human-interpretable genetic programming models, we identify key experimental descriptors and key supplemental bulk electronic and atomic structural descriptors that govern trends in onset potentials for these oxides and deduce how these descriptors should be tuned to increase onset potentials. We finally validate these machine learning predictions by experimentally confirming that scandium as a dopant in nickel antimony oxide leads to a desired onset potential increase. Macroscopic experimental factors are found to be crucially important descriptors to be considered for models of catalytic performance, highlighting the important role machine learning can play here even in the presence of small datasets.
Collapse
Affiliation(s)
- Shyam Deo
- Department of Chemical Engineering, Stanford University, Stanford, CA 94305, United States
- SUNCAT Center for Interface Science and Catalysis, SLAC National Accelerator Laboratory, Menlo Park, CA 94025, United States
| | - Melissa E Kreider
- Department of Chemical Engineering, Stanford University, Stanford, CA 94305, United States
- SUNCAT Center for Interface Science and Catalysis, SLAC National Accelerator Laboratory, Menlo Park, CA 94025, United States
| | - Gaurav Kamat
- Department of Chemical Engineering, Stanford University, Stanford, CA 94305, United States
- SUNCAT Center for Interface Science and Catalysis, SLAC National Accelerator Laboratory, Menlo Park, CA 94025, United States
| | - McKenzie Hubert
- Department of Chemical Engineering, Stanford University, Stanford, CA 94305, United States
- SUNCAT Center for Interface Science and Catalysis, SLAC National Accelerator Laboratory, Menlo Park, CA 94025, United States
| | - José A Zamora Zeledón
- Department of Chemical Engineering, Stanford University, Stanford, CA 94305, United States
- SUNCAT Center for Interface Science and Catalysis, SLAC National Accelerator Laboratory, Menlo Park, CA 94025, United States
| | - Lingze Wei
- Department of Chemical Engineering, Stanford University, Stanford, CA 94305, United States
- SUNCAT Center for Interface Science and Catalysis, SLAC National Accelerator Laboratory, Menlo Park, CA 94025, United States
| | - Jesse Matthews
- Department of Chemical Engineering, Stanford University, Stanford, CA 94305, United States
- SUNCAT Center for Interface Science and Catalysis, SLAC National Accelerator Laboratory, Menlo Park, CA 94025, United States
| | - Nathaniel Keyes
- Department of Chemical Engineering, Stanford University, Stanford, CA 94305, United States
- SUNCAT Center for Interface Science and Catalysis, SLAC National Accelerator Laboratory, Menlo Park, CA 94025, United States
| | - Ishaan Singh
- Department of Chemical Engineering, Stanford University, Stanford, CA 94305, United States
- SUNCAT Center for Interface Science and Catalysis, SLAC National Accelerator Laboratory, Menlo Park, CA 94025, United States
| | - Thomas F Jaramillo
- Department of Chemical Engineering, Stanford University, Stanford, CA 94305, United States
- SUNCAT Center for Interface Science and Catalysis, SLAC National Accelerator Laboratory, Menlo Park, CA 94025, United States
| | - Frank Abild-Pedersen
- SUNCAT Center for Interface Science and Catalysis, SLAC National Accelerator Laboratory, Menlo Park, CA 94025, United States
| | - Michaela Burke Stevens
- SUNCAT Center for Interface Science and Catalysis, SLAC National Accelerator Laboratory, Menlo Park, CA 94025, United States
| | - Kirsten Winther
- SUNCAT Center for Interface Science and Catalysis, SLAC National Accelerator Laboratory, Menlo Park, CA 94025, United States
| | - Johannes Voss
- SUNCAT Center for Interface Science and Catalysis, SLAC National Accelerator Laboratory, Menlo Park, CA 94025, United States
| |
Collapse
|
35
|
Yang J, Li J, Li J, Li J. Gaussian Process Regression for State-to-State Integral Cross Sections: The Case of the O + O 2 Collision Dissociation Reactions. J Phys Chem A 2024; 128:4966-4975. [PMID: 38869143 DOI: 10.1021/acs.jpca.4c01445] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2024]
Abstract
Research on hypersonic vehicles has become increasingly important worldwide in recent years. However, accurately simulating the dynamics of the nonequilibrium high-temperature reactions that are in the hypersonic flow around the vehicles presents a significant challenge as a large number of states and transitions are accessible even for the smallest atom-diatom reaction systems. It is quite difficult, sometimes even impossible, to exhaustively investigate all relevant combinations or determine high-dimensional analytical representations for the state-to-state reaction probabilities. In this study, we used Gaussian process regression (GPR) to fit a model based on only 807 QCT data for training. The confidence interval of the GPR prediction and the Kullback-Leibler (KL) divergence were used to help minimize the sampling amount of data for fitting the converged GPR model. The model aims to predict the state-to-state integral cross section (ICS) of the O + O2 → 3O dissociation reaction under random initial conditions (Et, v, j). In total, it took almost a month to obtain this converged GPR model, but it took only a few seconds to predict the ICS value for any initial condition. For 330 initial conditions not included in the training set, the mean-square error (MSE) between the QCT-calculated ICSs and the GPR-predicted ones is only 0.08 Å2 and the R2 is 0.9986, indicating that the GPR model can replace the direct expensive QCT calculation with high accuracy. Finally, we calculated the equilibrium dissociation rate coefficients based on the StS ICS values predicted by the GPR model, and the results were in good agreement with available experimental and theoretical results. Thus, this study provides an effective and accurate approach to the extensive direct state-to-state reaction dynamic calculations.
Collapse
Affiliation(s)
- Jiawei Yang
- School of Chemistry and Chemical Engineering & Chongqing Key Laboratory of Chemical Theory and Mechanism, Chongqing University, Chongqing 401331, China
| | - Jia Li
- School of Chemistry and Chemical Engineering & Chongqing Key Laboratory of Chemical Theory and Mechanism, Chongqing University, Chongqing 401331, China
| | - Junhong Li
- School of Chemistry and Chemical Engineering & Chongqing Key Laboratory of Chemical Theory and Mechanism, Chongqing University, Chongqing 401331, China
| | - Jun Li
- School of Chemistry and Chemical Engineering & Chongqing Key Laboratory of Chemical Theory and Mechanism, Chongqing University, Chongqing 401331, China
| |
Collapse
|
36
|
Berger E, Niemelä J, Lampela O, Juffer AH, Komsa HP. Raman Spectra of Amino Acids and Peptides from Machine Learning Polarizabilities. J Chem Inf Model 2024; 64:4601-4612. [PMID: 38829726 DOI: 10.1021/acs.jcim.4c00077] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/05/2024]
Abstract
Raman spectroscopy is an important tool in the study of vibrational properties and composition of molecules, peptides, and even proteins. Raman spectra can be simulated based on the change of the electronic polarizability with vibrations, which can nowadays be efficiently obtained via machine learning models trained on first-principles data. However, the transferability of the models trained on small molecules to larger structures is unclear, and direct training on large structures is prohibitively expensive. In this work, we first train two machine learning models to predict the polarizabilities of all 20 amino acids. Both models are carefully benchmarked and compared to density functional theory (DFT) calculations, with the neural network method being found to offer better transferability. By combination of machine learning models with classical force field molecular dynamics, Raman spectra of all amino acids are also obtained and investigated, showing good agreement with experiments. The models are further extended to small peptides. We find that adding structures containing peptide bonds to the training set greatly improves predictions, even for peptides not included in training sets.
Collapse
Affiliation(s)
- Ethan Berger
- Microelectronics Research Unit, Faculty of Information Technology and Electrical Engineering, University of Oulu, P.O. Box 4500, Oulu FIN-90014, Finland
| | - Juha Niemelä
- Faculty of Biochemistry and Molecular Medicine, University of Oulu, Oulu FIN-90014, Finland
| | - Outi Lampela
- Biocenter Oulu and Faculty of Biochemistry and Molecular Medicine, University of Oulu, Oulu FIN-90014, Finland
| | - André H Juffer
- Biocenter Oulu and Faculty of Biochemistry and Molecular Medicine, University of Oulu, Oulu FIN-90014, Finland
| | - Hannu-Pekka Komsa
- Microelectronics Research Unit, Faculty of Information Technology and Electrical Engineering, University of Oulu, P.O. Box 4500, Oulu FIN-90014, Finland
| |
Collapse
|
37
|
Singh S, Hernández-Lobato JM. Deep Kernel learning for reaction outcome prediction and optimization. Commun Chem 2024; 7:136. [PMID: 38877182 PMCID: PMC11178803 DOI: 10.1038/s42004-024-01219-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2024] [Accepted: 06/05/2024] [Indexed: 06/16/2024] Open
Abstract
Recent years have seen a rapid growth in the application of various machine learning methods for reaction outcome prediction. Deep learning models have gained popularity due to their ability to learn representations directly from the molecular structure. Gaussian processes (GPs), on the other hand, provide reliable uncertainty estimates but are unable to learn representations from the data. We combine the feature learning ability of neural networks (NNs) with uncertainty quantification of GPs in a deep kernel learning (DKL) framework to predict the reaction outcome. The DKL model is observed to obtain very good predictive performance across different input representations. It significantly outperforms standard GPs and provides comparable performance to graph neural networks, but with uncertainty estimation. Additionally, the uncertainty estimates on predictions provided by the DKL model facilitated its incorporation as a surrogate model for Bayesian optimization (BO). The proposed method, therefore, has a great potential towards accelerating reaction discovery by integrating accurate predictive models that provide reliable uncertainty estimates with BO.
Collapse
Affiliation(s)
- Sukriti Singh
- Department of Engineering, University of Cambridge, Cambridge, UK.
| | | |
Collapse
|
38
|
Weymuth T, Unsleber JP, Türtscher PL, Steiner M, Sobez JG, Müller CH, Mörchen M, Klasovita V, Grimmel SA, Eckhoff M, Csizi KS, Bosia F, Bensberg M, Reiher M. SCINE-Software for chemical interaction networks. J Chem Phys 2024; 160:222501. [PMID: 38857173 DOI: 10.1063/5.0206974] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2024] [Accepted: 05/09/2024] [Indexed: 06/12/2024] Open
Abstract
The software for chemical interaction networks (SCINE) project aims at pushing the frontier of quantum chemical calculations on molecular structures to a new level. While calculations on individual structures as well as on simple relations between them have become routine in chemistry, new developments have pushed the frontier in the field to high-throughput calculations. Chemical relations may be created by a search for specific molecular properties in a molecular design attempt, or they can be defined by a set of elementary reaction steps that form a chemical reaction network. The software modules of SCINE have been designed to facilitate such studies. The features of the modules are (i) general applicability of the applied methodologies ranging from electronic structure (no restriction to specific elements of the periodic table) to microkinetic modeling (with little restrictions on molecularity), full modularity so that SCINE modules can also be applied as stand-alone programs or be exchanged for external software packages that fulfill a similar purpose (to increase options for computational campaigns and to provide alternatives in case of tasks that are hard or impossible to accomplish with certain programs), (ii) high stability and autonomous operations so that control and steering by an operator are as easy as possible, and (iii) easy embedding into complex heterogeneous environments for molecular structures taken individually or in the context of a reaction network. A graphical user interface unites all modules and ensures interoperability. All components of the software have been made available as open source and free of charge.
Collapse
Affiliation(s)
- Thomas Weymuth
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 2, 8093 Zurich, Switzerland
| | - Jan P Unsleber
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 2, 8093 Zurich, Switzerland
| | - Paul L Türtscher
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 2, 8093 Zurich, Switzerland
| | - Miguel Steiner
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 2, 8093 Zurich, Switzerland
| | - Jan-Grimo Sobez
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 2, 8093 Zurich, Switzerland
| | - Charlotte H Müller
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 2, 8093 Zurich, Switzerland
| | - Maximilian Mörchen
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 2, 8093 Zurich, Switzerland
| | - Veronika Klasovita
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 2, 8093 Zurich, Switzerland
| | - Stephanie A Grimmel
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 2, 8093 Zurich, Switzerland
| | - Marco Eckhoff
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 2, 8093 Zurich, Switzerland
| | - Katja-Sophia Csizi
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 2, 8093 Zurich, Switzerland
| | - Francesco Bosia
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 2, 8093 Zurich, Switzerland
| | - Moritz Bensberg
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 2, 8093 Zurich, Switzerland
| | - Markus Reiher
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 2, 8093 Zurich, Switzerland
| |
Collapse
|
39
|
Zinovjev K, Hedges L, Montagud Andreu R, Woods C, Tuñón I, van der Kamp MW. emle-engine: A Flexible Electrostatic Machine Learning Embedding Package for Multiscale Molecular Dynamics Simulations. J Chem Theory Comput 2024; 20:4514-4522. [PMID: 38804055 PMCID: PMC11171281 DOI: 10.1021/acs.jctc.4c00248] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2024] [Revised: 05/17/2024] [Accepted: 05/20/2024] [Indexed: 05/29/2024]
Abstract
We present in this work the emle-engine package (https://github.com/chemle/emle-engine)─the implementation of a new machine learning embedding scheme for hybrid machine learning potential/molecular-mechanics (ML/MM) dynamics simulations. The package is based on an embedding scheme that uses a physics-based model of the electronic density and induction with a handful of tunable parameters derived from in vacuo properties of the subsystem to be embedded. This scheme is completely independent of the in vacuo potential and requires only the positions of the atoms of the machine learning subsystem and the positions and partial charges of the molecular mechanics environment. These characteristics allow emle-engine to be employed in existing QM/MM software. We demonstrate that the implemented electrostatic machine learning embedding scheme (named EMLE) is stable in enhanced sampling molecular dynamics simulations. Through the calculation of free energy surfaces of alanine dipeptide in water with two different ML options for the in vacuo potential and three embedding models, we test the performance of EMLE. When compared to the reference DFT/MM surface, the EMLE embedding is clearly superior to the MM one based on fixed partial charges. The configurational dependence of the electronic density and the inclusion of the induction energy introduced by the EMLE model leads to a systematic reduction in the average error of the free energy surface when compared to MM embedding. By enabling the usage of EMLE embedding in practical ML/MM simulations, emle-engine will make it possible to accurately model systems and processes that feature significant variations in the charge distribution of the ML subsystem and/or the interacting environment.
Collapse
Affiliation(s)
- Kirill Zinovjev
- Departamento
de Química Física, Universidad
de Valencia, 46100 Burjassot, Spain
| | - Lester Hedges
- School
of Biochemistry, University of Bristol, Biomedical Sciences Building, University
Walk, Bristol BS8 1TD, U.K.
- Research
Software Engineering, Advanced Computing
Research Centre, 31 Great
George Street, Bristol BS1 5QD, U.K.
| | | | - Christopher Woods
- Research
Software Engineering, Advanced Computing
Research Centre, 31 Great
George Street, Bristol BS1 5QD, U.K.
| | - Iñaki Tuñón
- Departamento
de Química Física, Universidad
de Valencia, 46100 Burjassot, Spain
| | - Marc W. van der Kamp
- School
of Biochemistry, University of Bristol, Biomedical Sciences Building, University
Walk, Bristol BS8 1TD, U.K.
| |
Collapse
|
40
|
Jana A, Shepherd S, Litman Y, Wilkins DM. Learning Electronic Polarizations in Aqueous Systems. J Chem Inf Model 2024; 64:4426-4435. [PMID: 38804973 PMCID: PMC11167596 DOI: 10.1021/acs.jcim.4c00421] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2024] [Revised: 05/10/2024] [Accepted: 05/14/2024] [Indexed: 05/29/2024]
Abstract
The polarization of periodically repeating systems is a discontinuous function of the atomic positions, a fact which seems at first to stymie attempts at their statistical learning. Two approaches to build models for bulk polarizations are compared: one in which a simple point charge model is used to preprocess the raw polarization to give a learning target that is a smooth function of atomic positions and the total polarization is learned as a sum of atom-centered dipoles and one in which instead the average position of Wannier centers around atoms is predicted. For a range of bulk aqueous systems, both of these methods perform perform comparatively well, with the former being slightly better but often requiring an extra effort to find a suitable point charge model. As a challenging test, we also analyze the performance of the models at the air-water interface. In this case, while the Wannier center approach delivers accurate predictions without further modifications, the preprocessing method requires augmentation with information from isolated water molecules to reach similar accuracy. Finally, we present a simple protocol to preprocess the polarizations in a data-driven way using a small number of derivatives calculated at a much lower level of theory, thus overcoming the need to find point charge models without appreciably increasing the computation cost. We believe that the training strategies presented here help the construction of accurate polarization models required for the study of the dielectric properties of realistic complex bulk systems and interfaces with ab initio accuracy.
Collapse
Affiliation(s)
- Arnab Jana
- Centre
for Quantum Materials and Technologies, School of Mathematics and
Physics, Queen’s University Belfast, Belfast BT7 1NN, U.K.
| | - Sam Shepherd
- Centre
for Quantum Materials and Technologies, School of Mathematics and
Physics, Queen’s University Belfast, Belfast BT7 1NN, U.K.
| | - Yair Litman
- Yusuf
Hamied Department of Chemistry, University
of Cambridge, Lensfield Road, Cambridge CB2 1EW, U.K.
| | - David M. Wilkins
- Centre
for Quantum Materials and Technologies, School of Mathematics and
Physics, Queen’s University Belfast, Belfast BT7 1NN, U.K.
| |
Collapse
|
41
|
Hahn AW, Zsombor-Pindera J, Kennepohl P, DeBeer S. Introducing SpectraFit: An Open-Source Tool for Interactive Spectral Analysis. ACS OMEGA 2024; 9:23252-23265. [PMID: 38854548 PMCID: PMC11155667 DOI: 10.1021/acsomega.3c09262] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/20/2023] [Revised: 05/05/2024] [Accepted: 05/10/2024] [Indexed: 06/11/2024]
Abstract
In chemistry, analyzing spectra through peak fitting is a crucial task that helps scientists extract useful quantitative information about a sample's chemical composition or electronic structure. To make this process more efficient, we have developed a new open-source software tool called SpectraFit. This tool allows users to perform quick data fitting using expressions of distribution and linear functions through the command line interface (CLI) or Jupyter Notebook, which can run on Linux, Windows, and MacOS, as well as in a Docker container. As part of our commitment to good scientific practice, we have introduced an output file-locking system to ensure the accuracy and consistency of information. This system collects input data, results data, and the initial fitting model in a single file, promoting transparency, reproducibility, collaboration, and innovation. To demonstrate SpectraFit's user-friendly interface and the advantages of its output file-locking system, we are focusing on a series of previously published iron-sulfur dimers and their XAS spectra. We will show how to analyze the XAS spectra via CLI and in a Jupyter Notebook by simultaneously fitting multiple data sets using SpectraFit. Additionally, we will demonstrate how SpectraFit can be used as a black box and white box solution, allowing users to apply their own algorithms to engineer the data further. This publication, along with its Supporting Information and the Jupyter Notebook, serves as a tutorial to guide users through each step of the process. SpectraFit will streamline the peak fitting process and provide a convenient, standardized platform for users to share fitting models, which we hope will improve transparency and reproducibility in the field of spectroscopy.
Collapse
Affiliation(s)
- Anselm W. Hahn
- Max
Planck Institute for Chemical Energy Conversion, Stiftstraße 34-36, Mülheim an der Ruhr 45470, Germany
| | - Joseph Zsombor-Pindera
- Department
of Chemistry, University of Calgary, Calgary, AB T2N 1N4, Canada
- Department
of Chemistry, The University of British
Columbia, Vancouver, BC V6T 1Z1, Canada
| | - Pierre Kennepohl
- Department
of Chemistry, University of Calgary, Calgary, AB T2N 1N4, Canada
| | - Serena DeBeer
- Max
Planck Institute for Chemical Energy Conversion, Stiftstraße 34-36, Mülheim an der Ruhr 45470, Germany
| |
Collapse
|
42
|
Ben Mahmoud C, Gardner JLA, Deringer VL. Data as the next challenge in atomistic machine learning. NATURE COMPUTATIONAL SCIENCE 2024; 4:384-387. [PMID: 38866969 DOI: 10.1038/s43588-024-00636-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2024]
|
43
|
Selloni A. Aqueous Titania Interfaces. Annu Rev Phys Chem 2024; 75:47-65. [PMID: 38271659 DOI: 10.1146/annurev-physchem-090722-015957] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2024]
Abstract
Water-metal oxide interfaces are central to many phenomena and applications, ranging from material corrosion and dissolution to photoelectrochemistry and bioengineering. In particular, the discovery of photocatalytic water splitting on TiO2 has motivated intensive studies of water-TiO2 interfaces for decades. So far, a broad understanding of the interaction of water vapor with several TiO2 surfaces has been obtained. However, much less is known about liquid water-TiO2 interfaces, which are more relevant to many practical applications. Probing these complex systems at the molecular level is experimentally challenging and is sometimes possible only through computational studies. This review summarizes recent advances in the atomistic understanding, mostly through computational simulations, of the structure and dynamics of interfacial water on TiO2 surfaces. The main focus is on the nature, molecular or dissociated, of water in direct contact with low-index defect-free crystalline surfaces. The hydroxyls resulting from water dissociation are essential in the photooxidation of water and critically affect the surface chemistry of TiO2.
Collapse
Affiliation(s)
- Annabella Selloni
- Department of Chemistry, Princeton University, Princeton, New Jersey, USA;
| |
Collapse
|
44
|
Yang Y, Zhang S, Ranasinghe KD, Isayev O, Roitberg AE. Machine Learning of Reactive Potentials. Annu Rev Phys Chem 2024; 75:371-395. [PMID: 38941524 DOI: 10.1146/annurev-physchem-062123-024417] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/30/2024]
Abstract
In the past two decades, machine learning potentials (MLPs) have driven significant developments in chemical, biological, and material sciences. The construction and training of MLPs enable fast and accurate simulations and analysis of thermodynamic and kinetic properties. This review focuses on the application of MLPs to reaction systems with consideration of bond breaking and formation. We review the development of MLP models, primarily with neural network and kernel-based algorithms, and recent applications of reactive MLPs (RMLPs) to systems at different scales. We show how RMLPs are constructed, how they speed up the calculation of reactive dynamics, and how they facilitate the study of reaction trajectories, reaction rates, free energy calculations, and many other calculations. Different data sampling strategies applied in building RMLPs are also discussed with a focus on how to collect structures for rare events and how to further improve their performance with active learning.
Collapse
Affiliation(s)
- Yinuo Yang
- Department of Chemistry, University of Florida, Gainesville, Florida;
| | - Shuhao Zhang
- Department of Chemistry, Carnegie Mellon University, Pittsburgh, Pennsylvania;
| | | | - Olexandr Isayev
- Department of Chemistry, Carnegie Mellon University, Pittsburgh, Pennsylvania;
| | - Adrian E Roitberg
- Department of Chemistry, University of Florida, Gainesville, Florida;
| |
Collapse
|
45
|
Fan M, Wen T, Chen S, Dong Y, Wang C. Perspectives Toward Damage-Tolerant Nanostructure Ceramics. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2024; 11:e2309834. [PMID: 38582503 PMCID: PMC11199990 DOI: 10.1002/advs.202309834] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/14/2023] [Revised: 03/13/2024] [Indexed: 04/08/2024]
Abstract
Advanced ceramic materials and devices call for better reliability and damage tolerance. In addition to their strong bonding nature, there are examples demonstrating superior mechanical properties of nanostructure ceramics, such as damage-tolerant ceramic aerogels that can withstand high deformation without cracking and local plasticity in dense nanocrystalline ceramics. The recent progresses shall be reviewed in this perspective article. Three topics including highly elastic nano-fibrous ceramic aerogels, load-bearing nanoceramics with improved mechanical properties, and implementing machine learning-assisted simulations toolbox in understanding the relationship among structure, deformation mechanisms, and microstructure-properties shall be discussed. It is hoped that the perspectives present here can help the discovery, synthesis, and processing of future structural ceramic materials that are insensitive to processing flaws and local damages in service.
Collapse
Affiliation(s)
- Meicen Fan
- State Key Lab of New Ceramics and Fine ProcessingSchool of Materials Science and EngineeringTsinghua UniversityBeijing100084China
| | - Tongqi Wen
- Department of Mechanical EngineeringThe University of Hong KongHong KongSARChina
| | - Shile Chen
- State Key Lab of New Ceramics and Fine ProcessingSchool of Materials Science and EngineeringTsinghua UniversityBeijing100084China
| | - Yanhao Dong
- State Key Lab of New Ceramics and Fine ProcessingSchool of Materials Science and EngineeringTsinghua UniversityBeijing100084China
| | - Chang‐An Wang
- State Key Lab of New Ceramics and Fine ProcessingSchool of Materials Science and EngineeringTsinghua UniversityBeijing100084China
| |
Collapse
|
46
|
Zarrouk T, Ibragimova R, Bartók AP, Caro MA. Experiment-Driven Atomistic Materials Modeling: A Case Study Combining X-Ray Photoelectron Spectroscopy and Machine Learning Potentials to Infer the Structure of Oxygen-Rich Amorphous Carbon. J Am Chem Soc 2024; 146:14645-14659. [PMID: 38749497 PMCID: PMC11140750 DOI: 10.1021/jacs.4c01897] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2024] [Revised: 05/02/2024] [Accepted: 05/03/2024] [Indexed: 05/30/2024]
Abstract
An important yet challenging aspect of atomistic materials modeling is reconciling experimental and computational results. Conventional approaches involve generating numerous configurations through molecular dynamics or Monte Carlo structure optimization and selecting the one with the closest match to experiment. However, this inefficient process is not guaranteed to succeed. We introduce a general method to combine atomistic machine learning (ML) with experimental observables that produces atomistic structures compatible with experiment by design. We use this approach in combination with grand-canonical Monte Carlo within a modified Hamiltonian formalism, to generate configurations that agree with experimental data and are chemically sound (low in energy). We apply our approach to understand the atomistic structure of oxygenated amorphous carbon (a-COx), an intriguing carbon-based material, to answer the question of how much oxygen can be added to carbon before it fully decomposes into CO and CO2. Utilizing an ML-based X-ray photoelectron spectroscopy (XPS) model trained from GW and density functional theory (DFT) data, in conjunction with an ML interatomic potential, we identify a-COx structures compliant with experimental XPS predictions that are also energetically favorable with respect to DFT. Employing a network analysis, we accurately deconvolve the XPS spectrum into motif contributions, both revealing the inaccuracies inherent to experimental XPS interpretation and granting us atomistic insight into the structure of a-COx. This method generalizes to multiple experimental observables and allows for the elucidation of the atomistic structure of materials directly from experimental data, thereby enabling experiment-driven materials modeling with a degree of realism previously out of reach.
Collapse
Affiliation(s)
- Tigany Zarrouk
- Department
of Chemistry and Materials Science, Aalto
University, Espoo 02150, Finland
| | - Rina Ibragimova
- Department
of Chemistry and Materials Science, Aalto
University, Espoo 02150, Finland
| | - Albert P. Bartók
- Department
of Physics, University of Warwick, Coventry CV4 7AL, U.K.
- Warwick
Centre for Predictive Modelling, School of Engineering, University of Warwick, Coventry CV4 7AL, U.K.
| | - Miguel A. Caro
- Department
of Chemistry and Materials Science, Aalto
University, Espoo 02150, Finland
| |
Collapse
|
47
|
Morrow JD, Ugwumadu C, Drabold DA, Elliott SR, Goodwin AL, Deringer VL. Understanding Defects in Amorphous Silicon with Million-Atom Simulations and Machine Learning. Angew Chem Int Ed Engl 2024; 63:e202403842. [PMID: 38517212 PMCID: PMC11497335 DOI: 10.1002/anie.202403842] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2024] [Revised: 03/14/2024] [Accepted: 03/18/2024] [Indexed: 03/23/2024]
Abstract
The structure of amorphous silicon (a-Si) is widely thought of as a fourfold-connected random network, and yet it is defective atoms, with fewer or more than four bonds, that make it particularly interesting. Despite many attempts to explain such "dangling-bond" and "floating-bond" defects, respectively, a unified understanding is still missing. Here, we use advanced computational chemistry methods to reveal the complex structural and energetic landscape of defects in a-Si. We study an ultra-large-scale, quantum-accurate structural model containing a million atoms, and thousands of individual defects, allowing reliable defect-related statistics to be obtained. We combine structural descriptors and machine-learned atomic energies to develop a classification of the different types of defects in a-Si. The results suggest a revision of the established floating-bond model by showing that fivefold-bonded atoms in a-Si exhibit a wide range of local environments-analogous to fivefold centers in coordination chemistry. Furthermore, it is shown that fivefold (but not threefold) coordination defects tend to cluster together. Our study provides new insights into one of the most widely studied amorphous solids, and has general implications for understanding defects in disordered materials beyond silicon alone.
Collapse
Affiliation(s)
- Joe D. Morrow
- Inorganic Chemistry LaboratoryDepartment of ChemistryUniversity of OxfordOxfordOX1 3QRUnited Kingdom
| | - Chinonso Ugwumadu
- Department of Physics and AstronomyNanoscale and Quantum Phenomena Institute (NQPI)Ohio UniversityAthensOhio45701United States
| | - David A. Drabold
- Department of Physics and AstronomyNanoscale and Quantum Phenomena Institute (NQPI)Ohio UniversityAthensOhio45701United States
| | - Stephen R. Elliott
- Physical and Theoretical Chemistry LaboratoryDepartment of ChemistryUniversity ofOxfordOX1 3QZUnited Kingdom
| | - Andrew L. Goodwin
- Inorganic Chemistry LaboratoryDepartment of ChemistryUniversity of OxfordOxfordOX1 3QRUnited Kingdom
| | - Volker L. Deringer
- Inorganic Chemistry LaboratoryDepartment of ChemistryUniversity of OxfordOxfordOX1 3QRUnited Kingdom
| |
Collapse
|
48
|
Wang G, Wang C, Zhang X, Li Z, Zhou J, Sun Z. Machine learning interatomic potential: Bridge the gap between small-scale models and realistic device-scale simulations. iScience 2024; 27:109673. [PMID: 38646181 PMCID: PMC11033164 DOI: 10.1016/j.isci.2024.109673] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/23/2024] Open
Abstract
Machine learning interatomic potential (MLIP) overcomes the challenges of high computational costs in density-functional theory and the relatively low accuracy in classical large-scale molecular dynamics, facilitating more efficient and precise simulations in materials research and design. In this review, the current state of the four essential stages of MLIP is discussed, including data generation methods, material structure descriptors, six unique machine learning algorithms, and available software. Furthermore, the applications of MLIP in various fields are investigated, notably in phase-change memory materials, structure searching, material properties predicting, and the pre-trained universal models. Eventually, the future perspectives, consisting of standard datasets, transferability, generalization, and trade-off between accuracy and complexity in MLIPs, are reported.
Collapse
Affiliation(s)
- Guanjie Wang
- School of Materials Science and Engineering, Beihang University, Beijing 100191, China
- School of Integrated Circuit Science and Engineering, Beihang University, Beijing 100191, China
| | - Changrui Wang
- School of Materials Science and Engineering, Beihang University, Beijing 100191, China
| | - Xuanguang Zhang
- School of Materials Science and Engineering, Beihang University, Beijing 100191, China
| | - Zefeng Li
- School of Materials Science and Engineering, Beihang University, Beijing 100191, China
| | - Jian Zhou
- School of Materials Science and Engineering, Beihang University, Beijing 100191, China
| | - Zhimei Sun
- School of Materials Science and Engineering, Beihang University, Beijing 100191, China
| |
Collapse
|
49
|
Guibourg P, Dontot L, Anglade PM, Gervais B. DFTB Simulation of Charged Clusters Using Machine Learning Charge Inference. J Chem Theory Comput 2024; 20:4007-4018. [PMID: 38690586 DOI: 10.1021/acs.jctc.4c00107] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/02/2024]
Abstract
We present a modification to self-consistent charge density functional-based tight binding (SCC-DFTB), which allows computation based on approximate atomic charges. We obtain these charges by means of a machine learning (ML) process that combines a Coulomb model with a neural network. This allows us to avoid the SCC cycles in the SCC-DFTB calculation while maintaining its accuracy. The main input of the model is the atomic positions characterized by a set of atom-centered symmetry functions. The charge inference from our ML algorithm is as close as 10-2 units of charge from the exact SCC solution. Our ML-DFTB approach provides a good approximation of the density matrix and of the energy and forces with only a single diagonalization. This is a significant computational saving with respect to the complete SCC algorithm, which allows us to investigate a bigger ensemble of atoms. We show the quality of our approach in the case of charged silicon carbide (SiC) clusters. The ML-DFTB potential energy surface (PES) mimics the SCC-DFTB PES rather well, despite its simplicity. This allows us to obtain the same geometric structure ordering with respect to energy for small clusters. The dissociation barriers for ion emission are well-reproduced, which opens the way to investigating ion field emission and charged cluster stability. The ML-DFTB approach is obviously not limited to charged clusters or SiC materials. It opens a new route to investigate larger clusters than those investigated by standard SCC-DFTB, as well as surface and solid-state chemistry at the atomic level.
Collapse
Affiliation(s)
- Paul Guibourg
- Laboratoire Cimap, UMR6252─Université de Caen Normandie, École Nationale Supérieure d'Ingénieures de Caen, Commissariat à l'Énergie Atomique, Centre National de la Recherche Scientifique, 6 Boulevard Du Maréchal Juin, 14050 Caen Cedex, France
| | - Léo Dontot
- Laboratoire Cimap, UMR6252─Université de Caen Normandie, École Nationale Supérieure d'Ingénieures de Caen, Commissariat à l'Énergie Atomique, Centre National de la Recherche Scientifique, 6 Boulevard Du Maréchal Juin, 14050 Caen Cedex, France
| | - Pierre-Matthieu Anglade
- Laboratoire Cimap, UMR6252─Université de Caen Normandie, École Nationale Supérieure d'Ingénieures de Caen, Commissariat à l'Énergie Atomique, Centre National de la Recherche Scientifique, 6 Boulevard Du Maréchal Juin, 14050 Caen Cedex, France
| | - Benoit Gervais
- Laboratoire Cimap, UMR6252─Université de Caen Normandie, École Nationale Supérieure d'Ingénieures de Caen, Commissariat à l'Énergie Atomique, Centre National de la Recherche Scientifique, 6 Boulevard Du Maréchal Juin, 14050 Caen Cedex, France
| |
Collapse
|
50
|
Shanks BL, Sullivan HW, Shazed AR, Hoepfner MP. Accelerated Bayesian Inference for Molecular Simulations using Local Gaussian Process Surrogate Models. J Chem Theory Comput 2024; 20:3798-3808. [PMID: 38551198 DOI: 10.1021/acs.jctc.3c01358] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/19/2024]
Abstract
While Bayesian inference is the gold standard for uncertainty quantification and propagation, its use within physical chemistry encounters formidable computational barriers. These bottlenecks are magnified for modeling data with many independent variables, such as X-ray/neutron scattering patterns and electromagnetic spectra. To address this challenge, we employ local Gaussian process (LGP) surrogate models to accelerate Bayesian optimization over these complex thermophysical properties. The time-complexity of the LGPs scales linearly in the number of independent variables, in stark contrast to the computationally expensive cubic scaling of conventional Gaussian processes. To illustrate the method, we trained a LGP surrogate model on the radial distribution function of liquid neon and observed a 1,760,000-fold speed-up compared to molecular dynamics simulation, beating a conventional GP by three orders-of-magnitude. We conclude that LGPs are robust and efficient surrogate models poised to expand the application of Bayesian inference in molecular simulations to a broad spectrum of experimental data.
Collapse
Affiliation(s)
- Brennon L Shanks
- Department of Chemical Engineering, University of Utah, Salt Lake City, UT 84112-9202, United States
| | - Harry W Sullivan
- Department of Chemical Engineering, University of Utah, Salt Lake City, UT 84112-9202, United States
| | - Abdur R Shazed
- Department of Chemical Engineering, University of Utah, Salt Lake City, UT 84112-9202, United States
| | - Michael P Hoepfner
- Department of Chemical Engineering, University of Utah, Salt Lake City, UT 84112-9202, United States
| |
Collapse
|