1
|
Li X, Luo Y, Zhou S, Wang J, Lu F, Wang S, Deng Q. Fluorescence sensing of water in various organic solvents based on a novel cyclic polymer. SPECTROCHIMICA ACTA. PART A, MOLECULAR AND BIOMOLECULAR SPECTROSCOPY 2024; 319:124554. [PMID: 38833888 DOI: 10.1016/j.saa.2024.124554] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/12/2024] [Revised: 05/12/2024] [Accepted: 05/27/2024] [Indexed: 06/06/2024]
Abstract
A sensor capable of sensing of water in various organic solvents ranging from water-soluble to water-miscible solvents is still a challenging task. In this research, a cyclic polymer fluorescence chemosensor (CPFC) has been developed for sensing of water by turn-on model in 9 organic solvents and turn-off model in DMA, where the broadest concentration range and the lowest detection limit was obtained for water in DMA (10 %-90 %) and dioxane (0.011 %), respectively. The sensing mechanism is explored by theory calculation and experimental investigation. The amphiphilic nature endows the polymer probe with great potential for measuring various contaminants from aqueous and nonaqueous mediums. Furthermore, the present search highlights the potential applications of cyclic polymer as fluorescence probes in the field of sensing.
Collapse
Affiliation(s)
- Xiaoxia Li
- Tianjin Key Laboratory of Multiplexed Identification for Port Hazardous Chemicals, College of Science, College of Chemical Engineering and Materials Science, Tianjin University of Science and Technology, Tianjin 300457, China
| | - Yuchen Luo
- Tianjin Key Laboratory of Multiplexed Identification for Port Hazardous Chemicals, College of Science, College of Chemical Engineering and Materials Science, Tianjin University of Science and Technology, Tianjin 300457, China
| | - Shufang Zhou
- Tianjin Key Laboratory of Multiplexed Identification for Port Hazardous Chemicals, College of Science, College of Chemical Engineering and Materials Science, Tianjin University of Science and Technology, Tianjin 300457, China
| | - Jiayi Wang
- Tianjin Key Laboratory of Multiplexed Identification for Port Hazardous Chemicals, College of Science, College of Chemical Engineering and Materials Science, Tianjin University of Science and Technology, Tianjin 300457, China
| | - Futai Lu
- Tianjin Key Laboratory of Multiplexed Identification for Port Hazardous Chemicals, College of Science, College of Chemical Engineering and Materials Science, Tianjin University of Science and Technology, Tianjin 300457, China.
| | - Shuo Wang
- Tianjin Key Laboratory of Multiplexed Identification for Port Hazardous Chemicals, College of Science, College of Chemical Engineering and Materials Science, Tianjin University of Science and Technology, Tianjin 300457, China; Tianjin Key Laboratory of Food Science and Health, School of Medicine, Nankai University, Tianjin 300071, China.
| | - Qiliang Deng
- Tianjin Key Laboratory of Multiplexed Identification for Port Hazardous Chemicals, College of Science, College of Chemical Engineering and Materials Science, Tianjin University of Science and Technology, Tianjin 300457, China.
| |
Collapse
|
2
|
Csizi KS, Reiher M. Automated preparation of nanoscopic structures: Graph-based sequence analysis, mismatch detection, and pH-consistent protonation with uncertainty estimates. J Comput Chem 2024; 45:761-776. [PMID: 38124290 DOI: 10.1002/jcc.27276] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2023] [Accepted: 11/14/2023] [Indexed: 12/23/2023]
Abstract
Structure and function in nanoscale atomistic assemblies are tightly coupled, and every atom with its specific position and even every electron will have a decisive effect on the electronic structure, and hence, on the molecular properties. Molecular simulations of nanoscopic atomistic structures therefore require accurately resolved three-dimensional input structures. If extracted from experiment, these structures often suffer from severe uncertainties, of which the lack of information on hydrogen atoms is a prominent example. Hence, experimental structures require careful review and curation, which is a time-consuming and error-prone process. Here, we present a fast and robust protocol for the automated structure analysis and pH-consistent protonation, in short, ASAP. For biomolecules as a target, the ASAP protocol integrates sequence analysis and error assessment of a given input structure. ASAP allows for pK a prediction from reference data through Gaussian process regression including uncertainty estimation and connects to system-focused atomistic modeling described in Brunken and Reiher (J. Chem. Theory Comput. 16, 2020, 1646). Although focused on biomolecules, ASAP can be extended to other nanoscopic objects, because most of its design elements rely on a general graph-based foundation guaranteeing transferability. The modular character of the underlying pipeline supports different degrees of automation, which allows for (i) efficient feedback loops for human-machine interaction with a low entrance barrier and for (ii) integration into autonomous procedures such as automated force field parametrizations. This facilitates fast switching of the pH-state through on-the-fly system-focused reparametrization during a molecular simulation at virtually no extra computational cost.
Collapse
Affiliation(s)
- Katja-Sophia Csizi
- Department of Chemistry and Applied Biosciences, ETH Zurich, Zurich, Switzerland
| | - Markus Reiher
- Department of Chemistry and Applied Biosciences, ETH Zurich, Zurich, Switzerland
| |
Collapse
|
3
|
Tinzl M, Diedrich JV, Mittl PRE, Clémancey M, Reiher M, Proppe J, Latour JM, Hilvert D. Myoglobin-Catalyzed Azide Reduction Proceeds via an Anionic Metal Amide Intermediate. J Am Chem Soc 2024; 146:1957-1966. [PMID: 38264790 PMCID: PMC10811658 DOI: 10.1021/jacs.3c09279] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2023] [Revised: 12/11/2023] [Accepted: 12/11/2023] [Indexed: 01/25/2024]
Abstract
Nitrene transfer reactions catalyzed by heme proteins have broad potential for the stereoselective formation of carbon-nitrogen bonds. However, competition between productive nitrene transfer and the undesirable reduction of nitrene precursors limits the broad implementation of such biocatalytic methods. Here, we investigated the reduction of azides by the model heme protein myoglobin to gain mechanistic insights into the factors that control the fate of key reaction intermediates. In this system, the reaction proceeds via a proposed nitrene intermediate that is rapidly reduced and protonated to give a reactive ferrous amide species, which we characterized by UV/vis and Mössbauer spectroscopies, quantum mechanical calculations, and X-ray crystallography. Rate-limiting protonation of the ferrous amide to produce the corresponding amine is the final step in the catalytic cycle. These findings contribute to our understanding of the heme protein-catalyzed reduction of azides and provide a guide for future enzyme engineering campaigns to create more efficient nitrene transferases. Moreover, harnessing the reduction reaction in a chemoenzymatic cascade provided a potentially practical route to substituted pyrroles.
Collapse
Affiliation(s)
- Matthias Tinzl
- Laboratory
of Organic Chemistry, ETH Zürich, 8093 Zürich, Switzerland
| | - Johannes V. Diedrich
- Institute
of Physical and Theoretical Chemistry, TU
Braunschweig, 38106 Braunschweig, Germany
| | - Peer R. E. Mittl
- Department
of Biochemistry, University of Zürich, 8057 Zürich, Switzerland
| | - Martin Clémancey
- Université
Grenoble AlpesCNRS, CEA, IRIG, Laboratoire de Chimie et Biologie des
Métaux, 17 Rue des Martyrs, Grenoble F-38054 Cedex, France
| | - Markus Reiher
- Institute
for Molecular Physical Science, ETH Zürich, 8093 Zürich, Switzerland
| | - Jonny Proppe
- Institute
of Physical and Theoretical Chemistry, TU
Braunschweig, 38106 Braunschweig, Germany
| | - Jean-Marc Latour
- Université
Grenoble AlpesCNRS, CEA, IRIG, Laboratoire de Chimie et Biologie des
Métaux, 17 Rue des Martyrs, Grenoble F-38054 Cedex, France
| | - Donald Hilvert
- Laboratory
of Organic Chemistry, ETH Zürich, 8093 Zürich, Switzerland
| |
Collapse
|
4
|
Zhan H, Zhu X, Qiao Z, Hu J. Graph Neural Tree: A novel and interpretable deep learning-based framework for accurate molecular property predictions. Anal Chim Acta 2023; 1244:340558. [PMID: 36737143 DOI: 10.1016/j.aca.2022.340558] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2022] [Accepted: 10/24/2022] [Indexed: 11/06/2022]
Abstract
Determining various properties of molecules is a critical step in drug discovery. Recently, with the improvement of large heterogeneous datasets and the development of deep learning approaches, more and more scientists have turned their attention to neural network-based virtual preliminary screening to reduce the time and monetary cost of drug discovery. However, the poor interpretability of deep learning masks causality, so models' conclusions are often beyond the comprehension of human users, which reduces the credibility of the model and makes it difficult for chemists to further narrow the huge chemical space based on models' results. Thus, this study develops a novel framework consisting of Graph Neural Networks for feature extraction, Curriculum-Based Learning Strategies for optimization, and a Learning Binary Neural Tree (LBNT) for prediction, to improve the performance of neural networks and reveal their decision-making process to chemists. The framework encodes molecular graph data with graph neural networks (GNNs), then retrains the encoder with curriculum-based learning strategies to reduce uncertainty and improve accuracy, and finally uses LBNT as the predictor, which joint retrains with the encoder after independently training, for prediction and visualization. The framework is validated on the public datasets and compared to single GNNs with normal training strategies as well as GNN encoders with common machine learning predictors instead of the LBNT predictor. The result reveals that the proposed framework enhances the point prediction accuracy of the completely trained GNN and reduces its uncertainty through curriculum-based learning, and further improves the accuracy by combining LBNT. Besides, compared with common machine learning tools, the LBNT predictor generally has the best performance because of joint retraining with the GNN encoder. The decision-making process of LBNT is also better and easier to explain than that of other models.
Collapse
Affiliation(s)
- Haolin Zhan
- Guangzhou Key Laboratory for New Energy and Green Catalysis, School of Chemistry and Chemical Engineering, Guangzhou University, Guangzhou, China; College of Economics and Statistics, Guangzhou University, Guangzhou, China
| | - Xin Zhu
- Guangzhou Key Laboratory for New Energy and Green Catalysis, School of Chemistry and Chemical Engineering, Guangzhou University, Guangzhou, China.
| | - Zhiwei Qiao
- Guangzhou Key Laboratory for New Energy and Green Catalysis, School of Chemistry and Chemical Engineering, Guangzhou University, Guangzhou, China; Joint Institute of Guangzhou University & Institute of Corrosion Science and Technology, Guangzhou University, Guangzhou, 510006, China.
| | - Jianming Hu
- College of Economics and Statistics, Guangzhou University, Guangzhou, China.
| |
Collapse
|
5
|
Teale AM, Helgaker T, Savin A, Adamo C, Aradi B, Arbuznikov AV, Ayers PW, Baerends EJ, Barone V, Calaminici P, Cancès E, Carter EA, Chattaraj PK, Chermette H, Ciofini I, Crawford TD, De Proft F, Dobson JF, Draxl C, Frauenheim T, Fromager E, Fuentealba P, Gagliardi L, Galli G, Gao J, Geerlings P, Gidopoulos N, Gill PMW, Gori-Giorgi P, Görling A, Gould T, Grimme S, Gritsenko O, Jensen HJA, Johnson ER, Jones RO, Kaupp M, Köster AM, Kronik L, Krylov AI, Kvaal S, Laestadius A, Levy M, Lewin M, Liu S, Loos PF, Maitra NT, Neese F, Perdew JP, Pernal K, Pernot P, Piecuch P, Rebolini E, Reining L, Romaniello P, Ruzsinszky A, Salahub DR, Scheffler M, Schwerdtfeger P, Staroverov VN, Sun J, Tellgren E, Tozer DJ, Trickey SB, Ullrich CA, Vela A, Vignale G, Wesolowski TA, Xu X, Yang W. DFT exchange: sharing perspectives on the workhorse of quantum chemistry and materials science. Phys Chem Chem Phys 2022; 24:28700-28781. [PMID: 36269074 PMCID: PMC9728646 DOI: 10.1039/d2cp02827a] [Citation(s) in RCA: 61] [Impact Index Per Article: 30.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2022] [Accepted: 08/09/2022] [Indexed: 12/13/2022]
Abstract
In this paper, the history, present status, and future of density-functional theory (DFT) is informally reviewed and discussed by 70 workers in the field, including molecular scientists, materials scientists, method developers and practitioners. The format of the paper is that of a roundtable discussion, in which the participants express and exchange views on DFT in the form of 302 individual contributions, formulated as responses to a preset list of 26 questions. Supported by a bibliography of 777 entries, the paper represents a broad snapshot of DFT, anno 2022.
Collapse
Affiliation(s)
- Andrew M. Teale
- School of Chemistry, University of Nottingham, University ParkNottinghamNG7 2RDUK
| | - Trygve Helgaker
- Hylleraas Centre for Quantum Molecular Sciences, Department of Chemistry, University of Oslo, P.O. Box 1033 Blindern, N-0315 Oslo, Norway.
| | - Andreas Savin
- Laboratoire de Chimie Théorique, CNRS and Sorbonne University, 4 Place Jussieu, CEDEX 05, 75252 Paris, France.
| | - Carlo Adamo
- PSL University, CNRS, ChimieParisTech-PSL, Institute of Chemistry for Health and Life Sciences, i-CLeHS, 11 rue P. et M. Curie, 75005 Paris, France.
| | - Bálint Aradi
- Bremen Center for Computational Materials Science, University of Bremen, P.O. Box 330440, D-28334 Bremen, Germany.
| | - Alexei V. Arbuznikov
- Technische Universität Berlin, Institut für Chemie, Theoretische Chemie/Quantenchemie, Sekr. C7Straße des 17. Juni 13510623Berlin
| | | | - Evert Jan Baerends
- Department of Chemistry and Pharmaceutical Sciences, Faculty of Science, Vrije Universiteit, De Boelelaan 1083, 1081HV Amsterdam, The Netherlands.
| | - Vincenzo Barone
- Scuola Normale Superiore, Piazza dei Cavalieri 7, 56125 Pisa, Italy.
| | - Patrizia Calaminici
- Departamento de Química, Centro de Investigación y de Estudios Avanzados (Cinvestav), CDMX, 07360, Mexico.
| | - Eric Cancès
- CERMICS, Ecole des Ponts and Inria Paris, 6 Avenue Blaise Pascal, 77455 Marne-la-Vallée, France.
| | - Emily A. Carter
- Department of Mechanical and Aerospace Engineering and the Andlinger Center for Energy and the Environment, Princeton UniversityPrincetonNJ 08544-5263USA
| | | | - Henry Chermette
- Institut Sciences Analytiques, Université Claude Bernard Lyon1, CNRS UMR 5280, 69622 Villeurbanne, France.
| | - Ilaria Ciofini
- PSL University, CNRS, ChimieParisTech-PSL, Institute of Chemistry for Health and Life Sciences, i-CLeHS, 11 rue P. et M. Curie, 75005 Paris, France.
| | - T. Daniel Crawford
- Department of Chemistry, Virginia TechBlacksburgVA 24061USA,Molecular Sciences Software InstituteBlacksburgVA 24060USA
| | - Frank De Proft
- Research Group of General Chemistry (ALGC), Vrije Universiteit Brussel (VUB), Pleinlaan 2, B-1050 Brussels, Belgium.
| | | | - Claudia Draxl
- Institut für Physik and IRIS Adlershof, Humboldt-Universität zu Berlin, 12489 Berlin, Germany. .,Fritz-Haber-Institut der Max-Planck-Gesellschaft, 14195 Berlin, Germany
| | - Thomas Frauenheim
- Bremen Center for Computational Materials Science, University of Bremen, P.O. Box 330440, D-28334 Bremen, Germany. .,Beijing Computational Science Research Center (CSRC), 100193 Beijing, China.,Shenzhen JL Computational Science and Applied Research Institute, 518110 Shenzhen, China
| | - Emmanuel Fromager
- Laboratoire de Chimie Quantique, Institut de Chimie, CNRS/Université de Strasbourg, 4 rue Blaise Pascal, 67000 Strasbourg, France.
| | - Patricio Fuentealba
- Departamento de Física, Facultad de Ciencias, Universidad de Chile, Casilla 653, Santiago, Chile.
| | - Laura Gagliardi
- Department of Chemistry, Pritzker School of Molecular Engineering, The James Franck Institute, and Chicago Center for Theoretical Chemistry, The University of Chicago, Chicago, Illinois 60637, USA.
| | - Giulia Galli
- Pritzker School of Molecular Engineering and Department of Chemistry, The University of Chicago, Chicago, IL, USA.
| | - Jiali Gao
- Institute of Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen 518055, China. .,Department of Chemistry, University of Minnesota, Minneapolis, MN 55455, USA
| | - Paul Geerlings
- Research Group of General Chemistry (ALGC), Vrije Universiteit Brussel (VUB), Pleinlaan 2, B-1050 Brussels, Belgium.
| | - Nikitas Gidopoulos
- Department of Physics, Durham University, South Road, Durham DH1 3LE, UK.
| | - Peter M. W. Gill
- School of Chemistry, University of SydneyCamperdown NSW 2006Australia
| | - Paola Gori-Giorgi
- Department of Chemistry and Pharmaceutical Sciences, Amsterdam Institute of Molecular and Life Sciences (AIMMS), Faculty of Science, Vrije Universiteit, De Boelelaan 1083, 1081HV Amsterdam, The Netherlands.
| | - Andreas Görling
- Chair of Theoretical Chemistry, University of Erlangen-Nuremberg, Egerlandstrasse 3, 91058 Erlangen, Germany.
| | - Tim Gould
- Qld Micro- and Nanotechnology Centre, Griffith University, Gold Coast, Qld 4222, Australia.
| | - Stefan Grimme
- Mulliken Center for Theoretical Chemistry, University of Bonn, Beringstrasse 4, 53115 Bonn, Germany.
| | - Oleg Gritsenko
- Department of Chemistry and Pharmaceutical Sciences, Amsterdam Institute of Molecular and Life Sciences (AIMMS), Faculty of Science, Vrije Universiteit, De Boelelaan 1083, 1081HV Amsterdam, The Netherlands.
| | - Hans Jørgen Aagaard Jensen
- Department of Physics, Chemistry and Pharmacy, University of Southern Denmark, DK-5230 Odense M, Denmark.
| | - Erin R. Johnson
- Department of Chemistry, Dalhousie UniversityHalifaxNova ScotiaB3H 4R2Canada
| | - Robert O. Jones
- Peter Grünberg Institut PGI-1, Forschungszentrum Jülich52425 JülichGermany
| | - Martin Kaupp
- Technische Universität Berlin, Institut für Chemie, Theoretische Chemie/Quantenchemie, Sekr. C7, Straße des 17. Juni 135, 10623, Berlin.
| | - Andreas M. Köster
- Departamento de Química, Centro de Investigación y de Estudios Avanzados (Cinvestav)CDMX07360Mexico
| | - Leeor Kronik
- Department of Molecular Chemistry and Materials Science, Weizmann Institute of Science, Rehovoth, 76100, Israel.
| | - Anna I. Krylov
- Department of Chemistry, University of Southern CaliforniaLos AngelesCalifornia 90089USA
| | - Simen Kvaal
- Hylleraas Centre for Quantum Molecular Sciences, Department of Chemistry, University of Oslo, P.O. Box 1033 Blindern, N-0315 Oslo, Norway.
| | - Andre Laestadius
- Hylleraas Centre for Quantum Molecular Sciences, Department of Chemistry, University of Oslo, P.O. Box 1033 Blindern, N-0315 Oslo, Norway.
| | - Mel Levy
- Department of Chemistry, Tulane University, New Orleans, Louisiana, 70118, USA.
| | - Mathieu Lewin
- CNRS & CEREMADE, Université Paris-Dauphine, PSL Research University, Place de Lattre de Tassigny, 75016 Paris, France.
| | - Shubin Liu
- Research Computing Center, University of North Carolina, Chapel Hill, NC 27599-3420, USA. .,Department of Chemistry, University of North Carolina, Chapel Hill, NC 27599-3290, USA
| | - Pierre-François Loos
- Laboratoire de Chimie et Physique Quantiques (UMR 5626), Université de Toulouse, CNRS, UPS, France.
| | - Neepa T. Maitra
- Department of Physics, Rutgers University at Newark101 Warren StreetNewarkNJ 07102USA
| | - Frank Neese
- Max Planck Institut für Kohlenforschung, Kaiser Wilhelm Platz 1, D-45470 Mülheim an der Ruhr, Germany.
| | - John P. Perdew
- Departments of Physics and Chemistry, Temple UniversityPhiladelphiaPA 19122USA
| | - Katarzyna Pernal
- Institute of Physics, Lodz University of Technology, ul. Wolczanska 219, 90-924 Lodz, Poland.
| | - Pascal Pernot
- Institut de Chimie Physique, UMR8000, CNRS and Université Paris-Saclay, Bât. 349, Campus d'Orsay, 91405 Orsay, France.
| | - Piotr Piecuch
- Department of Chemistry, Michigan State University, East Lansing, Michigan 48824, USA. .,Department of Physics and Astronomy, Michigan State University, East Lansing, Michigan 48824, USA
| | - Elisa Rebolini
- Institut Laue Langevin, 71 avenue des Martyrs, 38000 Grenoble, France.
| | - Lucia Reining
- Laboratoire des Solides Irradiés, CNRS, CEA/DRF/IRAMIS, École Polytechnique, Institut Polytechnique de Paris, F-91120 Palaiseau, France. .,European Theoretical Spectroscopy Facility
| | - Pina Romaniello
- Laboratoire de Physique Théorique (UMR 5152), Université de Toulouse, CNRS, UPS, France.
| | - Adrienn Ruzsinszky
- Department of Physics, Temple University, Philadelphia, Pennsylvania 19122, USA.
| | - Dennis R. Salahub
- Department of Chemistry, Department of Physics and Astronomy, CMS – Centre for Molecular Simulation, IQST – Institute for Quantum Science and Technology, Quantum Alberta, University of Calgary2500 University Drive NWCalgaryAlbertaT2N 1N4Canada
| | - Matthias Scheffler
- The NOMAD Laboratory at the FHI of the Max-Planck-Gesellschaft and IRIS-Adlershof of the Humboldt-Universität zu Berlin, Faradayweg 4-6, D-14195, Germany.
| | - Peter Schwerdtfeger
- Centre for Theoretical Chemistry and Physics, The New Zealand Institute for Advanced Study, Massey University Auckland, 0632 Auckland, New Zealand.
| | - Viktor N. Staroverov
- Department of Chemistry, The University of Western OntarioLondonOntario N6A 5B7Canada
| | - Jianwei Sun
- Department of Physics and Engineering Physics, Tulane University, New Orleans, LA 70118, USA.
| | - Erik Tellgren
- Hylleraas Centre for Quantum Molecular Sciences, Department of Chemistry, University of Oslo, P.O. Box 1033 Blindern, N-0315 Oslo, Norway.
| | - David J. Tozer
- Department of Chemistry, Durham UniversitySouth RoadDurhamDH1 3LEUK
| | - Samuel B. Trickey
- Quantum Theory Project, Deptartment of Physics, University of FloridaGainesvilleFL 32611USA
| | - Carsten A. Ullrich
- Department of Physics and Astronomy, University of MissouriColumbiaMO 65211USA
| | - Alberto Vela
- Departamento de Química, Centro de Investigación y de Estudios Avanzados (Cinvestav), CDMX, 07360, Mexico.
| | - Giovanni Vignale
- Department of Physics, University of Missouri, Columbia, MO 65203, USA.
| | - Tomasz A. Wesolowski
- Department of Physical Chemistry, Université de Genève30 Quai Ernest-Ansermet1211 GenèveSwitzerland
| | - Xin Xu
- Shanghai Key Laboratory of Molecular Catalysis and Innovation Materials, Collaborative Innovation Centre of Chemistry for Energy Materials, MOE Laboratory for Computational Physical Science, Department of Chemistry, Fudan University, Shanghai 200433, China.
| | - Weitao Yang
- Department of Chemistry and Physics, Duke University, Durham, NC 27516, USA.
| |
Collapse
|
6
|
Pernot P. The long road to calibrated prediction uncertainty in computational chemistry. J Chem Phys 2022; 156:114109. [DOI: 10.1063/5.0084302] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023] Open
Abstract
Uncertainty quantification (UQ) in computational chemistry (CC) is still in its infancy. Very few CC methods are designed to provide a confidence level on their predictions, and most users still rely improperly on the mean absolute error as an accuracy metric. The development of reliable UQ methods is essential, notably for CC to be used confidently in industrial processes. A review of the CC-UQ literature shows that there is no common standard procedure to report or validate prediction uncertainty. I consider here analysis tools using concepts (calibration and sharpness) developed in meteorology and machine learning for the validation of probabilistic forecasters. These tools are adapted to CC-UQ and applied to datasets of prediction uncertainties provided by composite methods, Bayesian ensembles methods, and machine learning and a posteriori statistical methods.
Collapse
Affiliation(s)
- Pascal Pernot
- Institut de Chimie Physique, UMR8000 CNRS, Université Paris-Saclay, 91405 Orsay, France
| |
Collapse
|
7
|
Proppe J, Kircher J. Uncertainty Quantification of Reactivity Scales. Chemphyschem 2022; 23:e202200061. [PMID: 35189024 PMCID: PMC9314972 DOI: 10.1002/cphc.202200061] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2022] [Revised: 02/16/2022] [Indexed: 11/09/2022]
Abstract
According to Mayr, polar organic synthesis can be rationalized by a simple empirical relationship linking bimolecular rate constants to as few as three reactivity parameters. Here, we propose an extension to Mayr's reactivity method that is rooted in uncertainty quantification and transforms the reactivity parameters into probability distributions. Through uncertainty propagation, these distributions can be transformed into uncertainty estimates for bimolecular rate constants. Chemists can exploit these virtual error bars to enhance synthesis planning and to decrease the ambiguity of conclusions drawn from experimental data. We demonstrate the above at the example of the reference data set released by Mayr and co-workers [J. Am. Chem. Soc. 2001, 123, 9500; J. Am. Chem. Soc. 2012, 134, 13902]. As by-product of the new approach, we obtain revised reactivity parameters for 36 π-nucleophiles and 32 benzhydrylium ions.
Collapse
Affiliation(s)
- Jonny Proppe
- Technische Universität Braunschweig: Technische Universitat Braunschweig, Institute of Physical and Theoretical Chemistry, Gaußstraße 17, 38106, Braunschweig, GERMANY
| | - Johannes Kircher
- Georg-August-Universität Göttingen: Georg-August-Universitat Gottingen, Institute of Physical Chemistry, Tammannstraße 6, 37077, Göttingen, GERMANY
| |
Collapse
|
8
|
Steiner M, Reiher M. Autonomous Reaction Network Exploration in Homogeneous and Heterogeneous Catalysis. Top Catal 2022; 65:6-39. [PMID: 35185305 PMCID: PMC8816766 DOI: 10.1007/s11244-021-01543-9] [Citation(s) in RCA: 20] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/17/2021] [Indexed: 12/11/2022]
Abstract
Autonomous computations that rely on automated reaction network elucidation algorithms may pave the way to make computational catalysis on a par with experimental research in the field. Several advantages of this approach are key to catalysis: (i) automation allows one to consider orders of magnitude more structures in a systematic and open-ended fashion than what would be accessible by manual inspection. Eventually, full resolution in terms of structural varieties and conformations as well as with respect to the type and number of potentially important elementary reaction steps (including decomposition reactions that determine turnover numbers) may be achieved. (ii) Fast electronic structure methods with uncertainty quantification warrant high efficiency and reliability in order to not only deliver results quickly, but also to allow for predictive work. (iii) A high degree of autonomy reduces the amount of manual human work, processing errors, and human bias. Although being inherently unbiased, it is still steerable with respect to specific regions of an emerging network and with respect to the addition of new reactant species. This allows for a high fidelity of the formalization of some catalytic process and for surprising in silico discoveries. In this work, we first review the state of the art in computational catalysis to embed autonomous explorations into the general field from which it draws its ingredients. We then elaborate on the specific conceptual issues that arise in the context of autonomous computational procedures, some of which we discuss at an example catalytic system.
Collapse
Affiliation(s)
- Miguel Steiner
- Laboratory of Physical Chemistry, ETH Zurich, Vladimir-Prelog-Weg 2, 8093 Zurich, Switzerland
| | - Markus Reiher
- Laboratory of Physical Chemistry, ETH Zurich, Vladimir-Prelog-Weg 2, 8093 Zurich, Switzerland
| |
Collapse
|
9
|
Affiliation(s)
- Markus Reiher
- ETH Zürich, Laboratorium für Physikalische Chemie Vladimir-Prelog-Weg 2 8093 Zürich Switzerland
| |
Collapse
|
10
|
Krieger AM, Pidko EA. The Impact of Computational Uncertainties on the Enantioselectivity Predictions: A Microkinetic Modeling of Ketone Transfer Hydrogenation with a Noyori-type Mn-diamine Catalyst. ChemCatChem 2021; 13:3517-3524. [PMID: 34589158 PMCID: PMC8453751 DOI: 10.1002/cctc.202100341] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2021] [Revised: 05/23/2021] [Indexed: 12/26/2022]
Abstract
Selectivity control is one of the most important functions of a catalyst. In asymmetric catalysis the enantiomeric excess (e.e.) is a property of major interest, with a lot of effort dedicated to developing the most enantioselective catalyst, understanding the origin of selectivity, and predicting stereoselectivity. Herein, we investigate the relationship between predicted selectivity and the uncertainties in the computed energetics of the catalytic reaction mechanism obtained by DFT calculations in a case study of catalytic asymmetric transfer hydrogenation (ATH) of ketones with an Mn-diamine catalyst. Data obtained from our analysis of DFT data by microkinetic modeling is compared to results from experiment. We discuss the limitations of the conventional reductionist approach of e.e. estimation from assessing the enantiodetermining steps only. Our analysis shows that the energetics of other reaction steps in the reaction mechanism have a substantial impact on the predicted reaction selectivity. The uncertainty of DFT calculations within the commonly accepted energy ranges of chemical accuracy may reverse the predicted e.e. with the non-enantiodetermining steps contributing to e.e. deviations of up to 25 %.
Collapse
Affiliation(s)
- Annika M. Krieger
- Inorganic Systems EngineeringDepartment of Chemical EngineeringFaculty of Applied SciencesDelft University of TechnologyVan der Maasweg 92629 HZDelftThe Netherlands
| | - Evgeny A. Pidko
- Inorganic Systems EngineeringDepartment of Chemical EngineeringFaculty of Applied SciencesDelft University of TechnologyVan der Maasweg 92629 HZDelftThe Netherlands
| |
Collapse
|
11
|
Wan S, Sinclair RC, Coveney PV. Uncertainty quantification in classical molecular dynamics. PHILOSOPHICAL TRANSACTIONS. SERIES A, MATHEMATICAL, PHYSICAL, AND ENGINEERING SCIENCES 2021; 379:20200082. [PMID: 33775140 PMCID: PMC8059622 DOI: 10.1098/rsta.2020.0082] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 11/02/2020] [Indexed: 05/24/2023]
Abstract
Molecular dynamics simulation is now a widespread approach for understanding complex systems on the atomistic scale. It finds applications from physics and chemistry to engineering, life and medical science. In the last decade, the approach has begun to advance from being a computer-based means of rationalizing experimental observations to producing apparently credible predictions for a number of real-world applications within industrial sectors such as advanced materials and drug discovery. However, key aspects concerning the reproducibility of the method have not kept pace with the speed of its uptake in the scientific community. Here, we present a discussion of uncertainty quantification for molecular dynamics simulation designed to endow the method with better error estimates that will enable it to be used to report actionable results. The approach adopted is a standard one in the field of uncertainty quantification, namely using ensemble methods, in which a sufficiently large number of replicas are run concurrently, from which reliable statistics can be extracted. Indeed, because molecular dynamics is intrinsically chaotic, the need to use ensemble methods is fundamental and holds regardless of the duration of the simulations performed. We discuss the approach and illustrate it in a range of applications from materials science to ligand-protein binding free energy estimation. This article is part of the theme issue 'Reliability and reproducibility in computational science: implementing verification, validation and uncertainty quantification in silico'.
Collapse
Affiliation(s)
- Shunzhou Wan
- Centre for Computational Science, University College London, Gordon Street, London WC1H 0AJ, UK
| | - Robert C. Sinclair
- Centre for Computational Science, University College London, Gordon Street, London WC1H 0AJ, UK
| | - Peter V. Coveney
- Centre for Computational Science, University College London, Gordon Street, London WC1H 0AJ, UK
- Institute for Informatics, Science Park 904, University of Amsterdam, 1098 XH Amsterdam, The Netherlands
| |
Collapse
|
12
|
Using the Gini coefficient to characterize the shape of computational chemistry error distributions. Theor Chem Acc 2021. [DOI: 10.1007/s00214-021-02725-0] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
|
13
|
Zöllner MS, Saghatchi A, Mujica V, Herrmann C. Influence of Electronic Structure Modeling and Junction Structure on First-Principles Chiral Induced Spin Selectivity. J Chem Theory Comput 2020; 16:7357-7371. [PMID: 33167619 DOI: 10.1021/acs.jctc.0c00621] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
We have carried out a comprehensive study of the influence of electronic structure modeling and junction structure description on the first-principles calculation of the spin polarization in molecular junctions caused by the chiral induced spin selectivity (CISS) effect. We explore the limits and the sensitivity to modeling decisions of a Landauer/Green's function/two-component density functional theory approach to CISS. We find that although the CISS effect is entirely attributed in the literature to molecular spin filtering, spin-orbit coupling being partially inherited from the metal electrodes plays an important role in our calculations on ideal carbon helices, even though this effect cannot explain the experimental conductance results. Its magnitude depends considerably on the shape, size, and material of the metal clusters modeling the electrodes. Also, a pronounced dependence on the specific description of exchange interaction and spin-orbit coupling is manifest in our approach. This is important because the interplay between exchange effects and spin-orbit coupling may play an important role in the description of the junction magnetic response. Our calculations are relevant for the whole field of spin-polarized electron transport and electron transfer, because there is still an open discussion in the literature about the detailed underlying mechanism and the magnitude of physical parameters that need to be included to achieve a consistent description of the CISS effect: seemingly good quantitative agreement between simulation and the experiment can be caused by error compensation, because spin polarization as contained in a Landauer/Green's function/two-component density functional theory approach depends strongly on computational and structural parameters.
Collapse
Affiliation(s)
| | - Aida Saghatchi
- Department of Chemistry, University of Hamburg, 20146 Hamburg, Germany
| | - Vladimiro Mujica
- School of Molecular Sciences, Arizona State University, Tempe, Arizona 85287-1604, United States.,Kimika Fakultatea, Euskal Herriko Unibertsitatea and Donostia International Physics Center (DIPC), Donostia, Euskadi P.K. 1072, 20080, Spain
| | - Carmen Herrmann
- Department of Chemistry, University of Hamburg, 20146 Hamburg, Germany
| |
Collapse
|
14
|
Sugisawa H, Ida T, Krems RV. Gaussian process model of 51-dimensional potential energy surface for protonated imidazole dimer. J Chem Phys 2020; 153:114101. [DOI: 10.1063/5.0023492] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Affiliation(s)
- Hiroki Sugisawa
- Department of Chemistry, University of British Columbia, Vancouver, British Columbia V6T 1Z1, Canada
- Division of Material Chemistry, Graduate School of Natural Science and Technology, Kanazawa University, Kakuma, Kanazawa 920-1192, Japan
| | - Tomonori Ida
- Division of Material Chemistry, Graduate School of Natural Science and Technology, Kanazawa University, Kakuma, Kanazawa 920-1192, Japan
| | - R. V. Krems
- Department of Chemistry, University of British Columbia, Vancouver, British Columbia V6T 1Z1, Canada
- Stewart Blusson Quantum Matter Institute, University of British Columbia, Vancouver, British Columbia V6T 1Z4, Canada
| |
Collapse
|
15
|
Pernot P, Huang B, Savin A. Impact of non-normal error distributions on the benchmarking and ranking of quantum machine learning models. MACHINE LEARNING-SCIENCE AND TECHNOLOGY 2020. [DOI: 10.1088/2632-2153/aba184] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/21/2023]
|
16
|
Sršeň Š, Sita J, Slavíček P, Ladányi V, Heger D. Limits of the Nuclear Ensemble Method for Electronic Spectra Simulations: Temperature Dependence of the (E)-Azobenzene Spectrum. J Chem Theory Comput 2020; 16:6428-6438. [DOI: 10.1021/acs.jctc.0c00579] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Affiliation(s)
- Štěpán Sršeň
- Department of Physical Chemistry, University of Chemistry and Technology, Technická 5, 16628 Prague, Czech Republic
| | - Jaroslav Sita
- Department of Physical Chemistry, University of Chemistry and Technology, Technická 5, 16628 Prague, Czech Republic
| | - Petr Slavíček
- Department of Physical Chemistry, University of Chemistry and Technology, Technická 5, 16628 Prague, Czech Republic
| | - Vít Ladányi
- Department of Chemistry, Faculty of Science, Masaryk University, Kamenice 5, 62500 Brno, Czech Republic
| | - Dominik Heger
- Department of Chemistry, Faculty of Science, Masaryk University, Kamenice 5, 62500 Brno, Czech Republic
| |
Collapse
|
17
|
Sobez JG, Reiher M. Molassembler: Molecular Graph Construction, Modification, and Conformer Generation for Inorganic and Organic Molecules. J Chem Inf Model 2020; 60:3884-3900. [DOI: 10.1021/acs.jcim.0c00503] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Jan-Grimo Sobez
- Laboratory of Physical Chemistry, ETH Zurich, Vladimir-Prelog-Weg 2, 8093 Zurich, Switzerland
| | - Markus Reiher
- Laboratory of Physical Chemistry, ETH Zurich, Vladimir-Prelog-Weg 2, 8093 Zurich, Switzerland
| |
Collapse
|
18
|
Pernot P, Savin A. Probabilistic performance estimators for computational chemistry methods: Systematic improvement probability and ranking probability matrix. I. Theory. J Chem Phys 2020; 152:164108. [PMID: 32357773 DOI: 10.1063/5.0006202] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023] Open
Abstract
The comparison of benchmark error sets is an essential tool for the evaluation of theories in computational chemistry. The standard ranking of methods by their mean unsigned error is unsatisfactory for several reasons linked to the non-normality of the error distributions and the presence of underlying trends. Complementary statistics have recently been proposed to palliate such deficiencies, such as quantiles of the absolute error distribution or the mean prediction uncertainty. We introduce here a new score, the systematic improvement probability, based on the direct system-wise comparison of absolute errors. Independent of the chosen scoring rule, the uncertainty of the statistics due to the incompleteness of the benchmark datasets is also generally overlooked. However, this uncertainty is essential to appreciate the robustness of rankings. In the present article, we develop two indicators based on robust statistics to address this problem: Pinv, the inversion probability between two values of a statistic, and Pr, the ranking probability matrix. We demonstrate also the essential contribution of the correlations between error sets in these scores comparisons.
Collapse
Affiliation(s)
- Pascal Pernot
- Institut de Chimie Physique, UMR8000, CNRS, Université Paris-Saclay, 91405 Orsay, France
| | - Andreas Savin
- Laboratoire de Chimie Théorique, CNRS and UPMC Université Paris 06, Sorbonne Universités, 75252 Paris, France
| |
Collapse
|
19
|
Pernot P, Savin A. Probabilistic performance estimators for computational chemistry methods: Systematic improvement probability and ranking probability matrix. II. Applications. J Chem Phys 2020; 152:164109. [DOI: 10.1063/5.0006204] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023] Open
Affiliation(s)
- Pascal Pernot
- Institut de Chimie Physique, UMR8000, CNRS, Université Paris-Saclay, 91405 Orsay, France
| | - Andreas Savin
- Laboratoire de Chimie Théorique, CNRS and UPMC Université Paris 06, Sorbonne Universités, 75252 Paris, France
| |
Collapse
|
20
|
Scalia G, Grambow CA, Pernici B, Li YP, Green WH. Evaluating Scalable Uncertainty Estimation Methods for Deep Learning-Based Molecular Property Prediction. J Chem Inf Model 2020; 60:2697-2717. [PMID: 32243154 DOI: 10.1021/acs.jcim.9b00975] [Citation(s) in RCA: 62] [Impact Index Per Article: 15.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
Advances in deep neural network (DNN)-based molecular property prediction have recently led to the development of models of remarkable accuracy and generalization ability, with graph convolutional neural networks (GCNNs) reporting state-of-the-art performance for this task. However, some challenges remain, and one of the most important that needs to be fully addressed concerns uncertainty quantification. DNN performance is affected by the volume and the quality of the training samples. Therefore, establishing when and to what extent a prediction can be considered reliable is just as important as outputting accurate predictions, especially when out-of-domain molecules are targeted. Recently, several methods to account for uncertainty in DNNs have been proposed, most of which are based on approximate Bayesian inference. Among these, only a few scale to the large data sets required in applications. Evaluating and comparing these methods has recently attracted great interest, but results are generally fragmented and absent for molecular property prediction. In this paper, we quantitatively compare scalable techniques for uncertainty estimation in GCNNs. We introduce a set of quantitative criteria to capture different uncertainty aspects and then use these criteria to compare MC-dropout, Deep Ensembles, and bootstrapping, both theoretically in a unified framework that separates aleatoric/epistemic uncertainty and experimentally on public data sets. Our experiments quantify the performance of the different uncertainty estimation methods and their impact on uncertainty-related error reduction. Our findings indicate that Deep Ensembles and bootstrapping consistently outperform MC-dropout, with different context-specific pros and cons. Our analysis leads to a better understanding of the role of aleatoric/epistemic uncertainty, also in relation to the target data set features, and highlights the challenge posed by out-of-domain uncertainty.
Collapse
Affiliation(s)
- Gabriele Scalia
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States.,Department of Electronics, Information and Bioengineering, Politecnico di Milano, 20133 Milano, Italy
| | - Colin A Grambow
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Barbara Pernici
- Department of Electronics, Information and Bioengineering, Politecnico di Milano, 20133 Milano, Italy
| | - Yi-Pei Li
- Department of Chemical Engineering, National Taiwan University, Taipei 10617, Taiwan
| | - William H Green
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| |
Collapse
|
21
|
Abstract
Modern computational chemistry has reached a stage at which massive exploration into chemical reaction space with unprecedented resolution with respect to the number of potentially relevant molecular structures has become possible. Various algorithmic advances have shown that such structural screenings must and can be automated and routinely carried out. This will replace the standard approach of manually studying a selected and restricted number of molecular structures for a chemical mechanism. The complexity of the task has led to many different approaches. However, all of them address the same general target, namely to produce a complete atomistic picture of the kinetics of a chemical process. It is the purpose of this overview to categorize the problems that should be targeted and to identify the principal components and challenges of automated exploration machines so that the various existing approaches and future developments can be compared based on well-defined conceptual principles.
Collapse
Affiliation(s)
- Jan P. Unsleber
- Laboratory for Physical Chemistry, ETH Zurich, 8093 Zurich, Switzerland
| | - Markus Reiher
- Laboratory for Physical Chemistry, ETH Zurich, 8093 Zurich, Switzerland
| |
Collapse
|
22
|
Brunken C, Reiher M. Self-Parametrizing System-Focused Atomistic Models. J Chem Theory Comput 2020; 16:1646-1665. [DOI: 10.1021/acs.jctc.9b00855] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Affiliation(s)
- Christoph Brunken
- Laboratory for Physical Chemistry, ETH Zurich, Vladimir-Prelog-Weg 2, 8093 Zurich, Switzerland
| | - Markus Reiher
- Laboratory for Physical Chemistry, ETH Zurich, Vladimir-Prelog-Weg 2, 8093 Zurich, Switzerland
| |
Collapse
|
23
|
Bergmann TG, Welzel MO, Jacob CR. Towards theoretical spectroscopy with error bars: systematic quantification of the structural sensitivity of calculated spectra. Chem Sci 2019; 11:1862-1877. [PMID: 34123280 PMCID: PMC8148348 DOI: 10.1039/c9sc05103a] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
Molecular spectra calculated with quantum-chemical methods are subject to a number of uncertainties (e.g., errors introduced by the computational methodology) that hamper the direct comparison of experiment and computation. Judging these uncertainties is crucial for drawing reliable conclusions from the interplay of experimental and theoretical spectroscopy, but largely relies on subjective judgment. Here, we explore the application of methods from uncertainty quantification to theoretical spectroscopy, with the ultimate goal of providing systematic error bars for calculated spectra. As a first target, we consider distortions of the underlying molecular structure as one important source of uncertainty. We show that by performing a principal component analysis, the most influential collective distortions can be identified, which allows for the construction of surrogate models that are amenable to a statistical analysis of the propagation of uncertainties in the molecular structure to uncertainties in the calculated spectrum. This is applied to the calculation of X-ray emission spectra of iron carbonyl complexes, of the electronic excitation spectrum of a coumarin dye, and of the infrared spectrum of alanine. We show that with our approach it becomes possible to obtain error bars for calculated spectra that account for uncertainties in the molecular structure. This is an important first step towards systematically quantifying other relevant sources of uncertainty in theoretical spectroscopy. Uncertainty quantification is applied in theoretical spectroscopy to obtain error bars accounting for the structural sensitivity of calculated spectra.![]()
Collapse
Affiliation(s)
- Tobias G Bergmann
- Technische Universität Braunschweig, Institute of Physical and Theoretical Chemistry Gaußstraße 17 38106 Braunschweig Germany
| | - Michael O Welzel
- Technische Universität Braunschweig, Institute of Physical and Theoretical Chemistry Gaußstraße 17 38106 Braunschweig Germany
| | - Christoph R Jacob
- Technische Universität Braunschweig, Institute of Physical and Theoretical Chemistry Gaußstraße 17 38106 Braunschweig Germany
| |
Collapse
|
24
|
Proppe J, Gugler S, Reiher M. Gaussian Process-Based Refinement of Dispersion Corrections. J Chem Theory Comput 2019; 15:6046-6060. [PMID: 31603673 DOI: 10.1021/acs.jctc.9b00627] [Citation(s) in RCA: 32] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
We employ Gaussian process (GP) regression to adjust for systematic errors in D3-type dispersion corrections. We refer to the associated, statistically improved model as D3-GP. It is trained on differences between interaction energies obtained from PBE-D3(BJ)/ma-def2-QZVPP and DLPNO-CCSD(T)/CBS calculations. We generated a data set containing interaction energies for 1248 molecular dimers, which resemble the dispersion-dominated systems contained in the S66 data set. Our systems represent not only equilibrium structures but also dimers with various relative orientations and conformations at both shorter and longer distances. A reparametrization of the D3(BJ) model based on 66 of these dimers suggests that two of its three empirical parameters, a1 and s8, are zero, whereas a2 = 5.6841 bohr. For the remaining 1182 dimers, we find that this new set of parameters is superior to all previously published D3(BJ) parameter sets. To train our D3-GP model, we engineered two different vectorial representations of (supra-)molecular systems, both derived from the matrix of atom-pairwise D3(BJ) interaction terms: (a) a distance-resolved interaction energy histogram, histD3(BJ), and (b) eigenvalues of the interaction matrix ordered according to their decreasing absolute value, eigD3(BJ). Hence, the GP learns a mapping from D3(BJ) information only, which renders D3-GP-type dispersion corrections comparable to those obtained with the original D3 approach. They improve systematically if the underlying training set is selected carefully. Here, we harness the prediction variance obtained from GP regression to select optimal training sets in an automated fashion. The larger the variance, the more information the corresponding data point may add to the training set. For a given set of molecular systems, variance-based sampling can approximately determine the smallest subset being subjected to reference calculations such that all dispersion corrections for the remaining systems fall below a predefined accuracy threshold. To render the entire D3-GP workflow as efficient as possible, we present an improvement over our variance-based, sequential active-learning scheme [ J. Chem. Theory Comput. 2018 , 14 , 5238 ]. Our refined learning algorithm selects multiple (instead of single) systems that can be subjected to reference calculations simultaneously. We refer to the underlying selection strategy as batchwise variance-based sampling (BVS). BVS-guided active learning is an essential component of our D3-GP workflow, which is implemented in a black-box fashion. Once provided with reference data for new molecular systems, the underlying GP model automatically learns to adapt to these and similar systems. This approach leads overall to a self-improving model (D3-GP) that predicts system-focused and GP-refined D3-type dispersion corrections for any given system of reference data.
Collapse
Affiliation(s)
- Jonny Proppe
- Department of Chemistry , and Department of Computer Science , University of Toronto , Toronto , Ontario M5S , Canada.,Laboratory of Physical Chemistry , ETH Zurich , Vladimir-Prelog-Weg 2 , 8093 Zurich , Switzerland
| | - Stefan Gugler
- Laboratory of Physical Chemistry , ETH Zurich , Vladimir-Prelog-Weg 2 , 8093 Zurich , Switzerland
| | - Markus Reiher
- Laboratory of Physical Chemistry , ETH Zurich , Vladimir-Prelog-Weg 2 , 8093 Zurich , Switzerland
| |
Collapse
|
25
|
Janet JP, Liu F, Nandy A, Duan C, Yang T, Lin S, Kulik HJ. Designing in the Face of Uncertainty: Exploiting Electronic Structure and Machine Learning Models for Discovery in Inorganic Chemistry. Inorg Chem 2019; 58:10592-10606. [PMID: 30834738 DOI: 10.1021/acs.inorgchem.9b00109] [Citation(s) in RCA: 59] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Recent transformative advances in computing power and algorithms have made computational chemistry central to the discovery and design of new molecules and materials. First-principles simulations are increasingly accurate and applicable to large systems with the speed needed for high-throughput computational screening. Despite these strides, the combinatorial challenges associated with the vastness of chemical space mean that more than just fast and accurate computational tools are needed for accelerated chemical discovery. In transition-metal chemistry and catalysis, unique challenges arise. The variable spin, oxidation state, and coordination environments favored by elements with well-localized d or f electrons provide great opportunity for tailoring properties in catalytic or functional (e.g., magnetic) materials but also add layers of uncertainty to any design strategy. We outline five key mandates for realizing computationally driven accelerated discovery in inorganic chemistry: (i) fully automated simulation of new compounds, (ii) knowledge of prediction sensitivity or accuracy, (iii) faster-than-fast property prediction methods, (iv) maps for rapid chemical space traversal, and (v) a means to reveal design rules on the kilocompound scale. Through case studies in open-shell transition-metal chemistry, we describe how advances in methodology and software in each of these areas bring about new chemical insights. We conclude with our outlook on the next steps in this process toward realizing fully autonomous discovery in inorganic chemistry using computational chemistry.
Collapse
Affiliation(s)
- Jon Paul Janet
- Department of Chemical Engineering , Massachusetts Institute of Technology , Cambridge , Massachusetts 02139 , United States
| | - Fang Liu
- Department of Chemical Engineering , Massachusetts Institute of Technology , Cambridge , Massachusetts 02139 , United States
| | - Aditya Nandy
- Department of Chemical Engineering , Massachusetts Institute of Technology , Cambridge , Massachusetts 02139 , United States.,Department of Chemistry , Massachusetts Institute of Technology , Cambridge , Massachusetts 02139 , United States
| | - Chenru Duan
- Department of Chemical Engineering , Massachusetts Institute of Technology , Cambridge , Massachusetts 02139 , United States.,Department of Chemistry , Massachusetts Institute of Technology , Cambridge , Massachusetts 02139 , United States
| | - Tzuhsiung Yang
- Department of Chemical Engineering , Massachusetts Institute of Technology , Cambridge , Massachusetts 02139 , United States
| | - Sean Lin
- Department of Chemical Engineering , Massachusetts Institute of Technology , Cambridge , Massachusetts 02139 , United States
| | - Heather J Kulik
- Department of Chemical Engineering , Massachusetts Institute of Technology , Cambridge , Massachusetts 02139 , United States
| |
Collapse
|
26
|
Li YP, Han K, Grambow CA, Green WH. Self-Evolving Machine: A Continuously Improving Model for Molecular Thermochemistry. J Phys Chem A 2019; 123:2142-2152. [DOI: 10.1021/acs.jpca.8b10789] [Citation(s) in RCA: 33] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Affiliation(s)
- Yi-Pei Li
- Department of Chemical Engineering, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, Massachusetts 02139, United States
| | - Kehang Han
- Department of Chemical Engineering, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, Massachusetts 02139, United States
| | - Colin A. Grambow
- Department of Chemical Engineering, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, Massachusetts 02139, United States
| | - William H. Green
- Department of Chemical Engineering, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, Massachusetts 02139, United States
| |
Collapse
|
27
|
Abstract
This article discusses applications of Bayesian machine learning for quantum molecular dynamics.
Collapse
Affiliation(s)
- R. V. Krems
- Department of Chemistry
- University of British Columbia
- Vancouver
- Canada
| |
Collapse
|
28
|
Proppe J, Reiher M. Mechanism Deduction from Noisy Chemical Reaction Networks. J Chem Theory Comput 2018; 15:357-370. [DOI: 10.1021/acs.jctc.8b00310] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
Affiliation(s)
- Jonny Proppe
- Laboratory of Physical Chemistry, ETH Zürich, Vladimir-Prelog-Weg 2, 8093 Zürich, Switzerland
| | - Markus Reiher
- Laboratory of Physical Chemistry, ETH Zürich, Vladimir-Prelog-Weg 2, 8093 Zürich, Switzerland
| |
Collapse
|
29
|
Simm GN, Vaucher AC, Reiher M. Exploration of Reaction Pathways and Chemical Transformation Networks. J Phys Chem A 2018; 123:385-399. [DOI: 10.1021/acs.jpca.8b10007] [Citation(s) in RCA: 103] [Impact Index Per Article: 17.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Affiliation(s)
- Gregor N. Simm
- Laboratory of Physical Chemistry, ETH Zürich, Vladimir-Prelog-Weg 2, 8093 Zürich, Switzerland
| | - Alain C. Vaucher
- Laboratory of Physical Chemistry, ETH Zürich, Vladimir-Prelog-Weg 2, 8093 Zürich, Switzerland
| | - Markus Reiher
- Laboratory of Physical Chemistry, ETH Zürich, Vladimir-Prelog-Weg 2, 8093 Zürich, Switzerland
| |
Collapse
|
30
|
Pernot P, Savin A. Probabilistic performance estimators for computational chemistry methods: The empirical cumulative distribution function of absolute errors. J Chem Phys 2018; 148:241707. [DOI: 10.1063/1.5016248] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023] Open
Affiliation(s)
- Pascal Pernot
- Laboratoire de Chimie Physique, UMR8000 CNRS/Université Paris-Sud, F-91405 Orsay, France
| | - Andreas Savin
- Laboratoire de Chimie Théorique, CNRS and UPMC Université Paris 06, Sorbonne Universités, F-75252 Paris, France
| |
Collapse
|
31
|
Abstract
Statistical estimation of the prediction uncertainty of physical models is typically hindered by the inadequacy of these models due to various approximations they are built upon. The prediction errors caused by model inadequacy can be handled either by correcting the model's results or by adapting the model's parameter uncertainty to generate prediction uncertainties representative, in a way to be defined, of model inadequacy errors. The main advantage of the latter approach (thereafter called PUI, for Parameter Uncertainty Inflation) is its transferability to the prediction of other quantities of interest based on the same parameters. A critical review of implementations of PUI in several areas of computational chemistry shows that it is biased, in the sense that it does not produce prediction uncertainty bands conforming to model inadequacy errors.
Collapse
Affiliation(s)
- Pascal Pernot
- Laboratoire de Chimie Physique, UMR 8000, CNRS/Université Paris-Sud, F-91405 Orsay, France
| |
Collapse
|
32
|
Weymuth T, Proppe J, Reiher M. Statistical Analysis of Semiclassical Dispersion Corrections. J Chem Theory Comput 2018; 14:2480-2494. [PMID: 29613785 DOI: 10.1021/acs.jctc.8b00078] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2023]
Abstract
Semiclassical dispersion corrections developed by Grimme and co-workers have become indispensable in applications of Kohn-Sham density functional theory. A deeper understanding of the underlying parametrization might be crucial for well-founded further improvements of this successful approach. To this end, we present an in-depth assessment of the fit parameters present in semiclassical (D3-type) dispersion corrections by means of a statistically rigorous analysis. We find that the choice of the cost function generally has a small effect on the empirical parameters of D3-type dispersion corrections with respect to the reference set under consideration. Only in a few cases, the choice of cost function has a surprisingly large effect on the total dispersion energies. In particular, the weighting scheme in the cost function can significantly affect the reliability of predictions. In order to obtain unbiased (data-independent) uncertainty estimates for both the empirical fit parameters and the corresponding predictions, we carried out a nonparametric bootstrap analysis. This analysis reveals that the standard deviation of the mean of the empirical D3 parameters is small. Moreover, the mean prediction uncertainty obtained by bootstrapping is not much larger than previously reported error measures. On the basis of a jackknife analysis, we find that the original reference set is slightly skewed, but our results also suggest that this feature hardly affects the prediction of dispersion energies. Furthermore, we find that the introduction of small uncertainties to the reference data does not change the conclusions drawn in this work. However, a rigorous analysis of error accumulation arising from different parametrizations reveals that error cancellation does not necessarily occur, leading to a monotonically increasing deviation in the dispersion energy with increasing molecule size. We discuss this issue in detail at the prominent example of the C60 "buckycatcher". We find deviations between individual parametrizations of several tens of kilocalories per mole in some cases. Hence, in combination with any calculation of dispersion energies, we recommend to always determine the associated uncertainties for which we will provide a software tool.
Collapse
Affiliation(s)
- Thomas Weymuth
- Laboratorium für Physikalische Chemie , ETH Zürich , Vladimir-Prelog-Weg 2 , 8093 Zürich , Switzerland
| | - Jonny Proppe
- Laboratorium für Physikalische Chemie , ETH Zürich , Vladimir-Prelog-Weg 2 , 8093 Zürich , Switzerland
| | - Markus Reiher
- Laboratorium für Physikalische Chemie , ETH Zürich , Vladimir-Prelog-Weg 2 , 8093 Zürich , Switzerland
| |
Collapse
|
33
|
Aspuru-Guzik A, Lindh R, Reiher M. The Matter Simulation (R)evolution. ACS CENTRAL SCIENCE 2018; 4:144-152. [PMID: 29532014 PMCID: PMC5832995 DOI: 10.1021/acscentsci.7b00550] [Citation(s) in RCA: 63] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/13/2017] [Indexed: 05/26/2023]
Abstract
To date, the program for the development of methods and models for atomistic and continuum simulation directed toward chemicals and materials has reached an incredible degree of sophistication and maturity. Currently, one can witness an increasingly rapid emergence of advances in computing, artificial intelligence, and robotics. This drives us to consider the future of computer simulation of matter from the molecular to the human length and time scales in a radical way that deliberately dares to go beyond the foreseeable next steps in any given discipline. This perspective article presents a view on this future development that we believe is likely to become a reality during our lifetime.
Collapse
Affiliation(s)
- Alán Aspuru-Guzik
- Department of Chemistry
and Chemical Biology, Harvard University, Cambridge, Massachusetts 02138, United States
- Canadian Institute for Advanced Research
(CIFAR), Toronto, Ontario M5G 1Z8, Canada
| | - Roland Lindh
- Department of Chemistry−Ångström,
The Theoretical Chemistry Programme, and Uppsala Center for Computational
Chemistry—UC3, Uppsala University, Box 518, 751 20 Uppsala, Sweden
| | - Markus Reiher
- Laboratory of Physical
Chemistry, ETH Zurich, Vladimir-Prelog-Weg 2, 8093 Zurich, Switzerland
| |
Collapse
|