51
|
Nandy A, Terrones G, Arunachalam N, Duan C, Kastner DW, Kulik HJ. MOFSimplify, machine learning models with extracted stability data of three thousand metal-organic frameworks. Sci Data 2022; 9:74. [PMID: 35277533 PMCID: PMC8917177 DOI: 10.1038/s41597-022-01181-0] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2021] [Accepted: 01/17/2022] [Indexed: 11/09/2022] Open
Abstract
We report a workflow and the output of a natural language processing (NLP)-based procedure to mine the extant metal–organic framework (MOF) literature describing structurally characterized MOFs and their solvent removal and thermal stabilities. We obtain over 2,000 solvent removal stability measures from text mining and 3,000 thermal decomposition temperatures from thermogravimetric analysis data. We assess the validity of our NLP methods and the accuracy of our extracted data by comparing to a hand-labeled subset. Machine learning (ML, i.e. artificial neural network) models trained on this data using graph- and pore-geometry-based representations enable prediction of stability on new MOFs with quantified uncertainty. Our web interface, MOFSimplify, provides users access to our curated data and enables them to harness that data for predictions on new MOFs. MOFSimplify also encourages community feedback on existing data and on ML model predictions for community-based active learning for improved MOF stability models. Measurement(s) | thermal decomposition | Technology Type(s) | thermogravimetry |
Collapse
Affiliation(s)
- Aditya Nandy
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA.,Department of Chemistry, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA
| | - Gianmarco Terrones
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA
| | - Naveen Arunachalam
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA
| | - Chenru Duan
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA.,Department of Chemistry, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA
| | - David W Kastner
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA.,Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA
| | - Heather J Kulik
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA.
| |
Collapse
|
52
|
Duan C, Nandy A, Kulik HJ. Machine Learning for the Discovery, Design, and Engineering of Materials. Annu Rev Chem Biomol Eng 2022; 13:405-429. [PMID: 35320698 DOI: 10.1146/annurev-chembioeng-092320-120230] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Machine learning (ML) has become a part of the fabric of high-throughput screening and computational discovery of materials. Despite its increasingly central role, challenges remain in fully realizing the promise of ML. This is especially true for the practical acceleration of the engineering of robust materials and the development of design strategies that surpass trial and error or high-throughput screening alone. Depending on the quantity being predicted and the experimental data available, ML can either outperform physics-based modes, be used to accelerate such models, or be integrated with them to improve their performance. We cover recent advances in algorithms and in their application that are starting to make inroads toward (a) the discovery of new materials through large-scale enumerative screening, (b) the design of materials through identification of rules and principles that govern materials properties, and (c) the engineering of practical materials by satisfying multiple objectives. We conclude with opportunities for further advancement to realize ML as a widespread tool for practical computational materials design. Expected final online publication date for the Annual Review of Chemical and Biomolecular Engineering, Volume 13 is October 2022. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.
Collapse
Affiliation(s)
- Chenru Duan
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA; , , .,Department of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA
| | - Aditya Nandy
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA; , , .,Department of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA
| | - Heather J Kulik
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA; , ,
| |
Collapse
|
53
|
Zeng Y, Gordiichuk P, Ichihara T, Zhang G, Sandoz-Rosado E, Wetzel ED, Tresback J, Yang J, Kozawa D, Yang Z, Kuehne M, Quien M, Yuan Z, Gong X, He G, Lundberg DJ, Liu P, Liu AT, Yang JF, Kulik HJ, Strano MS. Irreversible synthesis of an ultrastrong two-dimensional polymeric material. Nature 2022; 602:91-95. [PMID: 35110762 DOI: 10.1038/s41586-021-04296-3] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2021] [Accepted: 12/01/2021] [Indexed: 12/28/2022]
Abstract
Polymers that extend covalently in two dimensions have attracted recent attention1,2 as a means of combining the mechanical strength and in-plane energy conduction of conventional two-dimensional (2D) materials3,4 with the low densities, synthetic processability and organic composition of their one-dimensional counterparts. Efforts so far have proven successful in forms that do not allow full realization of these properties, such as polymerization at flat interfaces5,6 or fixation of monomers in immobilized lattices7-9. Another frequently employed synthetic approach is to introduce microscopic reversibility, at the cost of bond stability, to achieve 2D crystals after extensive error correction10,11. Here we demonstrate a homogenous 2D irreversible polycondensation that results in a covalently bonded 2D polymeric material that is chemically stable and highly processable. Further processing yields highly oriented, free-standing films that have a 2D elastic modulus and yield strength of 12.7 ± 3.8 gigapascals and 488 ± 57 megapascals, respectively. This synthetic route provides opportunities for 2D materials in applications ranging from composite structures to barrier coating materials.
Collapse
Affiliation(s)
- Yuwen Zeng
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Pavlo Gordiichuk
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Takeo Ichihara
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Ge Zhang
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Emil Sandoz-Rosado
- U.S. Army Combat Capabilities Development Command, Army Research Laboratory, Aberdeen Proving Ground, MD, USA
| | - Eric D Wetzel
- U.S. Army Combat Capabilities Development Command, Army Research Laboratory, Aberdeen Proving Ground, MD, USA
| | - Jason Tresback
- Center for Nanoscale Systems, Harvard University, Cambridge, MA, USA
| | - Jing Yang
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Daichi Kozawa
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Zhongyue Yang
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Matthias Kuehne
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Michelle Quien
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Zhe Yuan
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Xun Gong
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Guangwei He
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Daniel James Lundberg
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Pingwei Liu
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Albert Tianxiang Liu
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Jing Fan Yang
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Heather J Kulik
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Michael S Strano
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA.
| |
Collapse
|
54
|
Harper DR, Nandy A, Arunachalam N, Duan C, Janet JP, Kulik HJ. Representations and strategies for transferable machine learning Improve model performance in chemical discovery. J Chem Phys 2022; 156:074101. [DOI: 10.1063/5.0082964] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Affiliation(s)
- Daniel R Harper
- Massachusetts Institute of Technology, United States of America
| | - Aditya Nandy
- Massachusetts Institute of Technology, United States of America
| | | | - Chenru Duan
- Massachusetts Institute of Technology, United States of America
| | | | - Heather J. Kulik
- Dept of Chemical Engineering, Massachusetts Institute of Technology, United States of America
| |
Collapse
|
55
|
Abstract
Approximate semilocal density functional theory (DFT) is known to underestimate surface formation energies yet paradoxically overbind adsorbates on catalytic transition-metal oxide surfaces due to delocalization error. The low-cost DFT + U approach only improves surface formation energies for early transition-metal oxides or adsorption energies for late transition-metal oxides. In this work, we demonstrate that this inefficacy arises due to the conventional usage of metal-centered atomic orbitals as projectors within DFT + U. We analyze electron density rearrangement during surface formation and O atom adsorption on rutile transition-metal oxides to highlight that a standard DFT + U correction fails to tune properties when the corresponding density rearrangement is highly delocalized across both metal and oxygen sites. To improve both surface properties simultaneously while retaining the simplicity of a single-site DFT + U correction, we systematically construct multi-atom-centered molecular-orbital-like projectors for DFT + U. We demonstrate this molecular DFT + U approach for tuning adsorption energies and surface formation energies of minimal two-dimensional models of representative early (i.e., TiO2) and late (i.e., PtO2) transition-metal oxides. Molecular DFT + U simultaneously corrects adsorption energies and surface formation energies of multilayer models of rutile TiO2(110) and PtO2(110) to resolve the paradoxical description of surface stability and surface reactivity of semilocal DFT.
Collapse
Affiliation(s)
- Akash Bajaj
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States.,Department of Materials Science and Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Heather J Kulik
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| |
Collapse
|
56
|
Harper DR, Kulik HJ. Computational Scaling Relationships Predict Experimental Activity and Rate-Limiting Behavior in Homogeneous Water Oxidation. Inorg Chem 2022; 61:2186-2197. [PMID: 35037756 DOI: 10.1021/acs.inorgchem.1c03376] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
While computational screening with first-principles density functional theory (DFT) is essential for evaluating candidate catalysts, limitations in accuracy typically prevent the prediction of experimentally relevant activities. Exemplary of these challenges are homogeneous water oxidation catalysts (WOCs) where differences in experimental conditions or small changes in ligand structure can alter rate constants by over an order of magnitude. Here, we compute mechanistically relevant electronic and energetic properties for 19 mononuclear Ru transition-metal complexes (TMCs) from three experimental water oxidation catalysis studies. We discover that 15 of these TMCs have experimental activities that correlate with a single property, the ionization potential of the Ru(II)-O2 catalytic intermediate. This scaling parameter allows the quantitative understanding of activity trends and provides insight into the rate-limiting behavior. We use this approach to rationalize differences in activity with different experimental conditions, and we qualitatively analyze the source of distinct behavior for different electronic states in the other four catalysts. Comparison to closely related single-atom catalysts and modified WOCs enables rationalization of the source of rate enhancement in these WOCs.
Collapse
Affiliation(s)
- Daniel R Harper
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States.,Department of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Heather J Kulik
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| |
Collapse
|
57
|
Ritt CL, Liu M, Pham TA, Epsztein R, Kulik HJ, Elimelech M. Machine learning reveals key ion selectivity mechanisms in polymeric membranes with subnanometer pores. Sci Adv 2022; 8:eabl5771. [PMID: 35030018 PMCID: PMC8759746 DOI: 10.1126/sciadv.abl5771] [Citation(s) in RCA: 20] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/12/2023]
Abstract
Designing single-species selective membranes for high-precision separations requires a fundamental understanding of the molecular interactions governing solute transport. Here, we comprehensively assess molecular-level features that influence the separation of 18 different anions by nanoporous cellulose acetate membranes. Our analysis identifies the limitations of bulk solvation characteristics to explain ion transport, highlighted by the poor correlation between hydration energy and the measured permselectivity (R2 = 0.37). Entropy-enthalpy compensation, spanning 40 kilojoules per mole, leads to a free-energy barrier (∆G‡) variation of only ~8 kilojoules per mole across all anions. We apply machine learning to elucidate descriptors for energetic barriers from a set of 126 collected features. Notably, electrostatic features account for 75% of the overall features used to describe ∆G‡, despite the relatively uncharged state of cellulose acetate. Our work presents an approach for studying ion transport across nanoporous membranes that could enable the design of ion-selective membranes.
Collapse
Affiliation(s)
- Cody L. Ritt
- Department of Chemical and Environmental Engineering, Yale University, New Haven, CT 06520-8286, USA
| | - Mingjie Liu
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Tuan Anh Pham
- Quantum Simulations Group, Lawrence Livermore National Laboratory, Livermore, CA 94550, USA
| | - Razi Epsztein
- Faculty of Civil and Environmental Engineering, Technion–Israel Institute of Technology, Haifa 32000, Israel
| | - Heather J. Kulik
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
- Corresponding author. (M.E.); (H.J.K.)
| | - Menachem Elimelech
- Department of Chemical and Environmental Engineering, Yale University, New Haven, CT 06520-8286, USA
- Corresponding author. (M.E.); (H.J.K.)
| |
Collapse
|
58
|
Mehmood R, Kulik HJ. Quantum-Mechanical/Molecular-Mechanical (QM/MM) Simulations for Understanding Enzyme Dynamics. Methods Mol Biol 2022; 2397:227-248. [PMID: 34813067 DOI: 10.1007/978-1-0716-1826-4_12] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Quantum mechanics/molecular mechanics (QM/MM) methods have become widely used for computational modeling of enzyme structure and mechanism. In these approaches, a portion of the enzyme of great interest (e.g., where a chemical reaction is occurring) is treated with QM, whereas the surrounding region is treated with MM. A critical challenge with these methods is the choice of the region to partition into QM and which to treat with MM along with numerous practical choices that must be made at each step of the modeling procedure. Here, we attempt to simplify this process by describing the steps involved in preparing protein structures, choosing the appropriate QM region size and electronic structure methods, preparing all necessary input files, and troubleshooting common errors for QM/MM simulations of enzymes.
Collapse
Affiliation(s)
- Rimsha Mehmood
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Heather J Kulik
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA.
| |
Collapse
|
59
|
Duan C, Chu DBK, Nandy A, Kulik HJ. Detection of multi-reference character imbalances enables a transfer learning approach for virtual high throughput screening with coupled cluster accuracy at DFT cost. Chem Sci 2022; 13:4962-4971. [PMID: 35655882 PMCID: PMC9067623 DOI: 10.1039/d2sc00393g] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2022] [Accepted: 04/04/2022] [Indexed: 01/08/2023] Open
Abstract
Appropriately identifying and treating molecules and materials with significant multi-reference (MR) character is crucial for achieving high data fidelity in virtual high-throughput screening (VHTS). Despite development of numerous MR diagnostics, the extent to which a single value of such a diagnostic indicates the MR effect on a chemical property prediction is not well established. We evaluate MR diagnostics for over 10 000 transition-metal complexes (TMCs) and compare to those for organic molecules. We observe that only some MR diagnostics are transferable from one chemical space to another. By studying the influence of MR character on chemical properties (i.e., MR effect) that involve multiple potential energy surfaces (i.e., adiabatic spin splitting, ΔEH–L, and ionization potential, IP), we show that differences in MR character are more important than the cumulative degree of MR character in predicting the magnitude of an MR effect. Motivated by this observation, we build transfer learning models to predict CCSD(T)-level adiabatic ΔEH–L and IP from lower levels of theory. By combining these models with uncertainty quantification and multi-level modeling, we introduce a multi-pronged strategy that accelerates data acquisition by at least a factor of three while achieving coupled cluster accuracy (i.e., to within 1 kcal mol−1 MAE) for robust VHTS. We demonstrate that cancellation in multi-reference effect outweighs accumulation in evaluating chemical properties. We combine transfer learning and uncertainty quantification for accelerated data acquisition with chemical accuracy.![]()
Collapse
Affiliation(s)
- Chenru Duan
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Daniel B. K. Chu
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Aditya Nandy
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Heather J. Kulik
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| |
Collapse
|
60
|
Liu M, Nazemi A, Taylor MG, Nandy A, Duan C, Steeves AH, Kulik HJ. Large-Scale Screening Reveals That Geometric Structure Matters More Than Electronic Structure in the Bioinspired Catalyst Design of Formate Dehydrogenase Mimics. ACS Catal 2021. [DOI: 10.1021/acscatal.1c04624] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2023]
Affiliation(s)
- Mingjie Liu
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Azadeh Nazemi
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Michael G. Taylor
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Aditya Nandy
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Chenru Duan
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Adam H. Steeves
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Heather J. Kulik
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| |
Collapse
|
61
|
Smith DGA, Lolinco AT, Glick ZL, Lee J, Alenaizan A, Barnes TA, Borca CH, Di Remigio R, Dotson DL, Ehlert S, Heide AG, Herbst MF, Hermann J, Hicks CB, Horton JT, Hurtado AG, Kraus P, Kruse H, Lee SJR, Misiewicz JP, Naden LN, Ramezanghorbani F, Scheurer M, Schriber JB, Simmonett AC, Steinmetzer J, Wagner JR, Ward L, Welborn M, Altarawy D, Anwar J, Chodera JD, Dreuw A, Kulik HJ, Liu F, Martínez TJ, Matthews DA, Schaefer HF, Šponer J, Turney JM, Wang LP, De Silva N, King RA, Stanton JF, Gordon MS, Windus TL, Sherrill CD, Burns LA. Quantum Chemistry Common Driver and Databases (QCDB) and Quantum Chemistry Engine (QCEngine): Automation and interoperability among computational chemistry programs. J Chem Phys 2021; 155:204801. [PMID: 34852489 DOI: 10.1063/5.0059356] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Abstract
Community efforts in the computational molecular sciences (CMS) are evolving toward modular, open, and interoperable interfaces that work with existing community codes to provide more functionality and composability than could be achieved with a single program. The Quantum Chemistry Common Driver and Databases (QCDB) project provides such capability through an application programming interface (API) that facilitates interoperability across multiple quantum chemistry software packages. In tandem with the Molecular Sciences Software Institute and their Quantum Chemistry Archive ecosystem, the unique functionalities of several CMS programs are integrated, including CFOUR, GAMESS, NWChem, OpenMM, Psi4, Qcore, TeraChem, and Turbomole, to provide common computational functions, i.e., energy, gradient, and Hessian computations as well as molecular properties such as atomic charges and vibrational frequency analysis. Both standard users and power users benefit from adopting these APIs as they lower the language barrier of input styles and enable a standard layout of variables and data. These designs allow end-to-end interoperable programming of complex computations and provide best practices options by default.
Collapse
Affiliation(s)
- Daniel G A Smith
- Molecular Sciences Software Institute, Blacksburg, Virginia 24060, USA
| | | | - Zachary L Glick
- Center for Computational Molecular Science and Technology, School of Chemistry and Biochemistry, and School of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332, USA
| | - Jiyoung Lee
- Department of Chemistry, Iowa State University, Ames, Iowa 50011, USA
| | - Asem Alenaizan
- Center for Computational Molecular Science and Technology, School of Chemistry and Biochemistry, and School of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332, USA
| | - Taylor A Barnes
- Molecular Sciences Software Institute, Blacksburg, Virginia 24060, USA
| | - Carlos H Borca
- Center for Computational Molecular Science and Technology, School of Chemistry and Biochemistry, and School of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332, USA
| | - Roberto Di Remigio
- Department of Chemistry, Centre for Theoretical and Computational Chemistry, UiT, The Arctic University of Norway, N-9037 Tromsø, Norway
| | - David L Dotson
- Open Force Field Initiative, University of Colorado Boulder, Boulder, Colorado 80309, USA
| | - Sebastian Ehlert
- Mulliken Center for Theoretical Chemistry, Institut für Physikalische und Theoretische Chemie, Universität Bonn, Beringstraße 4, D-53115 Bonn, Germany
| | - Alexander G Heide
- Center for Computational Quantum Chemistry, University of Georgia, Athens, Georgia 30602, USA
| | - Michael F Herbst
- Applied and Computational Mathematics, RWTH Aachen University, Schinkelstr. 2, 52062 Aachen, Germany
| | - Jan Hermann
- FU Berlin, Department of Mathematics and Computer Science, 14195 Berlin, Germany
| | - Colton B Hicks
- Department of Chemistry, Stanford University, Stanford, California 94305, USA
| | - Joshua T Horton
- Department of Chemistry, Lancaster University, Lancaster LA1 4YW, United Kingdom
| | - Adrian G Hurtado
- Institute for Advanced Computational Science, Stony Brook University, Stony Brook, New York 11794-5250, USA
| | - Peter Kraus
- School of Molecular and Life Sciences, Curtin University, GPO Box U1987, Perth 6845, WA, Australia
| | - Holger Kruse
- Institute of Biophysics of the Czech Academy of Sciences, Královopolská 135, 612 65 Brno, Czech Republic
| | | | - Jonathon P Misiewicz
- Center for Computational Quantum Chemistry, University of Georgia, Athens, Georgia 30602, USA
| | - Levi N Naden
- Molecular Sciences Software Institute, Blacksburg, Virginia 24060, USA
| | | | - Maximilian Scheurer
- Interdisciplinary Center for Scientific Computing, Heidelberg University, Im Neuenheimer Feld 205, 69120 Heidelberg, Germany
| | - Jeffrey B Schriber
- Center for Computational Molecular Science and Technology, School of Chemistry and Biochemistry, and School of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332, USA
| | - Andrew C Simmonett
- Laboratory of Computational Biology, National Institutes of Health-National Heart, Lung and Blood Institute, Bethesda, Maryland 20892, USA
| | - Johannes Steinmetzer
- Institute of Physical Chemistry, Friedrich Schiller University Jena, Jena, Germany
| | - Jeffrey R Wagner
- Open Force Field Initiative, University of Colorado Boulder, Boulder, Colorado 80309, USA
| | - Logan Ward
- Data Science and Learning Division, Argonne National Laboratory, Lemont, Illinois 60439, USA
| | - Matthew Welborn
- Molecular Sciences Software Institute, Blacksburg, Virginia 24060, USA
| | - Doaa Altarawy
- Molecular Sciences Software Institute, Blacksburg, Virginia 24060, USA
| | - Jamshed Anwar
- Department of Chemistry, Lancaster University, Lancaster LA1 4YW, United Kingdom
| | - John D Chodera
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, New York 10065, USA
| | - Andreas Dreuw
- Interdisciplinary Center for Scientific Computing, Heidelberg University, Im Neuenheimer Feld 205, 69120 Heidelberg, Germany
| | - Heather J Kulik
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
| | - Fang Liu
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
| | - Todd J Martínez
- Department of Chemistry, Stanford University, Stanford, California 94305, USA
| | - Devin A Matthews
- The Institute for Computational Engineering and Sciences, The University of Texas at Austin, Austin, Texas 78712, USA
| | - Henry F Schaefer
- Center for Computational Quantum Chemistry, University of Georgia, Athens, Georgia 30602, USA
| | - Jiří Šponer
- Institute of Biophysics of the Czech Academy of Sciences, Královopolská 135, 612 65 Brno, Czech Republic
| | - Justin M Turney
- Center for Computational Quantum Chemistry, University of Georgia, Athens, Georgia 30602, USA
| | - Lee-Ping Wang
- Department of Chemistry, University of California Davis, Davis, California 95616, USA
| | - Nuwan De Silva
- Department of Chemistry, Iowa State University, Ames, Iowa 50011, USA
| | - Rollin A King
- Department of Chemistry, Bethel University, St. Paul, Minnesota 55112, USA
| | - John F Stanton
- Quantum Theory Project, The University of Florida, 2328 New Physics Building, Gainesville, Florida 32611-8435, USA
| | - Mark S Gordon
- Department of Chemistry and Ames Laboratory, Iowa State University, Ames, Iowa 50011, USA
| | - Theresa L Windus
- Department of Chemistry and Ames Laboratory, Iowa State University, Ames, Iowa 50011, USA
| | - C David Sherrill
- Center for Computational Molecular Science and Technology, School of Chemistry and Biochemistry, and School of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332, USA
| | - Lori A Burns
- Center for Computational Molecular Science and Technology, School of Chemistry and Biochemistry, and School of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332, USA
| |
Collapse
|
62
|
Nandy A, Duan C, Kulik HJ. Using Machine Learning and Data Mining to Leverage Community Knowledge for the Engineering of Stable Metal-Organic Frameworks. J Am Chem Soc 2021; 143:17535-17547. [PMID: 34643374 DOI: 10.1021/jacs.1c07217] [Citation(s) in RCA: 31] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
Abstract
Although the tailored metal active sites and porous architectures of MOFs hold great promise for engineering challenges ranging from gas separations to catalysis, a lack of understanding of how to improve their stability limits their use in practice. To overcome this limitation, we extract thousands of published reports of the key aspects of MOF stability necessary for their practical application: the ability to withstand high temperatures without degrading and the capacity to be activated by removal of solvent molecules. From nearly 4000 manuscripts, we use natural language processing and image analysis to obtain over 2000 solvent-removal stability measures and 3000 thermal degradation temperatures. We analyze the relationships between stability properties and the chemical and geometric structures in this set to identify limits of prior heuristics derived from smaller sets of MOFs. By training predictive machine learning (ML, i.e., Gaussian process and artificial neural network) models to encode the structure-property relationships with graph- and pore-structure-based representations, we are able to make predictions of stability orders of magnitude faster than conventional physics-based modeling or experiment. Interpretation of important features in ML models provides insights that we use to identify strategies to engineer increased stability into typically unstable 3d-transition-metal-containing MOFs that are frequently targeted for catalytic applications. We expect our approach to accelerate the time to discovery of stable, practical MOF materials for a wide range of applications.
Collapse
Affiliation(s)
- Aditya Nandy
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States.,Department of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Chenru Duan
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States.,Department of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Heather J Kulik
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| |
Collapse
|
63
|
Abstract
We demonstrate an alternative, data-driven approach to uncovering structure-property relationships for the rational design of heterobimetallic transition-metal complexes that exhibit metal-metal bonding. We tailor graph-based representations of the metal-local environment for these complexes for use in multiple linear regression and kernel ridge regression (KRR) models. We curate a set of 28 experimentally characterized complexes to develop a multiple linear regression model for oxidation potentials. We achieve good accuracy (mean absolute error of 0.25 V) and preserve transferability to unseen experimental data with a new ligand structure. We also train a KRR model on a subset of 330 structurally characterized heterobimetallics to predict the degree of metal-metal bonding. This KRR model predicts relative metal-metal bond lengths in the test set to within 5%, and analysis of key features reveals the fundamental atomic contributions (e.g., the valence electron configuration) that most strongly influence the behavior of these complexes. Our work provides guidance for rational bimetallic design, suggesting that properties, including the formal shortness ratio, should be transferable from one period to another.
Collapse
Affiliation(s)
- Michael G Taylor
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Aditya Nandy
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Connie C Lu
- Department of Chemistry, University of Minnesota, Minneapolis, Minnesota 55455, United States
| | - Heather J Kulik
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| |
Collapse
|
64
|
Duan C, Chen S, Taylor MG, Liu F, Kulik HJ. Machine learning to tame divergent density functional approximations: a new path to consensus materials design principles. Chem Sci 2021; 12:13021-13036. [PMID: 34745533 PMCID: PMC8513898 DOI: 10.1039/d1sc03701c] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2021] [Accepted: 09/01/2021] [Indexed: 01/17/2023] Open
Abstract
Virtual high-throughput screening (VHTS) with density functional theory (DFT) and machine-learning (ML)-acceleration is essential in rapid materials discovery. By necessity, efficient DFT-based workflows are carried out with a single density functional approximation (DFA). Nevertheless, properties evaluated with different DFAs can be expected to disagree for cases with challenging electronic structure (e.g., open-shell transition-metal complexes, TMCs) for which rapid screening is most needed and accurate benchmarks are often unavailable. To quantify the effect of DFA bias, we introduce an approach to rapidly obtain property predictions from 23 representative DFAs spanning multiple families, “rungs” (e.g., semi-local to double hybrid) and basis sets on over 2000 TMCs. Although computed property values (e.g., spin state splitting and frontier orbital gap) differ by DFA, high linear correlations persist across all DFAs. We train independent ML models for each DFA and observe convergent trends in feature importance, providing DFA-invariant, universal design rules. We devise a strategy to train artificial neural network (ANN) models informed by all 23 DFAs and use them to predict properties (e.g., spin-splitting energy) of over 187k TMCs. By requiring consensus of the ANN-predicted DFA properties, we improve correspondence of computational lead compounds with literature-mined, experimental compounds over the typically employed single-DFA approach. Machine learning (ML)-based feature analysis reveals universal design rules regardless of density functional choices. Using the consensus among multiple functionals, we identify robust lead complexes in ML-accelerated chemical discovery.![]()
Collapse
Affiliation(s)
- Chenru Duan
- Department of Chemical Engineering, Massachusetts Institute of Technology Cambridge MA 02139 USA +1-617-253-4584.,Department of Chemistry, Massachusetts Institute of Technology Cambridge MA 02139 USA
| | - Shuxin Chen
- Department of Chemical Engineering, Massachusetts Institute of Technology Cambridge MA 02139 USA +1-617-253-4584.,Department of Chemistry, Massachusetts Institute of Technology Cambridge MA 02139 USA
| | - Michael G Taylor
- Department of Chemical Engineering, Massachusetts Institute of Technology Cambridge MA 02139 USA +1-617-253-4584
| | - Fang Liu
- Department of Chemical Engineering, Massachusetts Institute of Technology Cambridge MA 02139 USA +1-617-253-4584
| | - Heather J Kulik
- Department of Chemical Engineering, Massachusetts Institute of Technology Cambridge MA 02139 USA +1-617-253-4584
| |
Collapse
|
65
|
Mehmood R, Vennelakanti V, Kulik HJ. Spectroscopically Guided Simulations Reveal Distinct Strategies for Positioning Substrates to Achieve Selectivity in Nonheme Fe(II)/α-Ketoglutarate-Dependent Halogenases. ACS Catal 2021. [DOI: 10.1021/acscatal.1c03169] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Affiliation(s)
- Rimsha Mehmood
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Vyshnavi Vennelakanti
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Heather J. Kulik
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| |
Collapse
|
66
|
Nandy A, Duan C, Taylor MG, Liu F, Steeves AH, Kulik HJ. Computational Discovery of Transition-metal Complexes: From High-throughput Screening to Machine Learning. Chem Rev 2021; 121:9927-10000. [PMID: 34260198 DOI: 10.1021/acs.chemrev.1c00347] [Citation(s) in RCA: 62] [Impact Index Per Article: 20.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Transition-metal complexes are attractive targets for the design of catalysts and functional materials. The behavior of the metal-organic bond, while very tunable for achieving target properties, is challenging to predict and necessitates searching a wide and complex space to identify needles in haystacks for target applications. This review will focus on the techniques that make high-throughput search of transition-metal chemical space feasible for the discovery of complexes with desirable properties. The review will cover the development, promise, and limitations of "traditional" computational chemistry (i.e., force field, semiempirical, and density functional theory methods) as it pertains to data generation for inorganic molecular discovery. The review will also discuss the opportunities and limitations in leveraging experimental data sources. We will focus on how advances in statistical modeling, artificial intelligence, multiobjective optimization, and automation accelerate discovery of lead compounds and design rules. The overall objective of this review is to showcase how bringing together advances from diverse areas of computational chemistry and computer science have enabled the rapid uncovering of structure-property relationships in transition-metal chemistry. We aim to highlight how unique considerations in motifs of metal-organic bonding (e.g., variable spin and oxidation state, and bonding strength/nature) set them and their discovery apart from more commonly considered organic molecules. We will also highlight how uncertainty and relative data scarcity in transition-metal chemistry motivate specific developments in machine learning representations, model training, and in computational chemistry. Finally, we will conclude with an outlook of areas of opportunity for the accelerated discovery of transition-metal complexes.
Collapse
Affiliation(s)
- Aditya Nandy
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States.,Department of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Chenru Duan
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States.,Department of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Michael G Taylor
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Fang Liu
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Adam H Steeves
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Heather J Kulik
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| |
Collapse
|
67
|
Vennelakanti V, Nandy A, Kulik HJ. The Effect of Hartree-Fock Exchange on Scaling Relations and Reaction Energetics for C–H Activation Catalysts. Top Catal 2021. [DOI: 10.1007/s11244-021-01482-5] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
|
68
|
Yang Z, Kulik HJ. Protein Dynamics and Substrate Protonation States Mediate the Catalytic Action of trans-4-Hydroxy-l-Proline Dehydratase. J Phys Chem B 2021; 125:7774-7784. [PMID: 34236200 DOI: 10.1021/acs.jpcb.1c05320] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
The enzyme trans-4-hydroxy-l-proline (Hyp) dehydratase (HypD) is among the most abundant glycyl radical enzymes (GREs) in the healthy human gut microbiome and is considered a promising antibiotic target for the prominent antibiotic-resistant pathogen Clostridium difficile. Although an enzymatic mechanism has been proposed, the role of the greater HypD protein environment in mediating radical reactivity is not well understood. To fill this gap in understanding, we investigate HypD across multiple time- and length-scales using electronic structure modeling and classical molecular dynamics. We observe that the Hyp substrate protonation state significantly alters both its enzyme-free reactivity and its dynamics within the enzyme active site. Accurate coupled-cluster modeling suggests the deprotonated form of Hyp to be the most reactive protonation state for C5-Hpro-S activation. In the protein environment, hydrophobic interactions modulate the positioning of the Cys434 radical to enhance the reactivity of C5-Hpro-S abstraction. Long-time dynamics reveal that changing Hyp protonation states triggers the switching of a Leu643-gated water tunnel, a functional feature that has not yet been observed for members of the GRE superfamily.
Collapse
Affiliation(s)
- Zhongyue Yang
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Heather J Kulik
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| |
Collapse
|
69
|
Abstract
Accelerated discovery with machine learning (ML) has begun to provide the advances in efficiency needed to overcome the combinatorial challenge of computational materials design. Nevertheless, ML-accelerated discovery both inherits the biases of training data derived from density functional theory (DFT) and leads to many attempted calculations that are doomed to fail. Many compelling functional materials and catalytic processes involve strained chemical bonds, open-shell radicals and diradicals, or metal-organic bonds to open-shell transition-metal centers. Although promising targets, these materials present unique challenges for electronic structure methods and combinatorial challenges for their discovery. In this Perspective, we describe the advances needed in accuracy, efficiency, and approach beyond what is typical in conventional DFT-based ML workflows. These challenges have begun to be addressed through ML models trained to predict the results of multiple methods or the differences between them, enabling quantitative sensitivity analysis. For DFT to be trusted for a given data point in a high-throughput screen, it must pass a series of tests. ML models that predict the likelihood of calculation success and detect the presence of strong correlation will enable rapid diagnoses and adaptation strategies. These "decision engines" represent the first steps toward autonomous workflows that avoid the need for expert determination of the robustness of DFT-based materials discoveries.
Collapse
Affiliation(s)
- Chenru Duan
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Fang Liu
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Aditya Nandy
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Heather J Kulik
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| |
Collapse
|
70
|
Kulik HJ, Sigman MS. Advancing Discovery in Chemistry with Artificial Intelligence: From Reaction Outcomes to New Materials and Catalysts. Acc Chem Res 2021; 54:2335-2336. [PMID: 34000811 DOI: 10.1021/acs.accounts.1c00232] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
71
|
Abstract
While density functional theory (DFT) is widely applied for its combination of cost and accuracy, corrections (e.g., DFT+U) that improve it are often needed to tackle correlated transition-metal chemistry. In principle, the functional form of DFT+U, consisting of a set of localized atomic orbitals (AOs) and a quadratic energy penalty for deviation from integer occupations of those AOs, enables the recovery of the exact conditions of piecewise linearity and the derivative discontinuity. Nevertheless, for practical transition-metal complexes, where both atomic states and ligand orbitals participate in bonding, standard DFT+U can fail to eliminate delocalization error (DE). Here, we show that by introducing an alternative valence-state (i.e., molecular orbital or MO) basis to the DFT+U approach, we recover exact conditions in cases for which standard DFT+U corrections have no error-reducing effect. This MO-based DFT+U also eliminates DE where standard AO-based DFT+U is already successful. We demonstrate the transferability of our approach on representative transition-metal complexes with a range of ligand field strengths, electron configurations (i.e., from Sc to Zn), and spin states.
Collapse
Affiliation(s)
- Akash Bajaj
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
- Department of Materials Science and Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Heather J Kulik
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| |
Collapse
|
72
|
Affiliation(s)
- Heather J. Kulik
- Department of Chemical Engineering Massachusetts Institute of Technology 77 Massachusetts Ave Rm 66–464 Cambridge MA 02139 USA
| |
Collapse
|
73
|
Dawson CD, Irwin SM, Backman LRF, Le C, Wang JX, Vennelakanti V, Yang Z, Kulik HJ, Drennan CL, Balskus EP. Molecular basis of C-S bond cleavage in the glycyl radical enzyme isethionate sulfite-lyase. Cell Chem Biol 2021; 28:1333-1346.e7. [PMID: 33773110 PMCID: PMC8473560 DOI: 10.1016/j.chembiol.2021.03.001] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2020] [Revised: 02/04/2021] [Accepted: 03/03/2021] [Indexed: 01/07/2023]
Abstract
Desulfonation of isethionate by the bacterial glycyl radical enzyme (GRE) isethionate sulfite-lyase (IslA) generates sulfite, a substrate for respiration that in turn produces the disease-associated metabolite hydrogen sulfide. Here, we present a 2.7 Å resolution X-ray structure of wild-type IslA from Bilophila wadsworthia with isethionate bound. In comparison with other GREs, alternate positioning of the active site β strands allows for distinct residue positions to contribute to substrate binding. These structural differences, combined with sequence variations, create a highly tailored active site for the binding of the negatively charged isethionate substrate. Through the kinetic analysis of 14 IslA variants and computational analyses, we probe the mechanism by which radical chemistry is used for C-S bond cleavage. This work further elucidates the structural basis of chemistry within the GRE superfamily and will inform structure-based inhibitor design of IsIA and thus of microbial hydrogen sulfide production.
Collapse
Affiliation(s)
- Christopher D Dawson
- Department of Biology, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Stephania M Irwin
- Department of Chemistry and Chemical Biology, Harvard University, 12 Oxford Street, Cambridge, MA 02138, USA
| | - Lindsey R F Backman
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Chip Le
- Department of Chemistry and Chemical Biology, Harvard University, 12 Oxford Street, Cambridge, MA 02138, USA
| | - Jennifer X Wang
- Harvard Center for Mass Spectrometry, Faculty of Arts and Sciences Division of Science, Harvard University, 52 Oxford Street, Cambridge, MA 02138, USA
| | - Vyshnavi Vennelakanti
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, MA 02139, USA; Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Zhongyue Yang
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Heather J Kulik
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA.
| | - Catherine L Drennan
- Department of Biology, Massachusetts Institute of Technology, Cambridge, MA 02139, USA; Department of Chemistry, Massachusetts Institute of Technology, Cambridge, MA 02139, USA; Howard Hughes Medical Institute, Massachusetts Institute of Technology, Cambridge, MA 02139, USA.
| | - Emily P Balskus
- Department of Chemistry and Chemical Biology, Harvard University, 12 Oxford Street, Cambridge, MA 02138, USA; Broad Institute, Cambridge, MA 02139, USA.
| |
Collapse
|
74
|
Abstract
The variability of chemical bonding in open-shell transition-metal complexes not only motivates their study as functional materials and catalysts but also challenges conventional computational modeling tools. Here, tailoring ligand chemistry can alter preferred spin or oxidation states as well as electronic structure properties and reactivity, creating vast regions of chemical space to explore when designing new materials atom by atom. Although first-principles density functional theory (DFT) remains the workhorse of computational chemistry in mechanism deduction and property prediction, it is of limited use here. DFT is both far too computationally costly for widespread exploration of transition-metal chemical space and also prone to inaccuracies that limit its predictive performance for localized d electrons in transition-metal complexes. These challenges starkly contrast with the well-trodden regions of small-organic-molecule chemical space, where the analytical forms of molecular mechanics force fields and semiempirical theories have for decades accelerated the discovery of new molecules, accurate DFT functional performance has been demonstrated, and gold-standard methods from correlated wavefunction theory can predict experimental results to chemical accuracy.The combined promise of transition-metal chemical space exploration and lack of established tools has mandated a distinct approach. In this Account, we outline the path we charted in exploration of transition-metal chemical space starting from the first machine learning (ML) models (i.e., artificial neural network and kernel ridge regression) and representations for the prediction of open-shell transition-metal complex properties. The distinct importance of the immediate coordination environment of the metal center as well as the lack of low-level methods to accurately predict structural properties in this coordination environment first motivated and then benefited from these ML models and representations. Once developed, the recipe for prediction of geometric, spin state, and redox potential properties was straightforwardly extended to a diverse range of other properties, including in catalysis, computational "feasibility", and the gas separation properties of periodic metal-organic frameworks. Interpretation of selected features most important for model prediction revealed new ways to encapsulate design rules and confirmed that models were robustly mapping essential structure-property relationships. Encountering the special challenge of ensuring that good model performance could generalize to new discovery targets motivated investigation of how to best carry out model uncertainty quantification. Distance-based approaches, whether in model latent space or in carefully engineered feature space, provided intuitive measures of the domain of applicability. With all of these pieces together, ML can be harnessed as an engine to tackle the large-scale exploration of transition-metal chemical space needed to satisfy multiple objectives using efficient global optimization methods. In practical terms, bringing these artificial intelligence tools to bear on the problems of transition-metal chemical space exploration has resulted in ML-model assessments of large, multimillion compound spaces in minutes and validated new design leads in weeks instead of decades.
Collapse
Affiliation(s)
- Jon Paul Janet
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Chenru Duan
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Aditya Nandy
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Fang Liu
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Heather J. Kulik
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| |
Collapse
|
75
|
Vennelakanti V, Qi HW, Mehmood R, Kulik HJ. When are two hydrogen bonds better than one? Accurate first-principles models explain the balance of hydrogen bond donors and acceptors found in proteins. Chem Sci 2021; 12:1147-1162. [PMID: 35382134 PMCID: PMC8908278 DOI: 10.1039/d0sc05084a] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2020] [Accepted: 11/18/2020] [Indexed: 01/02/2023] Open
Abstract
Hydrogen bonds (HBs) play an essential role in the structure and catalytic action of enzymes, but a complete understanding of HBs in proteins challenges the resolution of modern structural (i.e., X-ray diffraction) techniques and mandates computationally demanding electronic structure methods from correlated wavefunction theory for predictive accuracy. Numerous amino acid sidechains contain functional groups (e.g., hydroxyls in Ser/Thr or Tyr and amides in Asn/Gln) that can act as either HB acceptors or donors (HBA/HBD) and even form simultaneous, ambifunctional HB interactions. To understand the relative energetic benefit of each interaction, we characterize the potential energy surfaces of representative model systems with accurate coupled cluster theory calculations. To reveal the relationship of these energetics to the balance of these interactions in proteins, we curate a set of 4000 HBs, of which >500 are ambifunctional HBs, in high-resolution protein structures. We show that our model systems accurately predict the favored HB structural properties. Differences are apparent in HBA/HBD preference for aromatic Tyr versus aliphatic Ser/Thr hydroxyls because Tyr forms significantly stronger O–H⋯O HBs than N–H⋯O HBs in contrast to comparable strengths of the two for Ser/Thr. Despite this residue-specific distinction, all models of residue pairs indicate an energetic benefit for simultaneous HBA and HBD interactions in an ambifunctional HB. Although the stabilization is less than the additive maximum due both to geometric constraints and many-body electronic effects, a wide range of ambifunctional HB geometries are more favorable than any single HB interaction. Correlated wavefunction theory predicts and high-resolution crystal structure analysis confirms the important, stabilizing effect of simultaneous hydrogen bond donor and acceptor interactions in proteins.![]()
Collapse
Affiliation(s)
- Vyshnavi Vennelakanti
- Department of Chemical Engineering
- Massachusetts Institute of Technology
- Cambridge
- USA
- Department of Chemistry
| | - Helena W. Qi
- Department of Chemical Engineering
- Massachusetts Institute of Technology
- Cambridge
- USA
- Department of Chemistry
| | - Rimsha Mehmood
- Department of Chemical Engineering
- Massachusetts Institute of Technology
- Cambridge
- USA
- Department of Chemistry
| | - Heather J. Kulik
- Department of Chemical Engineering
- Massachusetts Institute of Technology
- Cambridge
- USA
| |
Collapse
|
76
|
Jonnalagadda R, Del Rio Flores A, Cai W, Mehmood R, Narayanamoorthy M, Ren C, Zaragoza JPT, Kulik HJ, Zhang W, Drennan CL. Biochemical and crystallographic investigations into isonitrile formation by a nonheme iron-dependent oxidase/decarboxylase. J Biol Chem 2021; 296:100231. [PMID: 33361191 PMCID: PMC7949033 DOI: 10.1074/jbc.ra120.015932] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2020] [Revised: 12/21/2020] [Accepted: 12/27/2020] [Indexed: 11/23/2022] Open
Abstract
The isonitrile moiety is found in marine sponges and some microbes, where it plays a role in processes such as virulence and metal acquisition. Until recently only one route was known for isonitrile biosynthesis, a condensation reaction that brings together a nitrogen atom of l-Trp/l-Tyr with a carbon atom from ribulose-5-phosphate. With the discovery of ScoE, a mononuclear Fe(II) α-ketoglutarate-dependent dioxygenase from Streptomyces coeruleorubidus, a second route was identified. ScoE forms isonitrile from a glycine adduct, with both the nitrogen and carbon atoms coming from the same glycyl moiety. This reaction is part of the nonribosomal biosynthetic pathway of isonitrile lipopeptides. Here, we present structural, biochemical, and computational investigations of the mechanism of isonitrile formation by ScoE, an unprecedented reaction in the mononuclear Fe(II) α-ketoglutarate-dependent dioxygenase superfamily. The stoichiometry of this enzymatic reaction is measured, and multiple high-resolution (1.45-1.96 Å resolution) crystal structures of Fe(II)-bound ScoE are presented, providing insight into the binding of substrate, (R)-3-((carboxylmethyl)amino)butanoic acid (CABA), cosubstrate α-ketoglutarate, and an Fe(IV)=O mimic oxovanadium. Comparison to a previously published crystal structure of ScoE suggests that ScoE has an "inducible" α-ketoglutarate binding site, in which two residues arginine-157 and histidine-299 move by approximately 10 Å from the surface of the protein into the active site to create a transient α-ketoglutarate binding pocket. Together, data from structural analyses, site-directed mutagenesis, and computation provide insight into the mode of α-ketoglutarate binding, the mechanism of isonitrile formation, and how the structure of ScoE has been adapted to perform this unusual chemical reaction.
Collapse
Affiliation(s)
- Rohan Jonnalagadda
- Department of Biology, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA
| | - Antonio Del Rio Flores
- Department of Chemical and Biomolecular Engineering, University of California Berkeley, Berkeley, California, USA
| | - Wenlong Cai
- Department of Chemical and Biomolecular Engineering, University of California Berkeley, Berkeley, California, USA
| | - Rimsha Mehmood
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA; Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA
| | | | - Chaoxiang Ren
- Department of Chemical and Biomolecular Engineering, University of California Berkeley, Berkeley, California, USA
| | - Jan Paulo T Zaragoza
- Department of Chemistry, University of California Berkeley, Berkeley, California, USA; California Institute for Quantitative Biosciences, University of California Berkeley, Berkeley, California, USA
| | - Heather J Kulik
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA.
| | - Wenjun Zhang
- Department of Chemical and Biomolecular Engineering, University of California Berkeley, Berkeley, California, USA; Chan Zuckerberg Biohub, San Francisco, California, USA.
| | - Catherine L Drennan
- Department of Biology, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA; Department of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA; Howard Hughes Medical Institute, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA.
| |
Collapse
|
77
|
Affiliation(s)
- Aditya Nandy
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Heather J. Kulik
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| |
Collapse
|
78
|
Liu F, Duan C, Kulik HJ. Rapid Detection of Strong Correlation with Machine Learning for Transition-Metal Complex High-Throughput Screening. J Phys Chem Lett 2020; 11:8067-8076. [PMID: 32864977 DOI: 10.1021/acs.jpclett.0c02288] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Despite its widespread use in chemical discovery, approximate density functional theory (DFT) is poorly suited to many targets, such as those containing open-shell, 3d transition metals that can be expected to have strong multireference (MR) character. For discovery workflows to be predictive, we need automated, low-cost methods that can distinguish the regions of chemical space where DFT should be applied from those where it should not. We curate more than 4800 open-shell transition-metal complexes up to hundreds of atoms in size from prior high-throughput DFT studies and evaluate affordable, finite-temperature DFT fractional occupation number (FON)-based MR diagnostics. We show that intuitive measures of strong correlation (i.e., the HOMO-LUMO gap) are not predictive of MR character as judged by FON-based diagnostics. Analysis of independently trained machine learning (ML) models to predict HOMO-LUMO gaps and FON-based diagnostics reveals differences in the metal and ligand sensitivity of the two quantities. We use our trained ML models to rapidly evaluate MR character over a space of ∼187000 theoretical complexes, identifying large-scale trends in spin-state-dependent MR character and finding small HOMO-LUMO gap complexes while ensuring low MR character.
Collapse
Affiliation(s)
- Fang Liu
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Chenru Duan
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Heather J Kulik
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| |
Collapse
|
79
|
Duan C, Liu F, Nandy A, Kulik HJ. Semi-supervised Machine Learning Enables the Robust Detection of Multireference Character at Low Cost. J Phys Chem Lett 2020; 11:6640-6648. [PMID: 32692570 DOI: 10.1021/acs.jpclett.0c02018] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/26/2023]
Abstract
Multireference (MR) diagnostics are common tools for identifying strongly correlated electronic structure that makes single-reference (SR) methods (e.g., density functional theory or DFT) insufficient for accurate property prediction. However, MR diagnostics typically require computationally demanding correlated wave function theory (WFT) calculations, and diagnostics often disagree or fail to predict MR effects on properties. To overcome these challenges, we introduce a semi-supervised machine learning (ML) approach with virtual adversarial training (VAT) of an MR classifier using 15 WFT and DFT MR diagnostics as inputs. In semi-supervised learning, only the most extreme SR or MR points are labeled, and the remaining point labels are learned. The resulting VAT model outperforms the alternatives, as quantified by the distinct property distributions of SR- and MR-classified molecules. To reduce the cost of generating inputs to the VAT model, we leverage the VAT model's robustness to noisy inputs by replacing WFT MR diagnostics with regression predictions in an MR decision engine workflow that preserves excellent performance. We demonstrate the transferability of our approach to larger molecules and those with distinct chemical composition from the training set. This MR decision engine demonstrates promise as a low-cost, high-accuracy approach to the automatic detection of strong correlation for predictive high-throughput screening.
Collapse
Affiliation(s)
- Chenru Duan
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Fang Liu
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Aditya Nandy
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Heather J Kulik
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| |
Collapse
|
80
|
Moosavi SM, Nandy A, Jablonka KM, Ongari D, Janet JP, Boyd PG, Lee Y, Smit B, Kulik HJ. Understanding the diversity of the metal-organic framework ecosystem. Nat Commun 2020; 11:4068. [PMID: 32792486 PMCID: PMC7426948 DOI: 10.1038/s41467-020-17755-8] [Citation(s) in RCA: 147] [Impact Index Per Article: 36.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2020] [Accepted: 07/10/2020] [Indexed: 02/07/2023] Open
Abstract
Millions of distinct metal-organic frameworks (MOFs) can be made by combining metal nodes and organic linkers. At present, over 90,000 MOFs have been synthesized and over 500,000 predicted. This raises the question whether a new experimental or predicted structure adds new information. For MOF chemists, the chemical design space is a combination of pore geometry, metal nodes, organic linkers, and functional groups, but at present we do not have a formalism to quantify optimal coverage of chemical design space. In this work, we develop a machine learning method to quantify similarities of MOFs to analyse their chemical diversity. This diversity analysis identifies biases in the databases, and we show that such bias can lead to incorrect conclusions. The developed formalism in this study provides a simple and practical guideline to see whether new structures will have the potential for new insights, or constitute a relatively small variation of existing structures.
Collapse
Affiliation(s)
- Seyed Mohamad Moosavi
- Laboratory of Molecular Simulation, Institut des Sciences et Ingénierie Chimiques, École, Polytechnique Fédérale de Lausanne (EPFL), Rue de l'Industrie 17, Sion, CH-1951, Valais, Switzerland
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA
| | - Aditya Nandy
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA
| | - Kevin Maik Jablonka
- Laboratory of Molecular Simulation, Institut des Sciences et Ingénierie Chimiques, École, Polytechnique Fédérale de Lausanne (EPFL), Rue de l'Industrie 17, Sion, CH-1951, Valais, Switzerland
| | - Daniele Ongari
- Laboratory of Molecular Simulation, Institut des Sciences et Ingénierie Chimiques, École, Polytechnique Fédérale de Lausanne (EPFL), Rue de l'Industrie 17, Sion, CH-1951, Valais, Switzerland
| | - Jon Paul Janet
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA
| | - Peter G Boyd
- Laboratory of Molecular Simulation, Institut des Sciences et Ingénierie Chimiques, École, Polytechnique Fédérale de Lausanne (EPFL), Rue de l'Industrie 17, Sion, CH-1951, Valais, Switzerland
| | - Yongjin Lee
- School of Physical Science and Technology, ShanghaiTech University, 201210, Shanghai, China
| | - Berend Smit
- Laboratory of Molecular Simulation, Institut des Sciences et Ingénierie Chimiques, École, Polytechnique Fédérale de Lausanne (EPFL), Rue de l'Industrie 17, Sion, CH-1951, Valais, Switzerland.
| | - Heather J Kulik
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA.
| |
Collapse
|
81
|
Duan C, Liu F, Nandy A, Kulik HJ. Data-Driven Approaches Can Overcome the Cost-Accuracy Trade-Off in Multireference Diagnostics. J Chem Theory Comput 2020; 16:4373-4387. [PMID: 32536161 DOI: 10.1021/acs.jctc.0c00358] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
High-throughput computational screening typically employs methods (i.e., density functional theory or DFT) that can fail to describe challenging molecules, such as those with strongly correlated electronic structure. In such cases, multireference (MR) correlated wavefunction theory (WFT) would be the appropriate choice but remains more challenging to carry out and automate than single-reference (SR) WFT or DFT. Numerous diagnostics have been proposed for identifying when MR character is likely to have an effect on the predictive power of SR calculations, but conflicting conclusions about diagnostic performance have been reached on small data sets. We compute 15 MR diagnostics, ranging from affordable DFT-based to more costly MR-WFT-based diagnostics, on a set of 3165 equilibrium and distorted small organic molecules containing up to six heavy atoms. Conflicting MR character assignments and low pairwise linear correlations among diagnostics are also observed over this set. We evaluate the ability of existing diagnostics to predict the percent recovery of the correlation energy, %Ecorr. None of the DFT-based diagnostics are nearly as predictive of %Ecorr as the best WFT-based diagnostics. To overcome the limitation of this cost-accuracy trade-off, we develop machine learning (ML, i.e., kernel ridge regression) models to predict WFT-based diagnostics from a combination of DFT-based diagnostics and a new, size-independent 3D geometric representation. The ML-predicted diagnostics correlate as well with MR effects as their computed (i.e., with WFT) values, significantly improving over the DFT-based diagnostics on which the models were trained. These ML models thus provide a promising approach to improve upon DFT-based diagnostic accuracy while remaining suitably low cost for high-throughput screening.
Collapse
Affiliation(s)
- Chenru Duan
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States.,Department of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Fang Liu
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Aditya Nandy
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States.,Department of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Heather J Kulik
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| |
Collapse
|
82
|
Janet JP, Ramesh S, Duan C, Kulik HJ. Accurate Multiobjective Design in a Space of Millions of Transition Metal Complexes with Neural-Network-Driven Efficient Global Optimization. ACS Cent Sci 2020; 6:513-524. [PMID: 32342001 PMCID: PMC7181321 DOI: 10.1021/acscentsci.0c00026] [Citation(s) in RCA: 81] [Impact Index Per Article: 20.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/08/2020] [Indexed: 05/20/2023]
Abstract
The accelerated discovery of materials for real world applications requires the achievement of multiple design objectives. The multidimensional nature of the search necessitates exploration of multimillion compound libraries over which even density functional theory (DFT) screening is intractable. Machine learning (e.g., artificial neural network, ANN, or Gaussian process, GP) models for this task are limited by training data availability and predictive uncertainty quantification (UQ). We overcome such limitations by using efficient global optimization (EGO) with the multidimensional expected improvement (EI) criterion. EGO balances exploitation of a trained model with acquisition of new DFT data at the Pareto front, the region of chemical space that contains the optimal trade-off between multiple design criteria. We demonstrate this approach for the simultaneous optimization of redox potential and solubility in candidate M(II)/M(III) redox couples for redox flow batteries from a space of 2.8 M transition metal complexes designed for stability in practical redox flow battery (RFB) applications. We show that a multitask ANN with latent-distance-based UQ surpasses the generalization performance of a GP in this space. With this approach, ANN prediction and EI scoring of the full space are achieved in minutes. Starting from ca. 100 representative points, EGO improves both properties by over 3 standard deviations in only five generations. Analysis of lookahead errors confirms rapid ANN model improvement during the EGO process, achieving suitable accuracy for predictive design in the space of transition metal complexes. The ANN-driven EI approach achieves at least 500-fold acceleration over random search, identifying a Pareto-optimal design in around 5 weeks instead of 50 years.
Collapse
Affiliation(s)
- Jon Paul Janet
- Department
of Chemical Engineering, Massachusetts Institute
of Technology, Cambridge, Massachusetts 02139, United States
| | - Sahasrajit Ramesh
- Department
of Chemical Engineering, Massachusetts Institute
of Technology, Cambridge, Massachusetts 02139, United States
| | - Chenru Duan
- Department
of Chemical Engineering, Massachusetts Institute
of Technology, Cambridge, Massachusetts 02139, United States
- Department
of Chemistry, Massachusetts Institute of
Technology, Cambridge, Massachusetts 02139, United States
| | - Heather J. Kulik
- Department
of Chemical Engineering, Massachusetts Institute
of Technology, Cambridge, Massachusetts 02139, United States
- . Phone: 617-253-4584
| |
Collapse
|
83
|
Abstract
Quantum-mechanical/molecular-mechanical (QM/MM) methods are essential to the study of metalloproteins, but the relative importance of sampling and degree of QM treatment in achieving quantitative predictions is poorly understood. We study the relative magnitude of configurational and QM-region sensitivity of energetic and electronic properties in a representative Zn2+ metal binding site of a DNA methyltransferase. To quantify property variations, we analyze snapshots extracted from 250 ns of molecular dynamics simulation. To understand the degree of QM-region sensitivity, we perform analysis using QM regions ranging from a minimal 49-atom region consisting only of the Zn2+ metal and its four coordinating Cys residues up to a 628-atom QM region that includes residues within 12 Å of the metal center. Over the configurations sampled, we observe that illustrative properties (e.g., rigid Zn2+ removal energy) exhibit large fluctuations that are well captured with even minimal QM regions. Nevertheless, for both energetic and electronic properties, we observe a slow approach to asymptotic limits with similarly large changes in absolute values that converge only with larger (ca. 300-atom) QM region sizes. For the smaller QM regions, the electronic description of Zn2+ binding is incomplete: the metal binds too tightly and is too stabilized by the strong electrostatic potential of MM point charges, and the Zn-S bond covalency is overestimated. Overall, this work suggests that efficient sampling with QM/MM in small QM regions is an effective method to explore the influence of enzyme structure on target properties. At the same time, accurate descriptions of electronic and energetic properties require a larger QM region than the minimal metal-coordinating residues in order to converge treatment of both metal-local bonding and the overall electrostatic environment.
Collapse
Affiliation(s)
- Rimsha Mehmood
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States.,Department of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Heather J Kulik
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| |
Collapse
|
84
|
Taylor MG, Yang T, Lin S, Nandy A, Janet JP, Duan C, Kulik HJ. Seeing Is Believing: Experimental Spin States from Machine Learning Model Structure Predictions. J Phys Chem A 2020; 124:3286-3299. [PMID: 32223165 PMCID: PMC7311053 DOI: 10.1021/acs.jpca.0c01458] [Citation(s) in RCA: 38] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
![]()
Determination of ground-state spins
of open-shell transition-metal
complexes is critical to understanding catalytic and materials properties
but also challenging with approximate electronic structure methods.
As an alternative approach, we demonstrate how structure alone can
be used to guide assignment of ground-state spin from experimentally
determined crystal structures of transition-metal complexes. We first
identify the limits of distance-based heuristics from distributions
of metal–ligand bond lengths of over 2000 unique mononuclear
Fe(II)/Fe(III) transition-metal complexes. To overcome these limits,
we employ artificial neural networks (ANNs) to predict spin-state-dependent
metal–ligand bond lengths and classify experimental ground-state
spins based on agreement of experimental structures with the ANN predictions.
Although the ANN is trained on hybrid density functional theory data,
we exploit the method-insensitivity of geometric properties to enable
assignment of ground states for the majority (ca. 80–90%) of
structures. We demonstrate the utility of the ANN by data-mining the
literature for spin-crossover (SCO) complexes, which have experimentally
observed temperature-dependent geometric structure changes, by correctly
assigning almost all (>95%) spin states in the 46 Fe(II) SCO complex
set. This approach represents a promising complement to more conventional
energy-based spin-state assignment from electronic structure theory
at the low cost of a machine learning model.
Collapse
Affiliation(s)
- Michael G Taylor
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Tzuhsiung Yang
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Sean Lin
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Aditya Nandy
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States.,Department of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Jon Paul Janet
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Chenru Duan
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States.,Department of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Heather J Kulik
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| |
Collapse
|
85
|
Nandy A, Chu DBK, Harper DR, Duan C, Arunachalam N, Cytter Y, Kulik HJ. Large-scale comparison of 3d and 4d transition metal complexes illuminates the reduced effect of exchange on second-row spin-state energetics. Phys Chem Chem Phys 2020; 22:19326-19341. [DOI: 10.1039/d0cp02977g] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
Abstract
The origin of distinct 3d vs. 4d transition metal complex sensitivity to exchange is explored over a large data set.
Collapse
Affiliation(s)
- Aditya Nandy
- Department of Chemical Engineering
- Massachusetts Institute of Technology
- Cambridge
- USA
- Department of Chemistry
| | - Daniel B. K. Chu
- Department of Chemical Engineering
- Massachusetts Institute of Technology
- Cambridge
- USA
| | - Daniel R. Harper
- Department of Chemical Engineering
- Massachusetts Institute of Technology
- Cambridge
- USA
- Department of Chemistry
| | - Chenru Duan
- Department of Chemical Engineering
- Massachusetts Institute of Technology
- Cambridge
- USA
- Department of Chemistry
| | - Naveen Arunachalam
- Department of Chemical Engineering
- Massachusetts Institute of Technology
- Cambridge
- USA
| | - Yael Cytter
- Department of Chemical Engineering
- Massachusetts Institute of Technology
- Cambridge
- USA
| | - Heather J. Kulik
- Department of Chemical Engineering
- Massachusetts Institute of Technology
- Cambridge
- USA
| |
Collapse
|
86
|
Liu F, Kulik HJ. Impact of Approximate DFT Density Delocalization Error on Potential Energy Surfaces in Transition Metal Chemistry. J Chem Theory Comput 2019; 16:264-277. [DOI: 10.1021/acs.jctc.9b00842] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Affiliation(s)
- Fang Liu
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Heather J. Kulik
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| |
Collapse
|
87
|
Qi HW, Kulik HJ. Reply to "Comment on 'Evaluating Unexpectedly Short Non-covalent Distances in X-ray Crystal Structures of Proteins with Electronic Structure Analysis'". J Chem Inf Model 2019; 59:3609-3610. [PMID: 31424928 DOI: 10.1021/acs.jcim.9b00606] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- Helena W Qi
- Department of Chemical Engineering , Massachusetts Institute of Technology , Cambridge , Massachusetts 02139 , United States.,Department of Chemistry , Massachusetts Institute of Technology , Cambridge , Massachusetts 02139 , United States
| | - Heather J Kulik
- Department of Chemical Engineering , Massachusetts Institute of Technology , Cambridge , Massachusetts 02139 , United States
| |
Collapse
|
88
|
Janet JP, Duan C, Yang T, Nandy A, Kulik HJ. A quantitative uncertainty metric controls error in neural network-driven chemical discovery. Chem Sci 2019; 10:7913-7922. [PMID: 31588334 PMCID: PMC6764470 DOI: 10.1039/c9sc02298h] [Citation(s) in RCA: 81] [Impact Index Per Article: 16.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2019] [Accepted: 07/11/2019] [Indexed: 12/14/2022] Open
Abstract
Machine learning (ML) models, such as artificial neural networks, have emerged as a complement to high-throughput screening, enabling characterization of new compounds in seconds instead of hours. The promise of ML models to enable large-scale chemical space exploration can only be realized if it is straightforward to identify when molecules and materials are outside the model's domain of applicability. Established uncertainty metrics for neural network models are either costly to obtain (e.g., ensemble models) or rely on feature engineering (e.g., feature space distances), and each has limitations in estimating prediction errors for chemical space exploration. We introduce the distance to available data in the latent space of a neural network ML model as a low-cost, quantitative uncertainty metric that works for both inorganic and organic chemistry. The calibrated performance of this approach exceeds widely used uncertainty metrics and is readily applied to models of increasing complexity at no additional cost. Tightening latent distance cutoffs systematically drives down predicted model errors below training errors, thus enabling predictive error control in chemical discovery or identification of useful data points for active learning.
Collapse
Affiliation(s)
- Jon Paul Janet
- Department of Chemical Engineering , Massachusetts Institute of Technology , Cambridge , MA 02139 , USA . ; Tel: +1-617-253-4584
| | - Chenru Duan
- Department of Chemical Engineering , Massachusetts Institute of Technology , Cambridge , MA 02139 , USA . ; Tel: +1-617-253-4584
- Department of Chemistry , Massachusetts Institute of Technology , Cambridge , MA 02139 , USA
| | - Tzuhsiung Yang
- Department of Chemical Engineering , Massachusetts Institute of Technology , Cambridge , MA 02139 , USA . ; Tel: +1-617-253-4584
| | - Aditya Nandy
- Department of Chemical Engineering , Massachusetts Institute of Technology , Cambridge , MA 02139 , USA . ; Tel: +1-617-253-4584
- Department of Chemistry , Massachusetts Institute of Technology , Cambridge , MA 02139 , USA
| | - Heather J Kulik
- Department of Chemical Engineering , Massachusetts Institute of Technology , Cambridge , MA 02139 , USA . ; Tel: +1-617-253-4584
| |
Collapse
|
89
|
Zhao Q, Kulik HJ. Stable Surfaces That Bind Too Tightly: Can Range-Separated Hybrids or DFT+U Improve Paradoxical Descriptions of Surface Chemistry? J Phys Chem Lett 2019; 10:5090-5098. [PMID: 31411023 PMCID: PMC6748670 DOI: 10.1021/acs.jpclett.9b01650] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2019] [Accepted: 08/14/2019] [Indexed: 05/25/2023]
Abstract
Approximate, semilocal density functional theory (DFT) suffers from delocalization error that can lead to a paradoxical model of catalytic surfaces that both overbind adsorbates yet are also too stable. We investigate the effect of two widely applied approaches for delocalization error correction, (i) affordable DFT+U (i.e., semilocal DFT augmented with a Hubbard U) and (ii) hybrid functionals with an admixture of Hartree-Fock (HF) exchange, on surface and adsorbate energies across a range of rutile transition metal oxides widely studied for their promise as water-splitting catalysts. We observe strongly row- and period-dependent trends with DFT+U, which increases surface formation energies only in early transition metals (e.g., Ti and V) and decreases adsorbate energies only in later transition metals (e.g., Ir and Pt). Both global and local hybrids destabilize surfaces and reduce adsorbate binding across the periodic table, in agreement with higher-level reference calculations. Density analysis reveals why hybrid functionals correct both quantities, whereas DFT+U does not. We recommend local, range-separated hybrids for the accurate modeling of catalysis in transition metal oxides at only a modest increase in computational cost over semilocal DFT.
Collapse
Affiliation(s)
- Qing Zhao
- Department
of Chemical Engineering, Massachusetts Institute
of Technology, Cambridge, Massachusetts 02139, United States
- Department
of Mechanical Engineering, Massachusetts
Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Heather J. Kulik
- Department
of Chemical Engineering, Massachusetts Institute
of Technology, Cambridge, Massachusetts 02139, United States
| |
Collapse
|
90
|
Affiliation(s)
- Heather J. Kulik
- Department of Chemical Engineering Massachusetts Institute of Technology Cambridge Massachusetts
| |
Collapse
|
91
|
Nandy A, Zhu J, Janet JP, Duan C, Getman RB, Kulik HJ. Machine Learning Accelerates the Discovery of Design Rules and Exceptions in Stable Metal–Oxo Intermediate Formation. ACS Catal 2019. [DOI: 10.1021/acscatal.9b02165] [Citation(s) in RCA: 52] [Impact Index Per Article: 10.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Affiliation(s)
| | - Jiazhou Zhu
- Department of Chemical & Biomolecular Engineering, Clemson University, Clemson, South Carolina 29634, United States
| | | | | | - Rachel B. Getman
- Department of Chemical & Biomolecular Engineering, Clemson University, Clemson, South Carolina 29634, United States
| | | |
Collapse
|
92
|
Yang Z, Liu F, Steeves AH, Kulik HJ. Quantum Mechanical Description of Electrostatics Provides a Unified Picture of Catalytic Action Across Methyltransferases. J Phys Chem Lett 2019; 10:3779-3787. [PMID: 31244268 DOI: 10.1021/acs.jpclett.9b01555] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Methyl transferases (MTases) are a well-studied class of enzymes for which competing enzymatic enhancement mechanisms have been suggested, ranging from structural methyl group CH···X hydrogen bonds (HBs) to electrostatic- and charge-transfer-driven stabilization of the transition state (TS). We identified all Class I MTases for which reasonable resolution (<2.0 Å) crystal structures could be used to form catalytically competent ternary complexes for multiscale (i.e., quantum-mechanical/molecular-mechanical or QM/MM) simulation of the SN2 methyl transfer reaction coordinate. The four Class I MTases studied have both distinct functions (e.g., protein repair or biosynthesis) and substrate nucleophiles (i.e., C, N, or O). While CH···X HBs stabilize all reactant complexes, no universal TS stabilization role is found for these interactions in MTases. A consistent picture is instead obtained through analysis of charge transfer and electrostatics, wherein much of cofactor-substrate charge separation is maintained in the TS region, and electrostatic potential is correlated with substrate nucleophilicity (i.e., intrinsic reactivity).
Collapse
Affiliation(s)
- Zhongyue Yang
- Department of Chemical Engineering , Massachusetts Institute of Technology , Cambridge , Massachusetts 02139 , United States
| | - Fang Liu
- Department of Chemical Engineering , Massachusetts Institute of Technology , Cambridge , Massachusetts 02139 , United States
| | - Adam H Steeves
- Department of Chemical Engineering , Massachusetts Institute of Technology , Cambridge , Massachusetts 02139 , United States
| | - Heather J Kulik
- Department of Chemical Engineering , Massachusetts Institute of Technology , Cambridge , Massachusetts 02139 , United States
| |
Collapse
|
93
|
Bajaj A, Liu F, Kulik HJ. Non-empirical, low-cost recovery of exact conditions with model-Hamiltonian inspired expressions in jmDFT. J Chem Phys 2019; 150:154115. [PMID: 31005112 DOI: 10.1063/1.5091563] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023] Open
Abstract
Density functional theory (DFT) is widely applied to both molecules and materials, but well known energetic delocalization and static correlation errors in practical exchange-correlation approximations limit quantitative accuracy. Common methods that correct energetic delocalization errors, such as the Hubbard U correction in DFT+U or Hartree-Fock exchange in global hybrids, do so at the cost of worsening static correlation errors. We recently introduced an alternate approach [Bajaj et al., J. Chem. Phys. 147, 191101 (2017)] known as judiciously modified DFT (jmDFT), wherein the deviation from exact behavior of semilocal functionals over both fractional spin and charge, i.e., the so-called flat plane, was used to motivate functional forms of second order analytic corrections. In this work, we introduce fully nonempirical expressions for all four coefficients in a DFT+U+J-inspired form of jmDFT, where all coefficients are obtained only from energies and eigenvalues of the integer-electron systems. We show good agreement for U and J coefficients obtained nonempirically as compared with the results of numerical fitting in a jmDFT U+J/J' correction. Incorporating the fully nonempirical jmDFT correction reduces and even eliminates the fractional spin error at the same time as eliminating the energetic delocalization error. We show that this approach extends beyond s-electron systems to higher angular momentum cases including p- and d-electrons. Finally, we diagnose some shortcomings of the current jmDFT approach that limit its ability to improve upon DFT results for cases such as weakly bound anions due to poor underlying semilocal functional behavior.
Collapse
Affiliation(s)
- Akash Bajaj
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
| | - Fang Liu
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
| | - Heather J Kulik
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
| |
Collapse
|
94
|
Mehmood R, Qi HW, Steeves AH, Kulik HJ. The Protein’s Role in Substrate Positioning and Reactivity for Biosynthetic Enzyme Complexes: The Case of SyrB2/SyrB1. ACS Catal 2019. [DOI: 10.1021/acscatal.9b00865] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
|
95
|
Liu F, Yang T, Yang J, Xu E, Bajaj A, Kulik HJ. Bridging the Homogeneous-Heterogeneous Divide: Modeling Spin for Reactivity in Single Atom Catalysis. Front Chem 2019; 7:219. [PMID: 31041303 PMCID: PMC6476907 DOI: 10.3389/fchem.2019.00219] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2018] [Accepted: 03/20/2019] [Indexed: 12/03/2022] Open
Abstract
Single atom catalysts (SACs) are emergent catalytic materials that have the promise of merging the scalability of heterogeneous catalysts with the high activity and atom economy of homogeneous catalysts. Computational, first-principles modeling can provide essential insight into SAC mechanism and active site configuration, where the sub-nm-scale environment can challenge even the highest-resolution experimental spectroscopic techniques. Nevertheless, the very properties that make SACs attractive in catalysis, such as localized d electrons of the isolated transition metal center, make them challenging to study with conventional computational modeling using density functional theory (DFT). For example, Fe/N-doped graphitic SACs have exhibited spin-state dependent reactivity that remains poorly understood. However, spin-state ordering in DFT is very sensitive to the nature of the functional approximation chosen. In this work, we develop accurate benchmarks from correlated wavefunction theory (WFT) for relevant octahedral complexes. We use those benchmarks to evaluate optimal DFT functional choice for predicting spin state ordering in small octahedral complexes as well as models of pyridinic and pyrrolic nitrogen environments expected in larger SACs. Using these guidelines, we determine Fe/N-doped graphene SAC model properties and reactivity as well as their sensitivities to DFT functional choice. Finally, we conclude with broad recommendations for computational modeling of open-shell transition metal single-atom catalysts.
Collapse
Affiliation(s)
- Fang Liu
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA, United States
| | - Tzuhsiung Yang
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA, United States
| | - Jing Yang
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA, United States
| | - Eve Xu
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA, United States
| | - Akash Bajaj
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA, United States.,Department of Materials Science and Engineering, Massachusetts Institute of Technology, Cambridge, MA, United States
| | - Heather J Kulik
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA, United States
| |
Collapse
|
96
|
Qi HW, Kulik HJ. Evaluating Unexpectedly Short Non-covalent Distances in X-ray Crystal Structures of Proteins with Electronic Structure Analysis. J Chem Inf Model 2019; 59:2199-2211. [DOI: 10.1021/acs.jcim.9b00144] [Citation(s) in RCA: 30] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Affiliation(s)
- Helena W. Qi
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Heather J. Kulik
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| |
Collapse
|
97
|
Duan C, Janet JP, Liu F, Nandy A, Kulik HJ. Learning from Failure: Predicting Electronic Structure Calculation Outcomes with Machine Learning Models. J Chem Theory Comput 2019; 15:2331-2345. [DOI: 10.1021/acs.jctc.9b00057] [Citation(s) in RCA: 50] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
|
98
|
Janet JP, Liu F, Nandy A, Duan C, Yang T, Lin S, Kulik HJ. Designing in the Face of Uncertainty: Exploiting Electronic Structure and Machine Learning Models for Discovery in Inorganic Chemistry. Inorg Chem 2019; 58:10592-10606. [PMID: 30834738 DOI: 10.1021/acs.inorgchem.9b00109] [Citation(s) in RCA: 59] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Recent transformative advances in computing power and algorithms have made computational chemistry central to the discovery and design of new molecules and materials. First-principles simulations are increasingly accurate and applicable to large systems with the speed needed for high-throughput computational screening. Despite these strides, the combinatorial challenges associated with the vastness of chemical space mean that more than just fast and accurate computational tools are needed for accelerated chemical discovery. In transition-metal chemistry and catalysis, unique challenges arise. The variable spin, oxidation state, and coordination environments favored by elements with well-localized d or f electrons provide great opportunity for tailoring properties in catalytic or functional (e.g., magnetic) materials but also add layers of uncertainty to any design strategy. We outline five key mandates for realizing computationally driven accelerated discovery in inorganic chemistry: (i) fully automated simulation of new compounds, (ii) knowledge of prediction sensitivity or accuracy, (iii) faster-than-fast property prediction methods, (iv) maps for rapid chemical space traversal, and (v) a means to reveal design rules on the kilocompound scale. Through case studies in open-shell transition-metal chemistry, we describe how advances in methodology and software in each of these areas bring about new chemical insights. We conclude with our outlook on the next steps in this process toward realizing fully autonomous discovery in inorganic chemistry using computational chemistry.
Collapse
Affiliation(s)
- Jon Paul Janet
- Department of Chemical Engineering , Massachusetts Institute of Technology , Cambridge , Massachusetts 02139 , United States
| | - Fang Liu
- Department of Chemical Engineering , Massachusetts Institute of Technology , Cambridge , Massachusetts 02139 , United States
| | - Aditya Nandy
- Department of Chemical Engineering , Massachusetts Institute of Technology , Cambridge , Massachusetts 02139 , United States.,Department of Chemistry , Massachusetts Institute of Technology , Cambridge , Massachusetts 02139 , United States
| | - Chenru Duan
- Department of Chemical Engineering , Massachusetts Institute of Technology , Cambridge , Massachusetts 02139 , United States.,Department of Chemistry , Massachusetts Institute of Technology , Cambridge , Massachusetts 02139 , United States
| | - Tzuhsiung Yang
- Department of Chemical Engineering , Massachusetts Institute of Technology , Cambridge , Massachusetts 02139 , United States
| | - Sean Lin
- Department of Chemical Engineering , Massachusetts Institute of Technology , Cambridge , Massachusetts 02139 , United States
| | - Heather J Kulik
- Department of Chemical Engineering , Massachusetts Institute of Technology , Cambridge , Massachusetts 02139 , United States
| |
Collapse
|
99
|
Yang Z, Mehmood R, Wang M, Qi HW, Steeves AH, Kulik HJ. Revealing quantum mechanical effects in enzyme catalysis with large-scale electronic structure simulation. REACT CHEM ENG 2019; 4:298-315. [PMID: 31572618 PMCID: PMC6768422 DOI: 10.1039/c8re00213d] [Citation(s) in RCA: 26] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Enzymes have evolved to facilitate challenging reactions at ambient conditions with specificity seldom matched by other catalysts. Computational modeling provides valuable insight into catalytic mechanism, and the large size of enzymes mandates multi-scale, quantum mechanical-molecular mechanical (QM/MM) simulations. Although QM/MM plays an essential role in balancing simulation cost to enable sampling with full QM treatment needed to understand electronic structure in enzyme active sites, the relative importance of these two strategies for understanding enzyme mechanism is not well known. We explore challenges in QM/MM for studying the reactivity and stability of three diverse enzymes: i) Mg2+-dependent catechol O-methyltransferase (COMT), ii) radical enzyme choline trimethylamine lyase (CutC), and iii) DNA methyltransferase (DNMT1), which has structural Zn2+ binding sites. In COMT, strong non-covalent interactions lead to long range coupling of electronic structure properties across the active site, but the more isolated nature of the metallocofactor in DNMT1 leads to faster convergence of some properties. We quantify these effects in COMT by computing covariance matrices of by-residue electronic structure properties during dynamics and along the reaction coordinate. In CutC, we observe spontaneous bond cleavage following initiation events, highlighting the importance of sampling and dynamics. We use electronic structure analysis to quantify the relative importance of CHO and OHO non-covalent interactions in imparting reactivity. These three diverse cases enable us to provide some general recommendations regarding QM/MM simulation of enzymes.
Collapse
Affiliation(s)
- Zhongyue Yang
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139
| | - Rimsha Mehmood
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, MA 02139
| | - Mengyi Wang
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139
- Department of Materials Science and Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139
| | - Helena W. Qi
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, MA 02139
| | - Adam H. Steeves
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139
| | - Heather J. Kulik
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139
| |
Collapse
|
100
|
Park YG, Sohn CH, Chen R, McCue M, Yun DH, Drummond GT, Ku T, Evans NB, Oak HC, Trieu W, Choi H, Jin X, Lilascharoen V, Wang J, Truttmann MC, Qi HW, Ploegh HL, Golub TR, Chen SC, Frosch MP, Kulik HJ, Lim BK, Chung K. Protection of tissue physicochemical properties using polyfunctional crosslinkers. Nat Biotechnol 2018; 37:nbt.4281. [PMID: 30556815 PMCID: PMC6579717 DOI: 10.1038/nbt.4281] [Citation(s) in RCA: 177] [Impact Index Per Article: 29.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2018] [Accepted: 09/26/2018] [Indexed: 12/31/2022]
Abstract
Understanding complex biological systems requires the system-wide characterization of both molecular and cellular features. Existing methods for spatial mapping of biomolecules in intact tissues suffer from information loss caused by degradation and tissue damage. We report a tissue transformation strategy named stabilization under harsh conditions via intramolecular epoxide linkages to prevent degradation (SHIELD), which uses a flexible polyepoxide to form controlled intra- and intermolecular cross-link with biomolecules. SHIELD preserves protein fluorescence and antigenicity, transcripts and tissue architecture under a wide range of harsh conditions. We applied SHIELD to interrogate system-level wiring, synaptic architecture, and molecular features of virally labeled neurons and their targets in mouse at single-cell resolution. We also demonstrated rapid three-dimensional phenotyping of core needle biopsies and human brain cells. SHIELD enables rapid, multiscale, integrated molecular phenotyping of both animal and clinical tissues.
Collapse
Affiliation(s)
- Young-Gyun Park
- Institute for Medical Engineering and Science
- Picower Institute for Learning and Memory
| | - Chang Ho Sohn
- Institute for Medical Engineering and Science
- Picower Institute for Learning and Memory
| | - Ritchie Chen
- Institute for Medical Engineering and Science
- Picower Institute for Learning and Memory
| | - Margaret McCue
- Institute for Medical Engineering and Science
- Picower Institute for Learning and Memory
| | - Dae Hee Yun
- Institute for Medical Engineering and Science
- Picower Institute for Learning and Memory
| | - Gabrielle T. Drummond
- Institute for Medical Engineering and Science
- Picower Institute for Learning and Memory
| | - Taeyun Ku
- Institute for Medical Engineering and Science
- Picower Institute for Learning and Memory
| | - Nicholas B. Evans
- Institute for Medical Engineering and Science
- Picower Institute for Learning and Memory
| | | | | | - Heejin Choi
- Institute for Medical Engineering and Science
- Picower Institute for Learning and Memory
| | - Xin Jin
- Institute for Medical Engineering and Science
- Broad Institute of Harvard University and MIT, Cambridge, MA, USA
| | - Varoth Lilascharoen
- Neurobiology Section, Division of Biological Sciences, University of California, San Diego, La Jolla, CA, USA
| | - Ji Wang
- Department of Mechanical and Automation Engineering, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong
| | - Matthias C. Truttmann
- Program in Cellular and Molecular Medicine, Boston Children’s Hospital and Harvard Medical School
| | - Helena W. Qi
- Department of Chemical Engineering
- Department of Chemistry, Massachusetts Institute of Technology (MIT), Cambridge, MA, USA
| | - Hidde L. Ploegh
- Boston Children’s Hospital and Harvard Medical School, Boston, MA, USA
| | - Todd R. Golub
- Broad Institute of Harvard University and MIT, Cambridge, MA, USA
| | - Shih-Chi Chen
- Department of Mechanical and Automation Engineering, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong
| | - Matthew P. Frosch
- C.S. Kubik Laboratory for Neuropathology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
| | | | - Byung Kook Lim
- Neurobiology Section, Division of Biological Sciences, University of California, San Diego, La Jolla, CA, USA
| | - Kwanghun Chung
- Institute for Medical Engineering and Science
- Department of Brain and Cognitive Sciences
- Broad Institute of Harvard University and MIT, Cambridge, MA, USA
| |
Collapse
|