1
|
Sidorov P, Tsuji N. A Primer on 2D Descriptors in Selectivity Modeling for Asymmetric Catalysis. Chemistry 2024; 30:e202302837. [PMID: 38010242 DOI: 10.1002/chem.202302837] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2023] [Revised: 11/21/2023] [Accepted: 11/23/2023] [Indexed: 11/29/2023]
Abstract
Machine learning has permeated all fields of research, including chemistry, and is now an integral part of the design of novel compounds with desired properties. In the field of asymmetric catalysis, the preference still lies with models based on a physical understanding of the catalysis phenomenon and the electronic and steric properties of catalysts. However, such models require quantum chemical calculations and are thus limited by their computational cost. Here, we highlight the recent advances in modeling catalyst selectivity by using the 2D structures of catalysts and substrates. While these have a less explicit mechanistic connection to the modeled property, 2D descriptors, such as topological indices, molecular fingerprints, and fragments, offer the tremendous advantages of low cost and high speed of calculations. This makes them optimal for the in-silico screening of large amounts of data. We provide an overview of common quantitative structure-property relationship workflow, model building and validation techniques, applications of these methodologies in asymmetric catalysis design, and an outlook on improving the understanding of 2D-based models.
Collapse
Affiliation(s)
- Pavel Sidorov
- Institute for Chemical Reaction Design and Discovery (WPI-ICReDD), Hokkaido University, Sapporo, 001-0021, Japan
| | - Nobuya Tsuji
- Institute for Chemical Reaction Design and Discovery (WPI-ICReDD), Hokkaido University, Sapporo, 001-0021, Japan
| |
Collapse
|
2
|
Dral PO, Ge F, Hou YF, Zheng P, Chen Y, Barbatti M, Isayev O, Wang C, Xue BX, Pinheiro Jr M, Su Y, Dai Y, Chen Y, Zhang L, Zhang S, Ullah A, Zhang Q, Ou Y. MLatom 3: A Platform for Machine Learning-Enhanced Computational Chemistry Simulations and Workflows. J Chem Theory Comput 2024; 20:1193-1213. [PMID: 38270978 PMCID: PMC10867807 DOI: 10.1021/acs.jctc.3c01203] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2023] [Revised: 12/29/2023] [Accepted: 01/03/2024] [Indexed: 01/26/2024]
Abstract
Machine learning (ML) is increasingly becoming a common tool in computational chemistry. At the same time, the rapid development of ML methods requires a flexible software framework for designing custom workflows. MLatom 3 is a program package designed to leverage the power of ML to enhance typical computational chemistry simulations and to create complex workflows. This open-source package provides plenty of choice to the users who can run simulations with the command-line options, input files, or with scripts using MLatom as a Python package, both on their computers and on the online XACS cloud computing service at XACScloud.com. Computational chemists can calculate energies and thermochemical properties, optimize geometries, run molecular and quantum dynamics, and simulate (ro)vibrational, one-photon UV/vis absorption, and two-photon absorption spectra with ML, quantum mechanical, and combined models. The users can choose from an extensive library of methods containing pretrained ML models and quantum mechanical approximations such as AIQM1 approaching coupled-cluster accuracy. The developers can build their own models using various ML algorithms. The great flexibility of MLatom is largely due to the extensive use of the interfaces to many state-of-the-art software packages and libraries.
Collapse
Affiliation(s)
- Pavlo O. Dral
- State
Key Laboratory of Physical Chemistry of Solid Surfaces, College of
Chemistry and Chemical Engineering, and Innovation Laboratory for
Sciences and Technologies of Energy Materials of Fujian Province (IKKEM), Xiamen University, Xiamen, Fujian 361005, China
- Fujian
Provincial Key Laboratory of Theoretical and Computational Chemistry, Xiamen, Fujian 361005, China
| | - Fuchun Ge
- State
Key Laboratory of Physical Chemistry of Solid Surfaces, College of
Chemistry and Chemical Engineering, and Innovation Laboratory for
Sciences and Technologies of Energy Materials of Fujian Province (IKKEM), Xiamen University, Xiamen, Fujian 361005, China
- Fujian
Provincial Key Laboratory of Theoretical and Computational Chemistry, Xiamen, Fujian 361005, China
| | - Yi-Fan Hou
- State
Key Laboratory of Physical Chemistry of Solid Surfaces, College of
Chemistry and Chemical Engineering, and Innovation Laboratory for
Sciences and Technologies of Energy Materials of Fujian Province (IKKEM), Xiamen University, Xiamen, Fujian 361005, China
- Fujian
Provincial Key Laboratory of Theoretical and Computational Chemistry, Xiamen, Fujian 361005, China
| | - Peikun Zheng
- State
Key Laboratory of Physical Chemistry of Solid Surfaces, College of
Chemistry and Chemical Engineering, and Innovation Laboratory for
Sciences and Technologies of Energy Materials of Fujian Province (IKKEM), Xiamen University, Xiamen, Fujian 361005, China
- Fujian
Provincial Key Laboratory of Theoretical and Computational Chemistry, Xiamen, Fujian 361005, China
| | - Yuxinxin Chen
- State
Key Laboratory of Physical Chemistry of Solid Surfaces, College of
Chemistry and Chemical Engineering, and Innovation Laboratory for
Sciences and Technologies of Energy Materials of Fujian Province (IKKEM), Xiamen University, Xiamen, Fujian 361005, China
- Fujian
Provincial Key Laboratory of Theoretical and Computational Chemistry, Xiamen, Fujian 361005, China
| | - Mario Barbatti
- Aix
Marseille University, CNRS, ICR, Marseille 13013, France
- Institut
Universitaire de France, Paris 75231, France
| | - Olexandr Isayev
- Department
of Chemistry, Carnegie Mellon University, Pittsburgh, Pennsylvania15213, United States
| | - Cheng Wang
- State
Key Laboratory of Physical Chemistry of Solid Surfaces, College of
Chemistry and Chemical Engineering, and Innovation Laboratory for
Sciences and Technologies of Energy Materials of Fujian Province (IKKEM), Xiamen University, Xiamen, Fujian 361005, China
- iChem, Xiamen University, Xiamen, Fujian 361005, China
| | - Bao-Xin Xue
- State
Key Laboratory of Physical Chemistry of Solid Surfaces, College of
Chemistry and Chemical Engineering, and Innovation Laboratory for
Sciences and Technologies of Energy Materials of Fujian Province (IKKEM), Xiamen University, Xiamen, Fujian 361005, China
- Fujian
Provincial Key Laboratory of Theoretical and Computational Chemistry, Xiamen, Fujian 361005, China
| | - Max Pinheiro Jr
- Aix
Marseille University, CNRS, ICR, Marseille 13013, France
| | - Yuming Su
- State
Key Laboratory of Physical Chemistry of Solid Surfaces, College of
Chemistry and Chemical Engineering, and Innovation Laboratory for
Sciences and Technologies of Energy Materials of Fujian Province (IKKEM), Xiamen University, Xiamen, Fujian 361005, China
- iChem, Xiamen University, Xiamen, Fujian 361005, China
| | - Yiheng Dai
- State
Key Laboratory of Physical Chemistry of Solid Surfaces, College of
Chemistry and Chemical Engineering, and Innovation Laboratory for
Sciences and Technologies of Energy Materials of Fujian Province (IKKEM), Xiamen University, Xiamen, Fujian 361005, China
- iChem, Xiamen University, Xiamen, Fujian 361005, China
| | - Yangtao Chen
- State
Key Laboratory of Physical Chemistry of Solid Surfaces, College of
Chemistry and Chemical Engineering, and Innovation Laboratory for
Sciences and Technologies of Energy Materials of Fujian Province (IKKEM), Xiamen University, Xiamen, Fujian 361005, China
- iChem, Xiamen University, Xiamen, Fujian 361005, China
| | - Lina Zhang
- State
Key Laboratory of Physical Chemistry of Solid Surfaces, College of
Chemistry and Chemical Engineering, and Innovation Laboratory for
Sciences and Technologies of Energy Materials of Fujian Province (IKKEM), Xiamen University, Xiamen, Fujian 361005, China
- Fujian
Provincial Key Laboratory of Theoretical and Computational Chemistry, Xiamen, Fujian 361005, China
| | - Shuang Zhang
- State
Key Laboratory of Physical Chemistry of Solid Surfaces, College of
Chemistry and Chemical Engineering, and Innovation Laboratory for
Sciences and Technologies of Energy Materials of Fujian Province (IKKEM), Xiamen University, Xiamen, Fujian 361005, China
- Fujian
Provincial Key Laboratory of Theoretical and Computational Chemistry, Xiamen, Fujian 361005, China
| | - Arif Ullah
- School
of Physics and Optoelectronic Engineering, Anhui University, Hefei230601, China
| | - Quanhao Zhang
- State
Key Laboratory of Physical Chemistry of Solid Surfaces, College of
Chemistry and Chemical Engineering, and Innovation Laboratory for
Sciences and Technologies of Energy Materials of Fujian Province (IKKEM), Xiamen University, Xiamen, Fujian 361005, China
- Fujian
Provincial Key Laboratory of Theoretical and Computational Chemistry, Xiamen, Fujian 361005, China
| | - Yanchi Ou
- State
Key Laboratory of Physical Chemistry of Solid Surfaces, College of
Chemistry and Chemical Engineering, and Innovation Laboratory for
Sciences and Technologies of Energy Materials of Fujian Province (IKKEM), Xiamen University, Xiamen, Fujian 361005, China
- Fujian
Provincial Key Laboratory of Theoretical and Computational Chemistry, Xiamen, Fujian 361005, China
| |
Collapse
|
3
|
Kubečka J, Besel V, Neefjes I, Knattrup Y, Kurtén T, Vehkamäki H, Elm J. Computational Tools for Handling Molecular Clusters: Configurational Sampling, Storage, Analysis, and Machine Learning. ACS OMEGA 2023; 8:45115-45128. [PMID: 38046354 PMCID: PMC10688175 DOI: 10.1021/acsomega.3c07412] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/25/2023] [Revised: 10/25/2023] [Accepted: 10/26/2023] [Indexed: 12/05/2023]
Abstract
Computational modeling of atmospheric molecular clusters requires a comprehensive understanding of their complex configurational spaces, interaction patterns, stabilities against fragmentation, and even dynamic behaviors. To address these needs, we introduce the Jammy Key framework, a collection of automated scripts that facilitate and streamline molecular cluster modeling workflows. Jammy Key handles file manipulations between varieties of integrated third-party programs. The framework is divided into three main functionalities: (1) Jammy Key for configurational sampling (JKCS) to perform systematic configurational sampling of molecular clusters, (2) Jammy Key for quantum chemistry (JKQC) to analyze commonly used quantum chemistry output files and facilitate database construction, handling, and analysis, and (3) Jammy Key for machine learning (JKML) to manage machine learning methods in optimizing molecular cluster modeling. This automation and machine learning utilization significantly reduces manual labor, greatly speeds up the search for molecular cluster configurations, and thus increases the number of systems that can be studied. Following the example of the Atmospheric Cluster Database (ACDB) of Elm (ACS Omega, 4, 10965-10984, 2019), the molecular clusters modeled in our group using the Jammy Key framework have been stored in an improved online GitHub repository named ACDB 2.0. In this work, we present the Jammy Key package alongside its assorted applications, which underline its versatility. Using several illustrative examples, we discuss how to choose appropriate combinations of methodologies for treating particular cluster types, including reactive, multicomponent, charged, or radical clusters, as well as clusters containing flexible or multiconformer monomers or heavy atoms. Finally, we present a detailed example of using the tools for atmospheric acid-base clusters.
Collapse
Affiliation(s)
- Jakub Kubečka
- Aarhus
University, Department of Chemistry, Langelandsgade 140, Aarhus 8000, Denmark
| | - Vitus Besel
- University
of Helsinki, Institute for Atmospheric and
Earth System Research/Physics, Faculty of Science, P.O. Box 64, Helsinki 00140, Finland
| | - Ivo Neefjes
- University
of Helsinki, Institute for Atmospheric and
Earth System Research/Physics, Faculty of Science, P.O. Box 64, Helsinki 00140, Finland
| | - Yosef Knattrup
- Aarhus
University, Department of Chemistry, Langelandsgade 140, Aarhus 8000, Denmark
| | - Theo Kurtén
- University
of Helsinki, Institute for Atmospheric and
Earth System Research/Chemistry, Faculty of Science, P.O. Box 64, Helsinki 00140, Finland
| | - Hanna Vehkamäki
- University
of Helsinki, Institute for Atmospheric and
Earth System Research/Physics, Faculty of Science, P.O. Box 64, Helsinki 00140, Finland
| | - Jonas Elm
- Aarhus
University, Department of Chemistry, Langelandsgade 140, Aarhus 8000, Denmark
| |
Collapse
|
4
|
Darby JP, Kovács DP, Batatia I, Caro MA, Hart GLW, Ortner C, Csányi G. Tensor-Reduced Atomic Density Representations. PHYSICAL REVIEW LETTERS 2023; 131:028001. [PMID: 37505943 DOI: 10.1103/physrevlett.131.028001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/06/2022] [Accepted: 04/18/2023] [Indexed: 07/30/2023]
Abstract
Density-based representations of atomic environments that are invariant under Euclidean symmetries have become a widely used tool in the machine learning of interatomic potentials, broader data-driven atomistic modeling, and the visualization and analysis of material datasets. The standard mechanism used to incorporate chemical element information is to create separate densities for each element and form tensor products between them. This leads to a steep scaling in the size of the representation as the number of elements increases. Graph neural networks, which do not explicitly use density representations, escape this scaling by mapping the chemical element information into a fixed dimensional space in a learnable way. By exploiting symmetry, we recast this approach as tensor factorization of the standard neighbour-density-based descriptors and, using a new notation, identify connections to existing compression algorithms. In doing so, we form compact tensor-reduced representation of the local atomic environment whose size does not depend on the number of chemical elements, is systematically convergable, and therefore remains applicable to a wide range of data analysis and regression tasks.
Collapse
Affiliation(s)
- James P Darby
- Warwick Centre for Predictive Modelling, School of Engineering, University of Warwick, Coventry, CV4 7AL, United Kingdom
- Engineering Laboratory, University of Cambridge, Cambridge, CB2 1PZ, United Kingdom
| | - Dávid P Kovács
- Engineering Laboratory, University of Cambridge, Cambridge, CB2 1PZ, United Kingdom
| | - Ilyes Batatia
- Engineering Laboratory, University of Cambridge, Cambridge, CB2 1PZ, United Kingdom
- ENS Paris-Saclay, Université Paris-Saclay, 91190 Gif-sur-Yvette, France
| | - Miguel A Caro
- Department of Electrical Engineering and Automation, Aalto University, FIN-02150 Espoo, Finland
| | - Gus L W Hart
- Department of Physics and Astronomy, Brigham Young University, Provo, Utah, 84602, USA
| | - Christoph Ortner
- Department of Mathematics, University of British Columbia, 1984 Mathematics Road, Vancouver, British Columbia, Canada V6T 1Z2
| | - Gábor Csányi
- Engineering Laboratory, University of Cambridge, Cambridge, CB2 1PZ, United Kingdom
| |
Collapse
|
5
|
Huang B, von Rudorff GF, von Lilienfeld OA. The central role of density functional theory in the AI age. Science 2023; 381:170-175. [PMID: 37440654 DOI: 10.1126/science.abn3445] [Citation(s) in RCA: 13] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2023] [Accepted: 05/30/2023] [Indexed: 07/15/2023]
Abstract
Density functional theory (DFT) plays a pivotal role in chemical and materials science because of its relatively high predictive power, applicability, versatility, and computational efficiency. We review recent progress in machine learning (ML) model developments, which have relied heavily on DFT for synthetic data generation and for the design of model architectures. The general relevance of these developments is placed in a broader context for chemical and materials sciences. DFT-based ML models have reached high efficiency, accuracy, scalability, and transferability and pave the way to the routine use of successful experimental planning software within self-driving laboratories.
Collapse
Affiliation(s)
- Bing Huang
- University of Vienna, Faculty of Physics, AT1090 Wien, Austria
| | - Guido Falk von Rudorff
- University Kassel, Department of Chemistry, 34132 Kassel, Germany
- Center for Interdisciplinary Nanostructure Science and Technology (CINSaT), 34132 Kassel, Germany
| | - O Anatole von Lilienfeld
- Vector Institute for Artificial Intelligence, Toronto, Ontario M5S 1M1, Canada
- Department of Chemistry, University of Toronto, St. George Campus, Toronto, Ontario M5S 3H6, Canada
- Department of Materials Science and Engineering, University of Toronto, St. George Campus, Toronto, Ontario M5S 3E4, Canada
- Department of Physics, University of Toronto, St. George Campus, Toronto, Ontario M5S 1A7, Canada
- Machine Learning Group, Technische Universität Berlin and Berlin Institute for the Foundations of Learning and Data, 10587 Berlin, Germany
| |
Collapse
|
6
|
Kubečka J, Knattrup Y, Engsvang M, Jensen AB, Ayoubi D, Wu H, Christiansen O, Elm J. Current and future machine learning approaches for modeling atmospheric cluster formation. NATURE COMPUTATIONAL SCIENCE 2023; 3:495-503. [PMID: 38177415 DOI: 10.1038/s43588-023-00435-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/21/2022] [Accepted: 03/16/2023] [Indexed: 01/06/2024]
Abstract
The formation of strongly bound atmospheric molecular clusters is the first step towards forming new aerosol particles. Recent advances in the application of machine learning models open an enormous opportunity for complementing expensive quantum chemical calculations with efficient machine learning predictions. In this Perspective, we present how data-driven approaches can be applied to accelerate cluster configurational sampling, thereby greatly increasing the number of chemically relevant systems that can be covered.
Collapse
Affiliation(s)
- Jakub Kubečka
- Department of Chemistry, Aarhus University, Aarhus, Denmark
| | - Yosef Knattrup
- Department of Chemistry, Aarhus University, Aarhus, Denmark
| | | | | | - Daniel Ayoubi
- Department of Chemistry, Aarhus University, Aarhus, Denmark
| | - Haide Wu
- Department of Chemistry, Aarhus University, Aarhus, Denmark
| | | | - Jonas Elm
- Department of Chemistry, Aarhus University, Aarhus, Denmark.
- iCLIMATE Aarhus University Interdisciplinary Centre for Climate Change, Aarhus, Denmark.
| |
Collapse
|