1
|
Jia L, Brémond É, Zaida L, Gaüzère B, Tognetti V, Joubert L. Predicting redox potentials by graph-based machine learning methods. J Comput Chem 2024; 45:2383-2396. [PMID: 38923574 DOI: 10.1002/jcc.27380] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2023] [Revised: 03/25/2024] [Accepted: 04/19/2024] [Indexed: 06/28/2024]
Abstract
The evaluation of oxidation and reduction potentials is a pivotal task in various chemical fields. However, their accurate prediction by theoretical computations, which is a complementary task and sometimes the only alternative to experimental measurement, may be often resource-intensive and time-consuming. This paper addresses this challenge through the application of machine learning techniques, with a particular focus on graph-based methods (such as graph edit distances, graph kernels, and graph neural networks) that are reviewed to enlighten their deep links with theoretical chemistry. To this aim, we establish the ORedOx159 database, a comprehensive, homogeneous (with reference values stemming from density functional theory calculations), and reliable resource containing 318 one-electron reduction and oxidation reactions and featuring 159 large organic compounds. Subsequently, we provide an instructive overview of the good practice in machine learning and of commonly utilized machine learning models. We then assess their predictive performances on the ORedOx159 dataset through extensive analyses. Our simulations using descriptors that are computed in an almost instantaneous way result in a notable improvement in prediction accuracy, with mean absolute error (MAE) values equal to 5.6 kcal mol- 1 for reduction and 7.2 kcal mol- 1 for oxidation potentials, which paves a way toward efficient in silico design of new electrochemical systems.
Collapse
Affiliation(s)
- Linlin Jia
- The PRG Group, Institute of Computer Science, University of Bern, Bern, Switzerland
| | - Éric Brémond
- Université Paris Cité, ITODYS, CNRS, Paris, France
| | | | - Benoit Gaüzère
- LITIS, Univ Rouen Normandie, INSA Rouen Normandie, Université Le Havre Normandie, Normandie Univ, Rouen, France
| | - Vincent Tognetti
- Normandy Univ., COBRA UMR 6014 & FR 3038, Université de Rouen, INSA Rouen, CNRS, Mont St Aignan Cedex, France
| | - Laurent Joubert
- Normandy Univ., COBRA UMR 6014 & FR 3038, Université de Rouen, INSA Rouen, CNRS, Mont St Aignan Cedex, France
| |
Collapse
|
2
|
Hoffmann G, Guégan F, Labet V, Joubert L, Chermette H, Morell C, Tognetti V. Expanding horizons in conceptual density functional theory: Novel ensembles and descriptors to decipher reactivity patterns. J Comput Chem 2024; 45:1716-1726. [PMID: 38580454 DOI: 10.1002/jcc.27363] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2023] [Revised: 02/16/2024] [Accepted: 02/22/2024] [Indexed: 04/07/2024]
Abstract
Conceptual density functional theory (CDFT) and the quantum reactivity descriptors stemming from it have proven to be valuable tools for understanding the chemical behavior of molecules. This article is presented as being intrinsically of dual character. In a first part, it briefly reviews, in a deliberately didactical way, the main ensembles in CDFT, while the second half presents two additional ensembles, where the chemical hardness acts as a natural variable, and their respective reactivity descriptors. The evaluation of these reactivity descriptors on common organic chemical reagents are presented and discussed.
Collapse
Affiliation(s)
- Guillaume Hoffmann
- Université de Lyon, Institut des Sciences Analytiques, UMR 5280, CNRS, Villeurbanne, France
| | - Frédéric Guégan
- IC2MP UMR 7285, Université de Poitiers - CNRS, Poitiers, France
| | - Vanessa Labet
- Sorbonne Université CNRS, MONARIS, UMR8233, Paris, France
| | - Laurent Joubert
- Univ Rouen Normandie, INSA Rouen Normandie, CNRS, Normandie Univ, COBRA UMR 6014, INC3M FR 3038, Rouen, France
| | - Henry Chermette
- Université de Lyon, Institut des Sciences Analytiques, UMR 5280, CNRS, Villeurbanne, France
| | - Christophe Morell
- Université de Lyon, Institut des Sciences Analytiques, UMR 5280, CNRS, Villeurbanne, France
| | - Vincent Tognetti
- Univ Rouen Normandie, INSA Rouen Normandie, CNRS, Normandie Univ, COBRA UMR 6014, INC3M FR 3038, Rouen, France
| |
Collapse
|
3
|
Sigmund LM, S SS, Albers A, Erdmann P, Paton RS, Greb L. Predicting Lewis Acidity: Machine Learning the Fluoride Ion Affinity of p-Block-Atom-Based Molecules. Angew Chem Int Ed Engl 2024; 63:e202401084. [PMID: 38452299 DOI: 10.1002/anie.202401084] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2024] [Revised: 03/01/2024] [Accepted: 03/04/2024] [Indexed: 03/09/2024]
Abstract
"How strong is this Lewis acid?" is a question researchers often approach by calculating its fluoride ion affinity (FIA) with quantum chemistry. Here, we present FIA49k, an extensive FIA dataset with 48,986 data points calculated at the RI-DSD-BLYP-D3(BJ)/def2-QZVPP//PBEh-3c level of theory, including 13 different p-block atoms as the fluoride accepting site. The FIA49k dataset was used to train FIA-GNN, two message-passing graph neural networks, which predict gas and solution phase FIA values of molecules excluded from training with a mean absolute error of 14 kJ mol-1 (r2=0.93) from the SMILES string of the Lewis acid as the only input. The level of accuracy is notable, given the wide energetic range of 750 kJ mol-1 spanned by FIA49k. The model's value was demonstrated with four case studies, including predictions for molecules extracted from the Cambridge Structural Database and by reproducing results from catalysis research available in the literature. Weaknesses of the model are evaluated and interpreted chemically. FIA-GNN and the FIA49k dataset can be reached via a free web app (www.grebgroup.de/fia-gnn).
Collapse
Affiliation(s)
- Lukas M Sigmund
- Anorganisch-Chemisches Institut, Ruprecht-Karls-Universität Heidelberg, Im Neuenheimer Feld 270, 69120, Heidelberg, Germany
- Department of Chemistry, Colorado State University, 1301 Center Avenue, Fort Collins, CO, 80523, USA
| | - Shree Sowndarya S
- Department of Chemistry, Colorado State University, 1301 Center Avenue, Fort Collins, CO, 80523, USA
| | - Andreas Albers
- Anorganisch-Chemisches Institut, Ruprecht-Karls-Universität Heidelberg, Im Neuenheimer Feld 270, 69120, Heidelberg, Germany
| | - Philipp Erdmann
- Anorganisch-Chemisches Institut, Ruprecht-Karls-Universität Heidelberg, Im Neuenheimer Feld 270, 69120, Heidelberg, Germany
| | - Robert S Paton
- Department of Chemistry, Colorado State University, 1301 Center Avenue, Fort Collins, CO, 80523, USA
| | - Lutz Greb
- Anorganisch-Chemisches Institut, Ruprecht-Karls-Universität Heidelberg, Im Neuenheimer Feld 270, 69120, Heidelberg, Germany
| |
Collapse
|
4
|
Eckhoff M, Diedrich JV, Mücke M, Proppe J. Quantitative Structure-Reactivity Relationships for Synthesis Planning: The Benzhydrylium Case. J Phys Chem A 2024; 128:343-354. [PMID: 38113457 PMCID: PMC10788916 DOI: 10.1021/acs.jpca.3c07289] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2023] [Revised: 11/28/2023] [Accepted: 12/01/2023] [Indexed: 12/21/2023]
Abstract
Selective and feasible reactions are among the top targets in synthesis planning. Mayr's approach to quantifying chemical reactivity has greatly facilitated the planning process, but reactivity parameters for new compounds require time-consuming experiments. In the past decade, data-driven modeling has been gaining momentum in the field, as it shows promise in terms of efficient reactivity prediction. However, state-of-the-art models use quantum chemical data as input, which prevent access to real-time planning in organic synthesis. Here, we present a novel data-driven workflow for predicting reactivity parameters of molecules that takes only structural information as input, enabling de facto real-time reactivity predictions. We use the well-understood chemical space of benzhydrylium ions as an example to demonstrate the functionality of our approach and the performance of the resulting quantitative structure-reactivity relationships (QSRRs). Our results suggest that it is straightforward to build low-cost QSRR models that are accurate, interpretable, and transferable to unexplored systems within a given scope of application. Moreover, our QSRR approach suggests that Hammett σ parameters are only approximately additive.
Collapse
Affiliation(s)
- Maike Eckhoff
- Institute
of Physical and Theoretical Chemistry, TU
Braunschweig, Braunschweig 38106, Germany
| | - Johannes V. Diedrich
- Institute
of Physical and Theoretical Chemistry, TU
Braunschweig, Braunschweig 38106, Germany
- Institute
of Physical Chemistry, University of Göttingen, Göttingen 37077, Germany
| | - Maike Mücke
- Institute
of Physical and Theoretical Chemistry, TU
Braunschweig, Braunschweig 38106, Germany
- Institute
of Physical Chemistry, University of Göttingen, Göttingen 37077, Germany
| | - Jonny Proppe
- Institute
of Physical and Theoretical Chemistry, TU
Braunschweig, Braunschweig 38106, Germany
| |
Collapse
|
5
|
Saini V. Machine learning prediction of empirical polarity using SMILES encoding of organic solvents. Mol Divers 2023; 27:2331-2343. [PMID: 36334165 DOI: 10.1007/s11030-022-10559-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2022] [Accepted: 10/26/2022] [Indexed: 11/07/2022]
Abstract
Machine learning based statistical models have played a significant role in increasing the speed and accuracy with which the chemical and physical properties of chemical compounds can be predicted as compared to the experimental, and traditional ab initio and quantum mechanical approaches. The transformative impact that these techniques have, in the field of chemical sciences has completely changed the way experiments are designed. The last decade has seen the prominence of computer-aided molecular design based on machine learning algorithms. The major challenge has been the generation of machine-readable data in the form of descriptors and observations for training the model, which can again be time-consuming and computationally expensive if atomic coordinates based molecular encoding approach is used. In this study, we have tried to solve this problem using SMILES representation of molecules for generating various topological, physicochemical, electronic and steric descriptors using open-source cheminformatics packages. With the aid of the data generated using these packages, we have been able to develop a simple and explainable quantitative structure property relationship model using artificial neural network based on 7 numerical descriptors and 1 categorical descriptor for predicting the empirical polarity of a wide diversity of organic solvents. Since polarity is the representation of various solute-solvent and solvent-solvent interactions taking place in an organic transformation, its intuition beforehand will definitely help a chemist in a better experimental design. An ANN algorithm based on 8 descriptors was successfully employed to predict the ET(30) values of organic solvents.
Collapse
Affiliation(s)
- Vaneet Saini
- Department of Chemistry & Centre for Advanced Studies in Chemistry, Panjab University, Chandigarh, 160014, India.
| |
Collapse
|
6
|
Cador A, Tognetti V, Joubert L, Popelier PLA. Aza-Michael Addition in Explicit Solvent: A Relative Energy Gradient-Interacting Quantum Atoms Study. Chemphyschem 2023:e202300529. [PMID: 37728125 DOI: 10.1002/cphc.202300529] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2023] [Revised: 09/09/2023] [Accepted: 09/17/2023] [Indexed: 09/21/2023]
Abstract
Aza-Michael additions are key reactions in organic synthesis. We investigate, from a theoretical and computational point of view, several examples ranging from weak to strong electrophiles in dimethylsulfoxide treated as explicit solvent. We use the REG-IQA method, which is a quantum topological energy decomposition (Interacting Quantum Atoms, IQA) coupled to a chemical-interpretation calculator (Relative Energy Gradient, REG). We focus on the rate-limiting addition step in order to unravel the different events taking place in this step, and understand the influence of solvent on the reaction, with an eye on predicting the Mayr electrophilicity. For the first time, a link is established between an REG-IQA analysis and experimental values.
Collapse
Affiliation(s)
- Aël Cador
- Normandy Univ., COBRA UMR 6014 & FR 3038, Université de Rouen, INSA Rouen, CNRS, 1 rue Tesnière, 76821, Mont St, Aignan Cedex, France
| | - Vincent Tognetti
- Normandy Univ., COBRA UMR 6014 & FR 3038, Université de Rouen, INSA Rouen, CNRS, 1 rue Tesnière, 76821, Mont St, Aignan Cedex, France
| | - Laurent Joubert
- Normandy Univ., COBRA UMR 6014 & FR 3038, Université de Rouen, INSA Rouen, CNRS, 1 rue Tesnière, 76821, Mont St, Aignan Cedex, France
| | - Paul L A Popelier
- Department of Chemistry, The University of Manchester, Manchester, M13 9PL, Great Britain
| |
Collapse
|
7
|
Dhakal P, Gassaway W, Shah JK. Mapping the frontier orbital energies of imidazolium-based cations using machine learning. J Chem Phys 2023; 159:064513. [PMID: 37579028 DOI: 10.1063/5.0155775] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2023] [Accepted: 07/24/2023] [Indexed: 08/16/2023] Open
Abstract
The knowledge of the frontier orbital, highest occupied molecular orbital (HOMO) and lowest unoccupied molecular orbital (LUMO), energies is vital for studying chemical and electrochemical stability of compounds, their corrosion inhibition potential, reactivity, etc. Density functional theory (DFT) calculations provide a direct route to estimate these energies either in the gas-phase or condensed phase. However, the application of DFT methods becomes computationally intensive when hundreds of thousands of compounds are to be screened. Such is the case when all the isomers for the 1-alkyl-3-alkylimidazolium cation [CnCmim]+ (n = 1-10, m = 1-10) are considered. Enumerating the isomer space of [CnCmim]+ yields close to 386 000 cation structures. Calculating frontier orbital energies for each would be computationally very expensive and time-consuming using DFT. In this article, we develop a machine learning model based on the extreme gradient boosting method using a small subset of the isomer space and predict the HOMO and LUMO energies. Using the model, the HOMO energies are predicted with a mean absolute error (MAE) of 0.4 eV and the LUMO energies are predicted with a MAE of 0.2 eV. Inferences are also drawn on the type of the descriptors deemed important for the HOMO and LUMO energy estimates. Application of the machine learning model results in a drastic reduction in computational time required for such calculations.
Collapse
Affiliation(s)
- Pratik Dhakal
- School of Chemical Engineering, Oklahoma State University, Stillwater, Oklahoma 74078, USA
| | - Wyatt Gassaway
- School of Chemical Engineering, Oklahoma State University, Stillwater, Oklahoma 74078, USA
| | - Jindal K Shah
- School of Chemical Engineering, Oklahoma State University, Stillwater, Oklahoma 74078, USA
| |
Collapse
|
8
|
Abstract
Cyclopropanes that carry an electron-accepting group react as electrophiles in polar, ring-opening reactions. Analogous reactions at cyclopropanes with additional C2 substituents allow one to access difunctionalized products. Consequently, functionalized cyclopropanes are frequently used building blocks in organic synthesis. The polarization of the C1-C2 bond in 1-acceptor-2-donor-substituted cyclopropanes not only favorably enhances reactivity toward nucleophiles but also directs the nucleophilic attack toward the already substituted C2 position. Monitoring the kinetics of non-catalytic ring-opening reactions with a series of thiophenolates and other strong nucleophiles, such as azide ions, in DMSO provided the inherent SN2 reactivity of electrophilic cyclopropanes. The experimentally determined second-order rate constants k 2 for cyclopropane ring-opening reactions were then compared to those of related Michael additions. Interestingly, cyclopropanes with aryl substituents at the C2 position reacted faster than their unsubstituted analogues. Variation of the electronic properties of the aryl groups at C2 gave rise to parabolic Hammett relationships.
Collapse
Affiliation(s)
- Andreas Eitzinger
- Department Chemie, Ludwig-Maximilians-Universität München, Butenandtstr. 5–13, 81377München, Germany
| | - Armin R. Ofial
- Department Chemie, Ludwig-Maximilians-Universität München, Butenandtstr. 5–13, 81377München, Germany
| |
Collapse
|
9
|
Li L, Mayer RJ, Ofial AR, Mayr H. One-Bond-Nucleophilicity and -Electrophilicity Parameters: An Efficient Ordering System for 1,3-Dipolar Cycloadditions. J Am Chem Soc 2023; 145:7416-7434. [PMID: 36952671 DOI: 10.1021/jacs.2c13872] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/25/2023]
Abstract
Diazoalkanes are ambiphilic 1,3-dipoles that undergo fast Huisgen cycloadditions with both electron-rich and electron-poor dipolarophiles but react slowly with alkenes of low polarity. Frontier molecular orbital (FMO) theory considering the 3-center-4-electron π-system of the propargyl fragment of diazoalkanes is commonly applied to rationalize these reactivity trends. However, we recently found that a change in the mechanism from cycloadditions to azo couplings takes place due to the existence of a previously overlooked lower-lying unoccupied molecular orbital. We now propose an alternative approach to analyze 1,3-dipolar cycloaddition reactions, which relies on the linear free energy relationship lg k2(20 °C) = sN(N + E) (eq 1) with two solvent-dependent parameters (N, sN) to characterize nucleophiles and one parameter (E) for electrophiles. Rate constants for the cycloadditions of diazoalkanes with dipolarophiles were measured and compared with those calculated for the formation of zwitterions by eq 1. The difference between experimental and predicted Gibbs energies of activation is interpreted as the energy of concert, i.e., the stabilization of the transition states by the concerted formation of two new bonds. By linking the plot of lg k2 vs N for nucleophilic dipolarophiles with that of lg k2 vs E for electrophilic dipolarophiles, one obtains V-shaped plots which provide absolute rate constants for the stepwise reactions on the borderlines. These plots furthermore predict relative reactivities of dipolarophiles in concerted, highly asynchronous cycloadditions more precisely than the classical correlations of rate constants with FMO energies or ionization potentials. DFT calculations using the SMD solvent model confirm these interpretations.
Collapse
Affiliation(s)
- Le Li
- Department Chemie, Ludwig-Maximilians-Universität München, Butenandtstr. 5-13, 81377 München, Germany
| | - Robert J Mayer
- CNRS, ISIS, Université de Strasbourg, 8 Allee Gaspard Monge, 67000 Strasbourg, France
| | - Armin R Ofial
- Department Chemie, Ludwig-Maximilians-Universität München, Butenandtstr. 5-13, 81377 München, Germany
| | - Herbert Mayr
- Department Chemie, Ludwig-Maximilians-Universität München, Butenandtstr. 5-13, 81377 München, Germany
| |
Collapse
|
10
|
Abstract
Reactivity scales are useful research tools for chemists, both experimental and computational. However, to determine the reactivity of a single molecule, multiple measurements need to be carried out, which is a time-consuming and resource-intensive task. In this Tutorial Review, we present alternative approaches for the efficient generation of quantitative structure-reactivity relationships that are based on quantum chemistry, supervised learning, and uncertainty quantification. First published in 2002, we observe a tendency for these relationships to become not only more predictive but also more interpretable over time.
Collapse
Affiliation(s)
- Maike Vahl
- Institute of Physical and Theoretical Chemistry, Technische Universität Braunschweig, Gaußstraße 17, 38106 Braunschweig, Germany.
| | - Jonny Proppe
- Institute of Physical and Theoretical Chemistry, Technische Universität Braunschweig, Gaußstraße 17, 38106 Braunschweig, Germany.
| |
Collapse
|
11
|
Cuesta SA, Moreno M, López RA, Mora JR, Paz JL, Márquez EA. ElectroPredictor: An Application to Predict Mayr's Electrophilicity E through Implementation of an Ensemble Model Based on Machine Learning Algorithms. J Chem Inf Model 2023; 63:507-521. [PMID: 36594600 DOI: 10.1021/acs.jcim.2c01367] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Abstract
Electrophilicity (E) is one of the most important parameters to understand the reactivity of an organic molecule. Although the theoretical electrophilicity index (ω) has been associated with E in a small homologous series, the use of w to predict E in a structurally heterogeneous set of compounds is not a trivial task. In this study, a robust ensemble model is created using Mayr's database of reactivity parameters. A combination of topological and quantum mechanical descriptors and different machine learning algorithms are employed for the model's development. The predictability of the model is assessed using different statistical parameters, and its validation is examined, including a training/test partition, an applicability domain, and a y-scrambling test. The global ensemble model presents a Q5-fold2 of 0.909 and a Qext2 of 0.912, demonstrating an excellent predictability performance of E values and showing that w is not a good descriptor for the prediction of E, especially for the case of neutral compounds. ElectroPredictor, a noncommercial Python application (https://github.com/mmoreno1/ElectroPredictor), is developed to predict E. QM9, a well-known large dataset containing 133885 neutral molecules, is used to perform a virtual screening (94.0% coverage). Finally, the 10 most electrophilic molecules are analyzed as possible new Mayr's electrophiles, which have not yet been experimentally tested. This study confirms the necessity to build an ensemble model using nonlinear machine learning algorithms, topographic descriptors, and separating molecules into charged and neutral compounds to predict E with precision.
Collapse
Affiliation(s)
- Sebastián A Cuesta
- Instituto de Simulación Computacional (ISC-USFQ), Departamento de Ingeniería Química, Universidad San Francisco de Quito, Diego de Robles y Vía Interoceánica, Quito170901, Ecuador
- Department of Chemistry, Manchester Institute of Biotechnology, The University of Manchester, 131 Princess Street, ManchesterM1 7DN, U.K
| | - Martín Moreno
- Instituto de Simulación Computacional (ISC-USFQ), Departamento de Ingeniería Química, Universidad San Francisco de Quito, Diego de Robles y Vía Interoceánica, Quito170901, Ecuador
| | - Romina A López
- Colegio San Ignacio de Loyola─Fe y Alegría, Ministerio de Educación, Quito170901, Ecuador
| | - José R Mora
- Instituto de Simulación Computacional (ISC-USFQ), Departamento de Ingeniería Química, Universidad San Francisco de Quito, Diego de Robles y Vía Interoceánica, Quito170901, Ecuador
| | - José Luis Paz
- Departamento Académico de Química Inorgánica, Facultad de Química e Ingeniería Química, Universidad Nacional Mayor de San Marcos, Cercado de Lima, Lima15081, Peru
| | - Edgar A Márquez
- Grupo de Investigaciones en Química y Biología, Departamento de Química y Biología, Facultad de Ciencias Exactas, Universidad del Norte, Carrera 51B, Km 5, vía Puerto Colombia, Barranquilla081007, Colombia
| |
Collapse
|
12
|
Cador A, Hoffmann G, Tognetti V, Joubert L. A theoretical study on aza-Michael additions. Theor Chem Acc 2022. [DOI: 10.1007/s00214-022-02921-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
|
13
|
Lustosa DM, Milo A. Mechanistic Inference from Statistical Models at Different Data-Size Regimes. ACS Catal 2022. [DOI: 10.1021/acscatal.2c01741] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Danilo M. Lustosa
- Department of Chemistry, Ben-Gurion University of the Negev, Beer Sheva 84105, Israel
| | - Anat Milo
- Department of Chemistry, Ben-Gurion University of the Negev, Beer Sheva 84105, Israel
| |
Collapse
|
14
|
Proppe J, Kircher J. Uncertainty Quantification of Reactivity Scales. Chemphyschem 2022; 23:e202200061. [PMID: 35189024 PMCID: PMC9314972 DOI: 10.1002/cphc.202200061] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2022] [Revised: 02/16/2022] [Indexed: 11/09/2022]
Abstract
According to Mayr, polar organic synthesis can be rationalized by a simple empirical relationship linking bimolecular rate constants to as few as three reactivity parameters. Here, we propose an extension to Mayr's reactivity method that is rooted in uncertainty quantification and transforms the reactivity parameters into probability distributions. Through uncertainty propagation, these distributions can be transformed into uncertainty estimates for bimolecular rate constants. Chemists can exploit these virtual error bars to enhance synthesis planning and to decrease the ambiguity of conclusions drawn from experimental data. We demonstrate the above at the example of the reference data set released by Mayr and co-workers [J. Am. Chem. Soc. 2001, 123, 9500; J. Am. Chem. Soc. 2012, 134, 13902]. As by-product of the new approach, we obtain revised reactivity parameters for 36 π-nucleophiles and 32 benzhydrylium ions.
Collapse
Affiliation(s)
- Jonny Proppe
- Georg-August UniversityInstitute of Physical ChemistryTammannstrasse 637077GöttingenGermany
- Present address: Technische Universität BraunschweigInstitute of Physical and Theoretical ChemistryGaussstrasse 1738106BraunschweigGermany
| | - Johannes Kircher
- Georg-August UniversityInstitute of Physical ChemistryTammannstrasse 637077GöttingenGermany
| |
Collapse
|
15
|
Dong J, Peng L, Yang X, Zhang Z, Zhang P. XGBoost-based intelligence yield prediction and reaction factors analysis of amination reaction. J Comput Chem 2022; 43:289-302. [PMID: 34862652 DOI: 10.1002/jcc.26791] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2021] [Revised: 11/19/2021] [Accepted: 11/22/2021] [Indexed: 11/10/2022]
Abstract
Buchwald-Hartwig amination reaction catalyzed by palladium plays an important role in drug synthesis. In the last few years, machine learning-assisted strategies emerged and quickly gained attention. In this article, an importance and relevance-based integrated feature screening method is proposed to effectively filter high-dimensional feature descriptor data. Then, a regularized machine learning boosting tree model, eXtreme Gradient Boosting, is introduced to intelligently predict reaction performance in multidimensional chemistry space. Furthermore, convergence, interpretability, generalization, and the internal association between reaction conditions and yields are excavated, which provides intelligent assistance for the optimal design of coupling reaction system and evaluating the reaction conditions. Compared with recently published results, the proposed method requires fewer feature descriptors, takes less time, and achieves more accurate prediction accuracy.
Collapse
Affiliation(s)
- Jing Dong
- Henan Engineering Research Center for Artificial Intelligence Theory and Algorithms, School of Mathematics and Statistics, Henan University, Kaifeng, China
| | - Lichao Peng
- National & Local Joint Engineering Research Center for Applied Technology of Hybrid Nanomaterials, Henan University, Kaifeng, China
| | - Xiaohui Yang
- Henan Engineering Research Center for Artificial Intelligence Theory and Algorithms, School of Mathematics and Statistics, Henan University, Kaifeng, China
| | - Zelin Zhang
- School of Computer and Information Engineering, Henan University, Kaifeng, China
| | - Puyu Zhang
- College of Chemistry and Chemical Engineering, Henan University, Kaifeng, China
| |
Collapse
|
16
|
Saini V. A machine learning approach for predicting the fluorination strength of electrophilic fluorinating reagents. Phys Chem Chem Phys 2022; 24:26802-26812. [DOI: 10.1039/d2cp03281c] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
A neural network algorithm utilizing SMILES encoding of organic molecules was successfully employed for predicting the fluorination strength of a wide range of N–F fluorinating reagents.
Collapse
Affiliation(s)
- Vaneet Saini
- Department of Chemistry & Centre for Advanced Studies in Chemistry, Panjab University, Chandigarh 160014, India
| |
Collapse
|
17
|
Saini V, Kumar R. A machine learning approach for predicting the empirical polarity of organic solvents. NEW J CHEM 2022. [DOI: 10.1039/d2nj02513b] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
A neural network architecture was found to efficiently predict the empirical polarity parameter ET(30) using simple to compute and interpretable six quantum mechanical, topological and categorical descriptors.
Collapse
Affiliation(s)
- Vaneet Saini
- Department of Chemistry & Centre for Advanced Studies in Chemistry, Panjab University, Chandigarh 160014, India
| | - Ranjeet Kumar
- Department of Chemistry & Centre for Advanced Studies in Chemistry, Panjab University, Chandigarh 160014, India
| |
Collapse
|
18
|
Abstract
As more data are introduced in the building of models of chemical reactivity, the mechanistic component can be reduced until 'big data' applications are reached. These methods no longer depend on underlying mechanistic hypotheses, potentially learning them implicitly through extensive data training. Reactivity models often focus on reaction barriers, but can also be trained to directly predict lab-relevant properties, such as yields or conditions. Calculations with a quantum-mechanical component are still preferred for quantitative predictions of reactivity. Although big data applications tend to be more qualitative, they have the advantage to be broadly applied to different kinds of reactions. There is a continuum of methods in between these extremes, such as methods that use quantum-derived data or descriptors in machine learning models. Here, we present an overview of the recent machine learning applications in the field of chemical reactivity from a mechanistic perspective. Starting with a summary of how reactivity questions are addressed by quantum-mechanical methods, we discuss methods that augment or replace quantum-based modelling with faster alternatives relying on machine learning.
Collapse
|
19
|
Kadish D, Mood AD, Tavakoli M, Gutman ES, Baldi P, Van Vranken DL. Methyl Cation Affinities of Canonical Organic Functional Groups. J Org Chem 2021; 86:3721-3729. [PMID: 33596071 DOI: 10.1021/acs.joc.0c02327] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Methyl cation affinities are calculated for the canonical nucleophilic functional groups in organic chemistry. These methyl cation affinities, calculated with a solvation model (MCA*), give an emprical correlation with the NsN term from the Mayr equation under aprotic conditions when they are scaled to the Mayr reference cation (4-MeOC6H4)2CH+ (Mayr E = 0). Highly reactive anionic nucleophiles were found to give a separate correlation, while some ylides and phosphorus compounds were determined to give a poor correlation. MCA*s are estimated for a broad range of simple molecules representing the canonical functional groups in organic chemistry. On the basis of a linear correlation, we estimate the range of nucleophilicities of organic functional groups, ranging from a C-C bond to a hypothetical tert-butyl carbanion, toward the reference electrophile to be about 50 orders of magnitude.
Collapse
Affiliation(s)
- Dora Kadish
- Department of Chemistry, University of California, Irvine, California 92697, United States
| | - Aaron D Mood
- Department of Chemistry, University of California, Irvine, California 92697, United States
| | - Mohammadamin Tavakoli
- Department of Computer Science, University of California, Irvine, California 92697, United States
| | - Eugene S Gutman
- Department of Chemistry, University of California, Irvine, California 92697, United States
| | - Pierre Baldi
- Department of Computer Science, University of California, Irvine, California 92697, United States
| | - David L Van Vranken
- Department of Chemistry, University of California, Irvine, California 92697, United States
| |
Collapse
|