1
|
Chen LY, Li YP. Machine learning-guided strategies for reaction conditions design and optimization. Beilstein J Org Chem 2024; 20:2476-2492. [PMID: 39376489 PMCID: PMC11457048 DOI: 10.3762/bjoc.20.212] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2024] [Accepted: 09/19/2024] [Indexed: 10/09/2024] Open
Abstract
This review surveys the recent advances and challenges in predicting and optimizing reaction conditions using machine learning techniques. The paper emphasizes the importance of acquiring and processing large and diverse datasets of chemical reactions, and the use of both global and local models to guide the design of synthetic processes. Global models exploit the information from comprehensive databases to suggest general reaction conditions for new reactions, while local models fine-tune the specific parameters for a given reaction family to improve yield and selectivity. The paper also identifies the current limitations and opportunities in this field, such as the data quality and availability, and the integration of high-throughput experimentation. The paper demonstrates how the combination of chemical engineering, data science, and ML algorithms can enhance the efficiency and effectiveness of reaction conditions design, and enable novel discoveries in synthetic chemistry.
Collapse
Affiliation(s)
- Lung-Yi Chen
- Department of Chemical Engineering, National Taiwan University, No. 1, Sec. 4, Roosevelt Road, Taipei 10617, Taiwan
| | - Yi-Pei Li
- Department of Chemical Engineering, National Taiwan University, No. 1, Sec. 4, Roosevelt Road, Taipei 10617, Taiwan
- Taiwan International Graduate Program on Sustainable Chemical Science and Technology (TIGP-SCST), No. 128, Sec. 2, Academia Road, Taipei 11529, Taiwan
| |
Collapse
|
2
|
Tom G, Schmid SP, Baird SG, Cao Y, Darvish K, Hao H, Lo S, Pablo-García S, Rajaonson EM, Skreta M, Yoshikawa N, Corapi S, Akkoc GD, Strieth-Kalthoff F, Seifrid M, Aspuru-Guzik A. Self-Driving Laboratories for Chemistry and Materials Science. Chem Rev 2024; 124:9633-9732. [PMID: 39137296 PMCID: PMC11363023 DOI: 10.1021/acs.chemrev.4c00055] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/15/2024]
Abstract
Self-driving laboratories (SDLs) promise an accelerated application of the scientific method. Through the automation of experimental workflows, along with autonomous experimental planning, SDLs hold the potential to greatly accelerate research in chemistry and materials discovery. This review provides an in-depth analysis of the state-of-the-art in SDL technology, its applications across various scientific disciplines, and the potential implications for research and industry. This review additionally provides an overview of the enabling technologies for SDLs, including their hardware, software, and integration with laboratory infrastructure. Most importantly, this review explores the diverse range of scientific domains where SDLs have made significant contributions, from drug discovery and materials science to genomics and chemistry. We provide a comprehensive review of existing real-world examples of SDLs, their different levels of automation, and the challenges and limitations associated with each domain.
Collapse
Affiliation(s)
- Gary Tom
- Department
of Chemistry, University of Toronto, 80 St. George St, Toronto, Ontario M5S 3H6, Canada
- Department
of Computer Science, University of Toronto, 40 St. George St, Toronto, Ontario M5S 2E4, Canada
- Vector Institute
for Artificial Intelligence, 661 University Ave Suite 710, Toronto, Ontario M5G 1M1, Canada
| | - Stefan P. Schmid
- Department
of Chemistry and Applied Biosciences, ETH
Zurich, Vladimir-Prelog-Weg 1, CH-8093 Zurich, Switzerland
| | - Sterling G. Baird
- Acceleration
Consortium, 80 St. George
St, Toronto, Ontario M5S 3H6, Canada
| | - Yang Cao
- Department
of Chemistry, University of Toronto, 80 St. George St, Toronto, Ontario M5S 3H6, Canada
- Department
of Computer Science, University of Toronto, 40 St. George St, Toronto, Ontario M5S 2E4, Canada
- Acceleration
Consortium, 80 St. George
St, Toronto, Ontario M5S 3H6, Canada
| | - Kourosh Darvish
- Department
of Computer Science, University of Toronto, 40 St. George St, Toronto, Ontario M5S 2E4, Canada
- Vector Institute
for Artificial Intelligence, 661 University Ave Suite 710, Toronto, Ontario M5G 1M1, Canada
- Acceleration
Consortium, 80 St. George
St, Toronto, Ontario M5S 3H6, Canada
| | - Han Hao
- Department
of Chemistry, University of Toronto, 80 St. George St, Toronto, Ontario M5S 3H6, Canada
- Department
of Computer Science, University of Toronto, 40 St. George St, Toronto, Ontario M5S 2E4, Canada
- Acceleration
Consortium, 80 St. George
St, Toronto, Ontario M5S 3H6, Canada
| | - Stanley Lo
- Department
of Chemistry, University of Toronto, 80 St. George St, Toronto, Ontario M5S 3H6, Canada
| | - Sergio Pablo-García
- Department
of Chemistry, University of Toronto, 80 St. George St, Toronto, Ontario M5S 3H6, Canada
- Department
of Computer Science, University of Toronto, 40 St. George St, Toronto, Ontario M5S 2E4, Canada
| | - Ella M. Rajaonson
- Department
of Chemistry, University of Toronto, 80 St. George St, Toronto, Ontario M5S 3H6, Canada
- Vector Institute
for Artificial Intelligence, 661 University Ave Suite 710, Toronto, Ontario M5G 1M1, Canada
| | - Marta Skreta
- Department
of Computer Science, University of Toronto, 40 St. George St, Toronto, Ontario M5S 2E4, Canada
- Vector Institute
for Artificial Intelligence, 661 University Ave Suite 710, Toronto, Ontario M5G 1M1, Canada
| | - Naruki Yoshikawa
- Department
of Computer Science, University of Toronto, 40 St. George St, Toronto, Ontario M5S 2E4, Canada
- Vector Institute
for Artificial Intelligence, 661 University Ave Suite 710, Toronto, Ontario M5G 1M1, Canada
| | - Samantha Corapi
- Department
of Chemistry, University of Toronto, 80 St. George St, Toronto, Ontario M5S 3H6, Canada
| | - Gun Deniz Akkoc
- Forschungszentrum
Jülich GmbH, Helmholtz Institute
for Renewable Energy Erlangen-Nürnberg, Cauerstr. 1, 91058 Erlangen, Germany
- Department
of Chemical and Biological Engineering, Friedrich-Alexander Universität Erlangen-Nürnberg, Egerlandstr. 3, 91058 Erlangen, Germany
| | - Felix Strieth-Kalthoff
- Department
of Chemistry, University of Toronto, 80 St. George St, Toronto, Ontario M5S 3H6, Canada
- Department
of Computer Science, University of Toronto, 40 St. George St, Toronto, Ontario M5S 2E4, Canada
- School of
Mathematics and Natural Sciences, University
of Wuppertal, Gaußstraße
20, 42119 Wuppertal, Germany
| | - Martin Seifrid
- Department
of Chemistry, University of Toronto, 80 St. George St, Toronto, Ontario M5S 3H6, Canada
- Department
of Computer Science, University of Toronto, 40 St. George St, Toronto, Ontario M5S 2E4, Canada
- Department
of Materials Science and Engineering, North
Carolina State University, Raleigh, North Carolina 27695, United States of America
| | - Alán Aspuru-Guzik
- Department
of Chemistry, University of Toronto, 80 St. George St, Toronto, Ontario M5S 3H6, Canada
- Department
of Computer Science, University of Toronto, 40 St. George St, Toronto, Ontario M5S 2E4, Canada
- Vector Institute
for Artificial Intelligence, 661 University Ave Suite 710, Toronto, Ontario M5G 1M1, Canada
- Acceleration
Consortium, 80 St. George
St, Toronto, Ontario M5S 3H6, Canada
- Department
of Chemical Engineering & Applied Chemistry, University of Toronto, Toronto, Ontario M5S 3E5, Canada
- Department
of Materials Science & Engineering, University of Toronto, Toronto, Ontario M5S 3E4, Canada
- Lebovic
Fellow, Canadian Institute for Advanced
Research (CIFAR), 661
University Ave, Toronto, Ontario M5G 1M1, Canada
| |
Collapse
|
3
|
Kalikadien AV, Mirza A, Hossaini AN, Sreenithya A, Pidko EA. Paving the road towards automated homogeneous catalyst design. Chempluschem 2024; 89:e202300702. [PMID: 38279609 DOI: 10.1002/cplu.202300702] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2023] [Revised: 12/20/2023] [Indexed: 01/28/2024]
Abstract
In the past decade, computational tools have become integral to catalyst design. They continue to offer significant support to experimental organic synthesis and catalysis researchers aiming for optimal reaction outcomes. More recently, data-driven approaches utilizing machine learning have garnered considerable attention for their expansive capabilities. This Perspective provides an overview of diverse initiatives in the realm of computational catalyst design and introduces our automated tools tailored for high-throughput in silico exploration of the chemical space. While valuable insights are gained through methods for high-throughput in silico exploration and analysis of chemical space, their degree of automation and modularity are key. We argue that the integration of data-driven, automated and modular workflows is key to enhancing homogeneous catalyst design on an unprecedented scale, contributing to the advancement of catalysis research.
Collapse
Affiliation(s)
- Adarsh V Kalikadien
- Inorganic Systems Engineering, Department of Chemical Engineering, Faculty of Applied Sciences, Delft University of Technology, Van der Maasweg 9, 2629 HZ, Delft, The Netherlands
| | - Adrian Mirza
- Inorganic Systems Engineering, Department of Chemical Engineering, Faculty of Applied Sciences, Delft University of Technology, Van der Maasweg 9, 2629 HZ, Delft, The Netherlands
| | - Aydin Najl Hossaini
- Inorganic Systems Engineering, Department of Chemical Engineering, Faculty of Applied Sciences, Delft University of Technology, Van der Maasweg 9, 2629 HZ, Delft, The Netherlands
| | - Avadakkam Sreenithya
- Inorganic Systems Engineering, Department of Chemical Engineering, Faculty of Applied Sciences, Delft University of Technology, Van der Maasweg 9, 2629 HZ, Delft, The Netherlands
| | - Evgeny A Pidko
- Inorganic Systems Engineering, Department of Chemical Engineering, Faculty of Applied Sciences, Delft University of Technology, Van der Maasweg 9, 2629 HZ, Delft, The Netherlands
| |
Collapse
|
4
|
Schoepfer A, Laplaza R, Wodrich MD, Waser J, Corminboeuf C. Reaction-Agnostic Featurization of Bidentate Ligands for Bayesian Ridge Regression of Enantioselectivity. ACS Catal 2024; 14:9302-9312. [PMID: 38933467 PMCID: PMC11197013 DOI: 10.1021/acscatal.4c02452] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2024] [Revised: 05/22/2024] [Accepted: 05/22/2024] [Indexed: 06/28/2024]
Abstract
Chiral ligands are important components in asymmetric homogeneous catalysis, but their synthesis and screening can be both time-consuming and resource-intensive. Data-driven approaches, in contrast to screening procedures based on intuition, have the potential to reduce the time and resources needed for reaction optimization by more rapidly identifying an ideal catalyst. These approaches, however, are often nontransferable and cannot be applied across different reactions. To overcome this drawback, we introduce a general featurization strategy for bidentate ligands that is coupled with an automated feature selection pipeline and Bayesian ridge regression to perform multivariate linear regression modeling. This approach, which is applicable to any reaction, incorporates electronic, steric, and topological features (rigidity/flexibility, branching, geometry, and constitution) and is well-suited for early stage ligand optimization. Using only small data sets, our workflow capably predicts the enantioselectivity of four metal-catalyzed asymmetric reactions. Uncertainty estimates provided by Bayesian ridge regression permit the use of Bayesian optimization to efficiently explore pools of prospective ligands. Finally, we constructed the BDL-Cu-2023 data set, composed of 312 bidentate ligands extracted from the Cambridge Structural Database, and screened it with this procedure to identify ligand candidates for a challenging asymmetric oxy-alkynylation reaction.
Collapse
Affiliation(s)
- Alexandre
A. Schoepfer
- Laboratory
for Computational Molecular Design, Institute of Chemical Sciences
and Engineering, École Polytechnique
Fédérale de Lausanne (EPFL), 1015 Lausanne, Switzerland
- Laboratory
of Catalysis and Organic Synthesis, Institute of Chemical Sciences
and Engineering, École Polytechnique
Fédérale de Lausanne (EPFL), 1015 Lausanne, Switzerland
- National
Center for Competence in Research-Catalysis (NCCR-Catalysis), École Polytechnique Fédérale
de Lausanne, 1015 Lausanne, Switzerland
| | - Ruben Laplaza
- Laboratory
for Computational Molecular Design, Institute of Chemical Sciences
and Engineering, École Polytechnique
Fédérale de Lausanne (EPFL), 1015 Lausanne, Switzerland
- National
Center for Competence in Research-Catalysis (NCCR-Catalysis), École Polytechnique Fédérale
de Lausanne, 1015 Lausanne, Switzerland
| | - Matthew D. Wodrich
- Laboratory
for Computational Molecular Design, Institute of Chemical Sciences
and Engineering, École Polytechnique
Fédérale de Lausanne (EPFL), 1015 Lausanne, Switzerland
- National
Center for Competence in Research-Catalysis (NCCR-Catalysis), École Polytechnique Fédérale
de Lausanne, 1015 Lausanne, Switzerland
| | - Jerome Waser
- Laboratory
of Catalysis and Organic Synthesis, Institute of Chemical Sciences
and Engineering, École Polytechnique
Fédérale de Lausanne (EPFL), 1015 Lausanne, Switzerland
- National
Center for Competence in Research-Catalysis (NCCR-Catalysis), École Polytechnique Fédérale
de Lausanne, 1015 Lausanne, Switzerland
| | - Clemence Corminboeuf
- Laboratory
for Computational Molecular Design, Institute of Chemical Sciences
and Engineering, École Polytechnique
Fédérale de Lausanne (EPFL), 1015 Lausanne, Switzerland
- National
Center for Competence in Research-Catalysis (NCCR-Catalysis), École Polytechnique Fédérale
de Lausanne, 1015 Lausanne, Switzerland
| |
Collapse
|
5
|
Raghavan P, Rago AJ, Verma P, Hassan MM, Goshu GM, Dombrowski AW, Pandey A, Coley CW, Wang Y. Incorporating Synthetic Accessibility in Drug Design: Predicting Reaction Yields of Suzuki Cross-Couplings by Leveraging AbbVie's 15-Year Parallel Library Data Set. J Am Chem Soc 2024; 146:15070-15084. [PMID: 38768950 PMCID: PMC11157529 DOI: 10.1021/jacs.4c00098] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2024] [Revised: 04/24/2024] [Accepted: 04/25/2024] [Indexed: 05/22/2024]
Abstract
Despite the increased use of computational tools to supplement medicinal chemists' expertise and intuition in drug design, predicting synthetic yields in medicinal chemistry endeavors remains an unsolved challenge. Existing design workflows could profoundly benefit from reaction yield prediction, as precious material waste could be reduced, and a greater number of relevant compounds could be delivered to advance the design, make, test, analyze (DMTA) cycle. In this work, we detail the evaluation of AbbVie's medicinal chemistry library data set to build machine learning models for the prediction of Suzuki coupling reaction yields. The combination of density functional theory (DFT)-derived features and Morgan fingerprints was identified to perform better than one-hot encoded baseline modeling, furnishing encouraging results. Overall, we observe modest generalization to unseen reactant structures within the 15-year retrospective library data set. Additionally, we compare predictions made by the model to those made by expert medicinal chemists, finding that the model can often predict both reaction success and reaction yields with greater accuracy. Finally, we demonstrate the application of this approach to suggest structurally and electronically similar building blocks to replace those predicted or observed to be unsuccessful prior to or after synthesis, respectively. The yield prediction model was used to select similar monomers predicted to have higher yields, resulting in greater synthesis efficiency of relevant drug-like molecules.
Collapse
Affiliation(s)
- Priyanka Raghavan
- Department
of Chemical Engineering, Massachusetts Institute
of Technology, 77 Massachusetts Ave, Cambridge, Massachusetts 02139, United States
| | - Alexander J. Rago
- Advanced
Chemistry Technologies Group, AbbVie, Inc., 1 N Waukegan Rd, North Chicago, Illinois 60064, United States
| | - Pritha Verma
- Advanced
Chemistry Technologies Group, AbbVie, Inc., 1 N Waukegan Rd, North Chicago, Illinois 60064, United States
| | - Majdi M. Hassan
- RAIDERS
Group, AbbVie, Inc., 1 N Waukegan Rd, North Chicago, Illinois 60064, United States
| | - Gashaw M. Goshu
- Advanced
Chemistry Technologies Group, AbbVie, Inc., 1 N Waukegan Rd, North Chicago, Illinois 60064, United States
| | - Amanda W. Dombrowski
- Advanced
Chemistry Technologies Group, AbbVie, Inc., 1 N Waukegan Rd, North Chicago, Illinois 60064, United States
| | - Abhishek Pandey
- RAIDERS
Group, AbbVie, Inc., 1 N Waukegan Rd, North Chicago, Illinois 60064, United States
| | - Connor W. Coley
- Department
of Chemical Engineering, Massachusetts Institute
of Technology, 77 Massachusetts Ave, Cambridge, Massachusetts 02139, United States
| | - Ying Wang
- Advanced
Chemistry Technologies Group, AbbVie, Inc., 1 N Waukegan Rd, North Chicago, Illinois 60064, United States
| |
Collapse
|
6
|
Luchini G, Paton RS. Bottom-Up Atomistic Descriptions of Top-Down Macroscopic Measurements: Computational Benchmarks for Hammett Electronic Parameters. ACS PHYSICAL CHEMISTRY AU 2024; 4:259-267. [PMID: 38800724 PMCID: PMC11117679 DOI: 10.1021/acsphyschemau.3c00045] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/19/2023] [Revised: 01/14/2024] [Accepted: 01/16/2024] [Indexed: 05/29/2024]
Abstract
The ability to relate substituent electronic effects to chemical reactivity is a cornerstone of physical organic chemistry and Linear Free Energy Relationships. The computation of electronic parameters is increasingly attractive since they can be obtained rapidly for structures and substituents without available experimental data and can be applied beyond aromatic substituents, for example, in studies of transition metal complexes and aliphatic and radical systems. Nevertheless, the description of "top-down" macroscopic observables, such as Hammett parameters using a "bottom-up" computational approach, poses several challenges for the practitioner. We have examined and benchmarked the performance of various computational charge schemes encompassing quantum mechanical methods that partition charge density, methods that fit charge to physical observables, and methods enhanced by semiempirical adjustments alongside NMR values. We study the locations of the atoms used to obtain these descriptors and their correlation with empirical Hammett parameters and rate differences resulting from electronic effects. These seemingly small choices have a much more significant impact than previously imagined, which outweighs the level of theory or basis set used. We observe a wide range of performance across the different computational protocols and observe stark and surprising differences in the ability of computational parameters to capture para- vs meta-electronic effects. In general, σm predictions fare much worse than σp. As a result, the choice of where to compute these descriptors-for the ring carbons or the attached H or other substituent atoms-affects their ability to capture experimental electronic differences. Density-based schemes, such as Hirshfeld charges, are more stable toward unphysical charge perturbations that result from nearby functional groups and outperform all other computational descriptors, including several commonly used basis set based schemes such as Natural Population Analysis. Using attached atoms also improves the statistical correlations. We obtained general linear relationships for the global prediction of experimental Hammett parameters from computed descriptors for use in statistical modeling studies.
Collapse
Affiliation(s)
- Guilian Luchini
- Department
of Chemistry, Colorado State University, 1301 Center Ave., Ft. Collins, Colorado 80523-1872, United States
| | - Robert S. Paton
- Department
of Chemistry, Colorado State University, 1301 Center Ave., Ft. Collins, Colorado 80523-1872, United States
| |
Collapse
|
7
|
van Gerwen P, Briling KR, Calvino Alonso Y, Franke M, Corminboeuf C. Benchmarking machine-readable vectors of chemical reactions on computed activation barriers. DIGITAL DISCOVERY 2024; 3:932-943. [PMID: 38756222 PMCID: PMC11094696 DOI: 10.1039/d3dd00175j] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/06/2023] [Accepted: 02/28/2024] [Indexed: 05/18/2024]
Abstract
In recent years, there has been a surge of interest in predicting computed activation barriers, to enable the acceleration of the automated exploration of reaction networks. Consequently, various predictive approaches have emerged, ranging from graph-based models to methods based on the three-dimensional structure of reactants and products. In tandem, many representations have been developed to predict experimental targets, which may hold promise for barrier prediction as well. Here, we bring together all of these efforts and benchmark various methods (Morgan fingerprints, the DRFP, the CGR representation-based Chemprop, SLATMd, B2Rl2, EquiReact and language model BERT + RXNFP) for the prediction of computed activation barriers on three diverse datasets.
Collapse
Affiliation(s)
- Puck van Gerwen
- Laboratory for Computational Molecular Design, Institute of Chemical Sciences and Engineering, École Polytechnique Fédérale de Lausanne 1015 Lausanne Switzerland
- National Center for Competence in Research-Catalysis (NCCR-Catalysis), École Polytechnique Fédérale de Lausanne 1015 Lausanne Switzerland
| | - Ksenia R Briling
- Laboratory for Computational Molecular Design, Institute of Chemical Sciences and Engineering, École Polytechnique Fédérale de Lausanne 1015 Lausanne Switzerland
| | - Yannick Calvino Alonso
- Laboratory for Computational Molecular Design, Institute of Chemical Sciences and Engineering, École Polytechnique Fédérale de Lausanne 1015 Lausanne Switzerland
| | - Malte Franke
- Laboratory for Computational Molecular Design, Institute of Chemical Sciences and Engineering, École Polytechnique Fédérale de Lausanne 1015 Lausanne Switzerland
| | - Clemence Corminboeuf
- Laboratory for Computational Molecular Design, Institute of Chemical Sciences and Engineering, École Polytechnique Fédérale de Lausanne 1015 Lausanne Switzerland
- National Center for Competence in Research-Catalysis (NCCR-Catalysis), École Polytechnique Fédérale de Lausanne 1015 Lausanne Switzerland
| |
Collapse
|
8
|
Williams WL, Gutiérrez-Valencia NE, Doyle AG. Branched-Selective Cross-Electrophile Coupling of 2-Alkyl Aziridines and (Hetero)aryl Iodides Using Ti/Ni Catalysis. J Am Chem Soc 2023; 145:24175-24183. [PMID: 37888947 DOI: 10.1021/jacs.3c08301] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2023]
Abstract
The arylation of 2-alkyl aziridines by nucleophilic ring-opening or transition-metal-catalyzed cross-coupling enables facile access to biologically relevant β-phenethylamine derivatives. However, both approaches largely favor C-C bond formation at the less-substituted carbon of the aziridine, thus enabling access to only linear products. Consequently, despite the attractive bond disconnection that it poses, the synthesis of branched arylated products from 2-alkyl aziridines has remained inaccessible. Herein, we address this long-standing challenge and report the first branched-selective cross-coupling of 2-alkyl aziridines with aryl iodides. This unique selectivity is enabled by a Ti/Ni dual-catalytic system. We demonstrate the robustness of the method by a twofold approach: an additive screening campaign to probe functional group tolerance and a feature-driven substrate scope to study the effect of the local steric and electronic profile of each coupling partner on reactivity. Furthermore, the diversity of this feature-driven substrate scope enabled the generation of predictive reactivity models that guided mechanistic understanding. Mechanistic studies demonstrated that the branched selectivity arises from a TiIII-induced radical ring-opening of the aziridine.
Collapse
Affiliation(s)
- Wendy L Williams
- Department of Chemistry, Princeton University, Princeton, New Jersey 08544, United States
- Department of Chemistry and Biochemistry, University of California, Los Angeles, California 90095, United States
| | - Neyci E Gutiérrez-Valencia
- Department of Chemistry, Princeton University, Princeton, New Jersey 08544, United States
- Department of Chemistry and Biochemistry, University of California, Los Angeles, California 90095, United States
| | - Abigail G Doyle
- Department of Chemistry and Biochemistry, University of California, Los Angeles, California 90095, United States
| |
Collapse
|
9
|
van Dijk L, Haas BC, Lim NK, Clagg K, Dotson JJ, Treacy SM, Piechowicz KA, Roytman VA, Zhang H, Toste FD, Miller SJ, Gosselin F, Sigman MS. Data Science-Enabled Palladium-Catalyzed Enantioselective Aryl-Carbonylation of Sulfonimidamides. J Am Chem Soc 2023; 145:20959-20967. [PMID: 37656964 DOI: 10.1021/jacs.3c06674] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/03/2023]
Abstract
New methods for the general asymmetric synthesis of sulfonimidamides are of great interest due to their applications in medicinal chemistry, agrochemical discovery, and academic research. We report a palladium-catalyzed cross-coupling method for the enantioselective aryl-carbonylation of sulfonimidamides. Using data science techniques, a virtual library of calculated bisphosphine ligand descriptors was used to guide reaction optimization by effectively sampling the catalyst chemical space. The optimized conditions identified using this approach provided the desired product in excellent yield and enantioselectivity. As the next step, a data science-driven strategy was also used to explore a diverse set of aryl and heteroaryl iodides, providing key information about the scope and limitations of the method. Furthermore, we tested a range of racemic sulfonimidamides for compatibility of this coupling partner. The developed method offers a general and efficient strategy for accessing enantioenriched sulfonimidamides, which should facilitate their application in industrial and academic settings.
Collapse
Affiliation(s)
- Lucy van Dijk
- Department of Chemistry, University of Utah, Salt Lake City, Utah 84112, United States
| | - Brittany C Haas
- Department of Chemistry, University of Utah, Salt Lake City, Utah 84112, United States
| | - Ngiap-Kie Lim
- Department of Small Molecule Process Chemistry, Genentech, Inc., South San Francisco, California 94080, United States
| | - Kyle Clagg
- Department of Small Molecule Process Chemistry, Genentech, Inc., South San Francisco, California 94080, United States
| | - Jordan J Dotson
- Department of Small Molecule Process Chemistry, Genentech, Inc., South San Francisco, California 94080, United States
| | - Sean M Treacy
- Department of Chemistry, University of California, Berkeley, California 94720, United States
| | - Katarzyna A Piechowicz
- Department of Small Molecule Process Chemistry, Genentech, Inc., South San Francisco, California 94080, United States
| | - Vladislav A Roytman
- Department of Chemistry, University of California, Berkeley, California 94720, United States
| | - Haiming Zhang
- Department of Small Molecule Process Chemistry, Genentech, Inc., South San Francisco, California 94080, United States
| | - F Dean Toste
- Department of Chemistry, University of California, Berkeley, California 94720, United States
| | - Scott J Miller
- Department of Chemistry, Yale University, New Haven, Connecticut 06511, United States
| | - Francis Gosselin
- Department of Small Molecule Process Chemistry, Genentech, Inc., South San Francisco, California 94080, United States
| | - Matthew S Sigman
- Department of Chemistry, University of Utah, Salt Lake City, Utah 84112, United States
| |
Collapse
|
10
|
Ruos ME, Kinney RG, Ring OT, Doyle AG. A General Photocatalytic Strategy for Nucleophilic Amination of Primary and Secondary Benzylic C-H Bonds. J Am Chem Soc 2023; 145:18487-18496. [PMID: 37565772 DOI: 10.1021/jacs.3c04912] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/12/2023]
Abstract
We report a visible-light photoredox-catalyzed method that enables nucleophilic amination of primary and secondary benzylic C(sp3)-H bonds. A novel amidyl radical precursor and organic photocatalyst operate in tandem to transform primary and secondary benzylic C(sp3)-H bonds into carbocations via sequential hydrogen atom transfer (HAT) and oxidative radical-polar crossover. The resulting carbocation can be intercepted by a variety of N-centered nucleophiles, including nitriles (Ritter reaction), amides, carbamates, sulfonamides, and azoles, for the construction of pharmaceutically relevant C(sp3)-N bonds under unified reaction conditions. Mechanistic studies indicate that HAT is amidyl radical-mediated and that the photocatalyst operates via a reductive quenching pathway. These findings establish a mild, metal-free, and modular protocol for the rapid diversification of C(sp3)-H bonds to a library of aminated products.
Collapse
Affiliation(s)
- Madeline E Ruos
- Department of Chemistry and Biochemistry, University of California-Los Angeles, Los Angeles, California 90095, United States
| | - R Garrison Kinney
- Department of Chemistry and Biochemistry, University of California-Los Angeles, Los Angeles, California 90095, United States
- Department of Chemistry, Princeton University, Princeton, New Jersey 08544, United States
| | - Oliver T Ring
- Department of Chemistry and Biochemistry, University of California-Los Angeles, Los Angeles, California 90095, United States
- Early Chemical Development, Pharmaceutical Sciences, Biopharmaceuticals R&D, AstraZeneca, Gothenburg, SE-431 83 Mölndal, Sweden
| | - Abigail G Doyle
- Department of Chemistry and Biochemistry, University of California-Los Angeles, Los Angeles, California 90095, United States
| |
Collapse
|
11
|
Chen K, Chen G, Li J, Huang Y, Wang E, Hou T, Heng PA. MetaRF: attention-based random forest for reaction yield prediction with a few trails. J Cheminform 2023; 15:43. [PMID: 37038222 PMCID: PMC10084704 DOI: 10.1186/s13321-023-00715-x] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2023] [Accepted: 03/21/2023] [Indexed: 04/12/2023] Open
Abstract
Artificial intelligence has deeply revolutionized the field of medicinal chemistry with many impressive applications, but the success of these applications requires a massive amount of training samples with high-quality annotations, which seriously limits the wide usage of data-driven methods. In this paper, we focus on the reaction yield prediction problem, which assists chemists in selecting high-yield reactions in a new chemical space only with a few experimental trials. To attack this challenge, we first put forth MetaRF, an attention-based random forest model specially designed for the few-shot yield prediction, where the attention weight of a random forest is automatically optimized by the meta-learning framework and can be quickly adapted to predict the performance of new reagents while given a few additional samples. To improve the few-shot learning performance, we further introduce a dimension-reduction based sampling method to determine valuable samples to be experimentally tested and then learned. Our methodology is evaluated on three different datasets and acquires satisfactory performance on few-shot prediction. In high-throughput experimentation (HTE) datasets, the average yield of our methodology's top 10 high-yield reactions is relatively close to the results of ideal yield selection.
Collapse
Affiliation(s)
- Kexin Chen
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, New Territories, Hong Kong SAR
| | | | | | - Yuansheng Huang
- College of Pharmaceutical Sciences, Zhejiang University, Zhejiang, China
| | - Ercheng Wang
- Zhejiang Lab, Zhejiang, China
- College of Pharmaceutical Sciences, Zhejiang University, Zhejiang, China
| | - Tingjun Hou
- College of Pharmaceutical Sciences, Zhejiang University, Zhejiang, China
| | - Pheng-Ann Heng
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, New Territories, Hong Kong SAR
- Zhejiang Lab, Zhejiang, China
| |
Collapse
|
12
|
Tsuji N, Sidorov P, Zhu C, Nagata Y, Gimadiev T, Varnek A, List B. Predicting Highly Enantioselective Catalysts Using Tunable Fragment Descriptors. Angew Chem Int Ed Engl 2023; 62:e202218659. [PMID: 36688354 DOI: 10.1002/anie.202218659] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2022] [Revised: 01/17/2023] [Accepted: 01/19/2023] [Indexed: 01/24/2023]
Abstract
Catalyst optimization processes typically rely on inductive and qualitative assumptions of chemists based on screening data. While machine learning models using molecular properties or calculated 3D structures enable quantitative data evaluation, costly quantum chemical calculations are often required. In contrast, readily available binary fingerprint descriptors are time- and cost-efficient, but their predictive performance remains insufficient. Here, we describe a machine learning model based on fragment descriptors, which are fine-tuned for asymmetric catalysis and represent cyclic or polyaromatic hydrocarbons, enabling robust and efficient virtual screening. Using training data with only moderate selectivities, we designed theoretically and validated experimentally new catalysts showing higher selectivities in a challenging asymmetric tetrahydropyran synthesis.
Collapse
Affiliation(s)
- Nobuya Tsuji
- Institute for Chemical Reaction Design and Discovery (WPI-ICReDD), Hokkaido University, Sapporo, 001-0021, Japan
| | - Pavel Sidorov
- Institute for Chemical Reaction Design and Discovery (WPI-ICReDD), Hokkaido University, Sapporo, 001-0021, Japan
| | - Chendan Zhu
- Max-Planck-Institut für Kohlenforschung, 45470, Mülheim an der Ruhr, Germany
| | - Yuuya Nagata
- Institute for Chemical Reaction Design and Discovery (WPI-ICReDD), Hokkaido University, Sapporo, 001-0021, Japan
| | - Timur Gimadiev
- Institute for Chemical Reaction Design and Discovery (WPI-ICReDD), Hokkaido University, Sapporo, 001-0021, Japan
| | - Alexandre Varnek
- Institute for Chemical Reaction Design and Discovery (WPI-ICReDD), Hokkaido University, Sapporo, 001-0021, Japan.,Laboratory of Chemoinformatics, UMR 7140, CNRS, University of Strasbourg, 67081, Strasbourg, France
| | - Benjamin List
- Institute for Chemical Reaction Design and Discovery (WPI-ICReDD), Hokkaido University, Sapporo, 001-0021, Japan.,Max-Planck-Institut für Kohlenforschung, 45470, Mülheim an der Ruhr, Germany
| |
Collapse
|
13
|
Alegre‐Requena JV, Sowndarya S. V. S, Pérez‐Soto R, Alturaifi TM, Paton RS. AQME: Automated quantum mechanical environments for researchers and educators. WIRES COMPUTATIONAL MOLECULAR SCIENCE 2023. [DOI: 10.1002/wcms.1663] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/01/2023]
Affiliation(s)
- Juan V. Alegre‐Requena
- Dpto. de Química Inorgánica Instituto de Síntesis Química y Catálisis Homogénea (ISQCH) CSIC‐Universidad de Zaragoza Zaragoza Spain
| | | | - Raúl Pérez‐Soto
- Department of Chemistry Colorado State University Fort Collins Colorado USA
| | - Turki M. Alturaifi
- Department of Chemistry Colorado State University Fort Collins Colorado USA
| | - Robert S. Paton
- Department of Chemistry Colorado State University Fort Collins Colorado USA
| |
Collapse
|
14
|
Schleinitz J, Langevin M, Smail Y, Wehnert B, Grimaud L, Vuilleumier R. Machine Learning Yield Prediction from NiCOlit, a Small-Size Literature Data Set of Nickel Catalyzed C-O Couplings. J Am Chem Soc 2022; 144:14722-14730. [PMID: 35939717 DOI: 10.1021/jacs.2c05302] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Synthetic yield prediction using machine learning is intensively studied. Previous work has focused on two categories of data sets: high-throughput experimentation data, as an ideal case study, and data sets extracted from proprietary databases, which are known to have a strong reporting bias toward high yields. However, predicting yields using published reaction data remains elusive. To fill the gap, we built a data set on nickel-catalyzed cross-couplings extracted from organic reaction publications, including scope and optimization information. We demonstrate the importance of including optimization data as a source of failed experiments and emphasize how publication constraints shape the exploration of the chemical space by the synthetic community. While machine learning models still fail to perform out-of-sample predictions, this work shows that adding chemical knowledge enables fair predictions in a low-data regime. Eventually, we hope that this unique public database will foster further improvements of machine learning methods for reaction yield prediction in a more realistic context.
Collapse
Affiliation(s)
- Jules Schleinitz
- LBM, Département de Chimie, École Normale Supérieure, PSL University, Sorbonne Université, CNRS, 75005 Paris, France
| | - Maxime Langevin
- PASTEUR, Département de Chimie, École Normale Supérieure, PSL University, Sorbonne Université, CNRS, 75005 Paris, France.,Molecular Design Sciences─Integrated Drug Discovery, Sanofi R&D, 94400 Vitry-Sur-Seine, France
| | - Yanis Smail
- UPMC, PSL University, Sorbonne Université, CNRS, 75005 Paris, France
| | - Benjamin Wehnert
- UPMC, PSL University, Sorbonne Université, CNRS, 75005 Paris, France
| | - Laurence Grimaud
- LBM, Département de Chimie, École Normale Supérieure, PSL University, Sorbonne Université, CNRS, 75005 Paris, France
| | - Rodolphe Vuilleumier
- PASTEUR, Département de Chimie, École Normale Supérieure, PSL University, Sorbonne Université, CNRS, 75005 Paris, France
| |
Collapse
|