1
|
Chen LY, Li YP. Machine learning-guided strategies for reaction conditions design and optimization. Beilstein J Org Chem 2024; 20:2476-2492. [PMID: 39376489 PMCID: PMC11457048 DOI: 10.3762/bjoc.20.212] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2024] [Accepted: 09/19/2024] [Indexed: 10/09/2024] Open
Abstract
This review surveys the recent advances and challenges in predicting and optimizing reaction conditions using machine learning techniques. The paper emphasizes the importance of acquiring and processing large and diverse datasets of chemical reactions, and the use of both global and local models to guide the design of synthetic processes. Global models exploit the information from comprehensive databases to suggest general reaction conditions for new reactions, while local models fine-tune the specific parameters for a given reaction family to improve yield and selectivity. The paper also identifies the current limitations and opportunities in this field, such as the data quality and availability, and the integration of high-throughput experimentation. The paper demonstrates how the combination of chemical engineering, data science, and ML algorithms can enhance the efficiency and effectiveness of reaction conditions design, and enable novel discoveries in synthetic chemistry.
Collapse
Affiliation(s)
- Lung-Yi Chen
- Department of Chemical Engineering, National Taiwan University, No. 1, Sec. 4, Roosevelt Road, Taipei 10617, Taiwan
| | - Yi-Pei Li
- Department of Chemical Engineering, National Taiwan University, No. 1, Sec. 4, Roosevelt Road, Taipei 10617, Taiwan
- Taiwan International Graduate Program on Sustainable Chemical Science and Technology (TIGP-SCST), No. 128, Sec. 2, Academia Road, Taipei 11529, Taiwan
| |
Collapse
|
2
|
Raghavan P, Rago AJ, Verma P, Hassan MM, Goshu GM, Dombrowski AW, Pandey A, Coley CW, Wang Y. Incorporating Synthetic Accessibility in Drug Design: Predicting Reaction Yields of Suzuki Cross-Couplings by Leveraging AbbVie's 15-Year Parallel Library Data Set. J Am Chem Soc 2024; 146:15070-15084. [PMID: 38768950 PMCID: PMC11157529 DOI: 10.1021/jacs.4c00098] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2024] [Revised: 04/24/2024] [Accepted: 04/25/2024] [Indexed: 05/22/2024]
Abstract
Despite the increased use of computational tools to supplement medicinal chemists' expertise and intuition in drug design, predicting synthetic yields in medicinal chemistry endeavors remains an unsolved challenge. Existing design workflows could profoundly benefit from reaction yield prediction, as precious material waste could be reduced, and a greater number of relevant compounds could be delivered to advance the design, make, test, analyze (DMTA) cycle. In this work, we detail the evaluation of AbbVie's medicinal chemistry library data set to build machine learning models for the prediction of Suzuki coupling reaction yields. The combination of density functional theory (DFT)-derived features and Morgan fingerprints was identified to perform better than one-hot encoded baseline modeling, furnishing encouraging results. Overall, we observe modest generalization to unseen reactant structures within the 15-year retrospective library data set. Additionally, we compare predictions made by the model to those made by expert medicinal chemists, finding that the model can often predict both reaction success and reaction yields with greater accuracy. Finally, we demonstrate the application of this approach to suggest structurally and electronically similar building blocks to replace those predicted or observed to be unsuccessful prior to or after synthesis, respectively. The yield prediction model was used to select similar monomers predicted to have higher yields, resulting in greater synthesis efficiency of relevant drug-like molecules.
Collapse
Affiliation(s)
- Priyanka Raghavan
- Department
of Chemical Engineering, Massachusetts Institute
of Technology, 77 Massachusetts Ave, Cambridge, Massachusetts 02139, United States
| | - Alexander J. Rago
- Advanced
Chemistry Technologies Group, AbbVie, Inc., 1 N Waukegan Rd, North Chicago, Illinois 60064, United States
| | - Pritha Verma
- Advanced
Chemistry Technologies Group, AbbVie, Inc., 1 N Waukegan Rd, North Chicago, Illinois 60064, United States
| | - Majdi M. Hassan
- RAIDERS
Group, AbbVie, Inc., 1 N Waukegan Rd, North Chicago, Illinois 60064, United States
| | - Gashaw M. Goshu
- Advanced
Chemistry Technologies Group, AbbVie, Inc., 1 N Waukegan Rd, North Chicago, Illinois 60064, United States
| | - Amanda W. Dombrowski
- Advanced
Chemistry Technologies Group, AbbVie, Inc., 1 N Waukegan Rd, North Chicago, Illinois 60064, United States
| | - Abhishek Pandey
- RAIDERS
Group, AbbVie, Inc., 1 N Waukegan Rd, North Chicago, Illinois 60064, United States
| | - Connor W. Coley
- Department
of Chemical Engineering, Massachusetts Institute
of Technology, 77 Massachusetts Ave, Cambridge, Massachusetts 02139, United States
| | - Ying Wang
- Advanced
Chemistry Technologies Group, AbbVie, Inc., 1 N Waukegan Rd, North Chicago, Illinois 60064, United States
| |
Collapse
|
3
|
Ghiandoni GM, Evertsson E, Riley DJ, Tyrchan C, Rathi PC. Augmenting DMTA using predictive AI modelling at AstraZeneca. Drug Discov Today 2024; 29:103945. [PMID: 38460568 DOI: 10.1016/j.drudis.2024.103945] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2023] [Revised: 02/27/2024] [Accepted: 03/05/2024] [Indexed: 03/11/2024]
Abstract
Design-Make-Test-Analyse (DMTA) is the discovery cycle through which molecules are designed, synthesised, and assayed to produce data that in turn are analysed to inform the next iteration. The process is repeated until viable drug candidates are identified, often requiring many cycles before reaching a sweet spot. The advent of artificial intelligence (AI) and cloud computing presents an opportunity to innovate drug discovery to reduce the number of cycles needed to yield a candidate. Here, we present the Predictive Insight Platform (PIP), a cloud-native modelling platform developed at AstraZeneca. The impact of PIP in each step of DMTA, as well as its architecture, integration, and usage, are discussed and used to provide insights into the future of drug discovery.
Collapse
Affiliation(s)
- Gian Marco Ghiandoni
- Augmented DMTA Platform, R&D IT, AstraZeneca, The Discovery Centre (DISC), Francis Crick Avenue, Cambridge CB2 0AA, UK.
| | - Emma Evertsson
- Research and Early Development, Respiratory and Immunology (R&I), Biopharmaceuticals R&D, AstraZeneca, Pepparedsleden, Mölndal, SE 43183, Sweden
| | - David J Riley
- Augmented DMTA Platform, R&D IT, AstraZeneca, The Discovery Centre (DISC), Francis Crick Avenue, Cambridge CB2 0AA, UK
| | - Christian Tyrchan
- Research and Early Development, Respiratory and Immunology (R&I), Biopharmaceuticals R&D, AstraZeneca, Pepparedsleden, Mölndal, SE 43183, Sweden
| | - Prakash Chandra Rathi
- Augmented DMTA Platform, R&D IT, AstraZeneca, The Discovery Centre (DISC), Francis Crick Avenue, Cambridge CB2 0AA, UK
| |
Collapse
|
4
|
Voinarovska V, Kabeshov M, Dudenko D, Genheden S, Tetko IV. When Yield Prediction Does Not Yield Prediction: An Overview of the Current Challenges. J Chem Inf Model 2024; 64:42-56. [PMID: 38116926 PMCID: PMC10778086 DOI: 10.1021/acs.jcim.3c01524] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2023] [Revised: 11/29/2023] [Accepted: 11/30/2023] [Indexed: 12/21/2023]
Abstract
Machine Learning (ML) techniques face significant challenges when predicting advanced chemical properties, such as yield, feasibility of chemical synthesis, and optimal reaction conditions. These challenges stem from the high-dimensional nature of the prediction task and the myriad essential variables involved, ranging from reactants and reagents to catalysts, temperature, and purification processes. Successfully developing a reliable predictive model not only holds the potential for optimizing high-throughput experiments but can also elevate existing retrosynthetic predictive approaches and bolster a plethora of applications within the field. In this review, we systematically evaluate the efficacy of current ML methodologies in chemoinformatics, shedding light on their milestones and inherent limitations. Additionally, a detailed examination of a representative case study provides insights into the prevailing issues related to data availability and transferability in the discipline.
Collapse
Affiliation(s)
- Varvara Voinarovska
- Molecular
AI, Discovery Sciences R&D, AstraZeneca, 431 83 Gothenburg, Sweden
- TUM
Graduate School, Faculty of Chemistry, Technical
University of Munich, 85748 Garching, Germany
| | - Mikhail Kabeshov
- Molecular
AI, Discovery Sciences R&D, AstraZeneca, 431 83 Gothenburg, Sweden
| | - Dmytro Dudenko
- Enamine
Ltd., 78 Chervonotkatska str., 02094 Kyiv, Ukraine
| | - Samuel Genheden
- Molecular
AI, Discovery Sciences R&D, AstraZeneca, 431 83 Gothenburg, Sweden
| | - Igor V. Tetko
- Molecular
Targets and Therapeutics Center, Helmholtz Munich − Deutsches
Forschungszentrum für Gesundheit und Umwelt (GmbH), Institute of Structural Biology, 85764 Neuherberg, Germany
| |
Collapse
|
5
|
Liu W, Mulhearn J, Hao B, Cañellas S, Last S, Gómez JE, Jones A, De Vera A, Kumar K, Rodríguez R, Van Eynde L, Strambeanu II, Wolkenberg SE. Enabling Deoxygenative C(sp 2)-C(sp 3) Cross-Coupling for Parallel Medicinal Chemistry. ACS Med Chem Lett 2023; 14:853-859. [PMID: 37312855 PMCID: PMC10258906 DOI: 10.1021/acsmedchemlett.3c00118] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2023] [Accepted: 05/10/2023] [Indexed: 06/15/2023] Open
Abstract
Herein we report the development of an automated deoxygenative C(sp2)-C(sp3) coupling of aryl bromide with alcohols to enable parallel medicinal chemistry. Alcohols are among the most diverse and abundant building blocks, but their usage as alkyl precursors has been limited. Although metallaphotoredox deoxygenative coupling is becoming a promising strategy to form C(sp2)-C(sp3) bond, the reaction setup limits its widespread application in library synthesis. To achieve high throughput and consistency, an automated workflow involving solid-dosing and liquid-handling robots has been developed. We have successfully demonstrated this high-throughput protocol is robust and consistent across three automation platforms. Furthermore, guided by cheminformatic analysis, we examined alcohols with comprehensive chemical space coverage and established a meaningful scope for medicinal chemistry applications. By accessing the rich diversity of alcohols, this automated protocol has the potential to substantially increase the impact of C(sp2)-C(sp3) cross-coupling in drug discovery.
Collapse
Affiliation(s)
- Wei Liu
- Discovery
Chemistry, Janssen Research & Development
LLC, 1400 McKean Road, Spring House, Pennsylvania 19477, United States
| | - James Mulhearn
- Discovery
Chemistry, Janssen Research & Development
LLC, 1400 McKean Road, Spring House, Pennsylvania 19477, United States
| | - Bo Hao
- Discovery
Chemistry, Janssen Research & Development
LLC, 1400 McKean Road, Spring House, Pennsylvania 19477, United States
| | - Santiago Cañellas
- Discovery
Chemistry, Janssen Research & Development LLC, Janssen-Cilag, S.A., E-45007 Toledo, Spain
| | - Stefaan Last
- Discovery
Chemistry, Janssen Research & Development
LLC, 2340 Beerse, Belgium
| | - José Enrique Gómez
- Discovery
Chemistry, Janssen Research & Development LLC, Janssen-Cilag, S.A., E-45007 Toledo, Spain
| | - Alexander Jones
- Discovery
Chemistry, Janssen Research & Development
LLC, 2340 Beerse, Belgium
| | - Alexander De Vera
- Discovery
Chemistry, Janssen Research & Development
LLC, 1400 McKean Road, Spring House, Pennsylvania 19477, United States
| | - Kiran Kumar
- Discovery
Chemistry, Janssen Research & Development
LLC, 1400 McKean Road, Spring House, Pennsylvania 19477, United States
| | - Raquel Rodríguez
- Discovery
Chemistry, Janssen Research & Development LLC, Janssen-Cilag, S.A., E-45007 Toledo, Spain
| | - Lars Van Eynde
- Discovery
Chemistry, Janssen Research & Development
LLC, 2340 Beerse, Belgium
| | - Iulia I. Strambeanu
- Discovery
Chemistry, Janssen Research & Development
LLC, 1400 McKean Road, Spring House, Pennsylvania 19477, United States
| | - Scott E. Wolkenberg
- Discovery
Chemistry, Janssen Research & Development
LLC, 1400 McKean Road, Spring House, Pennsylvania 19477, United States
| |
Collapse
|
6
|
Neves P, McClure K, Verhoeven J, Dyubankova N, Nugmanov R, Gedich A, Menon S, Shi Z, Wegner JK. Correction: Global reactivity models are impactful in industrial synthesis applications. J Cheminform 2023; 15:30. [PMID: 36864495 PMCID: PMC9983201 DOI: 10.1186/s13321-023-00705-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/04/2023] Open
Affiliation(s)
- Paulo Neves
- In-Silico Discovery and External Innovation (ISDEI), Janssen Research & Development, Janssen Pharmaceutica N.V, Beerse, Belgium.
| | - Kelly McClure
- Discovery Chemistry LJ, Janssen Research & Development, Janssen Pharmaceutica N.V, Philadelphia, USA
| | - Jonas Verhoeven
- grid.419619.20000 0004 0623 0341In-Silico Discovery and External Innovation (ISDEI), Janssen Research & Development, Janssen Pharmaceutica N.V, Beerse, Belgium
| | - Natalia Dyubankova
- grid.419619.20000 0004 0623 0341In-Silico Discovery and External Innovation (ISDEI), Janssen Research & Development, Janssen Pharmaceutica N.V, Beerse, Belgium
| | - Ramil Nugmanov
- grid.419619.20000 0004 0623 0341In-Silico Discovery and External Innovation (ISDEI), Janssen Research & Development, Janssen Pharmaceutica N.V, Beerse, Belgium
| | | | - Sairam Menon
- grid.419619.20000 0004 0623 0341Pharma R&D Information Tech, Janssen Research & Development, Janssen Pharmaceutica N.V, Beerse, Belgium
| | - Zhicai Shi
- Discovery Chemistry LJ, Janssen Research & Development, Janssen Pharmaceutica N.V, Philadelphia, USA
| | - Jörg K. Wegner
- grid.419619.20000 0004 0623 0341In-Silico Discovery and External Innovation (ISDEI), Janssen Research & Development, Janssen Pharmaceutica N.V, Beerse, Belgium
| |
Collapse
|