1
|
Gricourt G, Meyer P, Duigou T, Faulon JL. Artificial Intelligence Methods and Models for Retro-Biosynthesis: A Scoping Review. ACS Synth Biol 2024; 13:2276-2294. [PMID: 39047143 PMCID: PMC11334239 DOI: 10.1021/acssynbio.4c00091] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2024] [Revised: 06/14/2024] [Accepted: 06/14/2024] [Indexed: 07/27/2024]
Abstract
Retrosynthesis aims to efficiently plan the synthesis of desirable chemicals by strategically breaking down molecules into readily available building block compounds. Having a long history in chemistry, retro-biosynthesis has also been used in the fields of biocatalysis and synthetic biology. Artificial intelligence (AI) is driving us toward new frontiers in synthesis planning and the exploration of chemical spaces, arriving at an opportune moment for promoting bioproduction that would better align with green chemistry, enhancing environmental practices. In this review, we summarize the recent advancements in the application of AI methods and models for retrosynthetic and retro-biosynthetic pathway design. These techniques can be based either on reaction templates or generative models and require scoring functions and planning strategies to navigate through the retrosynthetic graph of possibilities. We finally discuss limitations and promising research directions in this field.
Collapse
Affiliation(s)
- Guillaume Gricourt
- Université
Paris-Saclay, INRAE, AgroParisTech, Micalis
Institute, 78350 Jouy-en-Josas, France
| | - Philippe Meyer
- Université
Paris-Saclay, INRAE, AgroParisTech, Micalis
Institute, 78350 Jouy-en-Josas, France
| | - Thomas Duigou
- Université
Paris-Saclay, INRAE, AgroParisTech, Micalis
Institute, 78350 Jouy-en-Josas, France
| | - Jean-Loup Faulon
- Université
Paris-Saclay, INRAE, AgroParisTech, Micalis
Institute, 78350 Jouy-en-Josas, France
- The
University of Manchester, Manchester Institute
of Biotechnology, Manchester M1 7DN, U.K.
| |
Collapse
|
2
|
Westerlund AM, Manohar Koki S, Kancharla S, Tibo A, Saigiridharan L, Kabeshov M, Mercado R, Genheden S. Do Chemformers Dream of Organic Matter? Evaluating a Transformer Model for Multistep Retrosynthesis. J Chem Inf Model 2024; 64:3021-3033. [PMID: 38602390 DOI: 10.1021/acs.jcim.3c01685] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/12/2024]
Abstract
Synthesis planning of new pharmaceutical compounds is a well-known bottleneck in modern drug design. Template-free methods, such as transformers, have recently been proposed as an alternative to template-based methods for single-step retrosynthetic predictions. Here, we trained and evaluated a transformer model, called the Chemformer, for retrosynthesis predictions within drug discovery. The proprietary data set used for training comprised ∼18 M reactions from literature, patents, and electronic lab notebooks. Chemformer was evaluated for the purpose of both single-step and multistep retrosynthesis. We found that the single-step performance of Chemformer was especially good on reaction classes common in drug discovery, with most reaction classes showing a top-10 round-trip accuracy above 0.97. Moreover, Chemformer reached a higher round-trip accuracy compared to that of a template-based model. By analyzing multistep retrosynthesis experiments, we observed that Chemformer found synthetic routes, leading to commercial starting materials for 95% of the target compounds, an increase of more than 20% compared to the template-based model on a proprietary compound data set. In addition to this, we discovered that Chemformer suggested novel disconnections corresponding to reaction templates, which are not included in the template-based model. These findings were further supported by a publicly available ChEMBL compound data set. The conclusions drawn from this work allow for the design of a synthesis planning tool where template-based and template-free models work in harmony to optimize retrosynthetic recommendations.
Collapse
Affiliation(s)
- Annie M Westerlund
- Department of Molecular AI, Discovery Sciences, R&D, AstraZeneca, 43183 Mölndal, Sweden
| | - Siva Manohar Koki
- Department of Molecular AI, Discovery Sciences, R&D, AstraZeneca, 43183 Mölndal, Sweden
- Department of Computer Science and Engineering, Chalmers University of Technology, 412 96 Göteborg, Sweden
| | - Supriya Kancharla
- Department of Molecular AI, Discovery Sciences, R&D, AstraZeneca, 43183 Mölndal, Sweden
- Department of Computer Science and Engineering, Chalmers University of Technology, 412 96 Göteborg, Sweden
| | - Alessandro Tibo
- Department of Molecular AI, Discovery Sciences, R&D, AstraZeneca, 43183 Mölndal, Sweden
| | | | - Mikhail Kabeshov
- Department of Molecular AI, Discovery Sciences, R&D, AstraZeneca, 43183 Mölndal, Sweden
| | - Rocío Mercado
- Department of Computer Science and Engineering, Chalmers University of Technology, 412 96 Göteborg, Sweden
| | - Samuel Genheden
- Department of Molecular AI, Discovery Sciences, R&D, AstraZeneca, 43183 Mölndal, Sweden
| |
Collapse
|
3
|
Esheh J, Affes S. Effectiveness of Data Augmentation for Localization in WSNs Using Deep Learning for the Internet of Things. SENSORS (BASEL, SWITZERLAND) 2024; 24:430. [PMID: 38257522 PMCID: PMC11154441 DOI: 10.3390/s24020430] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/20/2023] [Revised: 12/14/2023] [Accepted: 12/22/2023] [Indexed: 01/24/2024]
Abstract
Wireless sensor networks (WSNs) have become widely popular and are extensively used for various sensor communication applications due to their flexibility and cost effectiveness, especially for applications where localization is a main challenge. Furthermore, the Dv-hop algorithm is a range-free localization algorithm commonly used in WSNs. Despite its simplicity and low hardware requirements, it does suffer from limitations in terms of localization accuracy. In this article, we develop an accurate Deep Learning (DL)-based range-free localization for WSN applications in the Internet of things (IoT). To improve the localization performance, we exploit a deep neural network (DNN) to correct the estimated distance between the unknown nodes (i.e., position-unaware) and the anchor nodes (i.e., position-aware) without burdening the IoT cost. DL needs large training data to yield accurate results, and the DNN is no stranger. The efficacy of machine learning, including DNNs, hinges on access to substantial training data for optimal performance. However, to address this challenge, we propose a solution through the implementation of a Data Augmentation Strategy (DAS). This strategy involves the strategic creation of multiple virtual anchors around the existing real anchors. Consequently, this process generates more training data and significantly increases data size. We prove that DAS can provide the DNNs with sufficient training data, and ultimately making it more feasible for WSNs and the IoT to fully benefit from low-cost DNN-aided localization. The simulation results indicate that the accuracy of the proposed (Dv-hop with DNN correction) surpasses that of Dv-hop.
Collapse
Affiliation(s)
- Jehan Esheh
- EMT Centre (Energy, Materials and Telecommunications), INRS (Institut National de la Recherche Scientifique), Université du Québec, Montréal, QC H5A 1K6, Canada
| | - Sofiene Affes
- EMT Centre (Energy, Materials and Telecommunications), INRS (Institut National de la Recherche Scientifique), Université du Québec, Montréal, QC H5A 1K6, Canada
| |
Collapse
|