1
|
Roucairol M, Georgiou A, Cazenave T, Prischi F, Pardo OE. DrugSynthMC: An Atom-Based Generation of Drug-like Molecules with Monte Carlo Search. J Chem Inf Model 2024; 64:7097-7107. [PMID: 39249497 PMCID: PMC11423341 DOI: 10.1021/acs.jcim.4c01451] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/10/2024]
Abstract
A growing number of deep learning (DL) methodologies have recently been developed to design novel compounds and expand the chemical space within virtual libraries. Most of these neural network approaches design molecules to specifically bind a target based on its structural information and/or knowledge of previously identified binders. Fewer attempts have been made to develop approaches for de novo design of virtual libraries, as synthesizability of generated molecules remains a challenge. In this work, we developed a new Monte Carlo Search (MCS) algorithm, DrugSynthMC (Drug Synthesis using Monte Carlo), in conjunction with DL and statistical-based priors to generate thousands of interpretable chemical structures and novel drug-like molecules per second. DrugSynthMC produces drug-like compounds using an atom-based search model that builds molecules as SMILES, character by character. Designed molecules follow Lipinski's "rule of 5″, show a high proportion of highly water-soluble nontoxic predicted-to-be synthesizable compounds, and efficiently expand the chemical space within the libraries, without reliance on training data sets, synthesizability metrics, or enforcing during SMILES generation. Our approach can function with or without an underlying neural network and is thus easily explainable and versatile. This ease in drug-like molecule generation allows for future integration of score functions aimed at different target- or job-oriented goals. Thus, DrugSynthMC is expected to enable the functional assessment of large compound libraries covering an extensive novel chemical space, overcoming the limitations of existing drug collections. The software is available at https://github.com/RoucairolMilo/DrugSynthMC.
Collapse
Affiliation(s)
- Milo Roucairol
- LAMSADE, Université Paris-Dauphine, Pl. du Maréchal de Lattre de Tassigny, 75016 Paris, France
| | - Alexios Georgiou
- LAMSADE, Université Paris-Dauphine, Pl. du Maréchal de Lattre de Tassigny, 75016 Paris, France
| | - Tristan Cazenave
- LAMSADE, Université Paris-Dauphine, Pl. du Maréchal de Lattre de Tassigny, 75016 Paris, France
| | - Filippo Prischi
- Randall Centre for Cell and Molecular Biophysics, School of Basic and Medical Biosciences, King's College London, London SE1 1UL, United Kingdom
| | - Olivier E Pardo
- Division of Cancer, Department of Surgery and Cancer, Imperial College, Du Cane Road, London W12 0NN, United Kingdom
| |
Collapse
|
2
|
Chen S, Noh J, Jang J, Kim S, Gu GH, Jung Y. Reaction Templates: Bridging Synthesis Knowledge and Artificial Intelligence. Acc Chem Res 2024; 57:1964-1972. [PMID: 38924502 DOI: 10.1021/acs.accounts.4c00261] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/28/2024]
Abstract
ConspectusThe field of chemical research boasts a long history of developing software to automate synthesis planning and reaction prediction. Early software relied heavily on expert systems, requiring significant effort to encode vast amounts of synthesis knowledge into a computer-readable format. However, recent advancements in deep learning have shifted the focus toward AI models, offering improved prediction capabilities. Despite these advancements, current AI models often lack the integration of known synthesis rules and intuitions, creating a gap that hinders interpretability and future development of the models. To bridge them, our research group has been actively working on incorporating reaction templates into deep learning models, achieving promising results across various applications.In this Account, we present our latest works to incorporate the known synthesis knowledge into the deep learning models through the utilization of reaction templates. We begin by highlighting the limitations of early computer programs heavily reliant on hand-coded rules. These programs, while providing a foundation for the field, presented limitations in scalability and adaptability. We then introduce SMARTS (SMILES arbitrary target specification), a popular Python-readable format for representing chemical reactions. This format of reaction encoding facilitates the quick integration of synthesis knowledge into AI models built using the Python language. With the SMARTS-based reaction templates, we introduce our recent efforts of developing an AI model for reaction-based molecule optimization. Subsequently, we discuss the recent efforts to automate the extraction of reaction templates from vast chemical reaction databases. This approach eliminates the previously required manual effort of encoding knowledge, a process that could be time-consuming and prone to error when dealing with large data sets. By customizing the automated extraction algorithm, we have developed powerful AI models for specific tasks such as retrosynthesis (LocalRetro), reaction outcome prediction (LocalTransform), and atom-to-atom mapping (LocalMapper). These models, aligned with the intuition of chemists, demonstrate the effectiveness of incorporating reaction templates into deep learning frameworks.Looking toward the future, we believe that utilizing reaction templates to connect known chemical knowledge and AI models holds immense potential for various applications. Not only can this approach significantly benefit future AI models focused on challenging tasks like reaction mechanism labeling and prediction, but we anticipate it can also extend its reach to the realm of inorganic synthesis. By integrating synthesis knowledge, we can not only achieve improved performance but also enhance the interpretability of AI models, paving the way for further advancements in AI-powered chemical synthesis.
Collapse
Affiliation(s)
- Shuan Chen
- Department of Chemical and Biological Engineering, and Institute of Chemical Process, Seoul National University, 1 Gwanak-ro, Gwanak-gu, Seoul 08826, South Korea
| | - Juhwan Noh
- Chemical Data-Driven Research Center, Korea Research Institute of Chemical Technology (KRICT), 141 Gajeong-ro, Yuseong-gu, Daejeon 34114, South Korea
| | - Jidon Jang
- Data Convergence Drug Research Center, Korea Research Institute of Chemical Technology (KRICT), 141 Gajeong-ro, Yuseong-gu, Daejeon 34114, South Korea
| | - Seongmin Kim
- Department of Chemical and Biomolecular Engineering, Korea Advanced Institute of Science and Technology (KAIST), 291, Daehak-ro, Yuseong-gu, Daejeon 34141, South Korea
| | - Geun Ho Gu
- Department of Energy Engineering, Korea Institute of Energy Technology (KENTECH), 21 Kentech-gil, Naju, Jeonnam 58330, South Korea
| | - Yousung Jung
- Department of Chemical and Biological Engineering, and Institute of Chemical Process, Seoul National University, 1 Gwanak-ro, Gwanak-gu, Seoul 08826, South Korea
- Institute of Engineering Research, Seoul National University, 1 Gwanak-ro, Gwanak-gu, Seoul 08826, South Korea
- Interdisciplinary Program in Artificial Intelligence, Seoul National University, 1 Gwanak-ro, Gwanak-gu, Seoul 08826, South Korea
| |
Collapse
|
3
|
Jiang X, Lu L, Li J, Jiang J, Zhang J, Zhou S, Wen H, Cai H, Luo X, Li Z, Wang J, Ju B, Bai R. Synthetically Feasible De Novo Molecular Design of Leads Based on a Reinforcement Learning Model: AI-Assisted Discovery of an Anti-IBD Lead Targeting CXCR4. J Med Chem 2024; 67:10057-10075. [PMID: 38863440 DOI: 10.1021/acs.jmedchem.4c00184] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/13/2024]
Abstract
Artificial intelligence (AI) de novo molecular generation provides leads with novel structures for drug discovery. However, the target affinity and synthesizability of the generated molecules present critical challenges for the successful application of AI technology. Therefore, we developed an advanced reinforcement learning model to bridge the gap between the theory of de novo molecular generation and the practical aspects of drug discovery. This model utilizes chemical reaction templates and commercially available building blocks as a starting point and employs forward reaction prediction to generate molecules, while real-time docking and drug-likeness predictions are conducted to ensure synthesizability and drug-likeness. We applied this model to design active molecules targeting the inflammation-related receptor CXCR4 and successfully prepared them according to the AI-proposed synthetic routes. Several molecules exhibited potent anti-CXCR4 and anti-inflammatory activity in subsequent in vitro and in vivo assays. The top-performing compound XVI alleviated symptoms related to inflammatory bowel disease and showed reasonable pharmacokinetic properties.
Collapse
Affiliation(s)
- Xiaoying Jiang
- School of Pharmacy, Hangzhou Normal University, Hangzhou 311121, PR China
- Key Laboratory of Elemene Class Anti-Cancer Chinese Medicines; Engineering Laboratory of Development and Application of Traditional Chinese Medicines; Collaborative Innovation Center of Traditional Chinese Medicines of Zhejiang Province, Hangzhou Normal University, Hangzhou 311121, PR China
| | - Liuxin Lu
- School of Pharmacy, Hangzhou Normal University, Hangzhou 311121, PR China
- Key Laboratory of Elemene Class Anti-Cancer Chinese Medicines; Engineering Laboratory of Development and Application of Traditional Chinese Medicines; Collaborative Innovation Center of Traditional Chinese Medicines of Zhejiang Province, Hangzhou Normal University, Hangzhou 311121, PR China
| | - Junjie Li
- School of Pharmacy, Hangzhou Normal University, Hangzhou 311121, PR China
- Key Laboratory of Elemene Class Anti-Cancer Chinese Medicines; Engineering Laboratory of Development and Application of Traditional Chinese Medicines; Collaborative Innovation Center of Traditional Chinese Medicines of Zhejiang Province, Hangzhou Normal University, Hangzhou 311121, PR China
| | - Jing Jiang
- SanOmics AI Co. Ltd., Hangzhou 311103, PR China
| | - Jiapeng Zhang
- School of Pharmaceutical Science and Technology, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Hangzhou 310024, PR China
| | - Shengbin Zhou
- School of Pharmaceutical Science and Technology, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Hangzhou 310024, PR China
| | - Hao Wen
- School of Pharmacy, Hangzhou Normal University, Hangzhou 311121, PR China
- Key Laboratory of Elemene Class Anti-Cancer Chinese Medicines; Engineering Laboratory of Development and Application of Traditional Chinese Medicines; Collaborative Innovation Center of Traditional Chinese Medicines of Zhejiang Province, Hangzhou Normal University, Hangzhou 311121, PR China
| | - Hong Cai
- School of Pharmacy, Hangzhou Normal University, Hangzhou 311121, PR China
- Key Laboratory of Elemene Class Anti-Cancer Chinese Medicines; Engineering Laboratory of Development and Application of Traditional Chinese Medicines; Collaborative Innovation Center of Traditional Chinese Medicines of Zhejiang Province, Hangzhou Normal University, Hangzhou 311121, PR China
| | - Xinyu Luo
- School of Pharmacy, Hangzhou Normal University, Hangzhou 311121, PR China
- Key Laboratory of Elemene Class Anti-Cancer Chinese Medicines; Engineering Laboratory of Development and Application of Traditional Chinese Medicines; Collaborative Innovation Center of Traditional Chinese Medicines of Zhejiang Province, Hangzhou Normal University, Hangzhou 311121, PR China
| | - Zhen Li
- SanOmics AI Co. Ltd., Hangzhou 311103, PR China
| | - Jiahui Wang
- School of Pharmacy, Hangzhou Normal University, Hangzhou 311121, PR China
- Key Laboratory of Elemene Class Anti-Cancer Chinese Medicines; Engineering Laboratory of Development and Application of Traditional Chinese Medicines; Collaborative Innovation Center of Traditional Chinese Medicines of Zhejiang Province, Hangzhou Normal University, Hangzhou 311121, PR China
| | - Bin Ju
- SanOmics AI Co. Ltd., Hangzhou 311103, PR China
| | - Renren Bai
- School of Pharmacy, Hangzhou Normal University, Hangzhou 311121, PR China
- Key Laboratory of Elemene Class Anti-Cancer Chinese Medicines; Engineering Laboratory of Development and Application of Traditional Chinese Medicines; Collaborative Innovation Center of Traditional Chinese Medicines of Zhejiang Province, Hangzhou Normal University, Hangzhou 311121, PR China
| |
Collapse
|
4
|
Krishnan SR, Bung N, Srinivasan R, Roy A. Target-specific novel molecules with their recipe: Incorporating synthesizability in the design process. J Mol Graph Model 2024; 129:108734. [PMID: 38442440 DOI: 10.1016/j.jmgm.2024.108734] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2023] [Revised: 02/14/2024] [Accepted: 02/15/2024] [Indexed: 03/07/2024]
Abstract
Application of Artificial intelligence (AI) in drug discovery has led to several success stories in recent times. While traditional methods mostly relied upon screening large chemical libraries for early-stage drug-design, de novo design can help identify novel target-specific molecules by sampling from a much larger chemical space. Although this has increased the possibility of finding diverse and novel molecules from previously unexplored chemical space, this has also posed a great challenge for medicinal chemists to synthesize at least some of the de novo designed novel molecules for experimental validation. To address this challenge, in this work, we propose a novel forward synthesis-based generative AI method, which is used to explore the synthesizable chemical space. The method uses a structure-based drug design framework, where the target protein structure and a target-specific seed fragment from co-crystal structures can be the initial inputs. A random fragment from a purchasable fragment library can also be the input if a target-specific fragment is unavailable. Then a template-based forward synthesis route prediction and molecule generation is performed in parallel using the Monte Carlo Tree Search (MCTS) method where, the subsequent fragments for molecule growth can again be obtained from a purchasable fragment library. The rewards for each iteration of MCTS are computed using a drug-target affinity (DTA) model based on the docking pose of the generated reaction intermediates at the binding site of the target protein of interest. With the help of the proposed method, it is now possible to overcome one of the major obstacles posed to the AI-based drug design approaches through the ability of the method to design novel target-specific synthesizable molecules.
Collapse
Affiliation(s)
| | - Navneet Bung
- TCS Research (Life Sciences Division), Tata Consultancy Services Limited, Hyderabad, 500081, India
| | - Rajgopal Srinivasan
- TCS Research (Life Sciences Division), Tata Consultancy Services Limited, Hyderabad, 500081, India
| | - Arijit Roy
- TCS Research (Life Sciences Division), Tata Consultancy Services Limited, Hyderabad, 500081, India.
| |
Collapse
|
5
|
Fromer JC, Coley CW. An algorithmic framework for synthetic cost-aware decision making in molecular design. NATURE COMPUTATIONAL SCIENCE 2024; 4:440-450. [PMID: 38886590 DOI: 10.1038/s43588-024-00639-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/20/2023] [Accepted: 05/07/2024] [Indexed: 06/20/2024]
Abstract
Small molecules exhibiting desirable property profiles are often discovered through an iterative process of designing, synthesizing and testing sets of molecules. The selection of molecules to synthesize from all possible candidates is a complex decision-making process that typically relies on expert chemist intuition. Here we propose a quantitative decision-making framework, SPARROW, that prioritizes molecules for evaluation by balancing expected information gain and synthetic cost. SPARROW integrates molecular design, property prediction and retrosynthetic planning to balance the utility of testing a molecule with the cost of batch synthesis. We demonstrate, through three case studies, that the developed algorithm captures the non-additive costs inherent to batch synthesis, leverages common reaction steps and intermediates, and scales to hundreds of molecules.
Collapse
Affiliation(s)
- Jenna C Fromer
- Department of Chemical Engineering, MIT, Cambridge, MA, USA
| | - Connor W Coley
- Department of Chemical Engineering, MIT, Cambridge, MA, USA.
- Department of Electrical Engineering and Computer Science, MIT, Cambridge, MA, USA.
| |
Collapse
|
6
|
Lim H. Development of scoring-assisted generative exploration (SAGE) and its application to dual inhibitor design for acetylcholinesterase and monoamine oxidase B. J Cheminform 2024; 16:59. [PMID: 38790018 PMCID: PMC11127438 DOI: 10.1186/s13321-024-00845-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2023] [Accepted: 04/26/2024] [Indexed: 05/26/2024] Open
Abstract
De novo molecular design is the process of searching chemical space for drug-like molecules with desired properties, and deep learning has been recognized as a promising solution. In this study, I developed an effective computational method called Scoring-Assisted Generative Exploration (SAGE) to enhance chemical diversity and property optimization through virtual synthesis simulation, the generation of bridged bicyclic rings, and multiple scoring models for drug-likeness. In six protein targets, SAGE generated molecules with high scores within reasonable numbers of steps by optimizing target specificity without a constraint and even with multiple constraints such as synthetic accessibility, solubility, and metabolic stability. Furthermore, I suggested a top-ranked molecule with SAGE as dual inhibitors of acetylcholinesterase and monoamine oxidase B through multiple desired property optimization. Therefore, SAGE can generate molecules with desired properties by optimizing multiple properties simultaneously, indicating the importance of de novo design strategies in the future of drug discovery and development. SCIENTIFIC CONTRIBUTION: The scientific contribution of this study lies in the development of the Scoring-Assisted Generative Exploration (SAGE) method, a novel computational approach that significantly enhances de novo molecular design. SAGE uniquely integrates virtual synthesis simulation, the generation of complex bridged bicyclic rings, and multiple scoring models to optimize drug-like properties comprehensively. By efficiently generating molecules that meet a broad spectrum of pharmacological criteria-including target specificity, synthetic accessibility, solubility, and metabolic stability-within a reasonable number of steps, SAGE represents a substantial advancement over traditional methods. Additionally, the application of SAGE to discover dual inhibitors for acetylcholinesterase and monoamine oxidase B not only demonstrates its potential to streamline and enhance the drug development process but also highlights its capacity to create more effective and precisely targeted therapies. This study emphasizes the critical and evolving role of de novo design strategies in reshaping the future of drug discovery and development, providing promising avenues for innovative therapeutic discoveries.
Collapse
Affiliation(s)
- Hocheol Lim
- Bioinformatics and Molecular Design Research Center (BMDRC), Incheon, Republic of Korea.
| |
Collapse
|
7
|
Atz K, Cotos L, Isert C, Håkansson M, Focht D, Hilleke M, Nippa DF, Iff M, Ledergerber J, Schiebroek CCG, Romeo V, Hiss JA, Merk D, Schneider P, Kuhn B, Grether U, Schneider G. Prospective de novo drug design with deep interactome learning. Nat Commun 2024; 15:3408. [PMID: 38649351 PMCID: PMC11035696 DOI: 10.1038/s41467-024-47613-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2023] [Accepted: 04/02/2024] [Indexed: 04/25/2024] Open
Abstract
De novo drug design aims to generate molecules from scratch that possess specific chemical and pharmacological properties. We present a computational approach utilizing interactome-based deep learning for ligand- and structure-based generation of drug-like molecules. This method capitalizes on the unique strengths of both graph neural networks and chemical language models, offering an alternative to the need for application-specific reinforcement, transfer, or few-shot learning. It enables the "zero-shot" construction of compound libraries tailored to possess specific bioactivity, synthesizability, and structural novelty. In order to proactively evaluate the deep interactome learning framework for protein structure-based drug design, potential new ligands targeting the binding site of the human peroxisome proliferator-activated receptor (PPAR) subtype gamma are generated. The top-ranking designs are chemically synthesized and computationally, biophysically, and biochemically characterized. Potent PPAR partial agonists are identified, demonstrating favorable activity and the desired selectivity profiles for both nuclear receptors and off-target interactions. Crystal structure determination of the ligand-receptor complex confirms the anticipated binding mode. This successful outcome positively advocates interactome-based de novo design for application in bioorganic and medicinal chemistry, enabling the creation of innovative bioactive molecules.
Collapse
Affiliation(s)
- Kenneth Atz
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 4, 8093, Zurich, Switzerland
| | - Leandro Cotos
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 4, 8093, Zurich, Switzerland
| | - Clemens Isert
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 4, 8093, Zurich, Switzerland
| | - Maria Håkansson
- SARomics Biostructures AB, Medicon Village, SE-223 81, Lund, Sweden
| | - Dorota Focht
- SARomics Biostructures AB, Medicon Village, SE-223 81, Lund, Sweden
| | - Mattis Hilleke
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 4, 8093, Zurich, Switzerland
| | - David F Nippa
- Roche Pharma Research and Early Development (pRED), Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd., Grenzacherstrasse 124, CH-4070, Basel, Switzerland
- Department of Pharmacy, Ludwig-Maximilians-Universität München, Butenandtstrasse 5, 81377, Munich, Germany
| | - Michael Iff
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 4, 8093, Zurich, Switzerland
| | - Jann Ledergerber
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 4, 8093, Zurich, Switzerland
| | - Carl C G Schiebroek
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 4, 8093, Zurich, Switzerland
| | - Valentina Romeo
- Roche Pharma Research and Early Development (pRED), Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd., Grenzacherstrasse 124, CH-4070, Basel, Switzerland
| | - Jan A Hiss
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 4, 8093, Zurich, Switzerland
| | - Daniel Merk
- Department of Pharmacy, Ludwig-Maximilians-Universität München, Butenandtstrasse 5, 81377, Munich, Germany
| | - Petra Schneider
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 4, 8093, Zurich, Switzerland
| | - Bernd Kuhn
- Roche Pharma Research and Early Development (pRED), Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd., Grenzacherstrasse 124, CH-4070, Basel, Switzerland
| | - Uwe Grether
- Roche Pharma Research and Early Development (pRED), Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd., Grenzacherstrasse 124, CH-4070, Basel, Switzerland
| | - Gisbert Schneider
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 4, 8093, Zurich, Switzerland.
| |
Collapse
|
8
|
Tropsha A, Isayev O, Varnek A, Schneider G, Cherkasov A. Integrating QSAR modelling and deep learning in drug discovery: the emergence of deep QSAR. Nat Rev Drug Discov 2024; 23:141-155. [PMID: 38066301 DOI: 10.1038/s41573-023-00832-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/21/2023] [Indexed: 02/08/2024]
Abstract
Quantitative structure-activity relationship (QSAR) modelling, an approach that was introduced 60 years ago, is widely used in computer-aided drug design. In recent years, progress in artificial intelligence techniques, such as deep learning, the rapid growth of databases of molecules for virtual screening and dramatic improvements in computational power have supported the emergence of a new field of QSAR applications that we term 'deep QSAR'. Marking a decade from the pioneering applications of deep QSAR to tasks involved in small-molecule drug discovery, we herein describe key advances in the field, including deep generative and reinforcement learning approaches in molecular design, deep learning models for synthetic planning and the application of deep QSAR models in structure-based virtual screening. We also reflect on the emergence of quantum computing, which promises to further accelerate deep QSAR applications and the need for open-source and democratized resources to support computer-aided drug design.
Collapse
Affiliation(s)
| | | | | | | | - Artem Cherkasov
- University of British Columbia, Vancouver, BC, Canada.
- Photonic Inc., Coquitlam, BC, Canada.
| |
Collapse
|
9
|
Chen L, Yu L, Gao L. Potent antibiotic design via guided search from antibacterial activity evaluations. Bioinformatics 2023; 39:btad059. [PMID: 36707990 PMCID: PMC9897189 DOI: 10.1093/bioinformatics/btad059] [Citation(s) in RCA: 41] [Impact Index Per Article: 41.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2022] [Revised: 01/14/2023] [Accepted: 01/25/2023] [Indexed: 01/29/2023] Open
Abstract
MOTIVATION The emergence of drug-resistant bacteria makes the discovery of new antibiotics an urgent issue, but finding new molecules with the desired antibacterial activity is an extremely difficult task. To address this challenge, we established a framework, MDAGS (Molecular Design via Attribute-Guided Search), to optimize and generate potent antibiotic molecules. RESULTS By designing the antibacterial activity latent space and guiding the optimization of functional compounds based on this space, the model MDAGS can generate novel compounds with desirable antibacterial activity without the need for extensive expensive and time-consuming evaluations. Compared with existing antibiotics, candidate antibacterial compounds generated by MDAGS always possessed significantly better antibacterial activity and ensured high similarity. Furthermore, although without explicit constraints on similarity to known antibiotics, these candidate antibacterial compounds all exhibited the highest structural similarity to antibiotics of expected function in the DrugBank database query. Overall, our approach provides a viable solution to the problem of bacterial drug resistance. AVAILABILITY AND IMPLEMENTATION Code of the model and datasets can be downloaded from GitHub (https://github.com/LiangYu-Xidian/MDAGS). SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Lu Chen
- School of Computer Science and Technology, Xidian University, Xi’an 710071, Shaanxi, China
| | - Liang Yu
- School of Computer Science and Technology, Xidian University, Xi’an 710071, Shaanxi, China
| | - Lin Gao
- School of Computer Science and Technology, Xidian University, Xi’an 710071, Shaanxi, China
| |
Collapse
|
10
|
Zhang Y, Luo M, Wu P, Wu S, Lee TY, Bai C. Application of Computational Biology and Artificial Intelligence in Drug Design. Int J Mol Sci 2022; 23:13568. [PMID: 36362355 PMCID: PMC9658956 DOI: 10.3390/ijms232113568] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2022] [Revised: 10/29/2022] [Accepted: 11/03/2022] [Indexed: 08/24/2023] Open
Abstract
Traditional drug design requires a great amount of research time and developmental expense. Booming computational approaches, including computational biology, computer-aided drug design, and artificial intelligence, have the potential to expedite the efficiency of drug discovery by minimizing the time and financial cost. In recent years, computational approaches are being widely used to improve the efficacy and effectiveness of drug discovery and pipeline, leading to the approval of plenty of new drugs for marketing. The present review emphasizes on the applications of these indispensable computational approaches in aiding target identification, lead discovery, and lead optimization. Some challenges of using these approaches for drug design are also discussed. Moreover, we propose a methodology for integrating various computational techniques into new drug discovery and design.
Collapse
Affiliation(s)
- Yue Zhang
- School of Life and Health Sciences, School of Medicine, The Chinese University of Hong Kong, Shenzhen 518172, China
- School of Chemistry and Materials Science, University of Science and Technology of China, Hefei 230026, China
- Warshel Institute for Computational Biology, Shenzhen 518172, China
| | - Mengqi Luo
- School of Life and Health Sciences, School of Medicine, The Chinese University of Hong Kong, Shenzhen 518172, China
- South China Hospital, Health Science Center, Shenzhen University, Shenzhen 518116, China
| | - Peng Wu
- School of Biomedical Engineering, Health Science Center, Shenzhen University, Shenzhen 518055, China
| | - Song Wu
- South China Hospital, Health Science Center, Shenzhen University, Shenzhen 518116, China
| | - Tzong-Yi Lee
- School of Life and Health Sciences, School of Medicine, The Chinese University of Hong Kong, Shenzhen 518172, China
- Warshel Institute for Computational Biology, Shenzhen 518172, China
| | - Chen Bai
- School of Life and Health Sciences, School of Medicine, The Chinese University of Hong Kong, Shenzhen 518172, China
- Warshel Institute for Computational Biology, Shenzhen 518172, China
| |
Collapse
|
11
|
Akhmetshin T, Lin A, Mazitov D, Zabolotna Y, Ziaikin E, Madzhidov T, Varnek A. HyFactor: A Novel Open-Source, Graph-Based Architecture for Chemical Structure Generation. J Chem Inf Model 2022; 62:3524-3534. [PMID: 35876159 DOI: 10.1021/acs.jcim.2c00744] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Graph-based architectures are becoming increasingly popular as a tool for structure generation. Here, we introduce novel open-source architecture HyFactor in which, similar to the InChI linear notation, the number of hydrogens attached to the heavy atoms was considered instead of the bond types. HyFactor was benchmarked on the ZINC 250K, MOSES, and ChEMBL data sets against conventional graph-based architecture ReFactor, representing our implementation of the reported DEFactor architecture in the literature. On average, HyFactor models contain some 20% less fitting parameters than those of ReFactor. The two architectures display similar validity, uniqueness, and reconstruction rates. Compared to the training set compounds, HyFactor generates more similar structures than ReFactor. This could be explained by the fact that the latter generates many open-chain analogues of cyclic structures in the training set. It has been demonstrated that the reconstruction error of heavy molecules can be significantly reduced using the data augmentation technique. The codes of HyFactor and ReFactor as well as all models obtained in this study are publicly available from our GitHub repository: https://github.com/Laboratoire-de-Chemoinformatique/HyFactor.
Collapse
Affiliation(s)
- Tagir Akhmetshin
- Laboratory of Chemoinformatics, UMR 7140 CNRS, University of Strasbourg, 4 Blaise Pascal Str., 67081 Strasbourg, France
| | - Arkadii Lin
- Laboratory of Chemoinformatics, UMR 7140 CNRS, University of Strasbourg, 4 Blaise Pascal Str., 67081 Strasbourg, France
| | - Daniyar Mazitov
- Laboratory of Chemoinformatics and Molecular Modeling, Butlerov Institute of Chemistry, Kazan Federal University, 18 Kremlyovskaya Str., 420008 Kazan, Russia
| | - Yuliana Zabolotna
- Laboratory of Chemoinformatics, UMR 7140 CNRS, University of Strasbourg, 4 Blaise Pascal Str., 67081 Strasbourg, France
| | - Evgenii Ziaikin
- Laboratory of Chemoinformatics and Molecular Modeling, Butlerov Institute of Chemistry, Kazan Federal University, 18 Kremlyovskaya Str., 420008 Kazan, Russia
| | - Timur Madzhidov
- Laboratory of Chemoinformatics and Molecular Modeling, Butlerov Institute of Chemistry, Kazan Federal University, 18 Kremlyovskaya Str., 420008 Kazan, Russia
| | - Alexandre Varnek
- Laboratory of Chemoinformatics, UMR 7140 CNRS, University of Strasbourg, 4 Blaise Pascal Str., 67081 Strasbourg, France
| |
Collapse
|
12
|
Liu CH, Korablyov M, Jastrzębski S, Włodarczyk-Pruszyński P, Bengio Y, Segler M. RetroGNN: Fast Estimation of Synthesizability for Virtual Screening and De Novo Design by Learning from Slow Retrosynthesis Software. J Chem Inf Model 2022; 62:2293-2300. [PMID: 35452226 DOI: 10.1021/acs.jcim.1c01476] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
De novo molecule design algorithms often result in chemically unfeasible or synthetically inaccessible molecules. A natural idea to mitigate this problem is to bias these algorithms toward more easily synthesizable molecules using a proxy score for synthetic accessibility. However, using currently available proxies can still result in highly unrealistic compounds. Here, we propose a novel approach, RetroGNN, to estimate synthesizability. First, we search for routes using synthesis planning software for a large number of random molecules. This information is then used to train a graph neural network to predict the outcome of the synthesis planner given the target molecule, in which the regression task can be used as a synthesizability scorer. We highlight how RetroGNN can be used in generative molecule-discovery pipelines together with other scoring functions. We evaluate our approach on several QSAR-based molecule design benchmarks, for which we find synthesizable molecules with state-of-the-art scores. Compared to the virtual screening of 5 million existing molecules from the ZINC database, using RetroGNNScore with a simple fragment-based de novo design algorithm finds molecules predicted to be more likely to possess the desired activity exponentially faster, while maintaining good druglike properties and being easier to synthesize. Importantly, our deep neural network can successfully filter out hard to synthesize molecules while achieving a 105 times speedup over using retrosynthesis planning software.
Collapse
Affiliation(s)
- Cheng-Hao Liu
- Mila and Université de Montréal, 6666 St-Urbain Street, Montreal, Canada H2S 3H1.,Department of Chemistry, McGill University, 801 Sherbooke Street W, Montreal, Canada H3A 0B8
| | - Maksym Korablyov
- Mila and Université de Montréal, 6666 St-Urbain Street, Montreal, Canada H2S 3H1
| | - Stanisław Jastrzębski
- Molecule.one, Warsaw 00-815, Poland.,Faculty of Mathematics and Computer Science, Jagiellonian University, 6 Łojasiewicza Street, 30-348 Kraków, Poland
| | | | - Yoshua Bengio
- Mila and Université de Montréal, 6666 St-Urbain Street, Montreal, Canada H2S 3H1
| | - Marwin Segler
- Institute of Organic Chemistry and Center for Multiscale Theory and Computation, Westfälische Wilhelms-Universität Münster, 48149 Münster, Germany.,Microsoft Research, 21 Station Road, Cambridge, U.K. CB1 2FB
| |
Collapse
|
13
|
Ghiandoni GM, Bodkin MJ, Chen B, Hristozov D, Wallace JEA, Webster J, Gillet VJ. RENATE: A Pseudo-retrosynthetic Tool for Synthetically Accessible de novo Design. Mol Inform 2022; 41:e2100207. [PMID: 34750989 PMCID: PMC9285524 DOI: 10.1002/minf.202100207] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2021] [Accepted: 10/23/2021] [Indexed: 11/09/2022]
Abstract
Reaction-based de novo design refers to the generation of synthetically accessible molecules using transformation rules extracted from known reactions in the literature. In this context, we have previously described the extraction of reaction vectors from a reactions database and their coupling with a structure generation algorithm for the generation of novel molecules from a starting material. An issue when designing molecules from a starting material is the combinatorial explosion of possible product molecules that can be generated, especially for multistep syntheses. Here, we present the development of RENATE, a reaction-based de novo design tool, which is based on a pseudo-retrosynthetic fragmentation of a reference ligand and an inside-out approach to de novo design. The reference ligand is fragmented; each fragment is used to search for similar fragments as building blocks; the building blocks are combined into products using reaction vectors; and a synthetic route is suggested for each product molecule. The RENATE methodology is presented followed by a retrospective validation to recreate a set of approved drugs. Results show that RENATE can generate very similar or even identical structures to the corresponding input drugs, hence validating the fragmentation, search, and design heuristics implemented in the tool.
Collapse
Affiliation(s)
- Gian Marco Ghiandoni
- Information SchoolUniversity of SheffieldRegent Court, 211 PortobelloSheffieldS1 4DPUK
| | | | - Beining Chen
- Chemistry DepartmentUniversity of SheffieldDainton Building, Brook HillSheffieldS3 7HFUK
| | | | | | - James Webster
- Information SchoolUniversity of SheffieldRegent Court, 211 PortobelloSheffieldS1 4DPUK
| | - Valerie J. Gillet
- Information SchoolUniversity of SheffieldRegent Court, 211 PortobelloSheffieldS1 4DPUK
| |
Collapse
|
14
|
Bilodeau C, Jin W, Jaakkola T, Barzilay R, Jensen KF. Generative models for molecular discovery: Recent advances and challenges. WIRES COMPUTATIONAL MOLECULAR SCIENCE 2022. [DOI: 10.1002/wcms.1608] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Affiliation(s)
- Camille Bilodeau
- Department of Chemical Engineering Massachusetts Institute of Technology Cambridge Massachusetts USA
| | - Wengong Jin
- Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology Cambridge Massachusetts USA
| | - Tommi Jaakkola
- Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology Cambridge Massachusetts USA
| | - Regina Barzilay
- Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology Cambridge Massachusetts USA
| | - Klavs F. Jensen
- Department of Chemical Engineering Massachusetts Institute of Technology Cambridge Massachusetts USA
| |
Collapse
|
15
|
Kaitoh K, Yamanishi Y. Scaffold-Retained Structure Generator to Exhaustively Create Molecules in an Arbitrary Chemical Space. J Chem Inf Model 2022; 62:2212-2225. [DOI: 10.1021/acs.jcim.1c01130] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Affiliation(s)
- Kazuma Kaitoh
- Department of Bioscience and Bioinformatics, Faculty of Computer Science and Systems Engineering, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka 820-8502, Japan
| | - Yoshihiro Yamanishi
- Department of Bioscience and Bioinformatics, Faculty of Computer Science and Systems Engineering, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka 820-8502, Japan
| |
Collapse
|
16
|
Green biomanufacturing promoted by automatic retrobiosynthesis planning and computational enzyme design. Chin J Chem Eng 2022. [DOI: 10.1016/j.cjche.2021.08.017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
|
17
|
|
18
|
Abstract
INTRODUCTION The popularity and success of advanced AI methods like deep neural networks has led to novel ways for exploring chemical space. Their opaque nature poses challenges for model evaluation regarding novelty, uniqueness, and distribution of the chemical space covered. However, these methods also promise to be able to explore uncharted chemical space in novel ways that do not rely directly on structural similarity. AREAS COVERED This review provides an overview of popular deep learning methods for chemical space exploration. Crucial aspects like choice of molecular representation, training for focused chemical space exploration, and criteria for assessing and validating chemical space coverage are discussed. EXPERT OPINION Deep learning offers great potential for chemical space exploration beyond conventional fragment-based methods. Given the rarity of prospective applications and considering the difficulty in assessing representativeness and comprehensiveness of chemical space covered, developing criteria for assessing and validating generative models is of great significance. Latent space models like variational autoencoders are conceptually appealing for inverse QSAR/QSPR approaches as neighborhood relationships in latent space can be trained to reflect property similarities. Future research in understanding and interpreting generative models might lead to a better understanding of biologically relevant properties of molecules.
Collapse
Affiliation(s)
- Martin Vogt
- Department of Life Science Informatics, B-it, Limes Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich Wilhelms-Universität, Bonn, Germany
| |
Collapse
|
19
|
Multi-constraint molecular generation based on conditional transformer, knowledge distillation and reinforcement learning. NAT MACH INTELL 2021. [DOI: 10.1038/s42256-021-00403-1] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
|
20
|
Takahashi T, Chikenji G, Tokita K. Lattice protein design using Bayesian learning. Phys Rev E 2021; 104:014404. [PMID: 34412286 DOI: 10.1103/physreve.104.014404] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2020] [Accepted: 06/11/2021] [Indexed: 01/01/2023]
Abstract
Protein design is the inverse approach of the three-dimensional (3D) structure prediction for elucidating the relationship between the 3D structures and amino acid sequences. In general, the computation of the protein design involves a double loop: A loop for amino acid sequence changes and a loop for an exhaustive conformational search for each amino acid sequence. Herein, we propose a novel statistical mechanical design method using Bayesian learning, which can design lattice proteins without the exhaustive conformational search. We consider a thermodynamic hypothesis of the evolution of proteins and apply it to the prior distribution of amino acid sequences. Furthermore, we take the water effect into account in view of the grand canonical picture. As a result, on applying the 2D lattice hydrophobic-polar (HP) model, our design method successfully finds an amino acid sequence for which the target conformation has a unique ground state. However, the performance was not as good for the 3D lattice HP models compared to the 2D models. The performance of the 3D model improves on using a 20-letter lattice proteins. Furthermore, we find a strong linearity between the chemical potential of water and the number of surface residues, thereby revealing the relationship between protein structure and the effect of water molecules. The advantage of our method is that it greatly reduces computation time, because it does not require long calculations for the partition function corresponding to an exhaustive conformational search. As our method uses a general form of Bayesian learning and statistical mechanics and is not limited to lattice proteins, the results presented here elucidate some heuristics used successfully in previous protein design methods.
Collapse
Affiliation(s)
- Tomoei Takahashi
- Graduate School of Informatics, Nagoya University, Nagoya 464-8601, Japan
| | - George Chikenji
- Graduate School of Engineering, Nagoya University, Nagoya 464-8603, Japan
| | - Kei Tokita
- Graduate School of Informatics, Nagoya University, Nagoya 464-8601, Japan
| |
Collapse
|
21
|
Hermann E, Hermann G, Tremblay JC. Ethical Artificial Intelligence in Chemical Research and Development: A Dual Advantage for Sustainability. SCIENCE AND ENGINEERING ETHICS 2021; 27:45. [PMID: 34231042 PMCID: PMC8260511 DOI: 10.1007/s11948-021-00325-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/16/2021] [Accepted: 06/25/2021] [Indexed: 06/13/2023]
Abstract
Artificial intelligence can be a game changer to address the global challenge of humanity-threatening climate change by fostering sustainable development. Since chemical research and development lay the foundation for innovative products and solutions, this study presents a novel chemical research and development process backed with artificial intelligence and guiding ethical principles to account for both process- and outcome-related sustainability. Particularly in ethically salient contexts, ethical principles have to accompany research and development powered by artificial intelligence to promote social and environmental good and sustainability (beneficence) while preventing any harm (non-maleficence) for all stakeholders (i.e., companies, individuals, society at large) affected.
Collapse
Affiliation(s)
- Erik Hermann
- IHP - Leibniz-Institut für innovative Mikroelektronik, Frankfurt (Oder), Germany.
| | | | | |
Collapse
|
22
|
Fox S. Psychomotor Predictive Processing. ENTROPY (BASEL, SWITZERLAND) 2021; 23:806. [PMID: 34202804 PMCID: PMC8303599 DOI: 10.3390/e23070806] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/27/2021] [Revised: 06/22/2021] [Accepted: 06/23/2021] [Indexed: 02/06/2023]
Abstract
Psychomotor experience can be based on what people predict they will experience, rather than on sensory inputs. It has been argued that disconnects between human experience and sensory inputs can be addressed better through further development of predictive processing theory. In this paper, the scope of predictive processing theory is extended through three developments. First, by going beyond previous studies that have encompassed embodied cognition but have not addressed some fundamental aspects of psychomotor functioning. Second, by proposing a scientific basis for explaining predictive processing that spans objective neuroscience and subjective experience. Third, by providing an explanation of predictive processing that can be incorporated into the planning and operation of systems involving robots and other new technologies. This is necessary because such systems are becoming increasingly common and move us farther away from the hunter-gatherer lifestyles within which our psychomotor functioning evolved. For example, beliefs that workplace robots are threatening can generate anxiety, while wearing hardware, such as augmented reality headsets and exoskeletons, can impede the natural functioning of psychomotor systems. The primary contribution of the paper is the introduction of a new formulation of hierarchical predictive processing that is focused on psychomotor functioning.
Collapse
Affiliation(s)
- Stephen Fox
- VTT Technical Research Centre of Finland, FI-02150 Espoo, Finland
| |
Collapse
|
23
|
Meyers J, Fabian B, Brown N. De novo molecular design and generative models. Drug Discov Today 2021; 26:2707-2715. [PMID: 34082136 DOI: 10.1016/j.drudis.2021.05.019] [Citation(s) in RCA: 89] [Impact Index Per Article: 29.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2021] [Revised: 04/21/2021] [Accepted: 05/26/2021] [Indexed: 02/09/2023]
Abstract
Molecular design strategies are integral to therapeutic progress in drug discovery. Computational approaches for de novo molecular design have been developed over the past three decades and, recently, thanks in part to advances in machine learning (ML) and artificial intelligence (AI), the drug discovery field has gained practical experience. Here, we review these learnings and present de novo approaches according to the coarseness of their molecular representation: that is, whether molecular design is modeled on an atom-based, fragment-based, or reaction-based paradigm. Furthermore, we emphasize the value of strong benchmarks, describe the main challenges to using these methods in practice, and provide a viewpoint on further opportunities for exploration and challenges to be tackled in the upcoming years.
Collapse
Affiliation(s)
| | | | - Nathan Brown
- BenevolentAI, 4-8 Maple Street, London W1T 5HD, UK
| |
Collapse
|
24
|
Grisoni F, Huisman BJH, Button AL, Moret M, Atz K, Merk D, Schneider G. Combining generative artificial intelligence and on-chip synthesis for de novo drug design. SCIENCE ADVANCES 2021; 7:eabg3338. [PMID: 34117066 PMCID: PMC8195470 DOI: 10.1126/sciadv.abg3338] [Citation(s) in RCA: 51] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/27/2020] [Accepted: 04/23/2021] [Indexed: 05/24/2023]
Abstract
Automating the molecular design-make-test-analyze cycle accelerates hit and lead finding for drug discovery. Using deep learning for molecular design and a microfluidics platform for on-chip chemical synthesis, liver X receptor (LXR) agonists were generated from scratch. The computational pipeline was tuned to explore the chemical space of known LXRα agonists and generate novel molecular candidates. To ensure compatibility with automated on-chip synthesis, the chemical space was confined to the virtual products obtainable from 17 one-step reactions. Twenty-five de novo designs were successfully synthesized in flow. In vitro screening of the crude reaction products revealed 17 (68%) hits, with up to 60-fold LXR activation. The batch resynthesis, purification, and retesting of 14 of these compounds confirmed that 12 of them were potent LXR agonists. These results support the suitability of the proposed design-make-test-analyze framework as a blueprint for automated drug design with artificial intelligence and miniaturized bench-top synthesis.
Collapse
Affiliation(s)
- Francesca Grisoni
- ETH Zurich, Department of Chemistry and Applied Biosciences, RETHINK, Zurich, Switzerland.
- Eindhoven University of Technology, Department of Biomedical Engineering, Eindhoven, Netherlands
| | - Berend J H Huisman
- ETH Zurich, Department of Chemistry and Applied Biosciences, RETHINK, Zurich, Switzerland
| | - Alexander L Button
- ETH Zurich, Department of Chemistry and Applied Biosciences, RETHINK, Zurich, Switzerland
- University of Lausanne, Department of Computational Biology, Lausanne, Switzerland
| | - Michael Moret
- ETH Zurich, Department of Chemistry and Applied Biosciences, RETHINK, Zurich, Switzerland
| | - Kenneth Atz
- ETH Zurich, Department of Chemistry and Applied Biosciences, RETHINK, Zurich, Switzerland
| | - Daniel Merk
- ETH Zurich, Department of Chemistry and Applied Biosciences, RETHINK, Zurich, Switzerland.
- Goethe University Frankfurt, Institute of Pharmaceutical Chemistry, Frankfurt, Germany
| | - Gisbert Schneider
- ETH Zurich, Department of Chemistry and Applied Biosciences, RETHINK, Zurich, Switzerland.
- ETH Singapore SEC Ltd, Singapore, Singapore
| |
Collapse
|
25
|
Vatansever S, Schlessinger A, Wacker D, Kaniskan HÜ, Jin J, Zhou M, Zhang B. Artificial intelligence and machine learning-aided drug discovery in central nervous system diseases: State-of-the-arts and future directions. Med Res Rev 2021; 41:1427-1473. [PMID: 33295676 PMCID: PMC8043990 DOI: 10.1002/med.21764] [Citation(s) in RCA: 102] [Impact Index Per Article: 34.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2020] [Revised: 10/30/2020] [Accepted: 11/20/2020] [Indexed: 01/11/2023]
Abstract
Neurological disorders significantly outnumber diseases in other therapeutic areas. However, developing drugs for central nervous system (CNS) disorders remains the most challenging area in drug discovery, accompanied with the long timelines and high attrition rates. With the rapid growth of biomedical data enabled by advanced experimental technologies, artificial intelligence (AI) and machine learning (ML) have emerged as an indispensable tool to draw meaningful insights and improve decision making in drug discovery. Thanks to the advancements in AI and ML algorithms, now the AI/ML-driven solutions have an unprecedented potential to accelerate the process of CNS drug discovery with better success rate. In this review, we comprehensively summarize AI/ML-powered pharmaceutical discovery efforts and their implementations in the CNS area. After introducing the AI/ML models as well as the conceptualization and data preparation, we outline the applications of AI/ML technologies to several key procedures in drug discovery, including target identification, compound screening, hit/lead generation and optimization, drug response and synergy prediction, de novo drug design, and drug repurposing. We review the current state-of-the-art of AI/ML-guided CNS drug discovery, focusing on blood-brain barrier permeability prediction and implementation into therapeutic discovery for neurological diseases. Finally, we discuss the major challenges and limitations of current approaches and possible future directions that may provide resolutions to these difficulties.
Collapse
Affiliation(s)
- Sezen Vatansever
- Department of Genetics and Genomic SciencesIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
- Mount Sinai Center for Transformative Disease ModelingIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
- Icahn Institute for Data Science and Genomic TechnologyIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
| | - Avner Schlessinger
- Department of Pharmacological SciencesIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
- Mount Sinai Center for Therapeutics DiscoveryIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
| | - Daniel Wacker
- Department of Pharmacological SciencesIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
- Mount Sinai Center for Therapeutics DiscoveryIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
- Department of NeuroscienceIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
| | - H. Ümit Kaniskan
- Department of Pharmacological SciencesIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
- Mount Sinai Center for Therapeutics DiscoveryIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
- Department of Oncological Sciences, Tisch Cancer InstituteIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
| | - Jian Jin
- Department of Pharmacological SciencesIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
- Mount Sinai Center for Therapeutics DiscoveryIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
- Department of Oncological Sciences, Tisch Cancer InstituteIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
| | - Ming‐Ming Zhou
- Department of Pharmacological SciencesIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
- Department of Oncological Sciences, Tisch Cancer InstituteIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
| | - Bin Zhang
- Department of Genetics and Genomic SciencesIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
- Mount Sinai Center for Transformative Disease ModelingIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
- Icahn Institute for Data Science and Genomic TechnologyIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
- Department of Pharmacological SciencesIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
| |
Collapse
|
26
|
Jiménez-Luna J, Grisoni F, Weskamp N, Schneider G. Artificial intelligence in drug discovery: recent advances and future perspectives. Expert Opin Drug Discov 2021; 16:949-959. [PMID: 33779453 DOI: 10.1080/17460441.2021.1909567] [Citation(s) in RCA: 97] [Impact Index Per Article: 32.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
Introduction: Artificial intelligence (AI) has inspired computer-aided drug discovery. The widespread adoption of machine learning, in particular deep learning, in multiple scientific disciplines, and the advances in computing hardware and software, among other factors, continue to fuel this development. Much of the initial skepticism regarding applications of AI in pharmaceutical discovery has started to vanish, consequently benefitting medicinal chemistry.Areas covered: The current status of AI in chemoinformatics is reviewed. The topics discussed herein include quantitative structure-activity/property relationship and structure-based modeling, de novo molecular design, and chemical synthesis prediction. Advantages and limitations of current deep learning applications are highlighted, together with a perspective on next-generation AI for drug discovery.Expert opinion: Deep learning-based approaches have only begun to address some fundamental problems in drug discovery. Certain methodological advances, such as message-passing models, spatial-symmetry-preserving networks, hybrid de novo design, and other innovative machine learning paradigms, will likely become commonplace and help address some of the most challenging questions. Open data sharing and model development will play a central role in the advancement of drug discovery with AI.
Collapse
Affiliation(s)
- José Jiménez-Luna
- Department of Chemistry and Applied Biosciences, ETH Zurich, Zurich, Switzerland
| | - Francesca Grisoni
- Department of Chemistry and Applied Biosciences, ETH Zurich, Zurich, Switzerland
| | - Nils Weskamp
- Boehringer Ingelheim Pharma GmbH & Co. KG, Biberach an Der Riss, Germany
| | - Gisbert Schneider
- Department of Chemistry and Applied Biosciences, ETH Zurich, Zurich, Switzerland
| |
Collapse
|
27
|
Abstract
Molecular descriptors encode a variety of molecular representations for computer-assisted drug discovery. Here, we focus on the Weighted Holistic Atom Localization and Entity Shape (WHALES) descriptors, which were originally designed for scaffold hopping from natural products to synthetic molecules. WHALES descriptors capture molecular shape and partial charges simultaneously. We introduce the key aspects of the WHALES concept and provide a step-by-step guide on how to use these descriptors for virtual compound screening and scaffold hopping. The results presented can be reproduced by using the code freely available from URL: github.com/ETHmodlab/scaffold_hopping_whales .
Collapse
Affiliation(s)
- Francesca Grisoni
- Department of Chemistry and Applied Biosciences, RETHINK, ETH Zurich, Zurich, Switzerland.
| | - Gisbert Schneider
- Department of Chemistry and Applied Biosciences, RETHINK, ETH Zurich, Zurich, Switzerland
| |
Collapse
|
28
|
Horwood J, Noutahi E. Molecular Design in Synthetically Accessible Chemical Space via Deep Reinforcement Learning. ACS OMEGA 2020; 5:32984-32994. [PMID: 33403260 PMCID: PMC7774092 DOI: 10.1021/acsomega.0c04153] [Citation(s) in RCA: 30] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/27/2020] [Accepted: 10/27/2020] [Indexed: 06/12/2023]
Abstract
The fundamental goal of generative drug design is to propose optimized molecules that meet predefined activity, selectivity, and pharmacokinetic criteria. Despite recent progress, we argue that existing generative methods are limited in their ability to favorably shift the distributions of molecular properties during optimization. We instead propose a novel Reinforcement Learning framework for molecular design in which an agent learns to directly optimize through a space of synthetically accessible drug-like molecules. This becomes possible by defining transitions in our Markov decision process as chemical reactions and allows us to leverage synthetic routes as an inductive bias. We validate our method by demonstrating that it outperforms existing state-of-the-art approaches in the optimization of pharmacologically relevant objectives, while results on multi-objective optimization tasks suggest increased scalability to realistic pharmaceutical design problems.
Collapse
Affiliation(s)
- Julien Horwood
- InVivo
AI, Montreal, Quebec H2S 3H1, Canada
- Mila, Université de Montréal, Montreal, Quebec H2S 3H1, Canada
| | | |
Collapse
|
29
|
Kell DB, Samanta S, Swainston N. Deep learning and generative methods in cheminformatics and chemical biology: navigating small molecule space intelligently. Biochem J 2020; 477:4559-4580. [PMID: 33290527 PMCID: PMC7733676 DOI: 10.1042/bcj20200781] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2020] [Revised: 11/11/2020] [Accepted: 11/12/2020] [Indexed: 12/15/2022]
Abstract
The number of 'small' molecules that may be of interest to chemical biologists - chemical space - is enormous, but the fraction that have ever been made is tiny. Most strategies are discriminative, i.e. have involved 'forward' problems (have molecule, establish properties). However, we normally wish to solve the much harder generative or inverse problem (describe desired properties, find molecule). 'Deep' (machine) learning based on large-scale neural networks underpins technologies such as computer vision, natural language processing, driverless cars, and world-leading performance in games such as Go; it can also be applied to the solution of inverse problems in chemical biology. In particular, recent developments in deep learning admit the in silico generation of candidate molecular structures and the prediction of their properties, thereby allowing one to navigate (bio)chemical space intelligently. These methods are revolutionary but require an understanding of both (bio)chemistry and computer science to be exploited to best advantage. We give a high-level (non-mathematical) background to the deep learning revolution, and set out the crucial issue for chemical biology and informatics as a two-way mapping from the discrete nature of individual molecules to the continuous but high-dimensional latent representation that may best reflect chemical space. A variety of architectures can do this; we focus on a particular type known as variational autoencoders. We then provide some examples of recent successes of these kinds of approach, and a look towards the future.
Collapse
Affiliation(s)
- Douglas B. Kell
- Department of Biochemistry and Systems Biology, Institute of Systems, Molecular and Integrative Biology, Faculty of Health and Life Sciences, University of Liverpool, Crown St, Liverpool L69 7ZB, U.K
- Novo Nordisk Foundation Centre for Biosustainability, Technical University of Denmark, Building 220, Kemitorvet, 2800 Kgs. Lyngby, Denmark
| | - Soumitra Samanta
- Department of Biochemistry and Systems Biology, Institute of Systems, Molecular and Integrative Biology, Faculty of Health and Life Sciences, University of Liverpool, Crown St, Liverpool L69 7ZB, U.K
| | - Neil Swainston
- Department of Biochemistry and Systems Biology, Institute of Systems, Molecular and Integrative Biology, Faculty of Health and Life Sciences, University of Liverpool, Crown St, Liverpool L69 7ZB, U.K
| |
Collapse
|
30
|
Khemchandani Y, O'Hagan S, Samanta S, Swainston N, Roberts TJ, Bollegala D, Kell DB. DeepGraphMolGen, a multi-objective, computational strategy for generating molecules with desirable properties: a graph convolution and reinforcement learning approach. J Cheminform 2020; 12:53. [PMID: 33431037 PMCID: PMC7487898 DOI: 10.1186/s13321-020-00454-3] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2020] [Accepted: 08/18/2020] [Indexed: 02/03/2023] Open
Abstract
We address the problem of generating novel molecules with desired interaction properties as a multi-objective optimization problem. Interaction binding models are learned from binding data using graph convolution networks (GCNs). Since the experimentally obtained property scores are recognised as having potentially gross errors, we adopted a robust loss for the model. Combinations of these terms, including drug likeness and synthetic accessibility, are then optimized using reinforcement learning based on a graph convolution policy approach. Some of the molecules generated, while legitimate chemically, can have excellent drug-likeness scores but appear unusual. We provide an example based on the binding potency of small molecules to dopamine transporters. We extend our method successfully to use a multi-objective reward function, in this case for generating novel molecules that bind with dopamine transporters but not with those for norepinephrine. Our method should be generally applicable to the generation in silico of molecules with desirable properties.
Collapse
Affiliation(s)
- Yash Khemchandani
- Department of Biochemistry and Systems Biology, Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Crown St, Liverpool, L69 7ZB, UK
- Indian Institute of Technology Bombay, Powai, Mumbai, Maharashtra, 400 076, India
| | - Stephen O'Hagan
- Dept of Chemistry, Manchester Institute of Biotechnology, The University of Manchester, 131 Princess St, Manchester, M1 7DN, UK
| | - Soumitra Samanta
- Department of Biochemistry and Systems Biology, Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Crown St, Liverpool, L69 7ZB, UK
| | - Neil Swainston
- Department of Biochemistry and Systems Biology, Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Crown St, Liverpool, L69 7ZB, UK
| | - Timothy J Roberts
- Department of Biochemistry and Systems Biology, Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Crown St, Liverpool, L69 7ZB, UK
| | - Danushka Bollegala
- Dept of Computer Science, University of Liverpool, Ashton Building, Ashton Street, Liverpool, L69 3BX, UK
| | - Douglas B Kell
- Department of Biochemistry and Systems Biology, Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Crown St, Liverpool, L69 7ZB, UK.
- The Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Kemitorvet 200, Kgs, 2800, Lyngby, Denmark.
| |
Collapse
|
31
|
Coley CW, Eyke NS, Jensen KF. Autonomous Discovery in the Chemical Sciences Part I: Progress. Angew Chem Int Ed Engl 2020; 59:22858-22893. [DOI: 10.1002/anie.201909987] [Citation(s) in RCA: 100] [Impact Index Per Article: 25.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2019] [Indexed: 01/05/2023]
Affiliation(s)
- Connor W. Coley
- Department of Chemical Engineering Massachusetts Institute of Technology Cambridge MA 02139 USA
| | - Natalie S. Eyke
- Department of Chemical Engineering Massachusetts Institute of Technology Cambridge MA 02139 USA
| | - Klavs F. Jensen
- Department of Chemical Engineering Massachusetts Institute of Technology Cambridge MA 02139 USA
| |
Collapse
|
32
|
Coley CW, Eyke NS, Jensen KF. Autonome Entdeckung in den chemischen Wissenschaften, Teil I: Fortschritt. Angew Chem Int Ed Engl 2020. [DOI: 10.1002/ange.201909987] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Affiliation(s)
- Connor W. Coley
- Department of Chemical Engineering Massachusetts Institute of Technology Cambridge MA 02139 USA
| | - Natalie S. Eyke
- Department of Chemical Engineering Massachusetts Institute of Technology Cambridge MA 02139 USA
| | - Klavs F. Jensen
- Department of Chemical Engineering Massachusetts Institute of Technology Cambridge MA 02139 USA
| |
Collapse
|
33
|
Abstract
INTRODUCTION Deep discriminative and generative neural-network models are becoming an integral part of the modern approach to ligand-based novel drug discovery. The variety of different architectures of neural networks, the methods of their training, and the procedures of generating new molecules require expert knowledge to choose the most suitable approach. AREAS COVERED Three different approaches to deep learning use in ligand-based drug discovery are considered: virtual screening, neural generative models, and mutation-based structure generation. Several architectures of neural networks for building either discriminative or generative models are considered in this paper, including deep multilayer neural networks, different kinds of convolutional neural networks, recurrent neural networks, and several types of autoencoders. Several kinds of learning frameworks are also considered, including adversarial learning and reinforcement learning. Different types of representations for generating molecules, including SMILES, graphs, and several alternative string representations are also considered. EXPERT OPINION Two kinds of problem should be solved in order to make the models built using deep neural networks, especially generative models, a valuable option in ligand-based drug discovery: the issue of interpretability and explainability of deep-learning models and the issue of synthetic accessibility of novel compounds designed by deep-learning algorithms.
Collapse
Affiliation(s)
- Igor I Baskin
- Faculty of Physics, M.V. Lomonosov Moscow State University , Moscow, Russia.,Butlerov Institute of Chemistry, Kazan Federal University , Kazan, Russia
| |
Collapse
|
34
|
Yoshimori A, Kawasaki E, Kanai C, Tasaka T. Strategies for Design of Molecular Structures with a Desired Pharmacophore Using Deep Reinforcement Learning. Chem Pharm Bull (Tokyo) 2020; 68:227-233. [DOI: 10.1248/cpb.c19-00625] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
|
35
|
Grisoni F, Moret M, Lingwood R, Schneider G. Bidirectional Molecule Generation with Recurrent Neural Networks. J Chem Inf Model 2020; 60:1175-1183. [PMID: 31904964 DOI: 10.1021/acs.jcim.9b00943] [Citation(s) in RCA: 89] [Impact Index Per Article: 22.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
Abstract
Recurrent neural networks (RNNs) are able to generate de novo molecular designs using simplified molecular input line entry systems (SMILES) string representations of the chemical structure. RNN-based structure generation is usually performed unidirectionally, by growing SMILES strings from left to right. However, there is no natural start or end of a small molecule, and SMILES strings are intrinsically nonunivocal representations of molecular graphs. These properties motivate bidirectional structure generation. Here, bidirectional generative RNNs for SMILES-based molecule design are introduced. To this end, two established bidirectional methods were implemented, and a new method for SMILES string generation and data augmentation is introduced-the bidirectional molecule design by alternate learning (BIMODAL). These three bidirectional strategies were compared to the unidirectional forward RNN approach for SMILES string generation, in terms of the (i) novelty, (ii) scaffold diversity, and (iii) chemical-biological relevance of the computer-generated molecules. The results positively advocate bidirectional strategies for SMILES-based molecular de novo design, with BIMODAL showing superior results to the unidirectional forward RNN for most of the criteria in the tested conditions. The code of the methods and the pretrained models can be found at URL https://github.com/ETHmodlab/BIMODAL.
Collapse
Affiliation(s)
- Francesca Grisoni
- Department of Chemistry and Applied Biosciences, RETHINK, ETH Zurich, Vladimir-Prelog-Weg 4, 8093 Zurich, Switzerland
| | - Michael Moret
- Department of Chemistry and Applied Biosciences, RETHINK, ETH Zurich, Vladimir-Prelog-Weg 4, 8093 Zurich, Switzerland
| | - Robin Lingwood
- Department of Chemistry and Applied Biosciences, RETHINK, ETH Zurich, Vladimir-Prelog-Weg 4, 8093 Zurich, Switzerland
| | - Gisbert Schneider
- Department of Chemistry and Applied Biosciences, RETHINK, ETH Zurich, Vladimir-Prelog-Weg 4, 8093 Zurich, Switzerland
| |
Collapse
|
36
|
Schneider P, Walters WP, Plowright AT, Sieroka N, Listgarten J, Goodnow RA, Fisher J, Jansen JM, Duca JS, Rush TS, Zentgraf M, Hill JE, Krutoholow E, Kohler M, Blaney J, Funatsu K, Luebkemann C, Schneider G. Rethinking drug design in the artificial intelligence era. Nat Rev Drug Discov 2019. [DOI: 78495111110.1038/s41573-019-0050-3' target='_blank'>'"<>78495111110.1038/s41573-019-0050-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [78495111110.1038/s41573-019-0050-3','', '10.1038/s42256-019-0067-7')">Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/29/2022]
78495111110.1038/s41573-019-0050-3" />
|
37
|
Rethinking drug design in the artificial intelligence era. Nat Rev Drug Discov 2019; 19:353-364. [DOI: 10.1038/s41573-019-0050-3] [Citation(s) in RCA: 222] [Impact Index Per Article: 44.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/28/2019] [Indexed: 12/17/2022]
|
38
|
Rodrigues T. The good, the bad, and the ugly in chemical and biological data for machine learning. DRUG DISCOVERY TODAY. TECHNOLOGIES 2019; 32-33:3-8. [PMID: 33386092 PMCID: PMC7382642 DOI: 10.1016/j.ddtec.2020.07.001] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/15/2020] [Revised: 07/08/2020] [Accepted: 07/09/2020] [Indexed: 02/05/2023]
Abstract
Machine learning and artificial intelligence (ML/AI) have become important research tools in molecular medicine and chemistry. Their rise and recent success in drug discovery promises a rapid progression of development pipelines while reshaping how fundamental and clinical research is conducted. By taking advantage of the ever-growing wealth of publicly available and proprietary data, learning algorithms now provide an attractive means to generate statistically motivated research hypotheses. Hitherto unknown data patterns may guide and prioritize experiments, and augment expert intuition. Therefore, data is a key component in the model building workflow. Herein, I aim to discuss types of chemical and biological data according to their quality and reemphasize general recommendations for their use in ML/AI.
Collapse
Affiliation(s)
- Tiago Rodrigues
- Instituto de Medicina Molecular João Lobo Antunes, Faculdade de Medicina da Universidade de Lisboa, Av Prof Egaz Moniz, 1649-028 Lisboa, Portugal; Research Institute for Medicines (iMed.ULisboa), Faculdade de Farmácia, Universidade de Lisboa, Av. Prof. Gama Pinto 1649-003, Lisboa, Portugal.
| |
Collapse
|
39
|
Machine learning for target discovery in drug development. Curr Opin Chem Biol 2019; 56:16-22. [PMID: 31734566 DOI: 10.1016/j.cbpa.2019.10.003] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2019] [Revised: 10/01/2019] [Accepted: 10/03/2019] [Indexed: 12/15/2022]
Abstract
The discovery of macromolecular targets for bioactive agents is currently a bottleneck for the informed design of chemical probes and drug leads. Typically, activity profiling against genetically manipulated cell lines or chemical proteomics is pursued to shed light on their biology and deconvolute drug-target networks. By taking advantage of the ever-growing wealth of publicly available bioactivity data, learning algorithms now provide an attractive means to generate statistically motivated research hypotheses and thereby prioritize biochemical screens. Here, we highlight recent successes in machine intelligence for target identification and discuss challenges and opportunities for drug discovery.
Collapse
|