1
|
Wang J, Zhu F. Multi-objective molecular generation via clustered Pareto-based reinforcement learning. Neural Netw 2024; 179:106596. [PMID: 39163823 DOI: 10.1016/j.neunet.2024.106596] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2024] [Revised: 06/16/2024] [Accepted: 08/01/2024] [Indexed: 08/22/2024]
Abstract
De novo molecular design is the process of learning knowledge from existing data to propose new chemical structures that satisfy the desired properties. By using de novo design to generate compounds in a directed manner, better solutions can be obtained in large chemical libraries with less comparison cost. But drug design needs to take multiple factors into consideration. For example, in polypharmacology, molecules that activate or inhibit multiple target proteins produce multiple pharmacological activities and are less susceptible to drug resistance. However, most existing molecular generation methods either focus only on affinity for a single target or fail to effectively balance the relationship between multiple targets, resulting in insufficient validity and desirability of the generated molecules. To address the problems, an approach called clustered Pareto-based reinforcement learning (CPRL) is proposed. In CPRL, a pre-trained model is constructed to grasp existing molecular knowledge in a supervised learning manner. In addition, the clustered Pareto optimization algorithm is presented to find the best solution between different objectives. The algorithm first extracts an update set from the sampled molecules through the designed aggregation-based molecular clustering. Then, the final reward is computed by constructing the Pareto frontier ranking of the molecules from the updated set. To explore the vast chemical space, a reinforcement learning agent is designed in CPRL that can be updated under the guidance of the final reward to balance multiple properties. Furthermore, to increase the internal diversity of the molecules, a fixed-parameter exploration model is used for sampling in conjunction with the agent. The experimental results demonstrate that CPRL is capable of balancing multiple properties of the molecule and has higher desirability and validity, reaching 0.9551 and 0.9923, respectively.
Collapse
Affiliation(s)
- Jing Wang
- School of Computer Science and Technology, Soochow University, Suzhou, 215006, China.
| | - Fei Zhu
- School of Computer Science and Technology, Soochow University, Suzhou, 215006, China.
| |
Collapse
|
2
|
Li B, Tan K, Lao AR, Wang H, Zheng H, Zhang L. A comprehensive review of artificial intelligence for pharmacology research. Front Genet 2024; 15:1450529. [PMID: 39290983 PMCID: PMC11405247 DOI: 10.3389/fgene.2024.1450529] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2024] [Accepted: 08/26/2024] [Indexed: 09/19/2024] Open
Abstract
With the innovation and advancement of artificial intelligence, more and more artificial intelligence techniques are employed in drug research, biomedical frontier research, and clinical medicine practice, especially, in the field of pharmacology research. Thus, this review focuses on the applications of artificial intelligence in drug discovery, compound pharmacokinetic prediction, and clinical pharmacology. We briefly introduced the basic knowledge and development of artificial intelligence, presented a comprehensive review, and then summarized the latest studies and discussed the strengths and limitations of artificial intelligence models. Additionally, we highlighted several important studies and pointed out possible research directions.
Collapse
Affiliation(s)
- Bing Li
- College of Computer Science, Sichuan University, Chengdu, China
| | - Kan Tan
- College of Computer Science, Sichuan University, Chengdu, China
| | - Angelyn R Lao
- Department of Mathematics and Statistics, De La Salle University, Manila, Philippines
| | - Haiying Wang
- School of Computing, Ulster University, Belfast, United Kingdom
| | - Huiru Zheng
- School of Computing, Ulster University, Belfast, United Kingdom
| | - Le Zhang
- College of Computer Science, Sichuan University, Chengdu, China
| |
Collapse
|
3
|
Renz P, Luukkonen S, Klambauer G. Diverse Hits in De Novo Molecule Design: Diversity-Based Comparison of Goal-Directed Generators. J Chem Inf Model 2024; 64:5756-5761. [PMID: 39029090 PMCID: PMC11323242 DOI: 10.1021/acs.jcim.4c00519] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2024] [Revised: 07/10/2024] [Accepted: 07/11/2024] [Indexed: 07/21/2024]
Abstract
Since the rise of generative AI models, many goal-directed molecule generators have been proposed as tools for discovering novel drug candidates. However, molecule generators often produce highly similar molecules and tend to overemphasize conformity to an imperfect scoring function rather than capturing the true underlying properties sought. We rectify these two shortcomings by offering diversity-based evaluations using the #Circles metric and considering constraints on scoring function calls or computation time. Our findings highlight the superior performance of SMILES-based autoregressive models in generating diverse sets of desired molecules compared to graph-based models or genetic algorithms.
Collapse
Affiliation(s)
- Philipp Renz
- Johannes Kepler University Linz, Altenbergerstraße 69, Linz, AT 4040, Austria
| | - Sohvi Luukkonen
- Johannes Kepler University Linz, ELLIS Unit Linz, LIT AI Lab, Institute for Machine Learning, Altenbergerstraße 69, Linz, AT 4040, Austria
| | - Günter Klambauer
- Johannes Kepler University Linz, ELLIS Unit Linz, LIT AI Lab, Institute for Machine Learning, Altenbergerstraße 69, Linz, AT 4040, Austria
| |
Collapse
|
4
|
Nguyen ATN, Nguyen DTN, Koh HY, Toskov J, MacLean W, Xu A, Zhang D, Webb GI, May LT, Halls ML. The application of artificial intelligence to accelerate G protein-coupled receptor drug discovery. Br J Pharmacol 2024; 181:2371-2384. [PMID: 37161878 DOI: 10.1111/bph.16140] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2022] [Revised: 04/14/2023] [Accepted: 04/27/2023] [Indexed: 05/11/2023] Open
Abstract
The application of artificial intelligence (AI) approaches to drug discovery for G protein-coupled receptors (GPCRs) is a rapidly expanding area. Artificial intelligence can be used at multiple stages during the drug discovery process, from aiding our understanding of the fundamental actions of GPCRs to the discovery of new ligand-GPCR interactions or the prediction of clinical responses. Here, we provide an overview of the concepts behind artificial intelligence, including the subfields of machine learning and deep learning. We summarise the published applications of artificial intelligence to different stages of the GPCR drug discovery process. Finally, we reflect on the benefits and limitations of artificial intelligence and share our vision for the exciting potential for further development of applications to aid GPCR drug discovery. In addition to making the drug discovery process "faster, smarter and cheaper," we anticipate that the application of artificial intelligence will create exciting new opportunities for GPCR drug discovery. LINKED ARTICLES: This article is part of a themed issue Therapeutic Targeting of G Protein-Coupled Receptors: hot topics from the Australasian Society of Clinical and Experimental Pharmacologists and Toxicologists 2021 Virtual Annual Scientific Meeting. To view the other articles in this section visit http://onlinelibrary.wiley.com/doi/10.1111/bph.v181.14/issuetoc.
Collapse
Affiliation(s)
- Anh T N Nguyen
- Drug Discovery Biology Theme, Monash Institute of Pharmaceutical Sciences, Monash University, Parkville, Victoria, Australia
| | - Diep T N Nguyen
- Department of Information Technology, Faculty of Engineering and Technology, Vietnam National University, Cau Giay, Hanoi, Vietnam
| | - Huan Yee Koh
- Drug Discovery Biology Theme, Monash Institute of Pharmaceutical Sciences, Monash University, Parkville, Victoria, Australia
- Monash Data Futures Institute and Department of Data Science and Artificial Intelligence, Monash University, Clayton, Victoria, Australia
| | - Jason Toskov
- Monash DeepNeuron, Monash University, Clayton, Victoria, Australia
| | - William MacLean
- Monash DeepNeuron, Monash University, Clayton, Victoria, Australia
| | - Andrew Xu
- Monash DeepNeuron, Monash University, Clayton, Victoria, Australia
| | - Daokun Zhang
- Drug Discovery Biology Theme, Monash Institute of Pharmaceutical Sciences, Monash University, Parkville, Victoria, Australia
- Monash Data Futures Institute and Department of Data Science and Artificial Intelligence, Monash University, Clayton, Victoria, Australia
| | - Geoffrey I Webb
- Monash Data Futures Institute and Department of Data Science and Artificial Intelligence, Monash University, Clayton, Victoria, Australia
| | - Lauren T May
- Drug Discovery Biology Theme, Monash Institute of Pharmaceutical Sciences, Monash University, Parkville, Victoria, Australia
| | - Michelle L Halls
- Drug Discovery Biology Theme, Monash Institute of Pharmaceutical Sciences, Monash University, Parkville, Victoria, Australia
| |
Collapse
|
5
|
Yoo S, Kim J. Adapt-cMolGPT: A Conditional Generative Pre-Trained Transformer with Adapter-Based Fine-Tuning for Target-Specific Molecular Generation. Int J Mol Sci 2024; 25:6641. [PMID: 38928346 PMCID: PMC11203498 DOI: 10.3390/ijms25126641] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2024] [Revised: 06/09/2024] [Accepted: 06/14/2024] [Indexed: 06/28/2024] Open
Abstract
Small-molecule drug design aims to generate compounds that target specific proteins, playing a crucial role in the early stages of drug discovery. Recently, research has emerged that utilizes the GPT model, which has achieved significant success in various fields to generate molecular compounds. However, due to the persistent challenge of small datasets in the pharmaceutical field, there has been some degradation in the performance of generating target-specific compounds. To address this issue, we propose an enhanced target-specific drug generation model, Adapt-cMolGPT, which modifies molecular representation and optimizes the fine-tuning process. In particular, we introduce a new fine-tuning method that incorporates an adapter module into a pre-trained base model and alternates weight updates by sections. We evaluated the proposed model through multiple experiments and demonstrated performance improvements compared to previous models. In the experimental results, Adapt-cMolGPT generated a greater number of novel and valid compounds compared to other models, with these generated compounds exhibiting properties similar to those of real molecular data. These results indicate that our proposed method is highly effective in designing drugs targeting specific proteins.
Collapse
Affiliation(s)
- Soyoung Yoo
- Department of Artificial Intelligence, Sejong University, Seoul 05006, Republic of Korea;
| | - Junghyun Kim
- Department of Artificial Intelligence, Sejong University, Seoul 05006, Republic of Korea;
- Deep Learning Architecture Research Center, Sejong University, Seoul 05006, Republic of Korea
| |
Collapse
|
6
|
Gangwal A, Lavecchia A. Unleashing the power of generative AI in drug discovery. Drug Discov Today 2024; 29:103992. [PMID: 38663579 DOI: 10.1016/j.drudis.2024.103992] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2024] [Revised: 03/22/2024] [Accepted: 04/18/2024] [Indexed: 05/04/2024]
Abstract
Artificial intelligence (AI) is revolutionizing drug discovery by enhancing precision, reducing timelines and costs, and enabling AI-driven computer-aided drug design. This review focuses on recent advancements in deep generative models (DGMs) for de novo drug design, exploring diverse algorithms and their profound impact. It critically analyses the challenges that are intricately interwoven into these technologies, proposing strategies to unlock their full potential. It features case studies of both successes and failures in advancing drugs to clinical trials with AI assistance. Last, it outlines a forward-looking plan for optimizing DGMs in de novo drug design, thereby fostering faster and more cost-effective drug development.
Collapse
Affiliation(s)
- Amit Gangwal
- Department of Natural Product Chemistry, Shri Vile Parle Kelavani Mandal's Institute of Pharmacy, Dhule 424001, Maharashtra, India
| | - Antonio Lavecchia
- "Drug Discovery" Laboratory, Department of Pharmacy, University of Naples Federico II, I-80131 Naples, Italy.
| |
Collapse
|
7
|
Lin S, Chen W, Alqahtani MS, Elkamchouchi DH, Ge Y, Lu Y, Zhang G, Wang M. Exploring the therapeutic potential of layered double hydroxides and transition metal dichalcogenides through the convergence of rheumatology and nanotechnology using generative adversarial network. ENVIRONMENTAL RESEARCH 2024; 241:117262. [PMID: 37839531 DOI: 10.1016/j.envres.2023.117262] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/20/2023] [Revised: 09/10/2023] [Accepted: 09/27/2023] [Indexed: 10/17/2023]
Abstract
Two-dimensional Layered double hydroxides (LDHs) are highly used in the biomedical domain due to their biocompatibility, biodegradability, controlled drug loading and release capabilities, and improved cellular permeability. The interaction of LDHs with biological systems could facilitate targeted drug delivery and make them an attractive option for various biomedical applications. Rheumatoid Arthritis (RA) requires targeted drug delivery for optimum therapeutic outcomes. In this study, stacked double hydroxide nanocomposites with dextran sulphate modification (LDH-DS) were developed while exhibiting both targeting and pH-sensitivity for rheumatological conditions. This research examines the loading, release kinetics, and efficiency of the therapeutics of interest in the LDH-based drug delivery system. The mean size of LDH-DS particles (300.1 ± 8.12 nm) is -12.11 ± 0.4 mV. The encapsulation efficiency was 48.52%, and the loading efficacy was 16.81%. In vitro release tests indicate that the drug's discharge is modified more rapidly in PBS at pH 5.4 compared to pH 5.6, which later reached 7.3, showing the case sensitivity to pH. A generative adversarial network (GAN) is used to analyze the drug delivery system in rheumatology. The GAN model achieved high accuracy and classification rates of 99.3% and 99.0%, respectively, and a validity of 99.5%. The second and third administrations resulted in a significant change with p-values of 0.001 and 0.05, respectively. This investigation unequivocally demonstrated that LDH functions as a biocompatible drug delivery matrix, significantly improving delivery effectiveness.
Collapse
Affiliation(s)
- Suxian Lin
- Department of Rheumatology, Wenzhou People's Hospital, Wenzhou, 325000, China
| | - Weiwei Chen
- Department of Rheumatology, Wenzhou People's Hospital, Wenzhou, 325000, China
| | - Mohammed S Alqahtani
- Radiological Sciences Department, College of Applied Medical Sciences, King Khalid University, Abha 61421, Saudi Arabia; BioImaging Unit, Space Research Centre, Michael Atiyah Building, University of Leicester, Leicester, LE1 7RH, U.K
| | - Dalia H Elkamchouchi
- Department of Information Technology, College of Computer and Information Sciences, Princess Nourah bint Abdulrahman University, P.O. Box 84428, Riyadh 11671, Saudi Arabia
| | - Yisu Ge
- College of Computer Science and Artificial Intelligence, Wenzhou University, Wenzhou 325100, China
| | - Yanjie Lu
- Department of Digital Media Technology, Hangzhou Dianzi University, Hangzhou 310018, China
| | - Guodao Zhang
- Department of Digital Media Technology, Hangzhou Dianzi University, Hangzhou 310018, China.
| | - Mudan Wang
- Department of Nephrology, Wenzhou People's Hospital, Wenzhou, 325000, China.
| |
Collapse
|
8
|
Angelo JS, Guedes IA, Barbosa HJC, Dardenne LE. Multi-and many-objective optimization: present and future in de novo drug design. Front Chem 2023; 11:1288626. [PMID: 38192501 PMCID: PMC10773868 DOI: 10.3389/fchem.2023.1288626] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2023] [Accepted: 11/27/2023] [Indexed: 01/10/2024] Open
Abstract
de novo Drug Design (dnDD) aims to create new molecules that satisfy multiple conflicting objectives. Since several desired properties can be considered in the optimization process, dnDD is naturally categorized as a many-objective optimization problem (ManyOOP), where more than three objectives must be simultaneously optimized. However, a large number of objectives typically pose several challenges that affect the choice and the design of optimization methodologies. Herein, we cover the application of multi- and many-objective optimization methods, particularly those based on Evolutionary Computation and Machine Learning techniques, to enlighten their potential application in dnDD. Additionally, we comprehensively analyze how molecular properties used in the optimization process are applied as either objectives or constraints to the problem. Finally, we discuss future research in many-objective optimization for dnDD, highlighting two important possible impacts: i) its integration with the development of multi-target approaches to accelerate the discovery of innovative and more efficacious drug therapies and ii) its role as a catalyst for new developments in more fundamental and general methodological frameworks in the field.
Collapse
Affiliation(s)
| | | | | | - Laurent E. Dardenne
- Coordenação de Modelagem Computacional, Laboratório Nacional de Computação Científica, Petrópolis, Brazil
| |
Collapse
|
9
|
Zou J, Zhao L, Shi S. Generation of focused drug molecule library using recurrent neural network. J Mol Model 2023; 29:361. [PMID: 37932607 DOI: 10.1007/s00894-023-05772-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2023] [Accepted: 10/26/2023] [Indexed: 11/08/2023]
Abstract
CONTEXT With the wide application of deep learning in drug research and development, de novo molecular design methods based on recurrent neural network (RNN) have strong advantages in drug molecule generation. The RNN model can be used to learn the internal chemical structure of molecules, which is similar to a natural language processing task. Although techniques for generating target-specific molecular libraries based on RNN models are mature, research related to drug design and screening continues around the clock. Research based on de novo drug design methods to generate larger quantities of valid compounds is necessary. METHODS In this study, a molecular generation model based on RNN was designed, which abandoned the traditional way of stacked RNN and introduced the Nested long short-term memory network structure. To enrich the library of focused molecules for specific targets, we fine-tuned the model using active molecules from novel coronavirus pneumonia and screened the molecules using machine learning models. Following rigorous screening, the selected molecules underwent molecular docking with the SARS-CoV-2 M-pro receptor using AutoDock2.4 to identify the top 3 potential inhibitors. Subsequently, 100-ns molecular dynamics simulations were conducted using Amber22. Molecule parameterization involved the GAFF2 force field, while the proteins were modeled using the ff19SB force field, with solvation facilitated by a truncated octahedral TIP3P solvent environment. Upon completion of molecular dynamics simulations, stability of ligand-protein complexes was assessed by analysis of RMSD, H-bonds, and MM-GBSA. Reasonable results prove that the model can complete the task of de novo drug design and has the potential to be ideal drug molecules.
Collapse
Affiliation(s)
- Jinping Zou
- Department of Mathematics, School of Mathematics and Computer Sciences, Nanchang University, Nanchang, 330031, China
- Institute of Mathematics and Interdisciplinary Sciences, Nanchang University, Nanchang, 330031, China
| | - Long Zhao
- Department of Mathematics, School of Mathematics and Computer Sciences, Nanchang University, Nanchang, 330031, China
- Institute of Mathematics and Interdisciplinary Sciences, Nanchang University, Nanchang, 330031, China
| | - Shaoping Shi
- Department of Mathematics, School of Mathematics and Computer Sciences, Nanchang University, Nanchang, 330031, China.
- Institute of Mathematics and Interdisciplinary Sciences, Nanchang University, Nanchang, 330031, China.
| |
Collapse
|
10
|
Mullowney MW, Duncan KR, Elsayed SS, Garg N, van der Hooft JJJ, Martin NI, Meijer D, Terlouw BR, Biermann F, Blin K, Durairaj J, Gorostiola González M, Helfrich EJN, Huber F, Leopold-Messer S, Rajan K, de Rond T, van Santen JA, Sorokina M, Balunas MJ, Beniddir MA, van Bergeijk DA, Carroll LM, Clark CM, Clevert DA, Dejong CA, Du C, Ferrinho S, Grisoni F, Hofstetter A, Jespers W, Kalinina OV, Kautsar SA, Kim H, Leao TF, Masschelein J, Rees ER, Reher R, Reker D, Schwaller P, Segler M, Skinnider MA, Walker AS, Willighagen EL, Zdrazil B, Ziemert N, Goss RJM, Guyomard P, Volkamer A, Gerwick WH, Kim HU, Müller R, van Wezel GP, van Westen GJP, Hirsch AKH, Linington RG, Robinson SL, Medema MH. Artificial intelligence for natural product drug discovery. Nat Rev Drug Discov 2023; 22:895-916. [PMID: 37697042 DOI: 10.1038/s41573-023-00774-7] [Citation(s) in RCA: 33] [Impact Index Per Article: 33.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/20/2023] [Indexed: 09/13/2023]
Abstract
Developments in computational omics technologies have provided new means to access the hidden diversity of natural products, unearthing new potential for drug discovery. In parallel, artificial intelligence approaches such as machine learning have led to exciting developments in the computational drug design field, facilitating biological activity prediction and de novo drug design for molecular targets of interest. Here, we describe current and future synergies between these developments to effectively identify drug candidates from the plethora of molecules produced by nature. We also discuss how to address key challenges in realizing the potential of these synergies, such as the need for high-quality datasets to train deep learning algorithms and appropriate strategies for algorithm validation.
Collapse
Affiliation(s)
| | - Katherine R Duncan
- Strathclyde Institute of Pharmacy and Biomedical Sciences, University of Strathclyde, Glasgow, UK
| | - Somayah S Elsayed
- Department of Molecular Biotechnology, Institute of Biology, Leiden University, Leiden, The Netherlands
| | - Neha Garg
- School of Chemistry and Biochemistry, Center for Microbial Dynamics and Infection, Georgia Institute of Technology, Atlanta, GA, USA
| | - Justin J J van der Hooft
- Bioinformatics Group, Wageningen University, Wageningen, The Netherlands
- Department of Biochemistry, University of Johannesburg, Johannesburg, South Africa
| | - Nathaniel I Martin
- Biological Chemistry Group, Institute of Biology, Leiden University, Leiden, The Netherlands
| | - David Meijer
- Bioinformatics Group, Wageningen University, Wageningen, The Netherlands
| | - Barbara R Terlouw
- Bioinformatics Group, Wageningen University, Wageningen, The Netherlands
| | - Friederike Biermann
- Bioinformatics Group, Wageningen University, Wageningen, The Netherlands
- Institute of Molecular Bio Science, Goethe-University Frankfurt, Frankfurt am Main, Germany
- LOEWE Center for Translational Biodiversity Genomics (TBG), Frankfurt am Main, Germany
| | - Kai Blin
- The Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Kongens Lyngby, Denmark
| | | | - Marina Gorostiola González
- Drug Discovery and Safety, Leiden Academic Centre for Drug Research, Leiden, The Netherlands
- ONCODE institute, Leiden, The Netherlands
| | - Eric J N Helfrich
- Institute of Molecular Bio Science, Goethe-University Frankfurt, Frankfurt am Main, Germany
- LOEWE Center for Translational Biodiversity Genomics (TBG), Frankfurt am Main, Germany
| | - Florian Huber
- Center for Digitalization and Digitality, Hochschule Düsseldorf, Düsseldorf, Germany
| | - Stefan Leopold-Messer
- Institut für Mikrobiologie, Eidgenössische Technische Hochschule (ETH) Zürich, Zürich, Switzerland
| | - Kohulan Rajan
- Institute for Inorganic and Analytical Chemistry, Friedrich-Schiller-University Jena, Jena, Germany
| | - Tristan de Rond
- School of Chemical Sciences, University of Auckland, Auckland, New Zealand
| | - Jeffrey A van Santen
- Department of Chemistry, Simon Fraser University, Burnaby, British Columbia, Canada
| | - Maria Sorokina
- Institute for Inorganic and Analytical Chemistry, Friedrich-Schiller University, Jena, Germany
- Pharmaceuticals R&D, Bayer AG, Berlin, Germany
| | - Marcy J Balunas
- Department of Microbiology and Immunology, University of Michigan, Ann Arbor, MI, USA
- Department of Medicinal Chemistry, University of Michigan, Ann Arbor, MI, USA
| | - Mehdi A Beniddir
- Équipe "Chimie des Substances Naturelles", Université Paris-Saclay, CNRS, BioCIS, Orsay, France
| | - Doris A van Bergeijk
- Department of Molecular Biotechnology, Institute of Biology, Leiden University, Leiden, The Netherlands
| | - Laura M Carroll
- Structural and Computational Biology Unit, EMBL, Heidelberg, Germany
| | - Chase M Clark
- Division of Pharmaceutical Sciences, School of Pharmacy, University of Wisconsin-Madison, Madison, WI, USA
| | | | | | - Chao Du
- Department of Molecular Biotechnology, Institute of Biology, Leiden University, Leiden, The Netherlands
| | | | - Francesca Grisoni
- Institute for Complex Molecular Systems, Department of Biomedical Engineering, Eindhoven University of Technology, Eindhoven, The Netherlands
- Centre for Living Technologies, Alliance TU/e, WUR, UU, UMC Utrecht, Utrecht, The Netherlands
| | | | - Willem Jespers
- Drug Discovery and Safety, Leiden Academic Centre for Drug Research, Leiden, The Netherlands
| | - Olga V Kalinina
- Helmholtz Institute for Pharmaceutical Research Saarland (HIPS), Helmholtz Centre for Infection Research (HZI), Saarbrücken, Germany
- Drug Bioinformatics, Medical Faculty, Saarland University, Homburg, Germany
- Center for Bioinformatics, Saarland University, Saarbrücken, Germany
| | | | - Hyunwoo Kim
- College of Pharmacy and Integrated Research Institute for Drug Development, Dongguk University Seoul, Goyang-si, Republic of Korea
| | - Tiago F Leao
- Center for Nuclear Energy in Agriculture, University of São Paulo, Piracicaba, Brazil
| | - Joleen Masschelein
- Center for Microbiology, VIB-KU Leuven, Heverlee, Belgium
- Department of Biology, KU Leuven, Heverlee, Belgium
| | - Evan R Rees
- Division of Pharmaceutical Sciences, School of Pharmacy, University of Wisconsin-Madison, Madison, WI, USA
| | - Raphael Reher
- Institute of Pharmaceutical Biology and Biotechnology, University of Marburg, Marburg, Germany
- Institute of Pharmacy, Martin-Luther-University Halle-Wittenberg, Halle (Saale), Germany
| | - Daniel Reker
- Department of Biomedical Engineering, Duke University, Durham, NC, USA
- Duke Microbiome Center, Duke University, Durham, NC, USA
| | - Philippe Schwaller
- Laboratory of Artificial Chemical Intelligence, Institut des Sciences et Ingénierie Chimiques, Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
| | | | - Michael A Skinnider
- Adapsyn Bioscience, Hamilton, Ontario, Canada
- Michael Smith Laboratories, University of British Columbia, Vancouver, British Columbia, Canada
| | - Allison S Walker
- Department of Chemistry, Vanderbilt University, Nashville, TN, USA
- Department of Biological Sciences, Vanderbilt University, Nashville, TN, USA
| | - Egon L Willighagen
- Department of Bioinformatics - BiGCaT, NUTRIM, Maastricht University, Maastricht, The Netherlands
| | - Barbara Zdrazil
- European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Cambridgeshire, UK
| | - Nadine Ziemert
- Interfaculty Institute for Microbiology and Infection Medicine Tuebingen (IMIT), Institute for Bioinformatics and Medical Informatics (IBMI), University of Tuebingen, Tuebingen, Germany
| | | | - Pierre Guyomard
- Bonsai team, CRIStAL - Centre de Recherche en Informatique Signal et Automatique de Lille, Université de Lille, Villeneuve d'Ascq Cedex, France
| | - Andrea Volkamer
- Center for Bioinformatics, Saarland University, Saarbrücken, Germany
- In silico Toxicology and Structural Bioinformatics, Institute of Physiology, Charité - Universitätsmedizin Berlin, Berlin, Germany
| | - William H Gerwick
- Scripps Institution of Oceanography, University of California San Diego, La Jolla, CA, USA
| | - Hyun Uk Kim
- Department of Chemical and Biomolecular Engineering, Korea Advanced Institute of Science and Technology (KAIST), Daejeon, Korea
| | - Rolf Müller
- Helmholtz Institute for Pharmaceutical Research Saarland (HIPS), Helmholtz Centre for Infection Research (HZI), Saarbrücken, Germany
- Department of Pharmacy, Saarland University, Saarbrücken, Germany
- German Center for infection research (DZIF), Braunschweig, Germany
- Helmholtz International Lab for Anti-Infectives, Saarbrücken, Germany
| | - Gilles P van Wezel
- Department of Molecular Biotechnology, Institute of Biology, Leiden University, Leiden, The Netherlands
- Netherlands Institute of Ecology, NIOO-KNAW, Wageningen, The Netherlands
| | - Gerard J P van Westen
- Drug Discovery and Safety, Leiden Academic Centre for Drug Research, Leiden, The Netherlands.
| | - Anna K H Hirsch
- Helmholtz Institute for Pharmaceutical Research Saarland (HIPS), Helmholtz Centre for Infection Research (HZI), Saarbrücken, Germany.
- Department of Pharmacy, Saarland University, Saarbrücken, Germany.
- German Center for infection research (DZIF), Braunschweig, Germany.
- Helmholtz International Lab for Anti-Infectives, Saarbrücken, Germany.
| | - Roger G Linington
- Department of Chemistry, Simon Fraser University, Burnaby, British Columbia, Canada.
| | - Serina L Robinson
- Department of Environmental Microbiology, Eawag: Swiss Federal Institute for Aquatic Science and Technology, Dübendorf, Switzerland.
| | - Marnix H Medema
- Bioinformatics Group, Wageningen University, Wageningen, The Netherlands.
- Institute of Biology, Leiden University, Leiden, The Netherlands.
| |
Collapse
|
11
|
Bae B, Bae H, Nam H. LOGICS: Learning optimal generative distribution for designing de novo chemical structures. J Cheminform 2023; 15:77. [PMID: 37674239 PMCID: PMC10483765 DOI: 10.1186/s13321-023-00747-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2023] [Accepted: 08/23/2023] [Indexed: 09/08/2023] Open
Abstract
In recent years, the field of computational drug design has made significant strides in the development of artificial intelligence (AI) models for the generation of de novo chemical compounds with desired properties and biological activities, such as enhanced binding affinity to target proteins. These high-affinity compounds have the potential to be developed into more potent therapeutics for a broad spectrum of diseases. Due to the lack of data required for the training of deep generative models, however, some of these approaches have fine-tuned their molecular generators using data obtained from a separate predictor. While these studies show that generative models can produce structures with the desired target properties, it remains unclear whether the diversity of the generated structures and the span of their chemical space align with the distribution of the intended target molecules. In this study, we present a novel generative framework, LOGICS, a framework for Learning Optimal Generative distribution Iteratively for designing target-focused Chemical Structures. We address the exploration-exploitation dilemma, which weighs the choice between exploring new options and exploiting current knowledge. To tackle this issue, we incorporate experience memory and employ a layered tournament selection approach to refine the fine-tuning process. The proposed method was applied to the binding affinity optimization of two target proteins of different protein classes, κ-opioid receptors, and PIK3CA, and the quality and the distribution of the generative molecules were evaluated. The results showed that LOGICS outperforms competing state-of-the-art models and generates more diverse de novo chemical structures with optimized properties. The source code is available at the GitHub repository ( https://github.com/GIST-CSBL/LOGICS ).
Collapse
Affiliation(s)
- Bongsung Bae
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST), Buk-Gu, Gwangju, 61005, Republic of Korea
| | - Haelee Bae
- AI Graduate School, Gwangju Institute of Science and Technology (GIST), Buk-Gu, Gwangju, 61005, Republic of Korea
| | - Hojung Nam
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST), Buk-Gu, Gwangju, 61005, Republic of Korea.
- AI Graduate School, Gwangju Institute of Science and Technology (GIST), Buk-Gu, Gwangju, 61005, Republic of Korea.
- Center for AI-Applied High Efficiency Drug Discovery (AHEDD), Gwangju Institute of Science and Technology (GIST), Buk-Gu, Gwangju, 61005, Republic of Korea.
| |
Collapse
|
12
|
Šícho M, Luukkonen S, van den Maagdenberg HW, Schoenmaker L, Béquignon OJM, van Westen GJP. DrugEx: Deep Learning Models and Tools for Exploration of Drug-Like Chemical Space. J Chem Inf Model 2023; 63:3629-3636. [PMID: 37272707 PMCID: PMC10306259 DOI: 10.1021/acs.jcim.3c00434] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2023] [Indexed: 06/06/2023]
Abstract
The discovery of novel molecules with desirable properties is a classic challenge in medicinal chemistry. With the recent advancements of machine learning, there has been a surge of de novo drug design tools. However, few resources exist that are user-friendly as well as easily customizable. In this application note, we present the new versatile open-source software package DrugEx for multiobjective reinforcement learning. This package contains the consolidated and redesigned scripts from the prior DrugEx papers including multiple generator architectures, a variety of scoring tools, and multiobjective optimization methods. It has a flexible application programming interface and can readily be used via the command line interface or the graphical user interface GenUI. The DrugEx package is publicly available at https://github.com/CDDLeiden/DrugEx.
Collapse
Affiliation(s)
- Martin Šícho
- Leiden
Academic Centre for Drug Research, Leiden
University, 55 Einsteinweg, 2333 CC, Leiden, The Netherlands
- CZ-OPENSCREEN:
National Infrastructure for Chemical Biology, Department of Informatics
and Chemistry, Faculty of Chemical Technology, University of Chemistry and Technology Prague, Technická 5, 166 28, Prague, Czech Republic
| | - Sohvi Luukkonen
- Leiden
Academic Centre for Drug Research, Leiden
University, 55 Einsteinweg, 2333 CC, Leiden, The Netherlands
| | | | - Linde Schoenmaker
- Leiden
Academic Centre for Drug Research, Leiden
University, 55 Einsteinweg, 2333 CC, Leiden, The Netherlands
| | - Olivier J. M. Béquignon
- Leiden
Academic Centre for Drug Research, Leiden
University, 55 Einsteinweg, 2333 CC, Leiden, The Netherlands
| | - Gerard J. P. van Westen
- Leiden
Academic Centre for Drug Research, Leiden
University, 55 Einsteinweg, 2333 CC, Leiden, The Netherlands
| |
Collapse
|
13
|
Srivathsa AV, Sadashivappa NM, Hegde AK, Radha S, Mahesh AR, Ammunje DN, Sen D, Theivendren P, Govindaraj S, Kunjiappan S, Pavadai P. A Review on Artificial Intelligence Approaches and Rational Approaches in Drug Discovery. Curr Pharm Des 2023; 29:1180-1192. [PMID: 37132148 DOI: 10.2174/1381612829666230428110542] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2022] [Revised: 02/06/2023] [Accepted: 02/27/2023] [Indexed: 05/04/2023]
Abstract
Artificial intelligence (AI) speeds up the drug development process and reduces its time, as well as the cost which is of enormous importance in outbreaks such as COVID-19. It uses a set of machine learning algorithms that collects the available data from resources, categorises, processes and develops novel learning methodologies. Virtual screening is a successful application of AI, which is used in screening huge drug-like databases and filtering to a small number of compounds. The brain's thinking of AI is its neural networking which uses techniques such as Convoluted Neural Network (CNN), Recursive Neural Network (RNN) or Generative Adversial Neural Network (GANN). The application ranges from small molecule drug discovery to the development of vaccines. In the present review article, we discussed various techniques of drug design, structure and ligand-based, pharmacokinetics and toxicity prediction using AI. The rapid phase of discovery is the need of the hour and AI is a targeted approach to achieve this.
Collapse
Affiliation(s)
- Anjana Vidya Srivathsa
- Department of Pharmaceutical Chemistry, Faculty of Pharmacy, M.S. Ramaiah University of Applied Sciences, M.S.R. Nagar, Bengaluru, 560054, India
| | - Nandini Markuli Sadashivappa
- Department of Pharmaceutical Chemistry, Faculty of Pharmacy, M.S. Ramaiah University of Applied Sciences, M.S.R. Nagar, Bengaluru, 560054, India
| | - Apeksha Krishnamurthy Hegde
- Department of Pharmaceutical Chemistry, Faculty of Pharmacy, M.S. Ramaiah University of Applied Sciences, M.S.R. Nagar, Bengaluru, 560054, India
| | - Srimathi Radha
- Department of Pharmaceutical Chemistry, SRM College of Pharmacy, Faculty of Medicine and Health Sciences, SRM Institute of Science and Technology, Chengalpattu District, Kattankulathur, Tamil Nadu, 603203, India
| | - Agasa Ramu Mahesh
- Department of Pharmaceutical Chemistry, Faculty of Pharmacy, M.S. Ramaiah University of Applied Sciences, M.S.R. Nagar, Bengaluru, 560054, India
| | - Damodar Nayak Ammunje
- Department of Pharmacology, Faculty of Pharmacy, M.S. Ramaiah University of Applied Sciences, M.S.R. Nagar, Bengaluru, 560054, India
| | - Debanjan Sen
- Department of Pharmaceutical Chemistry, BCDA College of Pharmacy & Technology, Hridaypur, Kolkata, 700127, West Bengal, India
| | - Panneerselvam Theivendren
- Department of Pharmaceutical Chemistry, Swamy Vivekanandha College of Pharmacy, Elayampalayam, Tiruchengode, 637205, India
| | - Saravanan Govindaraj
- Department of Pharmaceutical Chemistry, MNR College of Pharmacy, Fasalwadi, Sangareddy, 502 001, India
| | - Selvaraj Kunjiappan
- Department of Biotechnology, Kalasalingam Academy of Research and Education, Krishnankoil, 626126, India
| | - Parasuraman Pavadai
- Department of Pharmaceutical Chemistry, Faculty of Pharmacy, M.S. Ramaiah University of Applied Sciences, M.S.R. Nagar, Bengaluru, 560054, India
| |
Collapse
|
14
|
Wang J, Zeng Y, Sun H, Wang J, Wang X, Jin R, Wang M, Zhang X, Cao D, Chen X, Hsieh CY, Hou T. Molecular Generation with Reduced Labeling through Constraint Architecture. J Chem Inf Model 2023. [PMID: 37184885 DOI: 10.1021/acs.jcim.3c00579] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/16/2023]
Abstract
In the past few years, a number of machine learning (ML)-based molecular generative models have been proposed for generating molecules with desirable properties, but they all require a large amount of label data of pharmacological and physicochemical properties. However, experimental determination of these labels, especially bioactivity labels, is very expensive. In this study, we analyze the dependence of various multi-property molecule generation models on biological activity label data and propose Frag-G/M, a fragment-based multi-constraint molecular generation framework based on conditional transformer, recurrent neural networks (RNNs), and reinforcement learning (RL). The experimental results illustrate that, using the same number of labels, Frag-G/M can generate more desired molecules than the baselines (several times more than the baselines). Moreover, compared with the known active compounds, the molecules generated by Frag-G/M exhibit higher scaffold diversity than those generated by the baselines, thus making it more promising to be used in real-world drug discovery scenarios.
Collapse
Affiliation(s)
- Jike Wang
- College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou, Zhejiang 310058, P. R. China
- School of Computer Science, Wuhan University, Wuhan, Hubei 430072, P. R. China
| | - Yundian Zeng
- College of Control Science and Engineering, Zhejiang University, Hangzhou, Zhejiang 310027, P. R. China
| | - Huiyong Sun
- Department of Medicinal Chemistry, China Pharmaceutical University, Nanjing, Jiangsu 210009, P. R. China
| | - Junmei Wang
- Department of Pharmaceutical Sciences and Computational Chemical Genomics Screening Center, School of Pharmacy, University of Pittsburgh, Pittsburgh, Pennsylvania 15261, United States
| | - Xiaorui Wang
- State Key Laboratory of Quality Research in Chinese Medicines, Macau University of Science and Technology, Macau 999078, P. R. China
| | - Ruofan Jin
- College of Life Science, Zhejiang University, Hangzhou, Zhejiang 310027, P. R. China
| | - Mingyang Wang
- College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou, Zhejiang 310058, P. R. China
| | - Xujun Zhang
- College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou, Zhejiang 310058, P. R. China
| | - Dongsheng Cao
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha, Hunan 410004, P. R. China
| | - Xi Chen
- School of Computer Science, Wuhan University, Wuhan, Hubei 430072, P. R. China
| | - Chang-Yu Hsieh
- College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou, Zhejiang 310058, P. R. China
| | - Tingjun Hou
- College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou, Zhejiang 310058, P. R. China
| |
Collapse
|
15
|
Liu X, Zhang W, Tong X, Zhong F, Li Z, Xiong Z, Xiong J, Wu X, Fu Z, Tan X, Liu Z, Zhang S, Jiang H, Li X, Zheng M. MolFilterGAN: a progressively augmented generative adversarial network for triaging AI-designed molecules. J Cheminform 2023; 15:42. [PMID: 37031191 PMCID: PMC10082991 DOI: 10.1186/s13321-023-00711-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2022] [Accepted: 03/14/2023] [Indexed: 04/10/2023] Open
Abstract
Artificial intelligence (AI)-based molecular design methods, especially deep generative models for generating novel molecule structures, have gratified our imagination to explore unknown chemical space without relying on brute-force exploration. However, whether designed by AI or human experts, the molecules need to be accessibly synthesized and biologically evaluated, and the trial-and-error process remains a resources-intensive endeavor. Therefore, AI-based drug design methods face a major challenge of how to prioritize the molecular structures with potential for subsequent drug development. This study indicates that common filtering approaches based on traditional screening metrics fail to differentiate AI-designed molecules. To address this issue, we propose a novel molecular filtering method, MolFilterGAN, based on a progressively augmented generative adversarial network. Comparative analysis shows that MolFilterGAN outperforms conventional screening approaches based on drug-likeness or synthetic ability metrics. Retrospective analysis of AI-designed discoidin domain receptor 1 (DDR1) inhibitors shows that MolFilterGAN significantly increases the efficiency of molecular triaging. Further evaluation of MolFilterGAN on eight external ligand sets suggests that MolFilterGAN is useful in triaging or enriching bioactive compounds across a wide range of target types. These results highlighted the importance of MolFilterGAN in evaluating molecules integrally and further accelerating molecular discovery especially combined with advanced AI generative models.
Collapse
Affiliation(s)
- Xiaohong Liu
- Shanghai Institute for Advanced Immunochemical Studies, and School of Life Science and Technology, ShanghaiTech University, 393 Middle Huaxia Road, Shanghai, 201210, China
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
- University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing, 100049, China
- AlphaMa Inc., No. 108, Yuxin Road, Suzhou Industrial Park, Suzhou, 215128, China
| | - Wei Zhang
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
- University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing, 100049, China
| | - Xiaochu Tong
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
- University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing, 100049, China
| | - Feisheng Zhong
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
- University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing, 100049, China
| | - Zhaojun Li
- AlphaMa Inc., No. 108, Yuxin Road, Suzhou Industrial Park, Suzhou, 215128, China
| | - Zhaoping Xiong
- Shanghai Institute for Advanced Immunochemical Studies, and School of Life Science and Technology, ShanghaiTech University, 393 Middle Huaxia Road, Shanghai, 201210, China
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
- University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing, 100049, China
| | - Jiacheng Xiong
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
- University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing, 100049, China
| | - Xiaolong Wu
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
- School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai, 200237, China
| | - Zunyun Fu
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
| | - Xiaoqin Tan
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
- University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing, 100049, China
- ByteDance AI Lab, No. 1999 Yishan Road, Shanghai, 201103, China
| | - Zhiguo Liu
- AlphaMa Inc., No. 108, Yuxin Road, Suzhou Industrial Park, Suzhou, 215128, China
| | - Sulin Zhang
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
- University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing, 100049, China
| | - Hualiang Jiang
- Shanghai Institute for Advanced Immunochemical Studies, and School of Life Science and Technology, ShanghaiTech University, 393 Middle Huaxia Road, Shanghai, 201210, China
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
- University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing, 100049, China
| | - Xutong Li
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China.
- University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing, 100049, China.
| | - Mingyue Zheng
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China.
- University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing, 100049, China.
- School of Pharmaceutical Science and Technology, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, 310024, Hangzhou, China.
| |
Collapse
|
16
|
Hu P, Zou J, Yu J, Shi S. De novo drug design based on Stack-RNN with multi-objective reward-weighted sum and reinforcement learning. J Mol Model 2023; 29:121. [PMID: 36991180 DOI: 10.1007/s00894-023-05523-6] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2023] [Accepted: 03/20/2023] [Indexed: 03/31/2023]
Abstract
CONTEXT In recent decades, drug development has become extremely important as different new diseases have emerged. However, drug discovery is a long and complex process with a very low success rate, and methods are needed to improve the efficiency of the process and reduce the possibility of failure. Among them, drug design from scratch has become a promising approach. Molecules are generated from scratch, reducing the reliance on trial and error and prefabricated molecular repositories, but the optimization of its molecular properties is still a challenging multi-objective optimization problem. METHODS In this study, two stack-augmented recurrent neural networks were used to compose a generative model for generating drug-like molecules, and then reinforcement learning was used for optimization to generate molecules with desirable properties, such as binding affinity and the logarithm of the partition coefficient between octanol and water. In addition, a memory storage network was added to increase the internal diversity of the generated molecules. For multi-objective optimization, we proposed a new approach which utilized the magnitude of different attribute reward values to assign different weights to molecular optimization. The proposed model not only solves the problem that the properties of the generated molecules are extremely biased towards a certain attribute due to the possible conflict between the attributes, but also improves various properties of the generated molecules compared with the traditional weighted sum and alternating weighted sum, among which the molecular validity reaches 97.3%, the internal diversity is 0.8613, and the desirable molecules increases from 55.9 to 92%.
Collapse
Affiliation(s)
- Pengwei Hu
- Department of Mathematics, School of Mathematics and Computer Sciences, Nanchang University, Nanchang, 330031, China
- Institute of Mathematics and Interdisciplinary Sciences, Nanchang University, Nanchang, 330031, China
| | - Jinping Zou
- Department of Mathematics, School of Mathematics and Computer Sciences, Nanchang University, Nanchang, 330031, China
- Institute of Mathematics and Interdisciplinary Sciences, Nanchang University, Nanchang, 330031, China
| | - Jialin Yu
- Department of Mathematics, School of Mathematics and Computer Sciences, Nanchang University, Nanchang, 330031, China
- Institute of Mathematics and Interdisciplinary Sciences, Nanchang University, Nanchang, 330031, China
| | - Shaoping Shi
- Department of Mathematics, School of Mathematics and Computer Sciences, Nanchang University, Nanchang, 330031, China.
- Institute of Mathematics and Interdisciplinary Sciences, Nanchang University, Nanchang, 330031, China.
| |
Collapse
|
17
|
Liu X, Ye K, van Vlijmen HWT, IJzerman AP, van Westen GJP. DrugEx v3: scaffold-constrained drug design with graph transformer-based reinforcement learning. J Cheminform 2023; 15:24. [PMID: 36803659 PMCID: PMC9940339 DOI: 10.1186/s13321-023-00694-z] [Citation(s) in RCA: 18] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2022] [Accepted: 02/06/2023] [Indexed: 02/22/2023] Open
Abstract
Rational drug design often starts from specific scaffolds to which side chains/substituents are added or modified due to the large drug-like chemical space available to search for novel drug-like molecules. With the rapid growth of deep learning in drug discovery, a variety of effective approaches have been developed for de novo drug design. In previous work we proposed a method named DrugEx, which can be applied in polypharmacology based on multi-objective deep reinforcement learning. However, the previous version is trained under fixed objectives and does not allow users to input any prior information (i.e. a desired scaffold). In order to improve the general applicability, we updated DrugEx to design drug molecules based on scaffolds which consist of multiple fragments provided by users. Here, a Transformer model was employed to generate molecular structures. The Transformer is a multi-head self-attention deep learning model containing an encoder to receive scaffolds as input and a decoder to generate molecules as output. In order to deal with the graph representation of molecules a novel positional encoding for each atom and bond based on an adjacency matrix was proposed, extending the architecture of the Transformer. The graph Transformer model contains growing and connecting procedures for molecule generation starting from a given scaffold based on fragments. Moreover, the generator was trained under a reinforcement learning framework to increase the number of desired ligands. As a proof of concept, the method was applied to design ligands for the adenosine A2A receptor (A2AAR) and compared with SMILES-based methods. The results show that 100% of the generated molecules are valid and most of them had a high predicted affinity value towards A2AAR with given scaffolds.
Collapse
Affiliation(s)
- Xuhan Liu
- grid.5132.50000 0001 2312 1970Drug Discovery and Safety, Leiden Academic Centre for Drug Research, Einsteinweg 55, Leiden, The Netherlands
| | - Kai Ye
- grid.43169.390000 0001 0599 1243School of Electrics and Information Engineering, Xi’an Jiaotong University, 28 XianningW Rd, Xi’an, China
| | - Herman W. T. van Vlijmen
- grid.5132.50000 0001 2312 1970Drug Discovery and Safety, Leiden Academic Centre for Drug Research, Einsteinweg 55, Leiden, The Netherlands ,grid.419619.20000 0004 0623 0341Janssen Pharmaceutica NV, Turnhoutseweg 30, B-2340 Beerse, Belgium
| | - Adriaan P. IJzerman
- grid.5132.50000 0001 2312 1970Drug Discovery and Safety, Leiden Academic Centre for Drug Research, Einsteinweg 55, Leiden, The Netherlands
| | - Gerard J. P. van Westen
- grid.5132.50000 0001 2312 1970Drug Discovery and Safety, Leiden Academic Centre for Drug Research, Einsteinweg 55, Leiden, The Netherlands
| |
Collapse
|
18
|
Zhang Y, Li S, Xing M, Yuan Q, He H, Sun S. Universal Approach to De Novo Drug Design for Target Proteins Using Deep Reinforcement Learning. ACS OMEGA 2023; 8:5464-5474. [PMID: 36816653 PMCID: PMC9933084 DOI: 10.1021/acsomega.2c06653] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/15/2022] [Accepted: 01/05/2023] [Indexed: 05/28/2023]
Abstract
In drug design, the design and manufacture of safe and effective compounds is a long-term, complex, and complicated process. Therefore, developing a new rapid and generalizable drug design method is of great value. This study aimed to propose a general model based on reinforcement learning combined with drug-target interaction, which could be used to design new molecules according to different protein targets. The method adopted recurrent neural network molecular modeling and took the drug-target affinity model as the reward function of optimal molecular generation. It did not need to know the three-dimensional structure and active sites of protein targets but only required the information of a one-dimensional amino acid sequence. This approach was demonstrated to produce drugs highly similar to marketed drugs and design molecules with a better binding energy.
Collapse
Affiliation(s)
- Yunjiang Zhang
- Beijing
Key Laboratory for Green Catalysis and Separation, The Faculty of
Environment and Life, Beijing University
of Technology, Beijing100124, PR China
| | - Shuyuan Li
- Beijing
Key Laboratory for Green Catalysis and Separation, The Faculty of
Environment and Life, Beijing University
of Technology, Beijing100124, PR China
| | - Miaojuan Xing
- Beijing
Key Laboratory for Green Catalysis and Separation, The Faculty of
Environment and Life, Beijing University
of Technology, Beijing100124, PR China
| | - Qing Yuan
- Department
of Chemistry and Chemical Engineering, Beijing
University of Technology, Beijing100124, China
| | - Hong He
- Beijing
Key Laboratory for Green Catalysis and Separation, The Faculty of
Environment and Life, Beijing University
of Technology, Beijing100124, PR China
| | - Shaorui Sun
- Beijing
Key Laboratory for Green Catalysis and Separation, The Faculty of
Environment and Life, Beijing University
of Technology, Beijing100124, PR China
| |
Collapse
|
19
|
Kumar M, Nguyen TPN, Kaur J, Singh TG, Soni D, Singh R, Kumar P. Opportunities and challenges in application of artificial intelligence in pharmacology. Pharmacol Rep 2023; 75:3-18. [PMID: 36624355 PMCID: PMC9838466 DOI: 10.1007/s43440-022-00445-1] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2022] [Revised: 12/23/2022] [Accepted: 12/25/2022] [Indexed: 01/11/2023]
Abstract
Artificial intelligence (AI) is a machine science that can mimic human behaviour like intelligent analysis of data. AI functions with specialized algorithms and integrates with deep and machine learning. Living in the digital world can generate a huge amount of medical data every day. Therefore, we need an automated and reliable evaluation tool that can make decisions more accurately and faster. Machine learning has the potential to learn, understand and analyse the data used in healthcare systems. In the last few years, AI is known to be employed in various fields in pharmaceutical science especially in pharmacological research. It helps in the analysis of preclinical (laboratory animals) and clinical (in human) trial data. AI also plays important role in various processes such as drug discovery/manufacturing, diagnosis of big data for disease identification, personalized treatment, clinical trial research, radiotherapy, surgical robotics, smart electronic health records, and epidemic outbreak prediction. Moreover, AI has been used in the evaluation of biomarkers and diseases. In this review, we explain various models and general processes of machine learning and their role in pharmacological science. Therefore, AI with deep learning and machine learning could be relevant in pharmacological research.
Collapse
Affiliation(s)
- Mandeep Kumar
- Department of Pharmacy, Unit of Pharmacology and Toxicology, University of Genoa, Genoa, Italy
| | - T P Nhung Nguyen
- Department of Pharmacy, Unit of Pharmacology and Toxicology, University of Genoa, Genoa, Italy
- Department of Pharmacy, Da Nang University of Medical Technology and Pharmacy, Da Nang, Vietnam
| | - Jasleen Kaur
- Department of Pharmacology and Toxicology, National Institute of Pharmaceutical Education and Research (NIPER), Lucknow, Uttar Pradesh, 226002, India
| | | | - Divya Soni
- Department of Pharmacology, Central University of Punjab, Ghudda, Bathinda, Punjab, 151401, India
| | - Randhir Singh
- Department of Pharmacology, Central University of Punjab, Ghudda, Bathinda, Punjab, 151401, India
| | - Puneet Kumar
- Department of Pharmacology, Central University of Punjab, Ghudda, Bathinda, Punjab, 151401, India.
| |
Collapse
|
20
|
Noguchi S, Inoue J. Exploration of Chemical Space Guided by PixelCNN for Fragment-Based De Novo Drug Discovery. J Chem Inf Model 2022; 62:5988-6001. [PMID: 36454646 DOI: 10.1021/acs.jcim.2c01345] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/05/2022]
Abstract
We report a novel framework for achieving fragment-based molecular design using pixel convolutional neural network (PixelCNN) combined with the simplified molecular input line entry system (SMILES) as molecular representation. While a widely used recurrent neural network (RNN) assumes monotonically decaying correlations in strings, PixelCNN captures a periodicity among characters of SMILES. Thus, PixelCNN provides us with a novel solution for the analysis of chemical space by extracting the periodicity of molecular structures that will be buried in SMILES. Moreover, this characteristic enables us to generate molecules by combining several simple building blocks, such as a benzene ring and side-chain structures, which contributes to the effective exploration of chemical space by step-by-step searching for molecules from a target fragment. In conclusion, PixelCNN could be a powerful approach focusing on the periodicity of molecules to explore chemical space for the fragment-based molecular design.
Collapse
Affiliation(s)
- Satoshi Noguchi
- Department of Advanced Interdisciplinary Studies, The University of Tokyo, 4-6-1 Komaba, Meguro, Tokyo153-8904, Japan
| | - Junya Inoue
- Institute for Industrial Science, The University of Tokyo, 5-1-5 Kashiwanoha, Kashiwa, Chiba277-0082, Japan.,Department of Materials Engineering, The University of Tokyo, 7-3-1 Hongo, Bunkyo, Tokyo113-8656, Japan.,Research Center for Advanced Science and Technology, The University of Tokyo, 4-6-1 Komaba, Meguro, Tokyo153-8904, Japan
| |
Collapse
|
21
|
Kim H, Ko S, Kim BJ, Ryu SJ, Ahn J. Predicting chemical structure using reinforcement learning with a stack-augmented conditional variational autoencoder. J Cheminform 2022; 14:83. [PMID: 36494855 PMCID: PMC9733204 DOI: 10.1186/s13321-022-00666-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2022] [Accepted: 12/03/2022] [Indexed: 12/13/2022] Open
Abstract
In this paper, a reinforcement learning model is proposed that can maximize the predicted binding affinity between a generated molecule and target proteins. The model used to generate molecules in the proposed model was the Stacked Conditional Variation AutoEncoder (Stack-CVAE), which acts as an agent in reinforcement learning so that the resulting chemical formulas have the desired chemical properties and show high binding affinity with specific target proteins. We generated 1000 chemical formulas using the chemical properties of sorafenib and the three target kinases of sorafenib. Then, we confirmed that Stack-CVAE generates more of the valid and unique chemical compounds that have the desired chemical properties and predicted binding affinity better than other generative models. More detailed analysis for 100 of the top scoring molecules show that they are novel ones not found in existing chemical databases. Moreover, they reveal significantly higher predicted binding affinity score for Raf kinases than for other kinases. Furthermore, they are highly druggable and synthesizable.
Collapse
Affiliation(s)
- Hwanhee Kim
- Department of Computer Science and Engineering, Incheon National University, Incheon, 22012 Republic of Korea
| | - Soohyun Ko
- GenesisEgo, Seoul, 04382 Republic of Korea
| | - Byung Ju Kim
- UBLBio Corporation, Suwon, 16679 Republic of Korea
| | - Sung Jin Ryu
- UBLBio Corporation, Suwon, 16679 Republic of Korea
| | - Jaegyoon Ahn
- Department of Computer Science and Engineering, Incheon National University, Incheon, 22012 Republic of Korea
| |
Collapse
|
22
|
Chadi MA, Mousannif H, Aamouche A. Conditional reduction of the loss value versus reinforcement learning for biassing a de-novo drug design generator. J Cheminform 2022; 14:65. [PMID: 36167559 PMCID: PMC9516832 DOI: 10.1186/s13321-022-00643-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2022] [Accepted: 09/07/2022] [Indexed: 11/10/2022] Open
Abstract
Deep learning has demonstrated promising results in de novo drug design. Often, the general pipeline consists of training a generative model (G) to learn the building rules of valid molecules, then using a biassing technique such as reinforcement learning (RL) to focus G on the desired chemical space. However, this sequential training of the same model for different tasks is known to be prone to a catastrophic forgetting (CF) phenomenon. This work presents a novel yet simple approach to bias G with significantly less CF than RL. The proposed method relies on backpropagating a reduced value of the cross-entropy loss used to train G according to the proportion of desired molecules that the biased-G can generate. We named our approach CRLV, short for conditional reduction of the loss value. We compared the two biased models (RL-biased-G and CRLV-biased-G) for four different objectives related to de novo drug design.CRLV-biased-G outperformed RL-biased-G in all four objectives and manifested appreciably less CF. Besides, an intersection analysis between molecules generated by the RL-biased-G and the CRLV-biased-G revealed that they can be used jointly without losing diversity given the low percentage of overlap between the two to further increase the desirability. Finally, we show that the difficulty of an objective is proportional to (i) its frequency in the dataset used to train G and (ii) the associated structural variance (SV), which is a new parameter we introduced in this paper, calling for novel exploration techniques for such difficult objectives.
Collapse
Affiliation(s)
- Mohamed-Amine Chadi
- Laboratoire Ingénierie des Systems Informatiques (LISI), Department of Computer Science, Faculty of Sciences Semlalia, Cadi Ayyad University, 40000, Marrakech, Morocco.
| | - Hajar Mousannif
- Laboratoire Ingénierie des Systems Informatiques (LISI), Department of Computer Science, Faculty of Sciences Semlalia, Cadi Ayyad University, 40000, Marrakech, Morocco
| | - Ahmed Aamouche
- Laboratoire Ingénierie des Systèmes et Applications (LISA), Ecole Nationale des Sciences Appliquées de Marrakech, Cadi Ayyad University, BP 575, Avenue Abdelkrim Khattabi, 40000, Marrakech, Morocco
| |
Collapse
|
23
|
Wang J, Wang X, Sun H, Wang M, Zeng Y, Jiang D, Wu Z, Liu Z, Liao B, Yao X, Hsieh CY, Cao D, Chen X, Hou T. ChemistGA: A Chemical Synthesizable Accessible Molecular Generation Algorithm for Real-World Drug Discovery. J Med Chem 2022; 65:12482-12496. [PMID: 36065998 DOI: 10.1021/acs.jmedchem.2c01179] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Many deep learning (DL)-based molecular generative models have been proposed to design novel molecules. These models may perform well on benchmarks, but they usually do not take real-world constraints into account, such as available training data set, synthetic accessibility, and scaffold diversity in drug discovery. In this study, a new algorithm, ChemistGA, was proposed by combining the traditional heuristic algorithm with DL, in which the crossover of the traditional genetic algorithm (GA) was redefined by DL in conjunction with GA, and an innovative backcrossing operation was implemented to generate desired molecules. Our results clearly show that ChemistGA not only retains the strength of the traditional GA but also greatly enhances the synthetic accessibility and success rate of the generated molecules with desired properties. Calculations on the two benchmarks illustrate that ChemistGA achieves impressive performance among the state-of-the-art baselines, and it opens a new avenue for the application of generative models to real-world drug discovery scenarios.
Collapse
Affiliation(s)
- Jike Wang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, P. R. China.,School of Computer Science, Wuhan University, Wuhan 430072, Hubei, P. R. China.,CarbonSilicon AI Technology Co., Ltd, Hangzhou 310018, Zhejiang, P. R. China
| | - Xiaorui Wang
- CarbonSilicon AI Technology Co., Ltd, Hangzhou 310018, Zhejiang, P. R. China.,State Key Laboratory of Quality Research in Chinese Medicine, Macau University of Science and Technology, Taipa 999078, Macau(SAR), P. R. China
| | - Huiyong Sun
- Department of Medicinal Chemistry, China Pharmaceutical University, Nanjing 210009, Jiangsu, P. R. China
| | - Mingyang Wang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, P. R. China.,CarbonSilicon AI Technology Co., Ltd, Hangzhou 310018, Zhejiang, P. R. China
| | - Yundian Zeng
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, P. R. China
| | - Dejun Jiang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, P. R. China.,CarbonSilicon AI Technology Co., Ltd, Hangzhou 310018, Zhejiang, P. R. China
| | - Zhenxing Wu
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, P. R. China
| | - Zeyi Liu
- DAMTP, Centre for Mathematical Sciences, University of Cambridge, Cambridge CB30WA, U.K
| | - Ben Liao
- Tencent Quantum Laboratory, Tencent, Shenzhen 518057, Guangdong, P. R. China
| | - Xiaojun Yao
- State Key Laboratory of Quality Research in Chinese Medicine, Macau University of Science and Technology, Taipa 999078, Macau(SAR), P. R. China
| | - Chang-Yu Hsieh
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, P. R. China.,Tencent Quantum Laboratory, Tencent, Shenzhen 518057, Guangdong, P. R. China
| | - Dongsheng Cao
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha 410004, Hunan, P. R. China
| | - Xi Chen
- School of Computer Science, Wuhan University, Wuhan 430072, Hubei, P. R. China
| | - Tingjun Hou
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, P. R. China
| |
Collapse
|
24
|
Tan RK, Liu Y, Xie L. Reinforcement learning for systems pharmacology-oriented and personalized drug design. Expert Opin Drug Discov 2022; 17:849-863. [PMID: 35510835 PMCID: PMC9824901 DOI: 10.1080/17460441.2022.2072288] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]
Abstract
INTRODUCTION Many multi-genic systemic diseases such as neurological disorders, inflammatory diseases, and the majority of cancers do not have effective treatments yet. Reinforcement learning powered systems pharmacology is a potentially effective approach to designing personalized therapies for untreatable complex diseases. AREAS COVERED In this survey, state-of-the-art reinforcement learning methods and their latest applications to drug design are reviewed. The challenges on harnessing reinforcement learning for systems pharmacology and personalized medicine are discussed. Potential solutions to overcome the challenges are proposed. EXPERT OPINION In spite of successful application of advanced reinforcement learning techniques to target-based drug discovery, new reinforcement learning strategies are needed to address systems pharmacology-oriented personalized de novo drug design.
Collapse
Affiliation(s)
- Ryan K. Tan
- Department of Computer Science, Hunter College, The City University of New York
| | - Yang Liu
- Department of Computer Science, Hunter College, The City University of New York
| | - Lei Xie
- Department of Computer Science, Hunter College, The City University of New York,Ph.D. Program in Computer Science, Biology & Biochemistry, The Graduate Center, The City University of New York,Helen and Robert Appel Alzheimer’s Disease Research Institute, Feil Family Brain & Mind Research Institute, Weill Cornell Medicine, Cornell University,Correspondence should be addressed to Lei Xie -
| |
Collapse
|
25
|
Chen Z, Liu X, Zhao P, Li C, Wang Y, Li F, Akutsu T, Bain C, Gasser RB, Li J, Yang Z, Gao X, Kurgan L, Song J. iFeatureOmega: an integrative platform for engineering, visualization and analysis of features from molecular sequences, structural and ligand data sets. Nucleic Acids Res 2022; 50:W434-W447. [PMID: 35524557 PMCID: PMC9252729 DOI: 10.1093/nar/gkac351] [Citation(s) in RCA: 22] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2022] [Revised: 04/22/2022] [Accepted: 04/25/2022] [Indexed: 01/07/2023] Open
Abstract
The rapid accumulation of molecular data motivates development of innovative approaches to computationally characterize sequences, structures and functions of biological and chemical molecules in an efficient, accessible and accurate manner. Notwithstanding several computational tools that characterize protein or nucleic acids data, there are no one-stop computational toolkits that comprehensively characterize a wide range of biomolecules. We address this vital need by developing a holistic platform that generates features from sequence and structural data for a diverse collection of molecule types. Our freely available and easy-to-use iFeatureOmega platform generates, analyzes and visualizes 189 representations for biological sequences, structures and ligands. To the best of our knowledge, iFeatureOmega provides the largest scope when directly compared to the current solutions, in terms of the number of feature extraction and analysis approaches and coverage of different molecules. We release three versions of iFeatureOmega including a webserver, command line interface and graphical interface to satisfy needs of experienced bioinformaticians and less computer-savvy biologists and biochemists. With the assistance of iFeatureOmega, users can encode their molecular data into representations that facilitate construction of predictive models and analytical studies. We highlight benefits of iFeatureOmega based on three research applications, demonstrating how it can be used to accelerate and streamline research in bioinformatics, computational biology, and cheminformatics areas. The iFeatureOmega webserver is freely available at http://ifeatureomega.erc.monash.edu and the standalone versions can be downloaded from https://github.com/Superzchen/iFeatureOmega-GUI/ and https://github.com/Superzchen/iFeatureOmega-CLI/.
Collapse
Affiliation(s)
- Zhen Chen
- Collaborative Innovation Center of Henan Grain Crops, Henan Agricultural University, Zhengzhou 450046, China
- Center for Crop Genome Engineering, Henan Agricultural University, Zhengzhou 450046, China
| | - Xuhan Liu
- Drug Discovery and Safety, Leiden Academic Centre for Drug Research, Einsteinweg 55, Leiden 2333 CC, The Netherlands
| | - Pei Zhao
- State Key Laboratory of Cotton Biology, Institute of Cotton Research of Chinese Academy of Agricultural Sciences (CAAS), Anyang 455000, China
| | - Chen Li
- Monash Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, Victoria 3800, Australia
| | - Yanan Wang
- Monash Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, Victoria 3800, Australia
| | - Fuyi Li
- Monash Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, Victoria 3800, Australia
| | - Tatsuya Akutsu
- Bioinformatics Center, Institute for Chemical Research, Kyoto University, Kyoto 611-0011, Japan
| | - Chris Bain
- Monash Data Future Institutes, Monash University, Melbourne, Victoria 3800, Australia
| | - Robin B Gasser
- Department of Veterinary Biosciences, Melbourne Veterinary School, The University of Melbourne, Parkville, Victoria 3010, Australia
| | - Junzhou Li
- Collaborative Innovation Center of Henan Grain Crops, Henan Agricultural University, Zhengzhou 450046, China
| | - Zuoren Yang
- State Key Laboratory of Cotton Biology, Institute of Cotton Research of Chinese Academy of Agricultural Sciences (CAAS), Anyang 455000, China
| | - Xin Gao
- Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955, Saudi Arabia
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, USA
| | - Jiangning Song
- Monash Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, Victoria 3800, Australia
- Monash Data Future Institutes, Monash University, Melbourne, Victoria 3800, Australia
| |
Collapse
|
26
|
Abbasi M, Santos BP, Pereira TC, Sofia R, Monteiro NRC, Simões CJV, Brito R, Ribeiro B, Oliveira JL, Arrais JP. Designing optimized drug candidates with Generative Adversarial Network. J Cheminform 2022; 14:40. [PMID: 35754029 PMCID: PMC9233801 DOI: 10.1186/s13321-022-00623-6] [Citation(s) in RCA: 19] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2022] [Accepted: 06/13/2022] [Indexed: 12/03/2022] Open
Abstract
Drug design is an important area of study for pharmaceutical businesses. However, low efficacy, off-target delivery, time consumption, and high cost are challenges and can create barriers that impact this process. Deep Learning models are emerging as a promising solution to perform de novo drug design, i.e., to generate drug-like molecules tailored to specific needs. However, stereochemistry was not explicitly considered in the generated molecules, which is inevitable in targeted-oriented molecules. This paper proposes a framework based on Feedback Generative Adversarial Network (GAN) that includes optimization strategy by incorporating Encoder-Decoder, GAN, and Predictor deep models interconnected with a feedback loop. The Encoder-Decoder converts the string notations of molecules into latent space vectors, effectively creating a new type of molecular representation. At the same time, the GAN can learn and replicate the training data distribution and, therefore, generate new compounds. The feedback loop is designed to incorporate and evaluate the generated molecules according to the multiobjective desired property at every epoch of training to ensure a steady shift of the generated distribution towards the space of the targeted properties. Moreover, to develop a more precise set of molecules, we also incorporate a multiobjective optimization selection technique based on a non-dominated sorting genetic algorithm. The results demonstrate that the proposed framework can generate realistic, novel molecules that span the chemical space. The proposed Encoder-Decoder model correctly reconstructs 99% of the datasets, including stereochemical information. The model's ability to find uncharted regions of the chemical space was successfully shown by optimizing the unbiased GAN to generate molecules with a high binding affinity to the Kappa Opioid and Adenosine [Formula: see text] receptor. Furthermore, the generated compounds exhibit high internal and external diversity levels 0.88 and 0.94, respectively, and uniqueness.
Collapse
Affiliation(s)
- Maryam Abbasi
- Univ Coimbra, Centre for Informatics and Systems of the University of Coimbra, Department of Informatics Engineering, Coimbra, Portugal
| | - Beatriz P. Santos
- Univ Coimbra, Centre for Informatics and Systems of the University of Coimbra, Department of Informatics Engineering, Coimbra, Portugal
| | - Tiago C. Pereira
- IEETA, Department of Electronics, Telecommunications and Informatics, University of Aveiro, Aveiro, Portugal
| | - Raul Sofia
- Univ Coimbra, Centre for Informatics and Systems of the University of Coimbra, Department of Informatics Engineering, Coimbra, Portugal
| | - Nelson R. C. Monteiro
- Univ Coimbra, Centre for Informatics and Systems of the University of Coimbra, Department of Informatics Engineering, Coimbra, Portugal
| | | | - Rui Brito
- BSIM Therapeutics, Instituto Pedro Nunes, Coimbra, Portugal
| | - Bernardete Ribeiro
- Univ Coimbra, Centre for Informatics and Systems of the University of Coimbra, Department of Informatics Engineering, Coimbra, Portugal
| | - José L. Oliveira
- IEETA, Department of Electronics, Telecommunications and Informatics, University of Aveiro, Aveiro, Portugal
| | - Joel P. Arrais
- Univ Coimbra, Centre for Informatics and Systems of the University of Coimbra, Department of Informatics Engineering, Coimbra, Portugal
| |
Collapse
|
27
|
Wang J, Bhattarai A, Do HN, Akhter S, Miao Y. Molecular Simulations and Drug Discovery of Adenosine Receptors. Molecules 2022; 27:2054. [PMID: 35408454 PMCID: PMC9000248 DOI: 10.3390/molecules27072054] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2022] [Revised: 03/18/2022] [Accepted: 03/20/2022] [Indexed: 02/02/2023] Open
Abstract
G protein-coupled receptors (GPCRs) represent the largest family of human membrane proteins. Four subtypes of adenosine receptors (ARs), the A1AR, A2AAR, A2BAR and A3AR, each with a unique pharmacological profile and distribution within the tissues in the human body, mediate many physiological functions and serve as critical drug targets for treating numerous human diseases including cancer, neuropathic pain, cardiac ischemia, stroke and diabetes. The A1AR and A3AR preferentially couple to the Gi/o proteins, while the A2AAR and A2BAR prefer coupling to the Gs proteins. Adenosine receptors were the first subclass of GPCRs that had experimental structures determined in complex with distinct G proteins. Here, we will review recent studies in molecular simulations and computer-aided drug discovery of the adenosine receptors and also highlight their future research opportunities.
Collapse
Affiliation(s)
| | | | | | | | - Yinglong Miao
- Center for Computational Biology and Department of Molecular Biosciences, University of Kansas, Lawrence, KS 66047, USA; (J.W.); (A.B.); (H.N.D.); (S.A.)
| |
Collapse
|
28
|
Martinelli DD. Generative machine learning for de novo drug discovery: A systematic review. Comput Biol Med 2022; 145:105403. [PMID: 35339849 DOI: 10.1016/j.compbiomed.2022.105403] [Citation(s) in RCA: 25] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2022] [Revised: 03/10/2022] [Accepted: 03/11/2022] [Indexed: 02/08/2023]
Abstract
Recent research on artificial intelligence indicates that machine learning algorithms can auto-generate novel drug-like molecules. Generative models have revolutionized de novo drug discovery, rendering the explorative process more efficient. Several model frameworks and input formats have been proposed to enhance the performance of intelligent algorithms in generative molecular design. In this systematic literature review of experimental articles and reviews over the last five years, machine learning models, challenges associated with computational molecule design along with proposed solutions, and molecular encoding methods are discussed. A query-based search of the PubMed, ScienceDirect, Springer, Wiley Online Library, arXiv, MDPI, bioRxiv, and IEEE Xplore databases yielded 87 studies. Twelve additional studies were identified via citation searching. Of the articles in which machine learning was implemented, six prominent algorithms were identified: long short-term memory recurrent neural networks (LSTM-RNNs), variational autoencoders (VAEs), generative adversarial networks (GANs), adversarial autoencoders (AAEs), evolutionary algorithms, and gated recurrent unit (GRU-RNNs). Furthermore, eight central challenges were designated: homogeneity of generated molecular libraries, deficient synthesizability, limited assay data, model interpretability, incapacity for multi-property optimization, incomparability, restricted molecule size, and uncertainty in model evaluation. Molecules were encoded either as strings, which were occasionally augmented using randomization, as 2D graphs, or as 3D graphs. Statistical analysis and visualization are performed to illustrate how approaches to machine learning in de novo drug design have evolved over the past five years. Finally, future opportunities and reservations are discussed.
Collapse
|
29
|
Design of SARS-CoV-2 Mpro, PLpro dual-target inhibitors based on deep reinforcement learning and virtual screening. Future Med Chem 2022; 14:393-405. [PMID: 35220726 PMCID: PMC8920029 DOI: 10.4155/fmc-2021-0269] [Citation(s) in RCA: 20] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Abstract
Background: Since December 2019, SARS-CoV-2 has continued to spread rapidly around the world. The effective drugs may provide a long-term strategy to combat this virus. The main protease (Mpro) and papain-like protease (PLpro) are two important targets for the inhibition of SARS-CoV-2 virus replication and proliferation. Materials & methods: In this study, deep reinforcement learning, covalent docking and molecular dynamics simulations were used to identify novel compounds that have the potential to inhibit both Mpro and PLpro. Results and conclusion: Three compounds were identified that can effectively occupy the Mpro protein cavity with the PLpro protein cavity and form high frequency contacts with key amino acid residues (Mpro: His41, Cys145, Glu166, PLpro: Cys111). These three compounds can be further investigated as potential lead compounds for SARS-CoV-2 inhibitors.
Collapse
|
30
|
Wang M, Sun H, Wang J, Pang J, Chai X, Xu L, Li H, Cao D, Hou T. Comprehensive assessment of deep generative architectures for de novo drug design. Brief Bioinform 2021; 23:6470970. [PMID: 34929743 DOI: 10.1093/bib/bbab544] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2021] [Revised: 11/24/2021] [Accepted: 11/25/2021] [Indexed: 01/20/2023] Open
Abstract
Recently, deep learning (DL)-based de novo drug design represents a new trend in pharmaceutical research, and numerous DL-based methods have been developed for the generation of novel compounds with desired properties. However, a comprehensive understanding of the advantages and disadvantages of these methods is still lacking. In this study, the performances of different generative models were evaluated by analyzing the properties of the generated molecules in different scenarios, such as goal-directed (rediscovery, optimization and scaffold hopping of active compounds) and target-specific (generation of novel compounds for a given target) tasks. In overall, the DL-based models have significant advantages over the baseline models built by the traditional methods in learning the physicochemical property distributions of the training sets and may be more suitable for target-specific tasks. However, both the baselines and DL-based generative models cannot fully exploit the scaffolds of the training sets, and the molecules generated by the DL-based methods even have lower scaffold diversity than those generated by the traditional models. Moreover, our assessment illustrates that the DL-based methods do not exhibit obvious advantages over the genetic algorithm-based baselines in goal-directed tasks. We believe that our study provides valuable guidance for the effective use of generative models in de novo drug design.
Collapse
Affiliation(s)
- Mingyang Wang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, P. R. China
| | - Huiyong Sun
- Department of Medicinal Chemistry, China Pharmaceutical University, Nanjing 210009, Jiangsu, P. R. China
| | - Jike Wang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, P. R. China
| | - Jinping Pang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, P. R. China
| | - Xin Chai
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, P. R. China
| | - Lei Xu
- Institute of Bioinformatics and Medical Engineering, School of Electrical and Information Engineering, Jiangsu University of Technology, Changzhou 213001, Jiangsu, China
| | - Honglin Li
- Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science & Technology, Shanghai 200237, P. R. China
| | - Dongsheng Cao
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha 410013, Hunan, P. R. China
| | - Tingjun Hou
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, P. R. China
| |
Collapse
|
31
|
Grebner C, Matter H, Hessler G. Artificial Intelligence in Compound Design. Methods Mol Biol 2021; 2390:349-382. [PMID: 34731477 DOI: 10.1007/978-1-0716-1787-8_15] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/10/2023]
Abstract
Artificial intelligence has seen an incredibly fast development in recent years. Many novel technologies for property prediction of drug molecules as well as for the design of novel molecules were introduced by different research groups. These artificial intelligence-based design methods can be applied for suggesting novel chemical motifs in lead generation or scaffold hopping as well as for optimization of desired property profiles during lead optimization. In lead generation, broad sampling of the chemical space for identification of novel motifs is required, while in the lead optimization phase, a detailed exploration of the chemical neighborhood of a current lead series is advantageous. These different requirements for successful design outcomes render different combinations of artificial intelligence technologies useful. Overall, we observe that a combination of different approaches with tailored scoring and evaluation schemes appears beneficial for efficient artificial intelligence-based compound design.
Collapse
Affiliation(s)
- Christoph Grebner
- Sanofi-Aventis Deutschland GmbH, R&D, Integrated Drug Discovery, Frankfurt am Main, Germany
| | - Hans Matter
- Sanofi-Aventis Deutschland GmbH, R&D, Integrated Drug Discovery, Frankfurt am Main, Germany
| | - Gerhard Hessler
- Sanofi-Aventis Deutschland GmbH, R&D, Integrated Drug Discovery, Frankfurt am Main, Germany.
| |
Collapse
|
32
|
Xu Z, Wauchope OR, Frank AT. Navigating Chemical Space by Interfacing Generative Artificial Intelligence and Molecular Docking. J Chem Inf Model 2021; 61:5589-5600. [PMID: 34633194 DOI: 10.1021/acs.jcim.1c00746] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Here, we report the implementation and application of a simple, structure-aware framework to generate target-specific screening libraries. Our approach combines advances in generative artificial intelligence (AI) with conventional molecular docking to explore chemical space conditioned on the unique physicochemical properties of the active site of a biomolecular target. As a demonstration, we used our framework, which we refer to as sample-and-dock, to construct focused libraries for cyclin-dependent kinase type-2 (CDK2) and the active site of the main protease (Mpro) of the SARS-CoV-2 virus. We envision that the sample-and-dock framework could be used to generate theoretical maps of the chemical space specific to a given target and so provide information about its molecular recognition characteristics.
Collapse
Affiliation(s)
- Ziqiao Xu
- Chemistry Department, University of Michigan, 930 North University Avenue, Ann Arbor, Michigan 48109, United States
| | - Orrette R Wauchope
- Department of Natural Sciences, City University of New York, Baruch College, New York, New York 10010, United States
| | - Aaron T Frank
- Biophysics Program, University of Michigan, 930 North University Avenue, Ann Arbor, Michigan 48109, United States
| |
Collapse
|
33
|
Molecular generation by Fast Assembly of (Deep)SMILES fragments. J Cheminform 2021; 13:88. [PMID: 34775976 PMCID: PMC8591910 DOI: 10.1186/s13321-021-00566-4] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2021] [Accepted: 11/02/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND In recent years, in silico molecular design is regaining interest. To generate on a computer molecules with optimized properties, scoring functions can be coupled with a molecular generator to design novel molecules with a desired property profile. RESULTS In this article, a simple method is described to generate only valid molecules at high frequency ([Formula: see text] molecule/s using a single CPU core), given a molecular training set. The proposed method generates diverse SMILES (or DeepSMILES) encoded molecules while also showing some propensity at training set distribution matching. When working with DeepSMILES, the method reaches peak performance ([Formula: see text] molecule/s) because it relies almost exclusively on string operations. The "Fast Assembly of SMILES Fragments" software is released as open-source at https://github.com/UnixJunkie/FASMIFRA . Experiments regarding speed, training set distribution matching, molecular diversity and benchmark against several other methods are also shown.
Collapse
|
34
|
Liu X, Ye K, van Vlijmen HWT, Emmerich MTM, IJzerman AP, van Westen GJP. DrugEx v2: de novo design of drug molecules by Pareto-based multi-objective reinforcement learning in polypharmacology. J Cheminform 2021; 13:85. [PMID: 34772471 PMCID: PMC8588612 DOI: 10.1186/s13321-021-00561-9] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2021] [Accepted: 10/12/2021] [Indexed: 12/03/2022] Open
Abstract
In polypharmacology drugs are required to bind to multiple specific targets, for example to enhance efficacy or to reduce resistance formation. Although deep learning has achieved a breakthrough in de novo design in drug discovery, most of its applications only focus on a single drug target to generate drug-like active molecules. However, in reality drug molecules often interact with more than one target which can have desired (polypharmacology) or undesired (toxicity) effects. In a previous study we proposed a new method named DrugEx that integrates an exploration strategy into RNN-based reinforcement learning to improve the diversity of the generated molecules. Here, we extended our DrugEx algorithm with multi-objective optimization to generate drug-like molecules towards multiple targets or one specific target while avoiding off-targets (the two adenosine receptors, A1AR and A2AAR, and the potassium ion channel hERG in this study). In our model, we applied an RNN as the agent and machine learning predictors as the environment. Both the agent and the environment were pre-trained in advance and then interplayed under a reinforcement learning framework. The concept of evolutionary algorithms was merged into our method such that crossover and mutation operations were implemented by the same deep learning model as the agent. During the training loop, the agent generates a batch of SMILES-based molecules. Subsequently scores for all objectives provided by the environment are used to construct Pareto ranks of the generated molecules. For this ranking a non-dominated sorting algorithm and a Tanimoto-based crowding distance algorithm using chemical fingerprints are applied. Here, we adopted GPU acceleration to speed up the process of Pareto optimization. The final reward of each molecule is calculated based on the Pareto ranking with the ranking selection algorithm. The agent is trained under the guidance of the reward to make sure it can generate desired molecules after convergence of the training process. All in all we demonstrate generation of compounds with a diverse predicted selectivity profile towards multiple targets, offering the potential of high efficacy and low toxicity.
Collapse
Affiliation(s)
- Xuhan Liu
- Drug Discovery and Safety, Leiden Academic Centre for Drug Research, Einsteinweg 55, 2333 CC, Leiden, The Netherlands
| | - Kai Ye
- School of Electronics and Information Engineering, Xi'an Jiaotong University, 28 Xianning W Rd, Xi'an, China
| | - Herman W T van Vlijmen
- Drug Discovery and Safety, Leiden Academic Centre for Drug Research, Einsteinweg 55, 2333 CC, Leiden, The Netherlands.,Janssen Pharmaceutica NV, Turnhoutseweg 30, 2340, Beerse, Belgium
| | - Michael T M Emmerich
- Leiden Institute of Advanced Computer Science, Niels Bohrweg 1, 2333 CA, Leiden, The Netherlands
| | - Adriaan P IJzerman
- Drug Discovery and Safety, Leiden Academic Centre for Drug Research, Einsteinweg 55, 2333 CC, Leiden, The Netherlands
| | - Gerard J P van Westen
- Drug Discovery and Safety, Leiden Academic Centre for Drug Research, Einsteinweg 55, 2333 CC, Leiden, The Netherlands.
| |
Collapse
|
35
|
Deep Learning Applied to Ligand-Based De Novo Drug Design. METHODS IN MOLECULAR BIOLOGY (CLIFTON, N.J.) 2021; 2390:273-299. [PMID: 34731474 DOI: 10.1007/978-1-0716-1787-8_12] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Abstract
In the latest years, the application of deep generative models to suggest virtual compounds is becoming a new and powerful tool in drug discovery projects. The idea behind this review is to offer an updated view on de novo design approaches based on artificial intelligent (AI) algorithms, with a particular focus on ligand-based methods. We start this review by reporting a brief overview of the most relevant de novo design approaches developed before the use of AI techniques. We then describe the nowadays most common neural network architectures employed in ligand-based de novo design, together with an up-to-date list of more than 100 deep generative models found in the literature (2017-2020). In order to show how deep generative approaches are applied into drug discovery context, we report all the now available studies in which generated compounds have been synthetized and their biological activity tested. Finally, we discuss what we envisage as beneficial future directions for further application of deep generative models in de novo drug design.
Collapse
|
36
|
Jia P, Pei J, Wang G, Pan X, Zhu Y, Wu Y, Ouyang L. The roles of computer-aided drug synthesis in drug development. GREEN SYNTHESIS AND CATALYSIS 2021. [DOI: 10.1016/j.gresc.2021.11.007] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023] Open
|
37
|
Multi-constraint molecular generation based on conditional transformer, knowledge distillation and reinforcement learning. NAT MACH INTELL 2021. [DOI: 10.1038/s42256-021-00403-1] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
|
38
|
Sicho M, Liu X, Svozil D, van Westen GJP. GenUI: interactive and extensible open source software platform for de novo molecular generation and cheminformatics. J Cheminform 2021; 13:73. [PMID: 34563271 PMCID: PMC8465716 DOI: 10.1186/s13321-021-00550-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2021] [Accepted: 09/05/2021] [Indexed: 03/05/2023] Open
Abstract
Many contemporary cheminformatics methods, including computer-aided de novo drug design, hold promise to significantly accelerate and reduce the cost of drug discovery. Thanks to this attractive outlook, the field has thrived and in the past few years has seen an especially significant growth, mainly due to the emergence of novel methods based on deep neural networks. This growth is also apparent in the development of novel de novo drug design methods with many new generative algorithms now available. However, widespread adoption of new generative techniques in the fields like medicinal chemistry or chemical biology is still lagging behind the most recent developments. Upon taking a closer look, this fact is not surprising since in order to successfully integrate the most recent de novo drug design methods in existing processes and pipelines, a close collaboration between diverse groups of experimental and theoretical scientists needs to be established. Therefore, to accelerate the adoption of both modern and traditional de novo molecular generators, we developed Generator User Interface (GenUI), a software platform that makes it possible to integrate molecular generators within a feature-rich graphical user interface that is easy to use by experts of diverse backgrounds. GenUI is implemented as a web service and its interfaces offer access to cheminformatics tools for data preprocessing, model building, molecule generation, and interactive chemical space visualization. Moreover, the platform is easy to extend with customizable frontend React.js components and backend Python extensions. GenUI is open source and a recently developed de novo molecular generator, DrugEx, was integrated as a proof of principle. In this work, we present the architecture and implementation details of GenUI and discuss how it can facilitate collaboration in the disparate communities interested in de novo molecular generation and computer-aided drug discovery.
Collapse
Affiliation(s)
- M. Sicho
- CZ-OPENSCREEN: National Infrastructure for Chemical Biology, Department of Informatics and Chemistry, Faculty of Chemical Technology, University of Chemistry and Technology Prague, Technická 5, 166 28 Prague, Czech Republic
| | - X. Liu
- Computational Drug Discovery, Drug Discovery and Safety, Leiden Academic Centre for Drug Research, Einsteinweg 55, Leiden, The Netherlands
| | - D. Svozil
- CZ-OPENSCREEN: National Infrastructure for Chemical Biology, Department of Informatics and Chemistry, Faculty of Chemical Technology, University of Chemistry and Technology Prague, Technická 5, 166 28 Prague, Czech Republic
- CZ-OPENSCREEN: National Infrastructure for Chemical Biology, Institute of Molecular Genetics of the ASCR, v. v. i., Vídeňská 1083, 142 20 Prague 4, Czech Republic
| | - G. J. P. van Westen
- Computational Drug Discovery, Drug Discovery and Safety, Leiden Academic Centre for Drug Research, Einsteinweg 55, Leiden, The Netherlands
| |
Collapse
|
39
|
|
40
|
Pereira T, Abbasi M, Oliveira JL, Ribeiro B, Arrais J. Optimizing blood-brain barrier permeation through deep reinforcement learning for de novo drug design. Bioinformatics 2021; 37:i84-i92. [PMID: 34252946 PMCID: PMC8336597 DOI: 10.1093/bioinformatics/btab301] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
MOTIVATION The process of placing new drugs into the market is time-consuming, expensive and complex. The application of computational methods for designing molecules with bespoke properties can contribute to saving resources throughout this process. However, the fundamental properties to be optimized are often not considered or conflicting with each other. In this work, we propose a novel approach to consider both the biological property and the bioavailability of compounds through a deep reinforcement learning framework for the targeted generation of compounds. We aim to obtain a promising set of selective compounds for the adenosine A2A receptor and, simultaneously, that have the necessary properties in terms of solubility and permeability across the blood-brain barrier to reach the site of action. The cornerstone of the framework is based on a recurrent neural network architecture, the Generator. It seeks to learn the building rules of valid molecules to sample new compounds further. Also, two Predictors are trained to estimate the properties of interest of the new molecules. Finally, the fine-tuning of the Generator was performed with reinforcement learning, integrated with multi-objective optimization and exploratory techniques to ensure that the Generator is adequately biased. RESULTS The biased Generator can generate an interesting set of molecules, with approximately 85% having the two fundamental properties biased as desired. Thus, this approach has transformed a general molecule generator into a model focused on optimizing specific objectives. Furthermore, the molecules' synthesizability and drug-likeness demonstrate the potential applicability of the de novo drug design in medicinal chemistry. AVAILABILITY AND IMPLEMENTATION All code is publicly available in the https://github.com/larngroup/De-Novo-Drug-Design. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Tiago Pereira
- CSUC/DEI, University of Coimbra, Coimbra 3030-290, Portugal.,IEETA/DETI, University of Aveiro, Aveiro 3810-193, Portugal
| | - Maryam Abbasi
- CSUC/DEI, University of Coimbra, Coimbra 3030-290, Portugal
| | | | | | - Joel Arrais
- CSUC/DEI, University of Coimbra, Coimbra 3030-290, Portugal
| |
Collapse
|
41
|
Papadopoulos K, Giblin KA, Janet JP, Patronov A, Engkvist O. De novo design with deep generative models based on 3D similarity scoring. Bioorg Med Chem 2021; 44:116308. [PMID: 34280849 DOI: 10.1016/j.bmc.2021.116308] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2021] [Revised: 07/01/2021] [Accepted: 07/05/2021] [Indexed: 01/25/2023]
Abstract
We have demonstrated the utility of a 3D shape and pharmacophore similarity scoring component in molecular design with a deep generative model trained with reinforcement learning. Using Dopamine receptor type 2 (DRD2) as an example and its antagonist haloperidol 1 as a starting point in a ligand based design context, we have shown in a retrospective study that a 3D similarity enabled generative model can discover new leads in the absence of any other information. It can be efficiently used for scaffold hopping and generation of novel series. 3D similarity based models were compared against 2D QSAR based, indicating a significant degree of orthogonality of the generated outputs and with the former having a more diverse output. In addition, when the two scoring components are combined together for training of the generative model, it results in more efficient exploration of desirable chemical space compared to the individual components.
Collapse
Affiliation(s)
| | - Kathryn A Giblin
- Medicinal Chemistry, Research and Early Development, Oncology R&D, AstraZeneca, Cambridge, UK
| | - Jon Paul Janet
- Medicinal Chemistry, Research and Early Development, Cardiovascular, Renal and Metabolism (CVRM), BioPharmaceuticals R&D, AstraZeneca, Gothenburg, Sweden
| | - Atanas Patronov
- Molecular AI, Discovery Sciences, R&D, AstraZeneca, Gothenburg, Sweden
| | - Ola Engkvist
- Molecular AI, Discovery Sciences, R&D, AstraZeneca, Gothenburg, Sweden
| |
Collapse
|
42
|
Zhu S, Wu M, Huang Z, An J. Trends in application of advancing computational approaches in GPCR ligand discovery. Exp Biol Med (Maywood) 2021; 246:1011-1024. [PMID: 33641446 PMCID: PMC8113737 DOI: 10.1177/1535370221993422] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
G protein-coupled receptors (GPCRs) comprise the most important superfamily of protein targets in current ligand discovery and drug development. GPCRs are integral membrane proteins that play key roles in various cellular signaling processes. Therefore, GPCR signaling pathways are closely associated with numerous diseases, including cancer and several neurological, immunological, and hematological disorders. Computer-aided drug design (CADD) can expedite the process of GPCR drug discovery and potentially reduce the actual cost of research and development. Increasing knowledge of biological structures, as well as improvements on computer power and algorithms, have led to unprecedented use of CADD for the discovery of novel GPCR modulators. Similarly, machine learning approaches are now widely applied in various fields of drug target research. This review briefly summarizes the application of rising CADD methodologies, as well as novel machine learning techniques, in GPCR structural studies and bioligand discovery in the past few years. Recent novel computational strategies and feasible workflows are updated, and representative cases addressing challenging issues on olfactory receptors, biased agonism, and drug-induced cardiotoxic effects are highlighted to provide insights into future GPCR drug discovery.
Collapse
Affiliation(s)
- Siyu Zhu
- Division of Infectious Diseases and Global Public Health, Department of Medicine, School of Medicine, University of California at San Diego, La Jolla, CA 92093, USA
- Ciechanover Institute of Precision and Regenerative Medicine, School of Life and Health Sciences, Chinese University of Hong Kong, Shenzhen 518172, China
| | - Meixian Wu
- Division of Infectious Diseases and Global Public Health, Department of Medicine, School of Medicine, University of California at San Diego, La Jolla, CA 92093, USA
| | - Ziwei Huang
- Division of Infectious Diseases and Global Public Health, Department of Medicine, School of Medicine, University of California at San Diego, La Jolla, CA 92093, USA
- Ciechanover Institute of Precision and Regenerative Medicine, School of Life and Health Sciences, Chinese University of Hong Kong, Shenzhen 518172, China
- School of Life Sciences, Tsinghua University, Beijing 100084, China
| | - Jing An
- Division of Infectious Diseases and Global Public Health, Department of Medicine, School of Medicine, University of California at San Diego, La Jolla, CA 92093, USA
| |
Collapse
|
43
|
Pereira T, Abbasi M, Ribeiro B, Arrais JP. Diversity oriented Deep Reinforcement Learning for targeted molecule generation. J Cheminform 2021; 13:21. [PMID: 33750461 PMCID: PMC7944916 DOI: 10.1186/s13321-021-00498-z] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2020] [Accepted: 02/22/2021] [Indexed: 11/10/2022] Open
Abstract
In this work, we explore the potential of deep learning to streamline the process of identifying new potential drugs through the computational generation of molecules with interesting biological properties. Two deep neural networks compose our targeted generation framework: the Generator, which is trained to learn the building rules of valid molecules employing SMILES strings notation, and the Predictor which evaluates the newly generated compounds by predicting their affinity for the desired target. Then, the Generator is optimized through Reinforcement Learning to produce molecules with bespoken properties. The innovation of this approach is the exploratory strategy applied during the reinforcement training process that seeks to add novelty to the generated compounds. This training strategy employs two Generators interchangeably to sample new SMILES: the initially trained model that will remain fixed and a copy of the previous one that will be updated during the training to uncover the most promising molecules. The evolution of the reward assigned by the Predictor determines how often each one is employed to select the next token of the molecule. This strategy establishes a compromise between the need to acquire more information about the chemical space and the need to sample new molecules, with the experience gained so far. To demonstrate the effectiveness of the method, the Generator is trained to design molecules with an optimized coefficient of partition and also high inhibitory power against the Adenosine [Formula: see text] and [Formula: see text] opioid receptors. The results reveal that the model can effectively adjust the newly generated molecules towards the wanted direction. More importantly, it was possible to find promising sets of unique and diverse molecules, which was the main purpose of the newly implemented strategy.
Collapse
Affiliation(s)
- Tiago Pereira
- Department of Informatics Engineering, Centre for Informatics and Systems of the University of Coimbra, University of Coimbra, Pinhal de Marrocos, Coimbra, Portugal
| | - Maryam Abbasi
- Department of Informatics Engineering, Centre for Informatics and Systems of the University of Coimbra, University of Coimbra, Pinhal de Marrocos, Coimbra, Portugal
| | - Bernardete Ribeiro
- Department of Informatics Engineering, Centre for Informatics and Systems of the University of Coimbra, University of Coimbra, Pinhal de Marrocos, Coimbra, Portugal
| | - Joel P. Arrais
- Department of Informatics Engineering, Centre for Informatics and Systems of the University of Coimbra, University of Coimbra, Pinhal de Marrocos, Coimbra, Portugal
| |
Collapse
|
44
|
Mouchlis VD, Afantitis A, Serra A, Fratello M, Papadiamantis AG, Aidinis V, Lynch I, Greco D, Melagraki G. Advances in de Novo Drug Design: From Conventional to Machine Learning Methods. Int J Mol Sci 2021; 22:1676. [PMID: 33562347 PMCID: PMC7915729 DOI: 10.3390/ijms22041676] [Citation(s) in RCA: 101] [Impact Index Per Article: 33.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2020] [Revised: 01/31/2021] [Accepted: 01/31/2021] [Indexed: 12/11/2022] Open
Abstract
. De novo drug design is a computational approach that generates novel molecular structures from atomic building blocks with no a priori relationships. Conventional methods include structure-based and ligand-based design, which depend on the properties of the active site of a biological target or its known active binders, respectively. Artificial intelligence, including machine learning, is an emerging field that has positively impacted the drug discovery process. Deep reinforcement learning is a subdivision of machine learning that combines artificial neural networks with reinforcement-learning architectures. This method has successfully been employed to develop novel de novo drug design approaches using a variety of artificial networks including recurrent neural networks, convolutional neural networks, generative adversarial networks, and autoencoders. This review article summarizes advances in de novo drug design, from conventional growth algorithms to advanced machine-learning methodologies and highlights hot topics for further development.
Collapse
Affiliation(s)
| | - Antreas Afantitis
- Department of ChemoInformatics, NovaMechanics Ltd., Nicosia 1046, Cyprus;
| | - Angela Serra
- Faculty of Medicine and Health Technology, Tampere University, 33520 Tampere, Finland; (A.S.); (M.F.); (D.G.)
- BioMEdiTech Institute, Tampere University, 33520 Tampere, Finland
| | - Michele Fratello
- Faculty of Medicine and Health Technology, Tampere University, 33520 Tampere, Finland; (A.S.); (M.F.); (D.G.)
- BioMEdiTech Institute, Tampere University, 33520 Tampere, Finland
| | - Anastasios G. Papadiamantis
- Department of ChemoInformatics, NovaMechanics Ltd., Nicosia 1046, Cyprus;
- School of Geography, Earth and Environmental Sciences, University of Birmingham, Birmingham B15 2TT, UK;
| | - Vassilis Aidinis
- Institute for Bioinnovation, Biomedical Sciences Research Center Alexander Fleming, Fleming 34, 16672 Athens, Greece;
| | - Iseult Lynch
- School of Geography, Earth and Environmental Sciences, University of Birmingham, Birmingham B15 2TT, UK;
| | - Dario Greco
- Faculty of Medicine and Health Technology, Tampere University, 33520 Tampere, Finland; (A.S.); (M.F.); (D.G.)
- BioMEdiTech Institute, Tampere University, 33520 Tampere, Finland
- Institute of Biotechnology, University of Helsinki, 00014 Helsinki, Finland
- Finnish Center for Alternative Methods (FICAM), Tampere University, 33520 Tampere, Finland
| | - Georgia Melagraki
- Division of Physical Sciences & Applications, Hellenic Military Academy, 16672 Vari, Greece
| |
Collapse
|
45
|
Kim H, Kim E, Lee I, Bae B, Park M, Nam H. Artificial Intelligence in Drug Discovery: A Comprehensive Review of Data-driven and Machine Learning Approaches. BIOTECHNOL BIOPROC E 2021; 25:895-930. [PMID: 33437151 PMCID: PMC7790479 DOI: 10.1007/s12257-020-0049-y] [Citation(s) in RCA: 28] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2020] [Revised: 05/27/2020] [Accepted: 06/03/2020] [Indexed: 02/07/2023]
Abstract
As expenditure on drug development increases exponentially, the overall drug discovery process requires a sustainable revolution. Since artificial intelligence (AI) is leading the fourth industrial revolution, AI can be considered as a viable solution for unstable drug research and development. Generally, AI is applied to fields with sufficient data such as computer vision and natural language processing, but there are many efforts to revolutionize the existing drug discovery process by applying AI. This review provides a comprehensive, organized summary of the recent research trends in AI-guided drug discovery process including target identification, hit identification, ADMET prediction, lead optimization, and drug repositioning. The main data sources in each field are also summarized in this review. In addition, an in-depth analysis of the remaining challenges and limitations will be provided, and proposals for promising future directions in each of the aforementioned areas.
Collapse
Affiliation(s)
- Hyunho Kim
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST), Gwangju, 61005 Korea
| | - Eunyoung Kim
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST), Gwangju, 61005 Korea
| | - Ingoo Lee
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST), Gwangju, 61005 Korea
| | - Bongsung Bae
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST), Gwangju, 61005 Korea
| | - Minsu Park
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST), Gwangju, 61005 Korea
| | - Hojung Nam
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST), Gwangju, 61005 Korea
| |
Collapse
|
46
|
Liu X, IJzerman AP, van Westen GJP. Computational Approaches for De Novo Drug Design: Past, Present, and Future. Methods Mol Biol 2021; 2190:139-165. [PMID: 32804364 DOI: 10.1007/978-1-0716-0826-5_6] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Drug discovery is time- and resource-consuming. To this end, computational approaches that are applied in de novo drug design play an important role to improve the efficiency and decrease costs to develop novel drugs. Over several decades, a variety of methods have been proposed and applied in practice. Traditionally, drug design problems are always taken as combinational optimization in discrete chemical space. Hence optimization methods were exploited to search for new drug molecules to meet multiple objectives. With the accumulation of data and the development of machine learning methods, computational drug design methods have gradually shifted to a new paradigm. There has been particular interest in the potential application of deep learning methods to drug design. In this chapter, we will give a brief description of these two different de novo methods, compare their application scopes and discuss their possible development in the future.
Collapse
Affiliation(s)
- Xuhan Liu
- Drug Discovery and Safety, Leiden Academic Centre for Drug Research, Leiden, The Netherlands
| | - Adriaan P IJzerman
- Drug Discovery and Safety, Leiden Academic Centre for Drug Research, Leiden, The Netherlands
| | - Gerard J P van Westen
- Drug Discovery and Safety, Leiden Academic Centre for Drug Research, Leiden, The Netherlands.
| |
Collapse
|
47
|
Issa NT, Stathias V, Schürer S, Dakshanamurthy S. Machine and deep learning approaches for cancer drug repurposing. Semin Cancer Biol 2021; 68:132-142. [PMID: 31904426 PMCID: PMC7723306 DOI: 10.1016/j.semcancer.2019.12.011] [Citation(s) in RCA: 103] [Impact Index Per Article: 34.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2019] [Revised: 10/31/2019] [Accepted: 12/15/2019] [Indexed: 02/07/2023]
Abstract
Knowledge of the underpinnings of cancer initiation, progression and metastasis has increased exponentially in recent years. Advanced "omics" coupled with machine learning and artificial intelligence (deep learning) methods have helped elucidate targets and pathways critical to those processes that may be amenable to pharmacologic modulation. However, the current anti-cancer therapeutic armamentarium continues to lag behind. As the cost of developing a new drug remains prohibitively expensive, repurposing of existing approved and investigational drugs is sought after given known safety profiles and reduction in the cost barrier. Notably, successes in oncologic drug repurposing have been infrequent. Computational in-silico strategies have been developed to aid in modeling biological processes to find new disease-relevant targets and discovering novel drug-target and drug-phenotype associations. Machine and deep learning methods have especially enabled leaps in those successes. This review will discuss these methods as they pertain to cancer biology as well as immunomodulation for drug repurposing opportunities in oncologic diseases.
Collapse
Affiliation(s)
- Naiem T Issa
- Dr. Phillip Frost Department of Dermatology and Cutaneous Surgery, University of Miami School of Medicine, Miami, FL, USA
| | - Vasileios Stathias
- Department of Molecular and Cellular Pharmacology, University of Miami School of Medicine, Miami, FL, USA
| | - Stephan Schürer
- Department of Molecular and Cellular Pharmacology, University of Miami School of Medicine, Miami, FL, USA
| | - Sivanesan Dakshanamurthy
- Department of Oncology, Lombardi Comprehensive Cancer Center, Georgetown University Medical Center, Washington, DC, USA.
| |
Collapse
|
48
|
Blaschke T, Engkvist O, Bajorath J, Chen H. Memory-assisted reinforcement learning for diverse molecular de novo design. J Cheminform 2020; 12:68. [PMID: 33292554 PMCID: PMC7654024 DOI: 10.1186/s13321-020-00473-0] [Citation(s) in RCA: 50] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2020] [Accepted: 10/29/2020] [Indexed: 12/23/2022] Open
Abstract
In de novo molecular design, recurrent neural networks (RNN) have been shown to be effective methods for sampling and generating novel chemical structures. Using a technique called reinforcement learning (RL), an RNN can be tuned to target a particular section of chemical space with optimized desirable properties using a scoring function. However, ligands generated by current RL methods so far tend to have relatively low diversity, and sometimes even result in duplicate structures when optimizing towards desired properties. Here, we propose a new method to address the low diversity issue in RL for molecular design. Memory-assisted RL is an extension of the known RL, with the introduction of a so-called memory unit. As proof of concept, we applied our method to generate structures with a desired AlogP value. In a second case study, we applied our method to design ligands for the dopamine type 2 receptor and the 5-hydroxytryptamine type 1A receptor. For both receptors, a machine learning model was developed to predict whether generated molecules were active or not for the receptor. In both case studies, it was found that memory-assisted RL led to the generation of more compounds predicted to be active having higher chemical diversity, thus achieving better coverage of chemical space of known ligands compared to established RL methods.
Collapse
Affiliation(s)
- Thomas Blaschke
- Hit Discovery, Discovery Sciences, R&D, AstraZeneca Gothenburg, Mölndal, Sweden
| | - Ola Engkvist
- Hit Discovery, Discovery Sciences, R&D, AstraZeneca Gothenburg, Mölndal, Sweden
| | - Jürgen Bajorath
- Department of Life Science Informatics, LIMES Program Unit Chemical Biology and Medicinal Chemistry B-IT, Rheinische Friedrich-Wilhelms-Universität, Endenicher Allee 19c, Bonn, 53115, Germany
| | - Hongming Chen
- Centre of Chemistry and Chemical Biology, Guangzhou Regenerative Medicine and Health-Guangdong Laboratory, Science Park, Guangzhou, China.
| |
Collapse
|
49
|
Blaschke T, Arús-Pous J, Chen H, Margreitter C, Tyrchan C, Engkvist O, Papadopoulos K, Patronov A. REINVENT 2.0: An AI Tool for De Novo Drug Design. J Chem Inf Model 2020; 60:5918-5922. [PMID: 33118816 DOI: 10.1021/acs.jcim.0c00915] [Citation(s) in RCA: 161] [Impact Index Per Article: 40.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
In the past few years, we have witnessed a renaissance of the field of molecular de novo drug design. The advancements in deep learning and artificial intelligence (AI) have triggered an avalanche of ideas on how to translate such techniques to a variety of domains including the field of drug design. A range of architectures have been devised to find the optimal way of generating chemical compounds by using either graph- or string (SMILES)-based representations. With this application note, we aim to offer the community a production-ready tool for de novo design, called REINVENT. It can be effectively applied on drug discovery projects that are striving to resolve either exploration or exploitation problems while navigating the chemical space. It can facilitate the idea generation process by bringing to the researcher's attention the most promising compounds. REINVENT's code is publicly available at https://github.com/MolecularAI/Reinvent.
Collapse
Affiliation(s)
- Thomas Blaschke
- Hit Discovery, Discovery Sciences, R&D, AstraZeneca, Pepparedsleden 1, Gothenburg 43183, Sweden
| | - Josep Arús-Pous
- Hit Discovery, Discovery Sciences, R&D, AstraZeneca, Pepparedsleden 1, Gothenburg 43183, Sweden.,Department of Chemistry and Biochemistry, University of Bern, Freiestrasse 3, 3012 Bern, Switzerland
| | - Hongming Chen
- Chemistry and Chemical Biology Centre, Guangzhou Regenerative Medicine and Health Guangdong Laboratory, Science Park, 510530 Guangzhou, China
| | - Christian Margreitter
- Hit Discovery, Discovery Sciences, R&D, AstraZeneca, Pepparedsleden 1, Gothenburg 43183, Sweden
| | - Christian Tyrchan
- Medicinal Chemistry, Early RIA, Biopharmaceuticals R&D, AstraZeneca, Pepparedsleden 1, Gothenburg 43183, Sweden
| | - Ola Engkvist
- Hit Discovery, Discovery Sciences, R&D, AstraZeneca, Pepparedsleden 1, Gothenburg 43183, Sweden
| | - Kostas Papadopoulos
- Hit Discovery, Discovery Sciences, R&D, AstraZeneca, Pepparedsleden 1, Gothenburg 43183, Sweden
| | - Atanas Patronov
- Hit Discovery, Discovery Sciences, R&D, AstraZeneca, Pepparedsleden 1, Gothenburg 43183, Sweden
| |
Collapse
|
50
|
Renz P, Van Rompaey D, Wegner JK, Hochreiter S, Klambauer G. On failure modes in molecule generation and optimization. DRUG DISCOVERY TODAY. TECHNOLOGIES 2020; 32-33:55-63. [PMID: 33386095 DOI: 10.1016/j.ddtec.2020.09.003] [Citation(s) in RCA: 54] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/17/2020] [Revised: 09/08/2020] [Accepted: 09/14/2020] [Indexed: 11/29/2022]
Abstract
There has been a wave of generative models for molecules triggered by advances in the field of Deep Learning. These generative models are often used to optimize chemical compounds towards particular properties or a desired biological activity. The evaluation of generative models remains challenging and suggested performance metrics or scoring functions often do not cover all relevant aspects of drug design projects. In this work, we highlight some unintended failure modes in molecular generation and optimization and how these evade detection by current performance metrics.
Collapse
Affiliation(s)
- Philipp Renz
- LIT AI Lab & Institute for Machine Learning, Johannes Kepler University Linz, Altenberger Strasse 69, A-4040 Linz, Austria
| | - Dries Van Rompaey
- High Dimensional Biology and Discovery Data Sciences, Janssen Research & Development, Janssen Pharmaceutica N.V., Turnhoutseweg 30, Beerse B-2340, Belgium
| | - Jörg Kurt Wegner
- High Dimensional Biology and Discovery Data Sciences, Janssen Research & Development, Janssen Pharmaceutica N.V., Turnhoutseweg 30, Beerse B-2340, Belgium
| | - Sepp Hochreiter
- LIT AI Lab & Institute for Machine Learning, Johannes Kepler University Linz, Altenberger Strasse 69, A-4040 Linz, Austria
| | - Günter Klambauer
- LIT AI Lab & Institute for Machine Learning, Johannes Kepler University Linz, Altenberger Strasse 69, A-4040 Linz, Austria
| |
Collapse
|