1
|
Chen S, Xie J, Ye R, Xu DD, Yang Y. Structure-aware dual-target drug design through collaborative learning of pharmacophore combination and molecular simulation. Chem Sci 2024; 15:10366-10380. [PMID: 38994407 PMCID: PMC11234869 DOI: 10.1039/d4sc00094c] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2024] [Accepted: 06/09/2024] [Indexed: 07/13/2024] Open
Abstract
Dual-target drug design has gained significant attention in the treatment of complex diseases, such as cancers and autoimmune disorders. A widely employed design strategy is combining pharmacophores to leverage the knowledge of structure-activity relationships of both targets. Unfortunately, pharmacophore combination often struggles with long and expensive trial and error, because the protein pockets of the two targets impose complex structural constraints. In this study, we propose AIxFuse, a structure-aware dual-target drug design method that learns pharmacophore fusion patterns to satisfy the dual-target structural constraints simulated by molecular docking. AIxFuse employs two self-play reinforcement learning (RL) agents to learn pharmacophore selection and fusion by comprehensive feedback including dual-target molecular docking scores. Collaboratively, the molecular docking scores are learned by active learning (AL). Through collaborative RL and AL, AIxFuse learns to generate molecules with multiple desired properties. AIxFuse is shown to outperform state-of-the-art methods in generating dual-target drugs against glycogen synthase kinase-3 beta (GSK3β) and c-Jun N-terminal kinase 3 (JNK3). When applied to another task against retinoic acid receptor-related orphan receptor γ-t (RORγt) and dihydroorotate dehydrogenase (DHODH), AIxFuse exhibits consistent performance while compared methods suffer from performance drops, leading to a 5 times higher performance in success rate. Docking studies demonstrate that AIxFuse can generate molecules concurrently satisfying the binding mode required by both targets. Further free energy perturbation calculation indicates that the generated candidates have promising binding free energies against both targets.
Collapse
Affiliation(s)
- Sheng Chen
- School of Computer Science and Engineering, Sun Yat-sen University Guangzhou 510006 China
- AixplorerBio Inc. Jiaxing 314031 China
| | - Junjie Xie
- School of Computer Science and Engineering, Sun Yat-sen University Guangzhou 510006 China
- AixplorerBio Inc. Jiaxing 314031 China
| | | | | | - Yuedong Yang
- School of Computer Science and Engineering, Sun Yat-sen University Guangzhou 510006 China
| |
Collapse
|
2
|
Ahmad S, Raza K. An extensive review on lung cancer therapeutics using machine learning techniques: state-of-the-art and perspectives. J Drug Target 2024; 32:635-646. [PMID: 38662768 DOI: 10.1080/1061186x.2024.2347358] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2024] [Accepted: 04/18/2024] [Indexed: 05/07/2024]
Abstract
There are over 100 types of human cancer, accounting for millions of deaths every year. Lung cancer alone claims over 1.8 million lives per year and is expected to surpass 3.2 million by 2050, which underscores the urgent need for rapid drug development and repurposing initiatives. The application of AI emerges as a pivotal solution to developing anti-cancer therapeutics. This state-of-the-art review aims to explore the various applications of AI in lung cancer therapeutics. Predictive models can analyse large datasets, including clinical data, genetic information, and treatment outcomes, for novel drug design and to generate personalised treatment recommendations, potentially optimising therapeutic strategies, enhancing treatment efficacy, and minimising adverse effects. A thorough literature review study was conducted based on articles indexed in PubMed and Scopus. We compiled the use of various machine learning approaches, including CNN, RNN, GAN, VAEs, and other AI techniques, enhancing efficiency with accuracy exceeding 95%, which is validated through a computer-aided drug design process. AI can revolutionise lung cancer therapeutics, streamlining processes and saving biological scientists' time and effort-however, further research is needed to overcome challenges and fully unlock AI's potential in Lung Cancer Therapeutics.
Collapse
Affiliation(s)
- Shaban Ahmad
- Department of Computer Science, Jamia Millia Islamia, New Delhi, India
| | - Khalid Raza
- Department of Computer Science, Jamia Millia Islamia, New Delhi, India
| |
Collapse
|
3
|
Shen C, Song J, Hsieh CY, Cao D, Kang Y, Ye W, Wu Z, Wang J, Zhang O, Zhang X, Zeng H, Cai H, Chen Y, Chen L, Luo H, Zhao X, Jian T, Chen T, Jiang D, Wang M, Ye Q, Wu J, Du H, Shi H, Deng Y, Hou T. DrugFlow: An AI-Driven One-Stop Platform for Innovative Drug Discovery. J Chem Inf Model 2024. [PMID: 38920405 DOI: 10.1021/acs.jcim.4c00621] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/27/2024]
Abstract
Artificial intelligence (AI)-aided drug design has demonstrated unprecedented effects on modern drug discovery, but there is still an urgent need for user-friendly interfaces that bridge the gap between these sophisticated tools and scientists, particularly those who are less computer savvy. Herein, we present DrugFlow, an AI-driven one-stop platform that offers a clean, convenient, and cloud-based interface to streamline early drug discovery workflows. By seamlessly integrating a range of innovative AI algorithms, covering molecular docking, quantitative structure-activity relationship modeling, molecular generation, ADMET (absorption, distribution, metabolism, excretion and toxicity) prediction, and virtual screening, DrugFlow can offer effective AI solutions for almost all crucial stages in early drug discovery, including hit identification and hit/lead optimization. We hope that the platform can provide sufficiently valuable guidance to aid real-word drug design and discovery. The platform is available at https://drugflow.com.
Collapse
Affiliation(s)
- Chao Shen
- Hangzhou Carbonsilicon AI Technology Co., Ltd., Hangzhou 310018, Zhejiang, China
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Jianfei Song
- Hangzhou Carbonsilicon AI Technology Co., Ltd., Hangzhou 310018, Zhejiang, China
| | - Chang-Yu Hsieh
- Hangzhou Carbonsilicon AI Technology Co., Ltd., Hangzhou 310018, Zhejiang, China
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Dongsheng Cao
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha 410004, Hunan, China
| | - Yu Kang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Wenling Ye
- Hangzhou Carbonsilicon AI Technology Co., Ltd., Hangzhou 310018, Zhejiang, China
| | - Zhenxing Wu
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Jike Wang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Odin Zhang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Xujun Zhang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Hao Zeng
- Hangzhou Carbonsilicon AI Technology Co., Ltd., Hangzhou 310018, Zhejiang, China
| | - Heng Cai
- Hangzhou Carbonsilicon AI Technology Co., Ltd., Hangzhou 310018, Zhejiang, China
| | - Yu Chen
- Hangzhou Carbonsilicon AI Technology Co., Ltd., Hangzhou 310018, Zhejiang, China
| | - Linkang Chen
- Hangzhou Carbonsilicon AI Technology Co., Ltd., Hangzhou 310018, Zhejiang, China
| | - Hao Luo
- Hangzhou Carbonsilicon AI Technology Co., Ltd., Hangzhou 310018, Zhejiang, China
| | - Xinda Zhao
- Hangzhou Carbonsilicon AI Technology Co., Ltd., Hangzhou 310018, Zhejiang, China
| | - Tianye Jian
- Hangzhou Carbonsilicon AI Technology Co., Ltd., Hangzhou 310018, Zhejiang, China
| | - Tong Chen
- Hangzhou Carbonsilicon AI Technology Co., Ltd., Hangzhou 310018, Zhejiang, China
| | - Dejun Jiang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Mingyang Wang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Qing Ye
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Jialu Wu
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Hongyan Du
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Hui Shi
- Hangzhou Carbonsilicon AI Technology Co., Ltd., Hangzhou 310018, Zhejiang, China
| | - Yafeng Deng
- Hangzhou Carbonsilicon AI Technology Co., Ltd., Hangzhou 310018, Zhejiang, China
- Department of Automation, Tsinghua University, Beijing 100084, China
| | - Tingjun Hou
- Hangzhou Carbonsilicon AI Technology Co., Ltd., Hangzhou 310018, Zhejiang, China
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| |
Collapse
|
4
|
Alberga D, Lamanna G, Graziano G, Delre P, Lomuscio MC, Corriero N, Ligresti A, Siliqi D, Saviano M, Contino M, Stefanachi A, Mangiatordi GF. DeLA-DrugSelf: Empowering multi-objective de novo design through SELFIES molecular representation. Comput Biol Med 2024; 175:108486. [PMID: 38653065 DOI: 10.1016/j.compbiomed.2024.108486] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2024] [Revised: 04/08/2024] [Accepted: 04/15/2024] [Indexed: 04/25/2024]
Abstract
In this paper, we introduce DeLA-DrugSelf, an upgraded version of DeLA-Drug [J. Chem. Inf. Model. 62 (2022) 1411-1424], which incorporates essential advancements for automated multi-objective de novo design. Unlike its predecessor, which relies on SMILES notation for molecular representation, DeLA-DrugSelf employs a novel and robust molecular representation string named SELFIES (SELF-referencing Embedded String). The generation process in DeLA-DrugSelf not only involves substitutions to the initial string representing the starting query molecule but also incorporates insertions and deletions. This enhancement makes DeLA-DrugSelf significantly more adept at executing data-driven scaffold decoration and lead optimization strategies. Remarkably, DeLA-DrugSelf explicitly addresses the SELFIES-related collapse issue, considering only collapse-free compounds during generation. These compounds undergo a rigorous quality metrics evaluation, highlighting substantial advancements in terms of drug-likeness, uniqueness, and novelty compared to the molecules generated by the previous version of the algorithm. To evaluate the potential of DeLA-DrugSelf as a mutational operator within a genetic algorithm framework for multi-objective optimization, we employed a fitness function based on Pareto dominance. Our objectives focused on target-oriented properties aimed at optimizing known cannabinoid receptor 2 (CB2R) ligands. The results obtained indicate that DeLA-DrugSelf, available as a user-friendly web platform (https://www.ba.ic.cnr.it/softwareic/delaself/), can effectively contribute to the data-driven optimization of starting bioactive molecules based on user-defined parameters.
Collapse
Affiliation(s)
- Domenico Alberga
- CNR - Institute of Crystallography, Via Amendola 122/o, 70126, Bari, Italy
| | - Giuseppe Lamanna
- CNR - Institute of Crystallography, Via Amendola 122/o, 70126, Bari, Italy
| | - Giovanni Graziano
- Department of Pharmacy - Pharmaceutical Sciences, University of Bari "Aldo Moro", via E. Orabona, 4, I-70125, Bari, Italy
| | - Pietro Delre
- CNR - Institute of Crystallography, Via Amendola 122/o, 70126, Bari, Italy
| | | | - Nicola Corriero
- CNR - Institute of Crystallography, Via Amendola 122/o, 70126, Bari, Italy
| | - Alessia Ligresti
- CNR - Institute of Biomolecular Chemistry, Via Campi Flegrei 34, 80078, Pozzuoli, Italy
| | - Dritan Siliqi
- CNR - Institute of Crystallography, Via Amendola 122/o, 70126, Bari, Italy
| | - Michele Saviano
- CNR - Institute of Crystallography, Via Vivaldi 43, 81100, Caserta, Italy
| | - Marialessandra Contino
- Department of Pharmacy - Pharmaceutical Sciences, University of Bari "Aldo Moro", via E. Orabona, 4, I-70125, Bari, Italy
| | - Angela Stefanachi
- Department of Pharmacy - Pharmaceutical Sciences, University of Bari "Aldo Moro", via E. Orabona, 4, I-70125, Bari, Italy
| | | |
Collapse
|
5
|
Thomas M, O'Boyle NM, Bender A, De Graaf C. MolScore: a scoring, evaluation and benchmarking framework for generative models in de novo drug design. J Cheminform 2024; 16:64. [PMID: 38816825 PMCID: PMC11141043 DOI: 10.1186/s13321-024-00861-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2023] [Accepted: 05/15/2024] [Indexed: 06/01/2024] Open
Abstract
Generative models are undergoing rapid research and application to de novo drug design. To facilitate their application and evaluation, we present MolScore. MolScore already contains many drug-design-relevant scoring functions commonly used in benchmarks such as, molecular similarity, molecular docking, predictive models, synthesizability, and more. In addition, providing performance metrics to evaluate generative model performance based on the chemistry generated. With this unification of functionality, MolScore re-implements commonly used benchmarks in the field (such as GuacaMol, MOSES, and MolOpt). Moreover, new benchmarks can be created trivially. We demonstrate this by testing a chemical language model with reinforcement learning on three new tasks of increasing complexity related to the design of 5-HT2a ligands that utilise either molecular descriptors, 266 pre-trained QSAR models, or dual molecular docking. Lastly, MolScore can be integrated into an existing Python script with just three lines of code. This framework is a step towards unifying generative model application and evaluation as applied to drug design for both practitioners and researchers. The framework can be found on GitHub and downloaded directly from the Python Package Index.Scientific ContributionMolScore is an open-source platform to facilitate generative molecular design and evaluation thereof for application in drug design. This platform takes important steps towards unifying existing benchmarks, providing a platform to share new benchmarks, and improves customisation, flexibility and usability for practitioners over existing solutions.
Collapse
Affiliation(s)
- Morgan Thomas
- Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, Cambridge, CB2 1EW, UK.
| | - Noel M O'Boyle
- Computational Chemistry, Nxera Pharma, Steinmetz Building, Granta Park, Great Abington, Cambridge, CB21 6DG, UK
| | - Andreas Bender
- Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, Cambridge, CB2 1EW, UK
| | - Chris De Graaf
- Computational Chemistry, Nxera Pharma, Steinmetz Building, Granta Park, Great Abington, Cambridge, CB21 6DG, UK
| |
Collapse
|
6
|
Gou R, Yang J, Guo M, Chen Y, Xue W. CNSMolGen: A Bidirectional Recurrent Neural Network-Based Generative Model for De Novo Central Nervous System Drug Design. J Chem Inf Model 2024; 64:4059-4070. [PMID: 38739718 DOI: 10.1021/acs.jcim.4c00504] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/16/2024]
Abstract
Central nervous system (CNS) drugs have had a significant impact on treating a wide range of neurodegenerative and psychiatric disorders. In recent years, deep learning-based generative models have shown great potential for accelerating drug discovery and improving efficacy. However, specific applications of these techniques in CNS drug discovery have not been widely reported. In this study, we developed the CNSMolGen model, which uses a framework of bidirectional recurrent neural networks (Bi-RNNs) for de novo molecular design of CNS drugs. Results showed that the pretrained model was able to generate more than 90% of completely new molecular structures, which possessed the properties of CNS drug molecules and were synthesizable. In addition, transfer learning was performed on small data sets with specific biological activities to evaluate the potential application of the model for CNS drug optimization. Here, we used drugs against the classical CNS disease target serotonin transporter (SERT) as a fine-tuned data set and generated a focused database against the target protein. The potential biological activities of the generated molecules were verified by using the physics-based induced-fit docking study. The success of this model demonstrates its potential in CNS drug design and optimization, which provides a new impetus for future CNS drug development.
Collapse
Affiliation(s)
- Rongpei Gou
- Chongqing Key Laboratory of Natural Product Synthesis and Drug Research, School of Pharmaceutical Sciences, Chongqing University, Chongqing 401331, China
| | - Jingyi Yang
- Chongqing Key Laboratory of Natural Product Synthesis and Drug Research, School of Pharmaceutical Sciences, Chongqing University, Chongqing 401331, China
| | - Menghan Guo
- Chongqing Key Laboratory of Natural Product Synthesis and Drug Research, School of Pharmaceutical Sciences, Chongqing University, Chongqing 401331, China
| | - Yingjun Chen
- Chongqing Key Laboratory of Natural Product Synthesis and Drug Research, School of Pharmaceutical Sciences, Chongqing University, Chongqing 401331, China
| | - Weiwei Xue
- Chongqing Key Laboratory of Natural Product Synthesis and Drug Research, School of Pharmaceutical Sciences, Chongqing University, Chongqing 401331, China
| |
Collapse
|
7
|
Tang Q, Ratnayake R, Seabra G, Jiang Z, Fang R, Cui L, Ding Y, Kahveci T, Bian J, Li C, Luesch H, Li Y. Morphological profiling for drug discovery in the era of deep learning. Brief Bioinform 2024; 25:bbae284. [PMID: 38886164 PMCID: PMC11182685 DOI: 10.1093/bib/bbae284] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2024] [Revised: 05/13/2024] [Accepted: 06/03/2024] [Indexed: 06/20/2024] Open
Abstract
Morphological profiling is a valuable tool in phenotypic drug discovery. The advent of high-throughput automated imaging has enabled the capturing of a wide range of morphological features of cells or organisms in response to perturbations at the single-cell resolution. Concurrently, significant advances in machine learning and deep learning, especially in computer vision, have led to substantial improvements in analyzing large-scale high-content images at high throughput. These efforts have facilitated understanding of compound mechanism of action, drug repurposing, characterization of cell morphodynamics under perturbation, and ultimately contributing to the development of novel therapeutics. In this review, we provide a comprehensive overview of the recent advances in the field of morphological profiling. We summarize the image profiling analysis workflow, survey a broad spectrum of analysis strategies encompassing feature engineering- and deep learning-based approaches, and introduce publicly available benchmark datasets. We place a particular emphasis on the application of deep learning in this pipeline, covering cell segmentation, image representation learning, and multimodal learning. Additionally, we illuminate the application of morphological profiling in phenotypic drug discovery and highlight potential challenges and opportunities in this field.
Collapse
Affiliation(s)
- Qiaosi Tang
- Calico Life Sciences, South San Francisco, CA 94080, United States
| | - Ranjala Ratnayake
- Department of Medicinal Chemistry, Center for Natural Products, Drug Discovery and Development, University of Florida, Gainesville, FL 32610, United States
| | - Gustavo Seabra
- Department of Medicinal Chemistry, Center for Natural Products, Drug Discovery and Development, University of Florida, Gainesville, FL 32610, United States
| | - Zhe Jiang
- Department of Computer & Information Science & Engineering, University of Florida, Gainesville, FL 32611, United States
| | - Ruogu Fang
- Department of Computer & Information Science & Engineering, University of Florida, Gainesville, FL 32611, United States
- J. Crayton Pruitt Family Department of Biomedical Engineering, Herbert Wertheim College of Engineering, University of Florida, Gainesville, FL 32611, United States
| | - Lina Cui
- Department of Medicinal Chemistry, Center for Natural Products, Drug Discovery and Development, University of Florida, Gainesville, FL 32610, United States
| | - Yousong Ding
- Department of Medicinal Chemistry, Center for Natural Products, Drug Discovery and Development, University of Florida, Gainesville, FL 32610, United States
| | - Tamer Kahveci
- Department of Computer & Information Science & Engineering, University of Florida, Gainesville, FL 32611, United States
| | - Jiang Bian
- Department of Health Outcomes and Biomedical Informatics, College of Medicine, University of Florida, Gainesville, FL 32611, United States
| | - Chenglong Li
- Department of Medicinal Chemistry, Center for Natural Products, Drug Discovery and Development, University of Florida, Gainesville, FL 32610, United States
| | - Hendrik Luesch
- Department of Medicinal Chemistry, Center for Natural Products, Drug Discovery and Development, University of Florida, Gainesville, FL 32610, United States
| | - Yanjun Li
- Department of Medicinal Chemistry, Center for Natural Products, Drug Discovery and Development, University of Florida, Gainesville, FL 32610, United States
- Department of Computer & Information Science & Engineering, University of Florida, Gainesville, FL 32611, United States
| |
Collapse
|
8
|
Tang X, Dai H, Knight E, Wu F, Li Y, Li T, Gerstein M. A survey of generative AI for de novo drug design: new frontiers in molecule and protein generation. Brief Bioinform 2024; 25:bbae338. [PMID: 39007594 DOI: 10.1093/bib/bbae338] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2024] [Revised: 05/21/2024] [Accepted: 06/27/2024] [Indexed: 07/16/2024] Open
Abstract
Artificial intelligence (AI)-driven methods can vastly improve the historically costly drug design process, with various generative models already in widespread use. Generative models for de novo drug design, in particular, focus on the creation of novel biological compounds entirely from scratch, representing a promising future direction. Rapid development in the field, combined with the inherent complexity of the drug design process, creates a difficult landscape for new researchers to enter. In this survey, we organize de novo drug design into two overarching themes: small molecule and protein generation. Within each theme, we identify a variety of subtasks and applications, highlighting important datasets, benchmarks, and model architectures and comparing the performance of top models. We take a broad approach to AI-driven drug design, allowing for both micro-level comparisons of various methods within each subtask and macro-level observations across different fields. We discuss parallel challenges and approaches between the two applications and highlight future directions for AI-driven de novo drug design as a whole. An organized repository of all covered sources is available at https://github.com/gersteinlab/GenAI4Drug.
Collapse
Affiliation(s)
- Xiangru Tang
- Department of Computer Science, Yale University, New Haven, CT 06520, United States
| | - Howard Dai
- Department of Computer Science, Yale University, New Haven, CT 06520, United States
| | - Elizabeth Knight
- School of Medicine, Yale University, New Haven, CT 06520, United States
| | - Fang Wu
- Computer Science Department, Stanford University, CA 94305, United States
| | - Yunyang Li
- Department of Computer Science, Yale University, New Haven, CT 06520, United States
| | - Tianxiao Li
- Program in Computational Biology & Bioinformatics, Yale University, New Haven, CT 06520, United States
| | - Mark Gerstein
- Department of Computer Science, Yale University, New Haven, CT 06520, United States
- Program in Computational Biology & Bioinformatics, Yale University, New Haven, CT 06520, United States
- Department of Statistics & Data Science, Yale University, New Haven, CT 06520, United States
- Department of Biomedical Informatics & Data Science, Yale University, New Haven, CT 06520, United States
- Department of Molecular Biophysics & Biochemistry, Yale University, New Haven, CT 06520, United States
| |
Collapse
|
9
|
Wang S, Liang D, Wang J, Dong K, Zhang Y, Liang H, Xu X, Song T. FraHMT: A Fragment-Oriented Heterogeneous Graph Molecular Generation Model for Target Proteins. J Chem Inf Model 2024; 64:3718-3732. [PMID: 38644797 DOI: 10.1021/acs.jcim.4c00252] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/23/2024]
Abstract
The molecular generation task stands as a pivotal step in the domains of computational chemistry and drug discovery, aiming to computationally generate molecular structures for specific properties. In contrast to previous models that focused primarily on SMILES strings or molecular graphs, our model placed a special emphasis on the substructure information on molecules, enabling the model to learn richer chemical rules and structure features from fragments and chemical reaction information on molecules. To accomplish this, we fragmented the molecules to construct heterogeneous graph representations based on atom and fragment information. Then our model mapped the heterogeneous graph data into a latent vector space by using an encoder and employed a self-regressive generative model as a decoder for molecular generation. Additionally, we performed transfer learning on the model using a small set of ligand molecules known to be active against the target protein to generate molecules that bind better to the target protein. Experimental results demonstrate that our model is highly competitive with state-of-the-art models. It can generate valid and diverse molecules with favorable physicochemical properties and drug-likeness. Importantly, they produce novel molecules with high docking scores against the target proteins.
Collapse
Affiliation(s)
- Shuang Wang
- College of Computer Science and Technology, China University of Petroleum, QingDao 266580, China
| | - Dingming Liang
- College of Computer Science and Technology, China University of Petroleum, QingDao 266580, China
| | - Jianmin Wang
- College of Computer Science and Technology, China University of Petroleum, QingDao 266580, China
- The Interdisciplinary Graduate Program in Integrative Biotechnology, Yonsei University, Incheon 21983, Republic of Korea
| | - Kaiyu Dong
- College of Computer Science and Technology, China University of Petroleum, QingDao 266580, China
| | - Yunjing Zhang
- College of Computer Science and Technology, China University of Petroleum, QingDao 266580, China
| | - Huicong Liang
- Marine Biomedical Institute of Qingdao, School of Medicine and Pharmacy, Ocean University of China, QingDao 266580, China
| | - Ximing Xu
- Marine Biomedical Institute of Qingdao, School of Medicine and Pharmacy, Ocean University of China, QingDao 266580, China
| | - Tao Song
- College of Computer Science and Technology, China University of Petroleum, QingDao 266580, China
- Department of Artificial Intelligence, Faculty of Computer Science, Polytechnical University of Madrid, Madrid 28031, Spain
| |
Collapse
|
10
|
Xia W, Xiao J, Bian H, Zhang J, Zhang JZH, Zhang H. Deep Learning-Based construction of a Drug-Like compound database and its application in virtual screening of HsDHODH inhibitors. Methods 2024; 225:44-51. [PMID: 38518843 DOI: 10.1016/j.ymeth.2024.03.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2023] [Revised: 01/24/2024] [Accepted: 03/13/2024] [Indexed: 03/24/2024] Open
Abstract
The process of virtual screening relies heavily on the databases, but it is disadvantageous to conduct virtual screening based on commercial databases with patent-protected compounds, high compound toxicity and side effects. Therefore, this paper utilizes generative recurrent neural networks (RNN) containing long short-term memory (LSTM) cells to learn the properties of drug compounds in the DrugBank, aiming to obtain a new and virtual screening compounds database with drug-like properties. Ultimately, a compounds database consisting of 26,316 compounds is obtained by this method. To evaluate the potential of this compounds database, a series of tests are performed, including chemical space, ADME properties, compound fragmentation, and synthesizability analysis. As a result, it is proved that the database is equipped with good drug-like properties and a relatively new backbone, its potential in virtual screening is further tested. Finally, a series of seedling compounds with completely new backbones are obtained through docking and binding free energy calculations.
Collapse
Affiliation(s)
- Wei Xia
- Key Laboratory of Quantitative Synthetic Biology, Shenzhen Institute of Synthetic Biology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
| | - Jin Xiao
- Shanghai Engineering Research Center of Molecular Therapeutics and New Drug Development, Shanghai Key Laboratory of Green Chemistry & Chemical Process, School of Chemistry and Molecular Engineering, East China Normal University at Shanghai, 200062, China
| | - Hengwei Bian
- Shanghai Engineering Research Center of Molecular Therapeutics and New Drug Development, Shanghai Key Laboratory of Green Chemistry & Chemical Process, School of Chemistry and Molecular Engineering, East China Normal University at Shanghai, 200062, China.
| | - Jiajun Zhang
- Key Laboratory of Quantitative Synthetic Biology, Shenzhen Institute of Synthetic Biology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
| | - John Z H Zhang
- Key Laboratory of Quantitative Synthetic Biology, Shenzhen Institute of Synthetic Biology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China; Shanghai Engineering Research Center of Molecular Therapeutics and New Drug Development, Shanghai Key Laboratory of Green Chemistry & Chemical Process, School of Chemistry and Molecular Engineering, East China Normal University at Shanghai, 200062, China; NYU-ECNU Center for Computational Chemistry at NYU Shanghai, Shanghai 200062, China; Department of Chemistry, New York University, NY, NY10003, USA; Collaborative Innovation Center of Extreme Optics, Shanxi University, Taiyuan, Shanxi, 030006, China.
| | - Haiping Zhang
- Key Laboratory of Quantitative Synthetic Biology, Shenzhen Institute of Synthetic Biology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China.
| |
Collapse
|
11
|
Agu PC, Obulose CN. Piquing artificial intelligence towards drug discovery: Tools, techniques, and applications. Drug Dev Res 2024; 85:e22159. [PMID: 38375772 DOI: 10.1002/ddr.22159] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2023] [Revised: 01/12/2024] [Accepted: 01/29/2024] [Indexed: 02/21/2024]
Abstract
The purpose of this study was to discuss how artificial intelligence (AI) methods have affected the field of drug development. It looks at how AI models and data resources are reshaping the drug development process by offering more affordable and expedient options to conventional approaches. The paper opens with an overview of well-known information sources for drug development. The discussion then moves on to molecular representation techniques that make it possible to convert data into representations that computers can understand. The paper also gives a general overview of the algorithms used in the creation of drug discovery models based on AI. In particular, the paper looks at how AI algorithms might be used to forecast drug toxicity, drug bioactivity, and drug physicochemical properties. De novo drug design, binding affinity prediction, and other AI-based models for drug-target interaction were covered in deeper detail. Modern applications of AI in nanomedicine design and pharmacological synergism/antagonism prediction were also covered. The potential advantages of AI in drug development are highlighted as the evaluation comes to a close. It underlines how AI may greatly speed up and improve the efficiency of drug discovery, resulting in the creation of new and better medicines. To fully realize the promise of AI in drug discovery, the review acknowledges the difficulties that come with its uses in this field and advocates for more study and development.
Collapse
Affiliation(s)
- Peter Chinedu Agu
- Department of Biochemistry, College of Science, Evangel University, Akaeze, Ebonyi State, Nigeria
| | - Chidiebere Nwiboko Obulose
- Department of Computer Sciences, Our Savior Institute of Science, Agriculture, and Technology (OSISATECH Polytechnic), Enugu, Nigeria
| |
Collapse
|
12
|
Jones J, Clark RD, Lawless MS, Miller DW, Waldman M. The AI-driven Drug Design (AIDD) platform: an interactive multi-parameter optimization system integrating molecular evolution with physiologically based pharmacokinetic simulations. J Comput Aided Mol Des 2024; 38:14. [PMID: 38499823 DOI: 10.1007/s10822-024-00552-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2023] [Accepted: 02/13/2024] [Indexed: 03/20/2024]
Abstract
Computer-aided drug design has advanced rapidly in recent years, and multiple instances of in silico designed molecules advancing to the clinic have demonstrated the contribution of this field to medicine. Properly designed and implemented platforms can drastically reduce drug development timelines and costs. While such efforts were initially focused primarily on target affinity/activity, it is now appreciated that other parameters are equally important in the successful development of a drug and its progression to the clinic, including pharmacokinetic properties as well as absorption, distribution, metabolic, excretion and toxicological (ADMET) properties. In the last decade, several programs have been developed that incorporate these properties into the drug design and optimization process and to varying degrees, allowing for multi-parameter optimization. Here, we introduce the Artificial Intelligence-driven Drug Design (AIDD) platform, which automates the drug design process by integrating high-throughput physiologically-based pharmacokinetic simulations (powered by GastroPlus) and ADMET predictions (powered by ADMET Predictor) with an advanced evolutionary algorithm that is quite different than current generative models. AIDD uses these and other estimates in iteratively performing multi-objective optimizations to produce novel molecules that are active and lead-like. Here we describe the AIDD workflow and details of the methodologies involved therein. We use a dataset of triazolopyrimidine inhibitors of the dihydroorotate dehydrogenase from Plasmodium falciparum to illustrate how AIDD generates novel sets of molecules.
Collapse
Affiliation(s)
- Jeremy Jones
- Simulations Plus, Inc., 42505 10th Street West, Lancaster, CA, 93534‑7059, USA.
| | - Robert D Clark
- The Indiana University Luddy School of Informatics, Computing and Engineering, 700 N. Woodlawn Avenue, Bloomington, IN, 47408, USA
| | - Michael S Lawless
- Simulations Plus, Inc., 42505 10th Street West, Lancaster, CA, 93534‑7059, USA
| | - David W Miller
- Simulations Plus, Inc., 42505 10th Street West, Lancaster, CA, 93534‑7059, USA
| | - Marvin Waldman
- Simulations Plus, Inc., 42505 10th Street West, Lancaster, CA, 93534‑7059, USA
| |
Collapse
|
13
|
Wang M, Wu Z, Wang J, Weng G, Kang Y, Pan P, Li D, Deng Y, Yao X, Bing Z, Hsieh CY, Hou T. Genetic Algorithm-Based Receptor Ligand: A Genetic Algorithm-Guided Generative Model to Boost the Novelty and Drug-Likeness of Molecules in a Sampling Chemical Space. J Chem Inf Model 2024; 64:1213-1228. [PMID: 38302422 DOI: 10.1021/acs.jcim.3c01964] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2024]
Abstract
Deep learning-based de novo molecular design has recently gained significant attention. While numerous DL-based generative models have been successfully developed for designing novel compounds, the majority of the generated molecules lack sufficiently novel scaffolds or high drug-like profiles. The aforementioned issues may not be fully captured by commonly used metrics for the assessment of molecular generative models, such as novelty, diversity, and quantitative estimation of the drug-likeness score. To address these limitations, we proposed a genetic algorithm-guided generative model called GARel (genetic algorithm-based receptor-ligand interaction generator), a novel framework for training a DL-based generative model to produce drug-like molecules with novel scaffolds. To efficiently train the GARel model, we utilized dense net to update the parameters based on molecules with novel scaffolds and drug-like features. To demonstrate the capability of the GARel model, we used it to design inhibitors for three targets: AA2AR, EGFR, and SARS-Cov2. The results indicate that GARel-generated molecules feature more diverse and novel scaffolds and possess more desirable physicochemical properties and favorable docking scores. Compared with other generative models, GARel makes significant progress in balancing novelty and drug-likeness, providing a promising direction for the further development of DL-based de novo design methodology with potential impacts on drug discovery.
Collapse
Affiliation(s)
- Mingyang Wang
- College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou 310058, Zhejiang ,China
- CarbonSilicon AI Technology Co., Ltd., Hangzhou 310018, Zhejiang ,China
| | - Zhengjian Wu
- College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou 310058, Zhejiang ,China
- School of Computer Science, Wuhan University, Wuhan 430072, Hubei ,China
| | - Jike Wang
- College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou 310058, Zhejiang ,China
- CarbonSilicon AI Technology Co., Ltd., Hangzhou 310018, Zhejiang ,China
| | - Gaoqi Weng
- College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou 310058, Zhejiang ,China
| | - Yu Kang
- College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou 310058, Zhejiang ,China
| | - Peichen Pan
- College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou 310058, Zhejiang ,China
| | - Dan Li
- College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou 310058, Zhejiang ,China
| | - Yafeng Deng
- CarbonSilicon AI Technology Co., Ltd., Hangzhou 310018, Zhejiang ,China
| | - Xiaojun Yao
- Dr. Neher's Biophysics Laboratory for Innovative Drug Discovery Macau Institute for Applied Research in Medicine and Health State Key Laboratory of Quality Research in Chinese Medicine, Macau University of Science and Technology, Taipa, Macau 999078, China
| | - Zhitong Bing
- Institute of Modern Physics, Chinese Academy of Sciences, Lanzhou, Gansu 730000, China
| | - Chang-Yu Hsieh
- College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou 310058, Zhejiang ,China
| | - Tingjun Hou
- College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou 310058, Zhejiang ,China
| |
Collapse
|
14
|
Zhang H, Huang J, Xie J, Huang W, Yang Y, Xu M, Lei J, Chen H. GRELinker: A Graph-Based Generative Model for Molecular Linker Design with Reinforcement and Curriculum Learning. J Chem Inf Model 2024; 64:666-676. [PMID: 38241022 DOI: 10.1021/acs.jcim.3c01700] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/13/2024]
Abstract
Fragment-based drug discovery (FBDD) is widely used in drug design. One useful strategy in FBDD is designing linkers for linking fragments to optimize their molecular properties. In the current study, we present a novel generative fragment linking model, GRELinker, which utilizes a gated-graph neural network combined with reinforcement and curriculum learning to generate molecules with desirable attributes. The model has been shown to be efficient in multiple tasks, including controlling log P, optimizing synthesizability or predicted bioactivity of compounds, and generating molecules with high 3D similarity but low 2D similarity to the lead compound. Specifically, our model outperforms the previously reported reinforcement learning (RL) built-in method DRlinker on these benchmark tasks. Moreover, GRELinker has been successfully used in an actual FBDD case to generate optimized molecules with enhanced affinities by employing the docking score as the scoring function in RL. Besides, the implementation of curriculum learning in our framework enables the generation of structurally complex linkers more efficiently. These results demonstrate the benefits and feasibility of GRELinker in linker design for molecular optimization and drug discovery.
Collapse
Affiliation(s)
- Hao Zhang
- School of Pharmaceutical Science, Sun Yat-sen University, Guangzhou 510006, China
| | - Jinchao Huang
- School of Pharmaceutical Science, Sun Yat-sen University, Guangzhou 510006, China
| | - Junjie Xie
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou 510006, China
| | - Weifeng Huang
- School of Pharmaceutical Science, Sun Yat-sen University, Guangzhou 510006, China
| | - Yuedong Yang
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou 510006, China
| | - Mingyuan Xu
- Guangzhou National Laboratory, Guangzhou International Bio Island, No. 9 Xin Dao Huan Bei Road, Guangzhou 510005, China
| | - Jinping Lei
- School of Pharmaceutical Science, Sun Yat-sen University, Guangzhou 510006, China
| | - Hongming Chen
- Guangzhou National Laboratory, Guangzhou International Bio Island, No. 9 Xin Dao Huan Bei Road, Guangzhou 510005, China
| |
Collapse
|
15
|
Izmailyan R, Matevosyan M, Khachatryan H, Shavina A, Gevorgyan S, Ghazaryan A, Tirosyan I, Gabrielyan Y, Ayvazyan M, Martirosyan B, Harutyunyan V, Zakaryan H. Discovery of new antiviral agents through artificial intelligence: In vitro and in vivo results. Antiviral Res 2024; 222:105818. [PMID: 38280564 DOI: 10.1016/j.antiviral.2024.105818] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2023] [Revised: 01/12/2024] [Accepted: 01/18/2024] [Indexed: 01/29/2024]
Abstract
In this research, we employed a deep reinforcement learning (RL)-based molecule design platform to generate a diverse set of compounds targeting the neuraminidase (NA) of influenza A and B viruses. A total of 60,291 compounds were generated, of which 86.5 % displayed superior physicochemical properties compared to oseltamivir. After narrowing down the selection through computational filters, nine compounds with non-sialic acid-like structures were selected for in vitro experiments. We identified two compounds, DS-22-inf-009 and DS-22-inf-021 that effectively inhibited the NAs of both influenza A and B viruses (IAV and IBV), including H275Y mutant strains at low micromolar concentrations. Molecular dynamics simulations revealed a similar pattern of interaction with amino acid residues as oseltamivir. In cell-based assays, DS-22-inf-009 and DS-22-inf-021 inhibited IAV and IBV in a dose-dependent manner with EC50 values ranging from 0.29 μM to 2.31 μM. Furthermore, animal experiments showed that both DS-22-inf-009 and DS-22-inf-021 exerted antiviral activity in mice, conferring 65 % and 85 % protection from IAV (H1N1 pdm09), and 65 % and 100 % protection from IBV (Yamagata lineage), respectively. Thus, these findings demonstrate the potential of RL to generate compounds with promising antiviral properties.
Collapse
Affiliation(s)
- Roza Izmailyan
- Laboratory of Antiviral Drug Discovery, Institute of Molecular Biology of NAS, Hasratyan 7, 0014, Yerevan, Armenia
| | | | - Hamlet Khachatryan
- Laboratory of Antiviral Drug Discovery, Institute of Molecular Biology of NAS, Hasratyan 7, 0014, Yerevan, Armenia; Denovo Sciences Inc., 0060, Yerevan, Armenia
| | - Anastasiya Shavina
- Laboratory of Antiviral Drug Discovery, Institute of Molecular Biology of NAS, Hasratyan 7, 0014, Yerevan, Armenia; Denovo Sciences Inc., 0060, Yerevan, Armenia
| | - Smbat Gevorgyan
- Laboratory of Antiviral Drug Discovery, Institute of Molecular Biology of NAS, Hasratyan 7, 0014, Yerevan, Armenia; Denovo Sciences Inc., 0060, Yerevan, Armenia
| | - Artur Ghazaryan
- Laboratory of Antiviral Drug Discovery, Institute of Molecular Biology of NAS, Hasratyan 7, 0014, Yerevan, Armenia
| | | | | | | | | | | | - Hovakim Zakaryan
- Laboratory of Antiviral Drug Discovery, Institute of Molecular Biology of NAS, Hasratyan 7, 0014, Yerevan, Armenia; Denovo Sciences Inc., 0060, Yerevan, Armenia.
| |
Collapse
|
16
|
Kutsal M, Ucar F, Kati N. Computational drug discovery on human immunodeficiency virus with a customized long short-term memory variational autoencoder deep-learning architecture. CPT Pharmacometrics Syst Pharmacol 2024; 13:308-316. [PMID: 38010989 PMCID: PMC10864928 DOI: 10.1002/psp4.13085] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2023] [Revised: 11/01/2023] [Accepted: 11/07/2023] [Indexed: 11/29/2023] Open
Abstract
Despite attempts to control the spread of human immunodeficiency virus (HIV) through the use of anti-HIV medications, the absence of an effective vaccine continues to present a significant obstacle. In addition, the development of drug resistance by HIV underscores the necessity for computational drug discovery methods to identify novel therapies. This investigation specifically focused on employing a long short-term memory (LSTM) variational autoencoder deep-learning architecture for computational drug discovery in relation to HIV. Our data set comprised simplified molecular input line entry system (SMILES)-encoded compounds, which were used to train the LSTM autoencoder. Remarkably, our model achieved a training accuracy of 91%, with a data set containing 1377 compounds. Leveraging the generative model derived from the training phase, we generated potential new drugs for combating HIV and assessed their interaction with the virus using a previously developed artificial intelligence model. Lastly, we verified the drug likeliness of our computationally generated compounds in accordance with Lipinski's rule of five. Overall, our study presents a promising approach to computational drug discovery in the ongoing battle against HIV.
Collapse
Affiliation(s)
- Mucahit Kutsal
- Institute of Theoretical Physics and Astrophysics, Quantum Information TechnologyUniversity of GdańskGdańskPoland
| | - Ferhat Ucar
- Faculty of Technology, Software EngineeringFırat UniversityElazigTurkey
| | - Nida Kati
- Faculty of Technology, Materials and Metallurgical EngineeringFırat UniversityElazigTurkey
| |
Collapse
|
17
|
Mehrzadi A, Rezaee E, Gharaghani S, Fakhar Z, Mirhosseini SM. A Molecular Generative Model of COVID-19 Main Protease Inhibitors Using Long Short-Term Memory-Based Recurrent Neural Network. J Comput Biol 2024; 31:83-98. [PMID: 38054946 DOI: 10.1089/cmb.2023.0064] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/07/2023] Open
Abstract
The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has caused a serious threat to public health and prompted researchers to find anti-coronavirus 2019 (COVID-19) compounds. In this study, the long short-term memory-based recurrent neural network was used to generate new inhibitors for the coronavirus. First, the model was trained to generate drug compounds in the form of valid simplified molecular-input line-entry system strings. Then, the structures of COVID-19 main protease inhibitors were applied to fine-tune the model. After fine-tuning, the network could generate new molecular structures as novel SARS-CoV-2 main protease inhibitors. Molecular docking exhibited that some generated compounds have the proper affinity to the active site of the protease. Molecular Dynamics simulations explored binding free energies of the compounds over simulation trajectories. In addition, in silico absorption, distribution, metabolism, and excretion studies showed that some novel compounds could be formulated as orally active agents. Based on molecular docking and molecular dynamics simulation studies, compound AADH possessed significant binding affinity and presumably inhibition against the SARS-CoV-2 main protease enzyme. Therefore, the proposed deep learning-based model was capable of generating promising anti-COVID-19 drugs.
Collapse
Affiliation(s)
- Arash Mehrzadi
- Department of Electrical, Computer and IT Engineering, Qazvin Branch, Islamic Azad University, Qazvin, Iran
| | - Elham Rezaee
- Department of Pharmaceutical Chemistry, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Sajjad Gharaghani
- Department of Bioinformatics, Laboratory of Bioinformatics and Drug Design (LBD), University of Tehran, Tehran, Iran
| | - Zeynab Fakhar
- Department of Bioinformatics, Laboratory of Bioinformatics and Drug Design (LBD), University of Tehran, Tehran, Iran
| | | |
Collapse
|
18
|
Qin R, Zhang H, Huang W, Shao Z, Lei J. Deep learning-based design and screening of benzimidazole-pyrazine derivatives as adenosine A 2B receptor antagonists. J Biomol Struct Dyn 2023:1-17. [PMID: 38133953 DOI: 10.1080/07391102.2023.2295974] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2023] [Accepted: 12/11/2023] [Indexed: 12/24/2023]
Abstract
The Adenosine A2B receptor (A2BAR) is considered a novel potential target for the immunotherapy of cancer, and A2BAR antagonists have an inhibitory effect on tumor growth, proliferation, and metastasis. In our previous studies, we identified a class of benzimidazole-pyrazine scaffolds whose derivatives exhibited the antagonistic effect but lacked subtype selectivity towards A2BAR. In this work, we developed a scaffold-based protocol that incorporates a deep generative model and multilayer virtual screening to design benzimidazole-pyrazine derivatives as potential selective A2BAR antagonists. By utilizing a generative model with reported A2BAR antagonists as the training set, we built up a scaffold-focused library of benzimidazole-pyrazine derivatives and processed a virtual screening protocol to discover potential A2BAR antagonists. Finally, five molecules with different Bemis-Murcko scaffolds were identified and exhibited higher binding free energies than the reference molecule 12o. Further computational analysis revealed that the 3-benzyl derivative ABA-1266 presented high selectivity toward A2BAR and showed preferred draggability, providing future potent development of selective A2BAR antagonists.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Rui Qin
- School of Pharmaceutical Sciences, Sun Yat-sen University, Guangzhou, China
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China
| | - Hao Zhang
- School of Pharmaceutical Sciences, Sun Yat-sen University, Guangzhou, China
| | - Weifeng Huang
- School of Pharmaceutical Sciences, Sun Yat-sen University, Guangzhou, China
| | - Zhenglin Shao
- School of Pharmaceutical Sciences, Sun Yat-sen University, Guangzhou, China
| | - Jinping Lei
- School of Pharmaceutical Sciences, Sun Yat-sen University, Guangzhou, China
| |
Collapse
|
19
|
Angelo JS, Guedes IA, Barbosa HJC, Dardenne LE. Multi-and many-objective optimization: present and future in de novo drug design. Front Chem 2023; 11:1288626. [PMID: 38192501 PMCID: PMC10773868 DOI: 10.3389/fchem.2023.1288626] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2023] [Accepted: 11/27/2023] [Indexed: 01/10/2024] Open
Abstract
de novo Drug Design (dnDD) aims to create new molecules that satisfy multiple conflicting objectives. Since several desired properties can be considered in the optimization process, dnDD is naturally categorized as a many-objective optimization problem (ManyOOP), where more than three objectives must be simultaneously optimized. However, a large number of objectives typically pose several challenges that affect the choice and the design of optimization methodologies. Herein, we cover the application of multi- and many-objective optimization methods, particularly those based on Evolutionary Computation and Machine Learning techniques, to enlighten their potential application in dnDD. Additionally, we comprehensively analyze how molecular properties used in the optimization process are applied as either objectives or constraints to the problem. Finally, we discuss future research in many-objective optimization for dnDD, highlighting two important possible impacts: i) its integration with the development of multi-target approaches to accelerate the discovery of innovative and more efficacious drug therapies and ii) its role as a catalyst for new developments in more fundamental and general methodological frameworks in the field.
Collapse
Affiliation(s)
| | | | | | - Laurent E. Dardenne
- Coordenação de Modelagem Computacional, Laboratório Nacional de Computação Científica, Petrópolis, Brazil
| |
Collapse
|
20
|
Lu H, Wei Z, Wang X, Zhang K, Liu H. GraphGPT: A Graph Enhanced Generative Pretrained Transformer for Conditioned Molecular Generation. Int J Mol Sci 2023; 24:16761. [PMID: 38069085 PMCID: PMC10706000 DOI: 10.3390/ijms242316761] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2023] [Revised: 11/16/2023] [Accepted: 11/23/2023] [Indexed: 12/18/2023] Open
Abstract
Condition-based molecular generation can generate a large number of molecules with particular properties, expanding the virtual drug screening library, and accelerating the process of drug discovery. In this study, we combined a molecular graph structure and sequential representations using a generative pretrained transformer (GPT) architecture for generating molecules conditionally. The incorporation of graph structure information facilitated a better comprehension of molecular topological features, and the augmentation of a sequential contextual understanding of GPT architecture facilitated molecular generation. The experiments indicate that our model efficiently produces molecules with the desired properties, with valid and unique metrics that are close to 100%. Faced with the typical task of generating molecules based on a scaffold in drug discovery, our model is able to preserve scaffold information and generate molecules with low similarity and specified properties.
Collapse
Affiliation(s)
| | | | | | | | - Hao Liu
- College of Computer Science and Technology, Ocean University of China, Qingdao 266100, China
| |
Collapse
|
21
|
da Fonseca AM, Cabongo SQ, Caluaco BJ, Colares RP, Fernandes CFC, Dos Santos HS, de Lima-Neto P, Marinho ES. The search for new efficient inhibitors of SARS-COV-2 through the De novo drug design developed by artificial intelligence. J Biomol Struct Dyn 2023; 41:9890-9906. [PMID: 36420665 DOI: 10.1080/07391102.2022.2148128] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2022] [Accepted: 11/10/2022] [Indexed: 11/25/2022]
Abstract
The pandemic caused by Sars-CoV-2 is a viral infection that has generated one of the most significant health problems worldwide. Previous studies report the main protease (Mpro) as a potential target for this virus, as it is considered a crucial enzyme in mediating replication and viral transcription. This work presented the construction of new bioactive compounds for possible inhibition. The De novo molecular design of drugs method in the incremental construction of a ligant model within a receptor model was used, producing new structures with the help of artificial intelligence. The research algorithm and the scoring function responsible for predicting orientation and affinity in the molecular target at the time of coupling showed, as a result of the simulation, the compound with the highest bioaffinity value, Hit 998, with the energy of -17.62 kcal/mol, and synthetic viability close to 50%. While hit 1103 presented better synthetic viability (80%), its affinity energy of -10.28 kcal/mol. Both were compared with the reference linker N3, with a binding affinity of -7.5 kcal/mol. ADMET tests demonstrated that simulated compounds have a low risk of metabolic activation and do not exert effective distribution in the CNS, suggesting a pharmacokinetic mechanism based on local action, even with high topological polarity, which resulted in low oral bioavailability. In conclusion, MMGBSA, H-bonds, RMSD, SASA, and RMSF values were also obtained through molecular dynamics to verify the stability of the receptor-ligant complex within the active protein site to seek new therapeutic propositions in the fight against the pandemic.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Aluísio Marques da Fonseca
- Mestrado Acadêmico em Sociobiodiversidades e Tecnologias Sustentáveis - MASTS, Instituto de Engenharias e Desenvolvimento Sustentável, Universidade da Integração Internacional da Lusofonia Afro-Brasileira, Acarape, CE, Brazil
| | - Sadrack Queque Cabongo
- Instituto de Ciências Exatas e da Natureza, Universidade da Integração Internacional da Lusofonia Afro-Brasileira, Acarape, CE, Brazil
| | - Bernardino Joaquim Caluaco
- Instituto de Ciências Exatas e da Natureza, Universidade da Integração Internacional da Lusofonia Afro-Brasileira, Acarape, CE, Brazil
| | - Regilany Paulo Colares
- Instituto de Ciências Exatas e da Natureza, Universidade da Integração Internacional da Lusofonia Afro-Brasileira, Acarape, CE, Brazil
| | | | | | - Pedro de Lima-Neto
- Department of Analytical Chemistry and Physical Chemistry, Science Center, Federal University of Ceara, Fortaleza, CE, Brazil
| | - Emmanuel Silva Marinho
- Grupo de química Teorica e Eletroquimica-GQTE, Universidade Estadual do Ceará, Limoeiro do Norte, CE, Brazil
| |
Collapse
|
22
|
Haroon S, C A H, A S J. Generative Pre-trained Transformer (GPT) based model with relative attention for de novo drug design. Comput Biol Chem 2023; 106:107911. [PMID: 37450999 DOI: 10.1016/j.compbiolchem.2023.107911] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2023] [Revised: 06/24/2023] [Accepted: 06/28/2023] [Indexed: 07/18/2023]
Abstract
De novo drug design refers to the process of designing new drug molecules from scratch using computational methods. In contrast to other computational methods that primarily focus on modifying existing molecules, designing from scratch enables the exploration of new chemical space and the potential discovery of novel molecules with enhanced properties. In this research, we proposed a model that utilizes Generative Pre-trained Transformer (GPT) architecture and relative attention for de novo drug design. GPT is a language model that utilizes transformer architecture to predict the next word or token in a given sequence. Representation of molecules using SMILES notation has enabled the use of next-token prediction techniques in de novo drug design. GPT uses attention mechanisms to capture the dependencies and relationships between different tokens in a sequence and allows the model to focus on the most important information when processing the input. Relative attention is a variant of the attention mechanism, which allows the model to capture the relative distances and relationships between tokens in the input sequence. In the standard attention mechanism, positional information is typically encoded using fixed-position embeddings. In relative attention, positional information is supplied dynamically during attention calculation by incorporating relative positional encodings, enabling the model to quickly learn the syntax of new unseen tokens. Relative attention enables the GPT model to better understand the relative positions of tokens in the sequence, which can be particularly useful when dealing with limited dataset sizes or generating target-specific drugs. The proposed model was trained on benchmark datasets, and performance was compared with other generative models. We show that relative attention and transfer learning could enable the GPT model to generate molecules with improved validity, uniqueness, and novelty in the context of de novo drug design. To illustrate the effectiveness of relative attention, the model was trained using transfer learning on three target-specific datasets, and the performance was compared with standard attention.
Collapse
Affiliation(s)
- Suhail Haroon
- Bioinformatics Lab, Department of Computer Science, Cochin University of Science and Technology, Kerala 682022, India.
| | - Hafsath C A
- Bioinformatics Lab, Department of Computer Science, Cochin University of Science and Technology, Kerala 682022, India
| | - Jereesh A S
- Bioinformatics Lab, Department of Computer Science, Cochin University of Science and Technology, Kerala 682022, India.
| |
Collapse
|
23
|
Feng H, Wang R, Zhan CG, Wei GW. Multiobjective Molecular Optimization for Opioid Use Disorder Treatment Using Generative Network Complex. J Med Chem 2023; 66:12479-12498. [PMID: 37623046 PMCID: PMC11037444 DOI: 10.1021/acs.jmedchem.3c01053] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/26/2023]
Abstract
Opioid use disorder (OUD) has emerged as a significant global public health issue, necessitating the discovery of new medications. In this study, we propose a deep generative model that combines a stochastic differential equation (SDE)-based diffusion model with a pretrained autoencoder. The molecular generator enables efficient generation of molecules that target multiple opioid receptors, including mu, kappa, and delta. Additionally, we assess the ADMET (absorption, distribution, metabolism, excretion, and toxicity) properties of the generated molecules to identify druglike compounds. We develop a molecular optimization approach to enhance the pharmacokinetic properties of some lead compounds. Advanced binding affinity predictors were built using molecular fingerprints, including autoencoder embeddings, transformer embeddings, and topological Laplacians. Our process yields druglike molecules that can be used in highly focused experimental studies to further evaluate their pharmacological effects. Our machine learning platform serves as a valuable tool for designing effective molecules to address OUD.
Collapse
Affiliation(s)
- Hongsong Feng
- Department of Mathematics, Michigan State University, East Lansing, Michigan 48824, United States
| | - Rui Wang
- Department of Mathematics, Michigan State University, East Lansing, Michigan 48824, United States
| | - Chang-Guo Zhan
- Department of Pharmaceutical Sciences, University of Kentucky, Lexington, Kentucky 40506, United States
| | - Guo-Wei Wei
- Department of Mathematics, Michigan State University, East Lansing, Michigan 48824, United States
- Department of Electrical and Computer Engineering, Michigan State University, East Lansing, Michigan 48824, United States
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, Michigan 48824, United States
| |
Collapse
|
24
|
Lima JPO, da Fonseca AM, Marinho GS, da Rocha MN, Marinho EM, dos Santos HS, Freire RM, Marinho ES, de Lima-Neto P, Fechine PBA. De novo design of bioactive phenol and chromone derivatives for inhibitors of Spike glycoprotein of SARS-CoV-2 in silico. 3 Biotech 2023; 13:301. [PMID: 37588795 PMCID: PMC10425314 DOI: 10.1007/s13205-023-03695-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2023] [Accepted: 06/29/2023] [Indexed: 08/18/2023] Open
Abstract
This work presents the synthesis of 12 phenol and chromone derivatives, prepared by the analogs, and the possibility of conducting an in silico study of its derivatives as a therapeutic alternative to combat the SARS-CoV-2, pathogen responsible for COVID-19 pandemic, using its S-glycoprotein as a macromolecular target. After the initial screening for the ranking of the products, it was chosen which structure presented the best energy bond with the target. As a result, derivative 4 was submitted to a molecular growth study using artificial intelligence, where 8436 initial structures were obtained that passed through the interaction filters and similarity to the active glycoprotein pocket through the MolAICal computational package. Thus, 557 Hits with active configuration were generated, which is very promising compared to the BLA reference link for inhibiting the biological target. Molecular dynamics also simulated these compounds to verify their stability within the active protein site to seek new therapeutic propositions to fight against the pandemic. The Hit 48 and 250 are the most active compounds against SARS-CoV-2. In summary, the results show that the Hit 250 would be more active than the natural compound, which could be further developed for further testing against SARS-CoV-2. The study employs the de novo approach to design new drugs, combining artificial intelligence and molecular dynamics simulations to create efficient molecular structures. This research aims to contribute to the development of effective therapeutic strategies against the pandemic.
Collapse
Affiliation(s)
- Joan Petrus Oliveira Lima
- Advanced Materials Chemistry Group (GQMat)-Department of Analytical Chemistry and Physical Chemistry, Federal University of Ceará, Campus Pici, Fortaleza, Ceará 60455-970 Brazil
| | - Aluísio Marques da Fonseca
- Mestrado Acadêmico em Sociobiodiversidades e Tecnologias Sustentáveis-MASTS, Instituto de Engenharias e Desenvolvimento Sustentável, Universidade da Integração Internacional da Lusofonia Afro-Brasileira, Acarape, CE 62785-000 Brazil
| | - Gabrielle Silva Marinho
- Faculdade de Filosofia Dom Aureliano Matos-FAFIDAM, Universidade Estadual do Ceará, Centro, Limoeiro do Norte, CE 62930-000 Brazil
| | - Matheus Nunes da Rocha
- Faculdade de Filosofia Dom Aureliano Matos-FAFIDAM, Universidade Estadual do Ceará, Centro, Limoeiro do Norte, CE 62930-000 Brazil
| | - Emanuelle Machado Marinho
- Advanced Materials Chemistry Group (GQMat)-Department of Analytical Chemistry and Physical Chemistry, Federal University of Ceará, Campus Pici, Fortaleza, Ceará 60455-970 Brazil
| | | | | | - Emmanuel Silva Marinho
- Faculdade de Filosofia Dom Aureliano Matos-FAFIDAM, Universidade Estadual do Ceará, Centro, Limoeiro do Norte, CE 62930-000 Brazil
| | - Pedro de Lima-Neto
- Advanced Materials Chemistry Group (GQMat)-Department of Analytical Chemistry and Physical Chemistry, Federal University of Ceará, Campus Pici, Fortaleza, Ceará 60455-970 Brazil
| | - Pierre Basílio Almeida Fechine
- Advanced Materials Chemistry Group (GQMat)-Department of Analytical Chemistry and Physical Chemistry, Federal University of Ceará, Campus Pici, Fortaleza, Ceará 60455-970 Brazil
| |
Collapse
|
25
|
Gu Y, Li J, Kang H, Zhang B, Zheng S. Employing Molecular Conformations for Ligand-Based Virtual Screening with Equivariant Graph Neural Network and Deep Multiple Instance Learning. Molecules 2023; 28:5982. [PMID: 37630234 PMCID: PMC10459669 DOI: 10.3390/molecules28165982] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2023] [Revised: 07/27/2023] [Accepted: 08/03/2023] [Indexed: 08/27/2023] Open
Abstract
Ligand-based virtual screening (LBVS) is a promising approach for rapid and low-cost screening of potentially bioactive molecules in the early stage of drug discovery. Compared with traditional similarity-based machine learning methods, deep learning frameworks for LBVS can more effectively extract high-order molecule structure representations from molecular fingerprints or structures. However, the 3D conformation of a molecule largely influences its bioactivity and physical properties, and has rarely been considered in previous deep learning-based LBVS methods. Moreover, the relative bioactivity benchmark dataset is still lacking. To address these issues, we introduce a novel end-to-end deep learning architecture trained from molecular conformers for LBVS. We first extracted molecule conformers from multiple public molecular bioactivity data and consolidated them into a large-scale bioactivity benchmark dataset, which totally includes millions of endpoints and molecules corresponding to 954 targets. Then, we devised a deep learning-based LBVS called EquiVS to learn molecule representations from conformers for bioactivity prediction. Specifically, graph convolutional network (GCN) and equivariant graph neural network (EGNN) are sequentially stacked to learn high-order molecule-level and conformer-level representations, followed with attention-based deep multiple-instance learning (MIL) to aggregate these representations and then predict the potential bioactivity for the query molecule on a given target. We conducted various experiments to validate the data quality of our benchmark dataset, and confirmed EquiVS achieved better performance compared with 10 traditional machine learning or deep learning-based LBVS methods. Further ablation studies demonstrate the significant contribution of molecular conformation for bioactivity prediction, as well as the reasonability and non-redundancy of deep learning architecture in EquiVS. Finally, a model interpretation case study on CDK2 shows the potential of EquiVS in optimal conformer discovery. The overall study shows that our proposed benchmark dataset and EquiVS method have promising prospects in virtual screening applications.
Collapse
Affiliation(s)
- Yaowen Gu
- Institute of Medical Information (IMI), Chinese Academy of Medical Sciences and Peking Union Medical College (CAMS & PUMC), Beijing 100020, China; (Y.G.); (J.L.); (H.K.)
- Department of Chemistry, New York University, New York, NY 10027, USA
| | - Jiao Li
- Institute of Medical Information (IMI), Chinese Academy of Medical Sciences and Peking Union Medical College (CAMS & PUMC), Beijing 100020, China; (Y.G.); (J.L.); (H.K.)
| | - Hongyu Kang
- Institute of Medical Information (IMI), Chinese Academy of Medical Sciences and Peking Union Medical College (CAMS & PUMC), Beijing 100020, China; (Y.G.); (J.L.); (H.K.)
- Department of Biomedical Engineering, School of Life Science, Beijing Institute of Technology, Beijing 100081, China
| | - Bowen Zhang
- Beijing StoneWise Technology Co., Ltd., Beijing 100080, China;
| | - Si Zheng
- Institute of Medical Information (IMI), Chinese Academy of Medical Sciences and Peking Union Medical College (CAMS & PUMC), Beijing 100020, China; (Y.G.); (J.L.); (H.K.)
- Institute for Artificial Intelligence, Department of Computer Science and Technology, BNRist, Tsinghua University, Beijing 100084, China
| |
Collapse
|
26
|
Scaini MC, Piccin L, Bassani D, Scapinello A, Pellegrini S, Poggiana C, Catoni C, Tonello D, Pigozzo J, Dall’Olmo L, Rosato A, Moro S, Chiarion-Sileni V, Menin C. Molecular Modeling Unveils the Effective Interaction of B-RAF Inhibitors with Rare B-RAF Insertion Variants. Int J Mol Sci 2023; 24:12285. [PMID: 37569660 PMCID: PMC10418914 DOI: 10.3390/ijms241512285] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2023] [Revised: 07/27/2023] [Accepted: 07/29/2023] [Indexed: 08/13/2023] Open
Abstract
The Food and Drug Administration (FDA) has approved MAPK inhibitors as a treatment for melanoma patients carrying a mutation in codon V600 of the BRAF gene exclusively. However, BRAF mutations outside the V600 codon may occur in a small percentage of melanomas. Although these rare variants may cause B-RAF activation, their predictive response to B-RAF inhibitor treatments is still poorly understood. We exploited an integrated approach for mutation detection, tumor evolution tracking, and assessment of response to treatment in a metastatic melanoma patient carrying the rare p.T599dup B-RAF mutation. He was addressed to Dabrafenib/Trametinib targeted therapy, showing an initial dramatic response. In parallel, in-silico ligand-based homology modeling was set up and performed on this and an additional B-RAF rare variant (p.A598_T599insV) to unveil and justify the success of the B-RAF inhibitory activity of Dabrafenib, showing that it could adeptly bind both these variants in a similar manner to how it binds and inhibits the V600E mutant. These findings open up the possibility of broadening the spectrum of BRAF inhibitor-sensitive mutations beyond mutations at codon V600, suggesting that B-RAF V600 WT melanomas should undergo more specific investigations before ruling out the possibility of targeted therapy.
Collapse
Affiliation(s)
- Maria Chiara Scaini
- Immunology and Molecular Oncology Unit, Veneto Institute of Oncology IOV-IRCCS, 35128 Padua, Italy; (M.C.S.); (S.P.); (C.P.); (C.C.); (D.T.); (A.R.); (C.M.)
| | - Luisa Piccin
- Melanoma Unit, Oncology 2 Unit, Veneto Institute of Oncology IOV-IRCCS, 35128 Padua, Italy; (L.P.); (J.P.); (V.C.-S.)
| | - Davide Bassani
- Molecular Modeling Section (MMS), Department of Pharmaceutical and Pharmacological Sciences, University of Padova, 35131 Padua, Italy;
| | - Antonio Scapinello
- Anatomy and Pathological Histology Unit, Veneto Institute of Oncology IOV-IRCCS, 35128 Padua, Italy;
| | - Stefania Pellegrini
- Immunology and Molecular Oncology Unit, Veneto Institute of Oncology IOV-IRCCS, 35128 Padua, Italy; (M.C.S.); (S.P.); (C.P.); (C.C.); (D.T.); (A.R.); (C.M.)
| | - Cristina Poggiana
- Immunology and Molecular Oncology Unit, Veneto Institute of Oncology IOV-IRCCS, 35128 Padua, Italy; (M.C.S.); (S.P.); (C.P.); (C.C.); (D.T.); (A.R.); (C.M.)
| | - Cristina Catoni
- Immunology and Molecular Oncology Unit, Veneto Institute of Oncology IOV-IRCCS, 35128 Padua, Italy; (M.C.S.); (S.P.); (C.P.); (C.C.); (D.T.); (A.R.); (C.M.)
| | - Debora Tonello
- Immunology and Molecular Oncology Unit, Veneto Institute of Oncology IOV-IRCCS, 35128 Padua, Italy; (M.C.S.); (S.P.); (C.P.); (C.C.); (D.T.); (A.R.); (C.M.)
| | - Jacopo Pigozzo
- Melanoma Unit, Oncology 2 Unit, Veneto Institute of Oncology IOV-IRCCS, 35128 Padua, Italy; (L.P.); (J.P.); (V.C.-S.)
| | - Luigi Dall’Olmo
- Soft-Tissue, Peritoneum and Melanoma Surgical Oncology Unit, Veneto Institute of Oncology IOV-IRCCS, 35128 Padua, Italy
- Department of Surgery, Oncology and Gastroenterology (DISCOG), University of Padua, 35128 Padua, Italy
| | - Antonio Rosato
- Immunology and Molecular Oncology Unit, Veneto Institute of Oncology IOV-IRCCS, 35128 Padua, Italy; (M.C.S.); (S.P.); (C.P.); (C.C.); (D.T.); (A.R.); (C.M.)
- Department of Surgery, Oncology and Gastroenterology (DISCOG), University of Padua, 35128 Padua, Italy
| | - Stefano Moro
- Molecular Modeling Section (MMS), Department of Pharmaceutical and Pharmacological Sciences, University of Padova, 35131 Padua, Italy;
| | - Vanna Chiarion-Sileni
- Melanoma Unit, Oncology 2 Unit, Veneto Institute of Oncology IOV-IRCCS, 35128 Padua, Italy; (L.P.); (J.P.); (V.C.-S.)
| | - Chiara Menin
- Immunology and Molecular Oncology Unit, Veneto Institute of Oncology IOV-IRCCS, 35128 Padua, Italy; (M.C.S.); (S.P.); (C.P.); (C.C.); (D.T.); (A.R.); (C.M.)
| |
Collapse
|
27
|
Kaneko H. Molecular Descriptors, Structure Generation, and Inverse QSAR/QSPR Based on SELFIES. ACS OMEGA 2023; 8:21781-21786. [PMID: 37360490 PMCID: PMC10286088 DOI: 10.1021/acsomega.3c01332] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/27/2023] [Accepted: 05/29/2023] [Indexed: 06/28/2023]
Abstract
For inverse QSAR/QSPR in conventional molecular design, several chemical structures must be generated and their molecular descriptors must be calculated. However, there is no one-to-one correspondence between the generated chemical structures and molecular descriptors. In this paper, molecular descriptors, structure generation, and inverse QSAR/QSPR based on self-referencing embedded strings (SELFIES), a 100% robust molecular string representation, are proposed. A one-hot vector is converted from SELFIES to SELFIES descriptors x, and an inverse analysis of the QSAR/QSPR model y = f(x) with the objective variable y and molecular descriptor x is conducted. Thus, x values that achieve a target y value are obtained. Based on these values, SELFIES strings or molecules are generated, meaning that inverse QSAR/QSPR is performed successfully. The SELFIES descriptors and SELFIES-based structure generation are verified using datasets of actual compounds. The successful construction of SELFIES-descriptor-based QSAR/QSPR models with predictive abilities comparable to those of models based on other fingerprints is confirmed. A large number of molecules with one-to-one relationships with the values of the SELFIES descriptors are generated. Furthermore, as a case study of inverse QSAR/QSPR, molecules with target y values are generated successfully. The Python code for the proposed method is available at https://github.com/hkaneko1985/dcekit.
Collapse
|
28
|
Bjerrum EJ, Margreitter C, Blaschke T, Kolarova S, de Castro RLR. Faster and more diverse de novo molecular optimization with double-loop reinforcement learning using augmented SMILES. J Comput Aided Mol Des 2023:10.1007/s10822-023-00512-6. [PMID: 37329395 DOI: 10.1007/s10822-023-00512-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2023] [Accepted: 05/29/2023] [Indexed: 06/19/2023]
Abstract
Using generative deep learning models and reinforcement learning together can effectively generate new molecules with desired properties. By employing a multi-objective scoring function, thousands of high-scoring molecules can be generated, making this approach useful for drug discovery and material science. However, the application of these methods can be hindered by computationally expensive or time-consuming scoring procedures, particularly when a large number of function calls are required as feedback in the reinforcement learning optimization. Here, we propose the use of double-loop reinforcement learning with simplified molecular line entry system (SMILES) augmentation to improve the efficiency and speed of the optimization. By adding an inner loop that augments the generated SMILES strings to non-canonical SMILES for use in additional reinforcement learning rounds, we can both reuse the scoring calculations on the molecular level, thereby speeding up the learning process, as well as offer additional protection against mode collapse. We find that employing between 5 and 10 augmentation repetitions is optimal for the scoring functions tested and is further associated with an increased diversity in the generated compounds, improved reproducibility of the sampling runs and the generation of molecules of higher similarity to known ligands.
Collapse
Affiliation(s)
| | | | | | | | - Raquel López-Ríos de Castro
- Odyssey Therapeutics, Cambridge, MA, USA
- Department of Physics and Department of Chemistry, King's College, London, UK
| |
Collapse
|
29
|
Yoo J, Kim TY, Joung I, Song SO. Industrializing AI/ML during the end-to-end drug discovery process. Curr Opin Struct Biol 2023; 79:102528. [PMID: 36736243 DOI: 10.1016/j.sbi.2023.102528] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2022] [Revised: 12/16/2022] [Accepted: 12/20/2022] [Indexed: 02/04/2023]
Abstract
Drug discovery aims to select proper targets and drug candidates to address unmet clinical needs. The end-to-end drug discovery process includes all stages of drug discovery from target identification to drug candidate selection. Recently, several artificial intelligence and machine learning (AI/ML)-based drug discovery companies have attempted to build data-driven platforms spanning the end-to-end drug discovery process. The ability to identify elusive targets essentially leads to the diversification of discovery pipelines, thereby increasing the ability to address unmet needs. Modern ML technologies are complementing traditional computer-aided drug discovery by accelerating candidate optimization in innovative ways. This review summarizes recent developments in AI/ML methods from target identification to molecule optimization, and concludes with an overview of current industrial trends in end-to-end AI/ML platforms.
Collapse
Affiliation(s)
- Jiho Yoo
- Standigm Inc., 3F, 70 Nonhyeon-ro 85-gil, Gangnam-gu, Seoul, South Korea, 06234 +82.2.501.8118
| | - Tae Yong Kim
- Standigm Inc., 3F, 70 Nonhyeon-ro 85-gil, Gangnam-gu, Seoul, South Korea, 06234 +82.2.501.8118
| | - InSuk Joung
- Standigm Inc., 3F, 70 Nonhyeon-ro 85-gil, Gangnam-gu, Seoul, South Korea, 06234 +82.2.501.8118
| | - Sang Ok Song
- Standigm Inc., 3F, 70 Nonhyeon-ro 85-gil, Gangnam-gu, Seoul, South Korea, 06234 +82.2.501.8118.
| |
Collapse
|
30
|
Koutroumpa NM, Papavasileiou KD, Papadiamantis AG, Melagraki G, Afantitis A. A Systematic Review of Deep Learning Methodologies Used in the Drug Discovery Process with Emphasis on In Vivo Validation. Int J Mol Sci 2023; 24:6573. [PMID: 37047543 PMCID: PMC10095548 DOI: 10.3390/ijms24076573] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2022] [Revised: 03/24/2023] [Accepted: 03/28/2023] [Indexed: 04/05/2023] Open
Abstract
The discovery and development of new drugs are extremely long and costly processes. Recent progress in artificial intelligence has made a positive impact on the drug development pipeline. Numerous challenges have been addressed with the growing exploitation of drug-related data and the advancement of deep learning technology. Several model frameworks have been proposed to enhance the performance of deep learning algorithms in molecular design. However, only a few have had an immediate impact on drug development since computational results may not be confirmed experimentally. This systematic review aims to summarize the different deep learning architectures used in the drug discovery process and are validated with further in vivo experiments. For each presented study, the proposed molecule or peptide that has been generated or identified by the deep learning model has been biologically evaluated in animal models. These state-of-the-art studies highlight that even if artificial intelligence in drug discovery is still in its infancy, it has great potential to accelerate the drug discovery cycle, reduce the required costs, and contribute to the integration of the 3R (Replacement, Reduction, Refinement) principles. Out of all the reviewed scientific articles, seven algorithms were identified: recurrent neural networks, specifically, long short-term memory (LSTM-RNNs), Autoencoders (AEs) and their Wasserstein Autoencoders (WAEs) and Variational Autoencoders (VAEs) variants; Convolutional Neural Networks (CNNs); Direct Message Passing Neural Networks (D-MPNNs); and Multitask Deep Neural Networks (MTDNNs). LSTM-RNNs were the most used architectures with molecules or peptide sequences as inputs.
Collapse
Affiliation(s)
- Nikoletta-Maria Koutroumpa
- Department of ChemoInformatics, NovaMechanics Ltd., Nicosia 1070, Cyprus
- School of Chemical Engineering, National Technical University of Athens, 157 80 Athens, Greece
- Division of Data Driven Innovation, Entelos Institute, Larnaca 6059, Cyprus
| | - Konstantinos D. Papavasileiou
- Department of ChemoInformatics, NovaMechanics Ltd., Nicosia 1070, Cyprus
- Division of Data Driven Innovation, Entelos Institute, Larnaca 6059, Cyprus
- Department of ChemoInformatics, NovaMechanics MIKE., 185 45 Piraeus, Greece
| | - Anastasios G. Papadiamantis
- Department of ChemoInformatics, NovaMechanics Ltd., Nicosia 1070, Cyprus
- Division of Data Driven Innovation, Entelos Institute, Larnaca 6059, Cyprus
| | - Georgia Melagraki
- Division of Physical Sciences & Applications, Hellenic Military Academy, 166 73 Vari, Greece
| | - Antreas Afantitis
- Department of ChemoInformatics, NovaMechanics Ltd., Nicosia 1070, Cyprus
- Division of Data Driven Innovation, Entelos Institute, Larnaca 6059, Cyprus
- Department of ChemoInformatics, NovaMechanics MIKE., 185 45 Piraeus, Greece
| |
Collapse
|
31
|
Chen W, Liu X, Zhang S, Chen S. Artificial intelligence for drug discovery: Resources, methods, and applications. MOLECULAR THERAPY. NUCLEIC ACIDS 2023; 31:691-702. [PMID: 36923950 PMCID: PMC10009646 DOI: 10.1016/j.omtn.2023.02.019] [Citation(s) in RCA: 24] [Impact Index Per Article: 24.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/19/2023]
Abstract
Conventional wet laboratory testing, validations, and synthetic procedures are costly and time-consuming for drug discovery. Advancements in artificial intelligence (AI) techniques have revolutionized their applications to drug discovery. Combined with accessible data resources, AI techniques are changing the landscape of drug discovery. In the past decades, a series of AI-based models have been developed for various steps of drug discovery. These models have been used as complements of conventional experiments and have accelerated the drug discovery process. In this review, we first introduced the widely used data resources in drug discovery, such as ChEMBL and DrugBank, followed by the molecular representation schemes that convert data into computer-readable formats. Meanwhile, we summarized the algorithms used to develop AI-based models for drug discovery. Subsequently, we discussed the applications of AI techniques in pharmaceutical analysis including predicting drug toxicity, drug bioactivity, and drug physicochemical property. Furthermore, we introduced the AI-based models for de novo drug design, drug-target structure prediction, drug-target interaction, and binding affinity prediction. Moreover, we also highlighted the advanced applications of AI in drug synergism/antagonism prediction and nanomedicine design. Finally, we discussed the challenges and future perspectives on the applications of AI to drug discovery.
Collapse
Affiliation(s)
- Wei Chen
- State Key Laboratory of Southwestern Chinese Medicine Resources, Innovative Institute of Chinese Medicine and Pharmacy, Chengdu University of Traditional Chinese Medicine, Chengdu 611137, China.,Institute of Herbgenomics, Chengdu University of Traditional Chinese Medicine, Chengdu 611137, China
| | - Xuesong Liu
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Sanyin Zhang
- State Key Laboratory of Southwestern Chinese Medicine Resources, Innovative Institute of Chinese Medicine and Pharmacy, Chengdu University of Traditional Chinese Medicine, Chengdu 611137, China.,Institute of Herbgenomics, Chengdu University of Traditional Chinese Medicine, Chengdu 611137, China
| | - Shilin Chen
- State Key Laboratory of Southwestern Chinese Medicine Resources, Innovative Institute of Chinese Medicine and Pharmacy, Chengdu University of Traditional Chinese Medicine, Chengdu 611137, China.,Institute of Herbgenomics, Chengdu University of Traditional Chinese Medicine, Chengdu 611137, China
| |
Collapse
|
32
|
Wang J, Mao J, Wang M, Le X, Wang Y. Explore drug-like space with deep generative models. Methods 2023; 210:52-59. [PMID: 36682423 DOI: 10.1016/j.ymeth.2023.01.004] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2022] [Revised: 01/05/2023] [Accepted: 01/17/2023] [Indexed: 01/20/2023] Open
Abstract
The process of design/discovery of drugs involves the identification and design of novel molecules that have the desired properties and bind well to a given disease-relevant target. One of the main challenges to effectively identify potential drug candidates is to explore the vast drug-like chemical space to find novel chemical structures with desired physicochemical properties and biological characteristics. Moreover, the chemical space of currently available molecular libraries is only a small fraction of the total possible drug-like chemical space. Deep molecular generative models have received much attention and provide an alternative approach to the design and discovery of molecules. To efficiently explore the drug-like space, we first constructed the drug-like dataset and then performed the generative design of drug-like molecules using a Conditional Randomized Transformer approach with the molecular access system (MACCS) fingerprint as a condition and compared it with previously published molecular generative models. The results show that the deep molecular generative model explores the wider drug-like chemical space. The generated drug-like molecules share the chemical space with known drugs, and the drug-like space captured by the combination of quantitative estimation of drug-likeness (QED) and quantitative estimate of protein-protein interaction targeting drug-likeness (QEPPI) can cover a larger drug-like space. Finally, we show the potential application of the model in design of inhibitors of MDM2-p53 protein-protein interaction. Our results demonstrate the potential application of deep molecular generative models for guided exploration in drug-like chemical space and molecular design.
Collapse
Affiliation(s)
- Jianmin Wang
- The Interdisciplinary Graduate Program in Integrative Biotechnology and Translational Medicine, Yonsei University, Incheon 21983, Korea
| | - Jiashun Mao
- The Interdisciplinary Graduate Program in Integrative Biotechnology and Translational Medicine, Yonsei University, Incheon 21983, Korea
| | - Meng Wang
- Department of Biostatistics, School of Public Health, Harbin Medical University
| | - Xiangyang Le
- Department of Medicinal Chemistry, Xiangya School of Pharmaceutical Sciences, Central South University, Changsha 410013, Hunan, China
| | - Yunyun Wang
- School of Pharmacy and Jiangsu Province Key Laboratory for Inflammation and Molecular Drug Target, Nantong University, Nantong 226001, China
| |
Collapse
|
33
|
PETrans: De Novo Drug Design with Protein-Specific Encoding Based on Transfer Learning. Int J Mol Sci 2023; 24:ijms24021146. [PMID: 36674658 PMCID: PMC9865828 DOI: 10.3390/ijms24021146] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2022] [Revised: 12/29/2022] [Accepted: 01/04/2023] [Indexed: 01/11/2023] Open
Abstract
Recent years have seen tremendous success in the design of novel drug molecules through deep generative models. Nevertheless, existing methods only generate drug-like molecules, which require additional structural optimization to be developed into actual drugs. In this study, a deep learning method for generating target-specific ligands was proposed. This method is useful when the dataset for target-specific ligands is limited. Deep learning methods can extract and learn features (representations) in a data-driven way with little or no human participation. Generative pretraining (GPT) was used to extract the contextual features of the molecule. Three different protein-encoding methods were used to extract the physicochemical properties and amino acid information of the target protein. Protein-encoding and molecular sequence information are combined to guide molecule generation. Transfer learning was used to fine-tune the pretrained model to generate molecules with better binding ability to the target protein. The model was validated using three different targets. The docking results show that our model is capable of generating new molecules with higher docking scores for the target proteins.
Collapse
|
34
|
Chang Y, Hawkins BA, Du JJ, Groundwater PW, Hibbs DE, Lai F. A Guide to In Silico Drug Design. Pharmaceutics 2022; 15:pharmaceutics15010049. [PMID: 36678678 PMCID: PMC9867171 DOI: 10.3390/pharmaceutics15010049] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2022] [Revised: 12/16/2022] [Accepted: 12/17/2022] [Indexed: 12/28/2022] Open
Abstract
The drug discovery process is a rocky path that is full of challenges, with the result that very few candidates progress from hit compound to a commercially available product, often due to factors, such as poor binding affinity, off-target effects, or physicochemical properties, such as solubility or stability. This process is further complicated by high research and development costs and time requirements. It is thus important to optimise every step of the process in order to maximise the chances of success. As a result of the recent advancements in computer power and technology, computer-aided drug design (CADD) has become an integral part of modern drug discovery to guide and accelerate the process. In this review, we present an overview of the important CADD methods and applications, such as in silico structure prediction, refinement, modelling and target validation, that are commonly used in this area.
Collapse
Affiliation(s)
- Yiqun Chang
- Sydney Pharmacy School, Faculty of Medicine and Health, The University of Sydney, Camperdown, NSW 2006, Australia
| | - Bryson A. Hawkins
- Sydney Pharmacy School, Faculty of Medicine and Health, The University of Sydney, Camperdown, NSW 2006, Australia
| | - Jonathan J. Du
- Department of Biochemistry, Emory University School of Medicine, Atlanta, GA 30322, USA
| | - Paul W. Groundwater
- Sydney Pharmacy School, Faculty of Medicine and Health, The University of Sydney, Camperdown, NSW 2006, Australia
| | - David E. Hibbs
- Sydney Pharmacy School, Faculty of Medicine and Health, The University of Sydney, Camperdown, NSW 2006, Australia
| | - Felcia Lai
- Sydney Pharmacy School, Faculty of Medicine and Health, The University of Sydney, Camperdown, NSW 2006, Australia
- Correspondence:
| |
Collapse
|
35
|
Zhang Y, Luo M, Wu P, Wu S, Lee TY, Bai C. Application of Computational Biology and Artificial Intelligence in Drug Design. Int J Mol Sci 2022; 23:13568. [PMID: 36362355 PMCID: PMC9658956 DOI: 10.3390/ijms232113568] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2022] [Revised: 10/29/2022] [Accepted: 11/03/2022] [Indexed: 08/24/2023] Open
Abstract
Traditional drug design requires a great amount of research time and developmental expense. Booming computational approaches, including computational biology, computer-aided drug design, and artificial intelligence, have the potential to expedite the efficiency of drug discovery by minimizing the time and financial cost. In recent years, computational approaches are being widely used to improve the efficacy and effectiveness of drug discovery and pipeline, leading to the approval of plenty of new drugs for marketing. The present review emphasizes on the applications of these indispensable computational approaches in aiding target identification, lead discovery, and lead optimization. Some challenges of using these approaches for drug design are also discussed. Moreover, we propose a methodology for integrating various computational techniques into new drug discovery and design.
Collapse
Affiliation(s)
- Yue Zhang
- School of Life and Health Sciences, School of Medicine, The Chinese University of Hong Kong, Shenzhen 518172, China
- School of Chemistry and Materials Science, University of Science and Technology of China, Hefei 230026, China
- Warshel Institute for Computational Biology, Shenzhen 518172, China
| | - Mengqi Luo
- School of Life and Health Sciences, School of Medicine, The Chinese University of Hong Kong, Shenzhen 518172, China
- South China Hospital, Health Science Center, Shenzhen University, Shenzhen 518116, China
| | - Peng Wu
- School of Biomedical Engineering, Health Science Center, Shenzhen University, Shenzhen 518055, China
| | - Song Wu
- South China Hospital, Health Science Center, Shenzhen University, Shenzhen 518116, China
| | - Tzong-Yi Lee
- School of Life and Health Sciences, School of Medicine, The Chinese University of Hong Kong, Shenzhen 518172, China
- Warshel Institute for Computational Biology, Shenzhen 518172, China
| | - Chen Bai
- School of Life and Health Sciences, School of Medicine, The Chinese University of Hong Kong, Shenzhen 518172, China
- Warshel Institute for Computational Biology, Shenzhen 518172, China
| |
Collapse
|
36
|
D'Souza S, Kv P, Balaji S. Training recurrent neural networks as generative neural networks for molecular structures: how does it impact drug discovery? Expert Opin Drug Discov 2022; 17:1071-1079. [PMID: 36216812 DOI: 10.1080/17460441.2023.2134340] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
INTRODUCTION Deep learning approaches have become popular in recent years in de novo drug design. Generative models for molecule generation and optimization have shown promising results. Molecules trained on different chemical data could regenerate molecules that were similar to the query molecule, thus supporting lead optimization. Recurrent neural network-based generative models have demonstrated application in low-data drug discovery, fragment-based drug design and in lead optimization. AREAS COVERED In this review, we have provided an overview of recurrent neural network models and their variants for molecule generation with recent examples. The input representation of molecules as SMILES and molecular graphs have been discussed. The evaluation benchmarks and metrics used in generative neural network models are also highlighted. For this, ScienceDirect, Web of Science, and Google Scholar databases were searched with the article's keywords and their combinations to retrieve the most relevant and up-to-date information. EXPERT OPINION The simplicity of SMILES notation makes it suitable for training a sequence-based model such as a recurrent neural network. However, models that could be trained on molecular graphs to generate molecular structures which could be synthesized could open new possibility for valid molecule generation and synthetic feasibility.
Collapse
Affiliation(s)
- Sofia D'Souza
- Department of Computer Science and Engineering, Manipal Institute of Technology, MAHE, Manipal, India
| | - Prema Kv
- Department of Computer Science and Engineering, Manipal Institute of Technology, MAHE, Manipal, India
| | - Seetharaman Balaji
- Department of Computer Science and Engineering, Manipal Institute of Technology, MAHE, Manipal, India
| |
Collapse
|
37
|
Wang J, Chu Y, Mao J, Jeon HN, Jin H, Zeb A, Jang Y, Cho KH, Song T, No KT. De novo molecular design with deep molecular generative models for PPI inhibitors. Brief Bioinform 2022; 23:6643455. [PMID: 35830870 DOI: 10.1093/bib/bbac285] [Citation(s) in RCA: 20] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2022] [Revised: 06/14/2022] [Accepted: 06/20/2022] [Indexed: 12/27/2022] Open
Abstract
We construct a protein-protein interaction (PPI) targeted drug-likeness dataset and propose a deep molecular generative framework to generate novel drug-likeness molecules from the features of the seed compounds. This framework gains inspiration from published molecular generative models, uses the key features associated with PPI inhibitors as input and develops deep molecular generative models for de novo molecular design of PPI inhibitors. For the first time, quantitative estimation index for compounds targeting PPI was applied to the evaluation of the molecular generation model for de novo design of PPI-targeted compounds. Our results estimated that the generated molecules had better PPI-targeted drug-likeness and drug-likeness. Additionally, our model also exhibits comparable performance to other several state-of-the-art molecule generation models. The generated molecules share chemical space with iPPI-DB inhibitors as demonstrated by chemical space analysis. The peptide characterization-oriented design of PPI inhibitors and the ligand-based design of PPI inhibitors are explored. Finally, we recommend that this framework will be an important step forward for the de novo design of PPI-targeted therapeutics.
Collapse
Affiliation(s)
- Jianmin Wang
- The Interdisciplinary Graduate Program in Integrative Biotechnology and Translational Medicine, Yonsei University, Incheon 21983, Republic of Korea.,Bioinformatics and Molecular Design Research Center (BMDRC), Incheon 21983, Republic of Korea
| | - Yanyi Chu
- State Key Laboratory of Microbial Metabolism, Shanghai-Islamabad Belgrade Joint Innovation Center on Antibacterial Resistances, Joint International Research Laboratory of Metabolic & Developmental Sciences and School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200030, P.R. China
| | - Jiashun Mao
- The Interdisciplinary Graduate Program in Integrative Biotechnology and Translational Medicine, Yonsei University, Incheon 21983, Republic of Korea.,Bioinformatics and Molecular Design Research Center (BMDRC), Incheon 21983, Republic of Korea
| | - Hyeon-Nae Jeon
- Bioinformatics and Molecular Design Research Center (BMDRC), Incheon 21983, Republic of Korea.,Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul 03722, Republic of Korea
| | - Haiyan Jin
- The Interdisciplinary Graduate Program in Integrative Biotechnology and Translational Medicine, Yonsei University, Incheon 21983, Republic of Korea.,Bioinformatics and Molecular Design Research Center (BMDRC), Incheon 21983, Republic of Korea
| | - Amir Zeb
- The Interdisciplinary Graduate Program in Integrative Biotechnology and Translational Medicine, Yonsei University, Incheon 21983, Republic of Korea.,Department of Natural and Basic Sciences, University of Turbat, 92600, Pakistan
| | - Yuil Jang
- The Interdisciplinary Graduate Program in Integrative Biotechnology and Translational Medicine, Yonsei University, Incheon 21983, Republic of Korea.,Bioinformatics and Molecular Design Research Center (BMDRC), Incheon 21983, Republic of Korea
| | - Kwang-Hwi Cho
- School of Systems Biomedical Science, Soongsil University, Seoul, Republic of Korea
| | - Tao Song
- School of Computer Science and Technology, China University of Petroleum, Qingdao, 266580, Shandong, China
| | - Kyoung Tai No
- The Interdisciplinary Graduate Program in Integrative Biotechnology and Translational Medicine, Yonsei University, Incheon 21983, Republic of Korea.,Bioinformatics and Molecular Design Research Center (BMDRC), Incheon 21983, Republic of Korea
| |
Collapse
|
38
|
Yu J, Wang J, Zhao H, Gao J, Kang Y, Cao D, Wang Z, Hou T. Organic Compound Synthetic Accessibility Prediction Based on the Graph Attention Mechanism. J Chem Inf Model 2022; 62:2973-2986. [PMID: 35675668 DOI: 10.1021/acs.jcim.2c00038] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Accurate estimation of the synthetic accessibility of small molecules is needed in many phases of drug discovery. Several expert-crafted scoring methods and descriptor-based quantitative structure-activity relationship (QSAR) models have been developed for synthetic accessibility assessment, but their practical applications in drug discovery are still quite limited because of relatively low prediction accuracy and poor model interpretability. In this study, we proposed a data-driven interpretable prediction framework called GASA (Graph Attention-based assessment of Synthetic Accessibility) to evaluate the synthetic accessibility of small molecules by distinguishing compounds to be easy- (ES) or hard-to-synthesize (HS). GASA is a graph neural network (GNN) architecture that makes self-feature deduction by applying an attention mechanism to automatically capture the most important structural features related to synthetic accessibility. The sampling around the hypothetical classification boundary was used to improve the ability of GASA to distinguish structurally similar molecules. GASA was extensively evaluated and compared with two descriptor-based machine learning methods (random forest, RF; eXtreme gradient boosting, XGBoost) and four existing scores (SYBA: SYnthetic Bayesian Accessibility; SCScore: Synthetic Complexity score; RAscore: Retrosynthetic Accessibility score; SAscore: Synthetic Accessibility score). Our analysis demonstrates that GASA achieved remarkable performance in distinguishing similar molecules compared with other methods and had a broader applicability domain. In addition, we show how GASA learns the important features that affect molecular synthetic accessibility by assigning attention weights to different atoms. An online prediction service for GASA was offered at http://cadd.zju.edu.cn/gasa/.
Collapse
Affiliation(s)
- Jiahui Yu
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, P. R. China
| | - Jike Wang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, P. R. China.,School of Computer Science, Wuhan University, Wuhan 430072, Hubei, P. R. China
| | - Hong Zhao
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, P. R. China
| | - Junbo Gao
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, P. R. China
| | - Yu Kang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, P. R. China
| | - Dongsheng Cao
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha 410004, Hunan, P. R. China
| | - Zhe Wang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, P. R. China.,State Key Lab of CAD&CG, Zhejiang University, Hangzhou 310058, Zhejiang, P. R. China
| | - Tingjun Hou
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, P. R. China.,State Key Lab of CAD&CG, Zhejiang University, Hangzhou 310058, Zhejiang, P. R. China
| |
Collapse
|
39
|
Wang M, Hsieh CY, Wang J, Wang D, Weng G, Shen C, Yao X, Bing Z, Li H, Cao D, Hou T. RELATION: A Deep Generative Model for Structure-Based De Novo Drug Design. J Med Chem 2022; 65:9478-9492. [PMID: 35713420 DOI: 10.1021/acs.jmedchem.2c00732] [Citation(s) in RCA: 21] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023]
Abstract
Deep learning (DL)-based de novo molecular design has recently gained considerable traction. Many DL-based generative models have been successfully developed to design novel molecules, but most of them are ligand-centric and the role of the 3D geometries of target binding pockets in molecular generation has not been well-exploited. Here, we proposed a new 3D-based generative model called RELATION. In the RELATION model, the BiTL algorithm was specifically designed to extract and transfer the desired geometric features of the protein-ligand complexes to a latent space for generation. The pharmacophore conditioning and docking-based Bayesian sampling were applied to efficiently navigate the vast chemical space for the design of molecules with desired geometric properties and pharmacophore features. As a proof of concept, the RELATION model was used to design inhibitors for two targets, AKT1 and CDK2. The calculation results demonstrated that the RELATION model could efficiently generate novel molecules with favorable binding affinity and pharmacophore features.
Collapse
Affiliation(s)
- Mingyang Wang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou 310058, Zhejiang, P. R. China
| | - Chang-Yu Hsieh
- Tencent, Tencent Quantum Lab, Shenzhen 518057, Guangdong, P. R. China
| | - Jike Wang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou 310058, Zhejiang, P. R. China
| | - Dong Wang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou 310058, Zhejiang, P. R. China
| | - Gaoqi Weng
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou 310058, Zhejiang, P. R. China
| | - Chao Shen
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou 310058, Zhejiang, P. R. China
| | - Xiaojun Yao
- Dr. Neher's Biophysics Laboratory for Innovative Drug Discovery Macau Institute for Applied Research in Medicine and Health State Key Laboratory of Quality Research in Chinese Medicine, Macau University of Science and Technology, Taipa 999078, Macau, P. R. China
| | - Zhitong Bing
- Institute of Modern Physics, Chinese Academy of Sciences, Lanzhou 730000, P. R. China
| | - Honglin Li
- Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science & Technology, Shanghai 200237, P. R. China
| | - Dongsheng Cao
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha 410013, Hunan, P. R. China
| | - Tingjun Hou
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou 310058, Zhejiang, P. R. China
| |
Collapse
|
40
|
A comprehensive review of Artificial Intelligence and Network based approaches to drug repurposing in Covid-19. Biomed Pharmacother 2022; 153:113350. [PMID: 35777222 PMCID: PMC9236981 DOI: 10.1016/j.biopha.2022.113350] [Citation(s) in RCA: 21] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2022] [Revised: 06/22/2022] [Accepted: 06/24/2022] [Indexed: 11/26/2022] Open
Abstract
Conventional drug discovery and development is tedious and time-taking process; because of which it has failed to keep the required pace to mitigate threats and cater demands of viral and re-occurring diseases, such as Covid-19. The main reasons of this delay in traditional drug development are: high attrition rates, extensive time requirements, and huge financial investment with significant risk. The effective solution to de novo drug discovery is drug repurposing. Previous studies have shown that the network-based approaches and analysis are versatile platform for repurposing as the network biology is used to model the interactions between variety of biological concepts. Herein, we provide a comprehensive background of machine learning and deep learning in drug repurposing while specifically focusing on the applications of network-based approach to drug repurposing in Covid-19, data sources, and tools used. Furthermore, use of network proximity, network diffusion, and AI on network-based drug repurposing for Covid-19 is well-explained. Finally, limitations of network-based approaches in general and specific to network are stated along with future recommendations for better network-based models.
Collapse
|
41
|
Cheng F, Tuncbag N. Editorial overview: Artificial intelligence (AI) methodologies in structural biology. Curr Opin Struct Biol 2022; 74:102387. [PMID: 35589509 DOI: 10.1016/j.sbi.2022.102387] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Affiliation(s)
- Feixiong Cheng
- Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, Cleveland, OH 44195, USA; Department of Molecular Medicine, Cleveland Clinic Lerner College of Medicine, Case Western Reserve University, Cleveland, OH 44195, USA; Case Comprehensive Cancer Center, Case Western Reserve University School of Medicine, Cleveland, OH 44106, USA.
| | - Nurcan Tuncbag
- Department of Chemical and Biological Engineering, College of Engineering, Koc University, Istanbul, 34450, Turkey; School of Medicine, Koc University, Istanbul, 34450, Turkey.
| |
Collapse
|