1
|
Jiang X, Lu L, Li J, Jiang J, Zhang J, Zhou S, Wen H, Cai H, Luo X, Li Z, Wang J, Ju B, Bai R. Synthetically Feasible De Novo Molecular Design of Leads Based on a Reinforcement Learning Model: AI-Assisted Discovery of an Anti-IBD Lead Targeting CXCR4. J Med Chem 2024; 67:10057-10075. [PMID: 38863440 DOI: 10.1021/acs.jmedchem.4c00184] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/13/2024]
Abstract
Artificial intelligence (AI) de novo molecular generation provides leads with novel structures for drug discovery. However, the target affinity and synthesizability of the generated molecules present critical challenges for the successful application of AI technology. Therefore, we developed an advanced reinforcement learning model to bridge the gap between the theory of de novo molecular generation and the practical aspects of drug discovery. This model utilizes chemical reaction templates and commercially available building blocks as a starting point and employs forward reaction prediction to generate molecules, while real-time docking and drug-likeness predictions are conducted to ensure synthesizability and drug-likeness. We applied this model to design active molecules targeting the inflammation-related receptor CXCR4 and successfully prepared them according to the AI-proposed synthetic routes. Several molecules exhibited potent anti-CXCR4 and anti-inflammatory activity in subsequent in vitro and in vivo assays. The top-performing compound XVI alleviated symptoms related to inflammatory bowel disease and showed reasonable pharmacokinetic properties.
Collapse
Affiliation(s)
- Xiaoying Jiang
- School of Pharmacy, Hangzhou Normal University, Hangzhou 311121, PR China
- Key Laboratory of Elemene Class Anti-Cancer Chinese Medicines; Engineering Laboratory of Development and Application of Traditional Chinese Medicines; Collaborative Innovation Center of Traditional Chinese Medicines of Zhejiang Province, Hangzhou Normal University, Hangzhou 311121, PR China
| | - Liuxin Lu
- School of Pharmacy, Hangzhou Normal University, Hangzhou 311121, PR China
- Key Laboratory of Elemene Class Anti-Cancer Chinese Medicines; Engineering Laboratory of Development and Application of Traditional Chinese Medicines; Collaborative Innovation Center of Traditional Chinese Medicines of Zhejiang Province, Hangzhou Normal University, Hangzhou 311121, PR China
| | - Junjie Li
- School of Pharmacy, Hangzhou Normal University, Hangzhou 311121, PR China
- Key Laboratory of Elemene Class Anti-Cancer Chinese Medicines; Engineering Laboratory of Development and Application of Traditional Chinese Medicines; Collaborative Innovation Center of Traditional Chinese Medicines of Zhejiang Province, Hangzhou Normal University, Hangzhou 311121, PR China
| | - Jing Jiang
- SanOmics AI Co. Ltd., Hangzhou 311103, PR China
| | - Jiapeng Zhang
- School of Pharmaceutical Science and Technology, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Hangzhou 310024, PR China
| | - Shengbin Zhou
- School of Pharmaceutical Science and Technology, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Hangzhou 310024, PR China
| | - Hao Wen
- School of Pharmacy, Hangzhou Normal University, Hangzhou 311121, PR China
- Key Laboratory of Elemene Class Anti-Cancer Chinese Medicines; Engineering Laboratory of Development and Application of Traditional Chinese Medicines; Collaborative Innovation Center of Traditional Chinese Medicines of Zhejiang Province, Hangzhou Normal University, Hangzhou 311121, PR China
| | - Hong Cai
- School of Pharmacy, Hangzhou Normal University, Hangzhou 311121, PR China
- Key Laboratory of Elemene Class Anti-Cancer Chinese Medicines; Engineering Laboratory of Development and Application of Traditional Chinese Medicines; Collaborative Innovation Center of Traditional Chinese Medicines of Zhejiang Province, Hangzhou Normal University, Hangzhou 311121, PR China
| | - Xinyu Luo
- School of Pharmacy, Hangzhou Normal University, Hangzhou 311121, PR China
- Key Laboratory of Elemene Class Anti-Cancer Chinese Medicines; Engineering Laboratory of Development and Application of Traditional Chinese Medicines; Collaborative Innovation Center of Traditional Chinese Medicines of Zhejiang Province, Hangzhou Normal University, Hangzhou 311121, PR China
| | - Zhen Li
- SanOmics AI Co. Ltd., Hangzhou 311103, PR China
| | - Jiahui Wang
- School of Pharmacy, Hangzhou Normal University, Hangzhou 311121, PR China
- Key Laboratory of Elemene Class Anti-Cancer Chinese Medicines; Engineering Laboratory of Development and Application of Traditional Chinese Medicines; Collaborative Innovation Center of Traditional Chinese Medicines of Zhejiang Province, Hangzhou Normal University, Hangzhou 311121, PR China
| | - Bin Ju
- SanOmics AI Co. Ltd., Hangzhou 311103, PR China
| | - Renren Bai
- School of Pharmacy, Hangzhou Normal University, Hangzhou 311121, PR China
- Key Laboratory of Elemene Class Anti-Cancer Chinese Medicines; Engineering Laboratory of Development and Application of Traditional Chinese Medicines; Collaborative Innovation Center of Traditional Chinese Medicines of Zhejiang Province, Hangzhou Normal University, Hangzhou 311121, PR China
| |
Collapse
|
2
|
Wei W, Fang J, Yang N, Li Q, Hu L, Zhao L, Han J. AC-ModNet: Molecular Reverse Design Network Based on Attribute Classification. Int J Mol Sci 2024; 25:6940. [PMID: 39000049 PMCID: PMC11241775 DOI: 10.3390/ijms25136940] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2024] [Revised: 06/13/2024] [Accepted: 06/22/2024] [Indexed: 07/14/2024] Open
Abstract
Deep generative models are becoming a tool of choice for exploring the molecular space. One important application area of deep generative models is the reverse design of drug compounds for given attributes (solubility, ease of synthesis, etc.). Although there are many generative models, these models cannot generate specific intervals of attributes. This paper proposes a AC-ModNet model that effectively combines VAE with AC-GAN to generate molecular structures in specific attribute intervals. The AC-ModNet is trained and evaluated using the open 250K ZINC dataset. In comparison with related models, our method performs best in the FCD and Frag model evaluation indicators. Moreover, we prove the AC-ModNet created molecules have potential application value in drug design by comparing and analyzing them with medical records in the PubChem database. The results of this paper will provide a new method for machine learning drug reverse design.
Collapse
Affiliation(s)
| | | | - Ning Yang
- School of Automation, Northwestern Polytechnical University, Xi’an 710072, China; (W.W.); (J.F.); (Q.L.); (L.H.); (L.Z.); (J.H.)
| | | | | | | | | |
Collapse
|
3
|
Liu Y, Yu H, Duan X, Zhang X, Cheng T, Jiang F, Tang H, Ruan Y, Zhang M, Zhang H, Zhang Q. TransGEM: a molecule generation model based on Transformer with gene expression data. Bioinformatics 2024; 40:btae189. [PMID: 38632084 PMCID: PMC11078772 DOI: 10.1093/bioinformatics/btae189] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2023] [Revised: 03/26/2024] [Accepted: 04/16/2024] [Indexed: 04/19/2024] Open
Abstract
MOTIVATION It is difficult to generate new molecules with desirable bioactivity through ligand-based de novo drug design, and receptor-based de novo drug design is constrained by disease target information availability. The combination of artificial intelligence and phenotype-based de novo drug design can generate new bioactive molecules, independent from disease target information. Gene expression profiles can be used to characterize biological phenotypes. The Transformer model can be utilized to capture the associations between gene expression profiles and molecular structures due to its remarkable ability in processing contextual information. RESULTS We propose TransGEM (Transformer-based model from gene expression to molecules), which is a phenotype-based de novo drug design model. A specialized gene expression encoder is used to embed gene expression difference values between diseased cell lines and their corresponding normal tissue cells into TransGEM model. The results demonstrate that the TransGEM model can generate molecules with desirable evaluation metrics and property distributions. Case studies illustrate that TransGEM model can generate structurally novel molecules with good binding affinity to disease target proteins. The majority of genes with high attention scores obtained from TransGEM model are associated with the onset of the disease, indicating the potential of these genes as disease targets. Therefore, this study provides a new paradigm for de novo drug design, and it will promote phenotype-based drug discovery. AVAILABILITY AND IMPLEMENTATION The code is available at https://github.com/hzauzqy/TransGEM.
Collapse
Affiliation(s)
- Yanguang Liu
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, P.R. China
| | - Hailong Yu
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, P.R. China
| | - Xinya Duan
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, P.R. China
| | - Xiaomin Zhang
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, P.R. China
| | - Ting Cheng
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, P.R. China
| | - Feng Jiang
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, P.R. China
| | - Hao Tang
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, P.R. China
| | - Yao Ruan
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, P.R. China
| | - Miao Zhang
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, P.R. China
| | - Hongyu Zhang
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, P.R. China
| | - Qingye Zhang
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, P.R. China
| |
Collapse
|
4
|
Singh S, Singh PK, Sachan K, Kumar M, Bhardwaj P. Automation of Drug Discovery through Cutting-edge In-silico Research in Pharmaceuticals: Challenges and Future Scope. Curr Comput Aided Drug Des 2024; 20:723-735. [PMID: 37807412 DOI: 10.2174/0115734099260187230921073932] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2023] [Revised: 08/05/2023] [Accepted: 08/18/2023] [Indexed: 10/10/2023]
Abstract
The rapidity and high-throughput nature of in silico technologies make them advantageous for predicting the properties of a large array of substances. In silico approaches can be used for compounds intended for synthesis at the beginning of drug development when there is either no or very little compound available. In silico approaches can be used for impurities or degradation products. Quantifying drugs and related substances (RS) with pharmaceutical drug analysis (PDA) can also improve drug discovery (DD) by providing additional avenues to pursue. Potential future applications of PDA include combining it with other methods to make insilico predictions about drugs and RS. One possible outcome of this is a determination of the drug potential of nontoxic RS. ADME estimation, QSAR research, molecular docking, bioactivity prediction, and toxicity testing all involve impurity profiling. Before committing to DD, RS with minimal toxicity can be utilised in silico. The efficacy of molecular docking in getting a medication to market is still debated despite its refinement and improvement. Biomedical labs and pharmaceutical companies were hesitant to adopt molecular docking algorithms for drug screening despite their decades of development and improvement. Despite the widespread use of "force fields" to represent the energy exerted within and between molecules, it has been impossible to reliably predict or compute the binding affinities between proteins and potential binding medications.
Collapse
Affiliation(s)
- Smita Singh
- Department of Pharmaceutics, SRM Modinagar College of Pharmacy, SRM Institute of Science and Technology, Delhi NCR Campus, Modinagar, Ghaziabad, India
| | - Pranjal Kumar Singh
- Department of Pharmaceutics, SRM Modinagar College of Pharmacy, SRM Institute of Science and Technology, Delhi NCR Campus, Modinagar, Ghaziabad, India
| | - Kapil Sachan
- KIET School of Pharmacy, KIET Group of Institutions, Ghaziabad, India
| | - Mukesh Kumar
- IIMT College of Medical Sciences, IIMT University, Ganga Nagar, Meerut, India
| | - Poonam Bhardwaj
- NKBR College of Pharmacy and Research Center, Phaphunda, Meerut, India
| |
Collapse
|
5
|
Xiong Y, Wang Y, Wang Y, Li C, Yusong P, Wu J, Wang Y, Gu L, Butch CJ. Improving drug discovery with a hybrid deep generative model using reinforcement learning trained on a Bayesian docking approximation. J Comput Aided Mol Des 2023; 37:507-517. [PMID: 37550462 DOI: 10.1007/s10822-023-00523-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2023] [Accepted: 07/17/2023] [Indexed: 08/09/2023]
Abstract
Generative approaches to molecular design are an area of intense study in recent years as a method to generate new pharmaceuticals with desired properties. Often though, these types of efforts are constrained by limited experimental activity data, resulting in either models that generate molecules with poor performance or models that are overfit and produce close analogs of known molecules. In this paper, we reduce this data dependency for the generation of new chemotypes by incorporating docking scores of known and de novo molecules to expand the applicability domain of the reward function and diversify the compounds generated during reinforcement learning. Our approach employs a deep generative model initially trained using a combination of limited known drug activity and an approximate docking score provided by a second machine learned Bayes regression model, with final evaluation of high scoring compounds by a full docking simulation. This strategy results in molecules with docking scores improved by 10-20% compared to molecules of similar size, while being 130 × faster than a docking only approach on a typical GPU workstation. We also show that the increased docking scores correlate with (1) docking poses with interactions similar to known inhibitors and (2) result in higher MM-GBSA binding energies comparable to the energies of known DDR1 inhibitors, demonstrating that the Bayesian model contains sufficient information for the network to learn to efficiently interact with the binding pocket during reinforcement learning. This outcome shows that the combination of the learned latent molecular representation along with the feature-based docking regression is sufficient for reinforcement learning to infer the relationship between the molecules and the receptor binding site, which suggest that our method can be a powerful tool for the discovery of new chemotypes with potential therapeutic applications.
Collapse
Affiliation(s)
- Youjin Xiong
- Department of Biomedical Engineering, Nanjing University, Nanjing, 210093, China
| | - Yiqing Wang
- Icekredit Incorporated, Shanghai, 200120, China
| | - Yisheng Wang
- Department of Biomedical Engineering, Nanjing University, Nanjing, 210093, China
| | - Chenmei Li
- Department of Biomedical Engineering, Nanjing University, Nanjing, 210093, China
| | - Peng Yusong
- Department of Biomedical Engineering, Nanjing University, Nanjing, 210093, China
| | - Junyu Wu
- Icekredit Incorporated, Shanghai, 200120, China
| | - Yiqing Wang
- Department of Biomedical Engineering, Nanjing University, Nanjing, 210093, China
| | - Lingyun Gu
- Department of Information Systems Technology and Design, Singapore University of Technology and Design, Singapore, Singapore.
| | - Christopher J Butch
- Department of Biomedical Engineering, Nanjing University, Nanjing, 210093, China.
| |
Collapse
|
6
|
Janet JP, Mervin L, Engkvist O. Artificial intelligence in molecular de novo design: Integration with experiment. Curr Opin Struct Biol 2023; 80:102575. [PMID: 36966692 DOI: 10.1016/j.sbi.2023.102575] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2022] [Revised: 02/09/2023] [Accepted: 02/18/2023] [Indexed: 06/04/2023]
Abstract
In this mini review, we capture the latest progress of applying artificial intelligence (AI) techniques based on deep learning architectures to molecular de novo design with a focus on integration with experimental validation. We will cover the progress and experimental validation of novel generative algorithms, the validation of QSAR models and how AI-based molecular de novo design is starting to become connected with chemistry automation. While progress has been made in the last few years, it is still early days. The experimental validations conducted thus far should be considered proof-of-principle, providing confidence that the field is moving in the right direction.
Collapse
Affiliation(s)
- Jon Paul Janet
- Molecular AI, Discovery Sciences, R&D, AstraZeneca, Gothenburg, Sweden
| | - Lewis Mervin
- Molecular AI, Discovery Sciences, R&D, AstraZeneca, Cambridge, UK.
| | - Ola Engkvist
- Molecular AI, Discovery Sciences, R&D, AstraZeneca, Gothenburg, Sweden
| |
Collapse
|
7
|
Abate C, Decherchi S, Cavalli A. Graph neural networks for conditional de novo drug design. WIRES COMPUTATIONAL MOLECULAR SCIENCE 2023. [DOI: 10.1002/wcms.1651] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/15/2023]
Affiliation(s)
- Carlo Abate
- Fondazione Istituto Italiano di Tecnologia Genoa Italy
- Università degli Studi di Bologna Bologna Italy
| | | | - Andrea Cavalli
- Fondazione Istituto Italiano di Tecnologia Genoa Italy
- Università degli Studi di Bologna Bologna Italy
| |
Collapse
|
8
|
To Affinity and Beyond: A Personal Reflection on the Design and Discovery of Drugs. Molecules 2022; 27:molecules27217624. [DOI: 10.3390/molecules27217624] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2022] [Revised: 10/19/2022] [Accepted: 10/21/2022] [Indexed: 11/09/2022] Open
Abstract
Faced with new and as yet unmet medical need, the stark underperformance of the pharmaceutical discovery process is well described if not perfectly understood. Driven primarily by profit rather than societal need, the search for new pharmaceutical products—small molecule drugs, biologicals, and vaccines—is neither properly funded nor sufficiently systematic. Many innovative approaches remain significantly underused and severely underappreciated, while dominant methodologies are replete with problems and limitations. Design is a component of drug discovery that is much discussed but seldom realised. In and of itself, technical innovation alone is unlikely to fulfil all the possibilities of drug discovery if the necessary underlying infrastructure remains unaltered. A fundamental revision in attitudes, with greater reliance on design powered by computational approaches, as well as a move away from the commercial imperative, is thus essential to capitalise fully on the potential of pharmaceutical intervention in healthcare.
Collapse
|
9
|
Spenke F, Hartke B. Graph-based Automated Macro-Molecule Assembly. J Chem Inf Model 2022; 62:3714-3723. [PMID: 35938711 DOI: 10.1021/acs.jcim.2c00609] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
We present a general molecular framework assembly algorithm that takes a largely arbitrary molecular fragment database and a user-supplied target template graph as input. Automatic assembly of molecular fragments from the database, following a prescribed, user-supplied set of connection rules, then turns the template graph into an actual, chemically reasonable molecular framework. Assembly capabilities of our algorithm are tested by producing several abstract, closed-loop shapes. To indicate a few of many possible application areas we demonstrate a host-guest complex and a road toward catalysis. Postassembly substituent exchange can be used to produce electric fields of desired values at desired points inside the framework or at its surface as a stepping stone toward rationally designed, artificial heterogeneous catalysts.
Collapse
Affiliation(s)
- Florian Spenke
- Institute for Physical Chemistry, Christian-Albrechts-University, Olshausenstrasse 40, Kiel 24098, Germany
| | - Bernd Hartke
- Institute for Physical Chemistry, Christian-Albrechts-University, Olshausenstrasse 40, Kiel 24098, Germany
| |
Collapse
|
10
|
Kim J, Park S, Min D, Kim W. Comprehensive Survey of Recent Drug Discovery Using Deep Learning. Int J Mol Sci 2021; 22:9983. [PMID: 34576146 PMCID: PMC8470987 DOI: 10.3390/ijms22189983] [Citation(s) in RCA: 41] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2021] [Revised: 09/09/2021] [Accepted: 09/10/2021] [Indexed: 02/07/2023] Open
Abstract
Drug discovery based on artificial intelligence has been in the spotlight recently as it significantly reduces the time and cost required for developing novel drugs. With the advancement of deep learning (DL) technology and the growth of drug-related data, numerous deep-learning-based methodologies are emerging at all steps of drug development processes. In particular, pharmaceutical chemists have faced significant issues with regard to selecting and designing potential drugs for a target of interest to enter preclinical testing. The two major challenges are prediction of interactions between drugs and druggable targets and generation of novel molecular structures suitable for a target of interest. Therefore, we reviewed recent deep-learning applications in drug-target interaction (DTI) prediction and de novo drug design. In addition, we introduce a comprehensive summary of a variety of drug and protein representations, DL models, and commonly used benchmark datasets or tools for model training and testing. Finally, we present the remaining challenges for the promising future of DL-based DTI prediction and de novo drug design.
Collapse
Affiliation(s)
- Jintae Kim
- KaiPharm Co., Ltd., Seoul 03759, Korea; (J.K.); (S.P.)
| | - Sera Park
- KaiPharm Co., Ltd., Seoul 03759, Korea; (J.K.); (S.P.)
| | - Dongbo Min
- Computer Vision Lab, Department of Computer Science and Engineering, Ewha Womans University, Seoul 03760, Korea
| | - Wankyu Kim
- KaiPharm Co., Ltd., Seoul 03759, Korea; (J.K.); (S.P.)
- System Pharmacology Lab, Department of Life Sciences, Ewha Womans University, Seoul 03760, Korea
| |
Collapse
|