1
|
Singh A, Tanwar M, Singh TP, Sharma S, Sharma P. An escape from ESKAPE pathogens: A comprehensive review on current and emerging therapeutics against antibiotic resistance. Int J Biol Macromol 2024; 279:135253. [PMID: 39244118 DOI: 10.1016/j.ijbiomac.2024.135253] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2024] [Revised: 08/29/2024] [Accepted: 08/30/2024] [Indexed: 09/09/2024]
Abstract
The rise of antimicrobial resistance has positioned ESKAPE pathogens as a serious global health threat, primarily due to the limitations and frequent failures of current treatment options. This growing risk has spurred the scientific community to seek innovative antibiotic therapies and improved oversight strategies. This review aims to provide a comprehensive overview of the origins and resistance mechanisms of ESKAPE pathogens, while also exploring next-generation treatment strategies for these infections. In addition, it will address both traditional and novel approaches to combating antibiotic resistance, offering insights into potential new therapeutic avenues. Emerging research underscores the urgency of developing new antimicrobial agents and strategies to overcome resistance, highlighting the need for novel drug classes and combination therapies. Advances in genomic technologies and a deeper understanding of microbial pathogenesis are crucial in identifying effective treatments. Integrating precision medicine and personalized approaches could enhance therapeutic efficacy. The review also emphasizes the importance of global collaboration in surveillance and stewardship, as well as policy reforms, enhanced diagnostic tools, and public awareness initiatives, to address resistance on a worldwide scale.
Collapse
Affiliation(s)
- Anamika Singh
- Department of Biophysics, All India Institute of Medical Sciences, New Delhi 110029, India
| | - Mansi Tanwar
- Department of Biophysics, All India Institute of Medical Sciences, New Delhi 110029, India
| | - T P Singh
- Department of Biophysics, All India Institute of Medical Sciences, New Delhi 110029, India
| | - Sujata Sharma
- Department of Biophysics, All India Institute of Medical Sciences, New Delhi 110029, India.
| | - Pradeep Sharma
- Department of Biophysics, All India Institute of Medical Sciences, New Delhi 110029, India.
| |
Collapse
|
2
|
Shi C, Gao T, Lyu W, Qiang B, Chen Y, Chen Q, Zhang L, Liu Z. Deep-Learning-Driven Discovery of SN3-1, a Potent NLRP3 Inhibitor with Therapeutic Potential for Inflammatory Diseases. J Med Chem 2024; 67:17833-17854. [PMID: 39302813 DOI: 10.1021/acs.jmedchem.4c01857] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/22/2024]
Abstract
The NLRP3 inflammasome plays a central role in the pathogenesis of various intractable human diseases, making it an urgent target for therapeutic intervention. Here, we report the development of SN3-1, a novel orally potent NLRP3 inhibitor, designed through a lead compound strategy centered on deep-learning-based molecular generative models. Our strategy enables rapid fragment enumeration and takes into account the synthetic accessibility of the compounds, thereby significantly enhancing the optimization of lead compounds and facilitating the discovery of potent inhibitors. X-ray crystallography provided insights into the SN3-1 inhibitory mechanism. SN3-1 has shown a favorable safety profile in both acute and chronic toxicity assessments and exhibits robust pharmacokinetic properties. Furthermore, SN3-1 demonstrated significant therapeutic efficacy in various disease models characterized by NLRP3 activation. This study introduces a potent candidate for developing NLRP3 inhibitors and significantly expands the repertoire of tools available for the discovery of novel inhibitors.
Collapse
Affiliation(s)
- Cheng Shi
- State Key Laboratory of Natural and Biomimetic Drugs, School of Pharmaceutical Sciences, Peking University, Beijing 100191, China
| | - Tongfei Gao
- State Key Laboratory of Natural and Biomimetic Drugs, School of Pharmaceutical Sciences, Peking University, Beijing 100191, China
| | - Weiping Lyu
- State Key Laboratory of Natural and Biomimetic Drugs, School of Pharmaceutical Sciences, Peking University, Beijing 100191, China
| | - Bo Qiang
- State Key Laboratory of Natural and Biomimetic Drugs, School of Pharmaceutical Sciences, Peking University, Beijing 100191, China
| | - Yanming Chen
- State Key Laboratory of Natural and Biomimetic Drugs, School of Pharmaceutical Sciences, Peking University, Beijing 100191, China
| | - Qixuan Chen
- State Key Laboratory of Natural and Biomimetic Drugs, School of Pharmaceutical Sciences, Peking University, Beijing 100191, China
| | - Liangren Zhang
- State Key Laboratory of Natural and Biomimetic Drugs, School of Pharmaceutical Sciences, Peking University, Beijing 100191, China
| | - Zhenming Liu
- State Key Laboratory of Natural and Biomimetic Drugs, School of Pharmaceutical Sciences, Peking University, Beijing 100191, China
| |
Collapse
|
3
|
He J, Tibo A, Janet JP, Nittinger E, Tyrchan C, Czechtizky W, Engkvist O. Evaluation of reinforcement learning in transformer-based molecular design. J Cheminform 2024; 16:95. [PMID: 39118113 PMCID: PMC11312936 DOI: 10.1186/s13321-024-00887-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2024] [Accepted: 07/21/2024] [Indexed: 08/10/2024] Open
Abstract
Designing compounds with a range of desirable properties is a fundamental challenge in drug discovery. In pre-clinical early drug discovery, novel compounds are often designed based on an already existing promising starting compound through structural modifications for further property optimization. Recently, transformer-based deep learning models have been explored for the task of molecular optimization by training on pairs of similar molecules. This provides a starting point for generating similar molecules to a given input molecule, but has limited flexibility regarding user-defined property profiles. Here, we evaluate the effect of reinforcement learning on transformer-based molecular generative models. The generative model can be considered as a pre-trained model with knowledge of the chemical space close to an input compound, while reinforcement learning can be viewed as a tuning phase, steering the model towards chemical space with user-specific desirable properties. The evaluation of two distinct tasks-molecular optimization and scaffold discovery-suggest that reinforcement learning could guide the transformer-based generative model towards the generation of more compounds of interest. Additionally, the impact of pre-trained models, learning steps and learning rates are investigated.Scientific contributionOur study investigates the effect of reinforcement learning on a transformer-based generative model initially trained for generating molecules similar to starting molecules. The reinforcement learning framework is applied to facilitate multiparameter optimisation of starting molecules. This approach allows for more flexibility for optimizing user-specific property profiles and helps finding more ideas of interest.
Collapse
Affiliation(s)
- Jiazhen He
- Molecular AI, Discovery Sciences, R&D, AstraZeneca, Gothenburg, Sweden.
| | - Alessandro Tibo
- Molecular AI, Discovery Sciences, R&D, AstraZeneca, Gothenburg, Sweden
| | - Jon Paul Janet
- Molecular AI, Discovery Sciences, R&D, AstraZeneca, Gothenburg, Sweden
| | - Eva Nittinger
- Medicinal Chemistry, Research and Early Development, Respiratory and Immunology (R&I), BioPharmaceuticals R&D, AstraZeneca, Gothenburg, Sweden
| | - Christian Tyrchan
- Medicinal Chemistry, Research and Early Development, Respiratory and Immunology (R&I), BioPharmaceuticals R&D, AstraZeneca, Gothenburg, Sweden
| | - Werngard Czechtizky
- Medicinal Chemistry, Research and Early Development, Respiratory and Immunology (R&I), BioPharmaceuticals R&D, AstraZeneca, Gothenburg, Sweden
| | - Ola Engkvist
- Molecular AI, Discovery Sciences, R&D, AstraZeneca, Gothenburg, Sweden
- Department of Computer Science and Engineering, Chalmers University of Technology, Gothenburg, Sweden
| |
Collapse
|
4
|
Wang X, Xu K, Zeng X, Linghu K, Zhao B, Yu S, Wang K, Yu S, Zhao X, Zeng W, Wang K, Zhou J. Machine learning-assisted substrate binding pocket engineering based on structural information. Brief Bioinform 2024; 25:bbae381. [PMID: 39101501 PMCID: PMC11299021 DOI: 10.1093/bib/bbae381] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2024] [Revised: 05/25/2024] [Accepted: 07/23/2024] [Indexed: 08/06/2024] Open
Abstract
Engineering enzyme-substrate binding pockets is the most efficient approach for modifying catalytic activity, but is limited if the substrate binding sites are indistinct. Here, we developed a 3D convolutional neural network for predicting protein-ligand binding sites. The network was integrated by DenseNet, UNet, and self-attention for extracting features and recovering sample size. We attempted to enlarge the dataset by data augmentation, and the model achieved success rates of 48.4%, 35.5%, and 43.6% at a precision of ≥50% and 52%, 47.6%, and 58.1%. The distance of predicted and real center is ≤4 Å, which is based on SC6K, COACH420, and BU48 validation datasets. The substrate binding sites of Klebsiella variicola acid phosphatase (KvAP) and Bacillus anthracis proline 4-hydroxylase (BaP4H) were predicted using DUnet, showing high competitive performance of 53.8% and 56% of the predicted binding sites that critically affected the catalysis of KvAP and BaP4H. Virtual saturation mutagenesis was applied based on the predicted binding sites of KvAP, and the top-ranked 10 single mutations contributed to stronger enzyme-substrate binding varied while the predicted sites were different. The advantage of DUnet for predicting key residues responsible for enzyme activity further promoted the success rate of virtual mutagenesis. This study highlighted the significance of correctly predicting key binding sites for enzyme engineering.
Collapse
Affiliation(s)
- Xinglong Wang
- School of Food Science and Technology, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu 214122, China
- Engineering Research Center of Ministry of Education on Food Synthetic Biotechnology and School of Biotechnology, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu 214122, China
- Science Center for Future Foods, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu 214122, China
| | - Kangjie Xu
- Science Center for Future Foods, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu 214122, China
| | - Xuan Zeng
- Key Laboratory of Advanced Process Control for Light Industry (Ministry of Education), School of Internet of Things Engineering, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu 214122, China
| | - Kai Linghu
- Science Center for Future Foods, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu 214122, China
| | - Beichen Zhao
- Science Center for Future Foods, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu 214122, China
| | - Shangyang Yu
- Science Center for Future Foods, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu 214122, China
| | - Kun Wang
- Science Center for Future Foods, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu 214122, China
| | - Shuyao Yu
- Science Center for Future Foods, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu 214122, China
| | - Xinyi Zhao
- Science Center for Future Foods, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu 214122, China
| | - Weizhu Zeng
- Science Center for Future Foods, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu 214122, China
| | - Kai Wang
- Key Laboratory of Advanced Process Control for Light Industry (Ministry of Education), School of Internet of Things Engineering, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu 214122, China
| | - Jingwen Zhou
- Engineering Research Center of Ministry of Education on Food Synthetic Biotechnology and School of Biotechnology, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu 214122, China
- Science Center for Future Foods, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu 214122, China
- Jiangsu Province Engineering Research Center of Food Synthetic Biotechnology, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu 214122, China
| |
Collapse
|
5
|
Gim M, Park J, Park S, Lee S, Baek S, Lee J, Nguyen NQ, Kang J. MolPLA: a molecular pretraining framework for learning cores, R-groups and their linker joints. Bioinformatics 2024; 40:i369-i380. [PMID: 38940143 PMCID: PMC11211832 DOI: 10.1093/bioinformatics/btae256] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/29/2024] Open
Abstract
MOTIVATION Molecular core structures and R-groups are essential concepts in drug development. Integration of these concepts with conventional graph pre-training approaches can promote deeper understanding in molecules. We propose MolPLA, a novel pre-training framework that employs masked graph contrastive learning in understanding the underlying decomposable parts in molecules that implicate their core structure and peripheral R-groups. Furthermore, we formulate an additional framework that grants MolPLA the ability to help chemists find replaceable R-groups in lead optimization scenarios. RESULTS Experimental results on molecular property prediction show that MolPLA exhibits predictability comparable to current state-of-the-art models. Qualitative analysis implicate that MolPLA is capable of distinguishing core and R-group sub-structures, identifying decomposable regions in molecules and contributing to lead optimization scenarios by rationally suggesting R-group replacements given various query core templates. AVAILABILITY AND IMPLEMENTATION The code implementation for MolPLA and its pre-trained model checkpoint is available at https://github.com/dmis-lab/MolPLA.
Collapse
Affiliation(s)
- Mogan Gim
- Department of Computer Science, Korea University, Seoul 02841, Republic of Korea
| | - Jueon Park
- Department of Computer Science, Korea University, Seoul 02841, Republic of Korea
| | - Soyon Park
- Department of Computer Science, Korea University, Seoul 02841, Republic of Korea
| | - Sanghoon Lee
- Department of Computer Science, Korea University, Seoul 02841, Republic of Korea
- AIGEN Sciences, Seoul 04778, Republic of Korea
| | - Seungheun Baek
- Department of Computer Science, Korea University, Seoul 02841, Republic of Korea
| | - Junhyun Lee
- Department of Computer Science, Korea University, Seoul 02841, Republic of Korea
| | - Ngoc-Quang Nguyen
- Department of Computer Science, Korea University, Seoul 02841, Republic of Korea
| | - Jaewoo Kang
- Department of Computer Science, Korea University, Seoul 02841, Republic of Korea
- AIGEN Sciences, Seoul 04778, Republic of Korea
| |
Collapse
|
6
|
Jin H, Merz KM. LigandDiff: de Novo Ligand Design for 3D Transition Metal Complexes with Diffusion Models. J Chem Theory Comput 2024; 20:4377-4384. [PMID: 38743854 PMCID: PMC11137811 DOI: 10.1021/acs.jctc.4c00232] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2024] [Revised: 05/06/2024] [Accepted: 05/07/2024] [Indexed: 05/16/2024]
Abstract
Transition metal complexes are a class of compounds with varied and versatile properties, making them of great technological importance. Their applications cover a wide range of fields, either as metallodrugs in medicine or as materials, catalysts, batteries, solar cells, etc. The demand for the novel design of transition metal complexes with new properties remains of great interest. However, the traditional high-throughput screening approach is inherently expensive and laborious since it depends on human expertise. Here, we present LigandDiff, a generative model for the de novo design of novel transition metal complexes. Unlike the existing methods that simply extract and combine ligands with the metal to get new complexes, LigandDiff aims at designing configurationally novel ligands from scratch, which opens new pathways for the discovery of organometallic complexes. Moreover, it overcomes the limitations of current methods, where the diversity of new complexes highly relies on the diversity of available ligands, while LigandDiff can design numerous novel ligands without human intervention. Our results indicate that LigandDiff designs unique and novel ligands under different contexts, and these generated ligands are synthetically accessible. Moreover, LigandDiff shows good transferability by generating successful ligands for any transition metal complex.
Collapse
Affiliation(s)
- Hongni Jin
- Department
of Chemistry, Michigan State University, East Lansing, Michigan 48824, United States
| | - Kenneth M. Merz
- Department
of Chemistry, Michigan State University, East Lansing, Michigan 48824, United States
- Department
of Biochemistry and Molecular Biology, Michigan
State University, East Lansing, Michigan 48824, United States
| |
Collapse
|
7
|
Ju W, Fang Z, Gu Y, Liu Z, Long Q, Qiao Z, Qin Y, Shen J, Sun F, Xiao Z, Yang J, Yuan J, Zhao Y, Wang Y, Luo X, Zhang M. A Comprehensive Survey on Deep Graph Representation Learning. Neural Netw 2024; 173:106207. [PMID: 38442651 DOI: 10.1016/j.neunet.2024.106207] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2023] [Revised: 01/23/2024] [Accepted: 02/21/2024] [Indexed: 03/07/2024]
Abstract
Graph representation learning aims to effectively encode high-dimensional sparse graph-structured data into low-dimensional dense vectors, which is a fundamental task that has been widely studied in a range of fields, including machine learning and data mining. Classic graph embedding methods follow the basic idea that the embedding vectors of interconnected nodes in the graph can still maintain a relatively close distance, thereby preserving the structural information between the nodes in the graph. However, this is sub-optimal due to: (i) traditional methods have limited model capacity which limits the learning performance; (ii) existing techniques typically rely on unsupervised learning strategies and fail to couple with the latest learning paradigms; (iii) representation learning and downstream tasks are dependent on each other which should be jointly enhanced. With the remarkable success of deep learning, deep graph representation learning has shown great potential and advantages over shallow (traditional) methods, there exist a large number of deep graph representation learning techniques have been proposed in the past decade, especially graph neural networks. In this survey, we conduct a comprehensive survey on current deep graph representation learning algorithms by proposing a new taxonomy of existing state-of-the-art literature. Specifically, we systematically summarize the essential components of graph representation learning and categorize existing approaches by the ways of graph neural network architectures and the most recent advanced learning paradigms. Moreover, this survey also provides the practical and promising applications of deep graph representation learning. Last but not least, we state new perspectives and suggest challenging directions which deserve further investigations in the future.
Collapse
Affiliation(s)
- Wei Ju
- School of Computer Science, National Key Laboratory for Multimedia Information Processing, Peking University, Beijing, 100871, China
| | - Zheng Fang
- School of Intelligence Science and Technology, Peking University, Beijing, 100871, China
| | - Yiyang Gu
- School of Computer Science, National Key Laboratory for Multimedia Information Processing, Peking University, Beijing, 100871, China
| | - Zequn Liu
- School of Computer Science, National Key Laboratory for Multimedia Information Processing, Peking University, Beijing, 100871, China
| | - Qingqing Long
- Computer Network Information Center, Chinese Academy of Sciences, Beijing, 100086, China
| | - Ziyue Qiao
- Artificial Intelligence Thrust, The Hong Kong University of Science and Technology, Guangzhou, 511453, China
| | - Yifang Qin
- School of Computer Science, National Key Laboratory for Multimedia Information Processing, Peking University, Beijing, 100871, China
| | - Jianhao Shen
- School of Computer Science, National Key Laboratory for Multimedia Information Processing, Peking University, Beijing, 100871, China
| | - Fang Sun
- Department of Computer Science, University of California, Los Angeles, 90095, USA
| | - Zhiping Xiao
- Department of Computer Science, University of California, Los Angeles, 90095, USA
| | - Junwei Yang
- School of Computer Science, National Key Laboratory for Multimedia Information Processing, Peking University, Beijing, 100871, China
| | - Jingyang Yuan
- School of Computer Science, National Key Laboratory for Multimedia Information Processing, Peking University, Beijing, 100871, China
| | - Yusheng Zhao
- School of Computer Science, National Key Laboratory for Multimedia Information Processing, Peking University, Beijing, 100871, China
| | - Yifan Wang
- School of Information Technology & Management, University of International Business and Economics, Beijing, 100029, China
| | - Xiao Luo
- Department of Computer Science, University of California, Los Angeles, 90095, USA.
| | - Ming Zhang
- School of Computer Science, National Key Laboratory for Multimedia Information Processing, Peking University, Beijing, 100871, China.
| |
Collapse
|
8
|
Xie J, Chen S, Lei J, Yang Y. DiffDec: Structure-Aware Scaffold Decoration with an End-to-End Diffusion Model. J Chem Inf Model 2024; 64:2554-2564. [PMID: 38267393 DOI: 10.1021/acs.jcim.3c01466] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2024]
Abstract
In molecular optimization, one popular way is R-group decoration on molecular scaffolds, and many efforts have been made to generate R-groups based on deep generative models. However, these methods mostly use information on known binding ligands, without fully utilizing target structure information. In this study, we proposed a new method, DiffDec, to involve 3D pocket constraints by a modified diffusion technique for optimizing molecules through molecular scaffold decoration. For end-to-end generation of R-groups with different sizes, we designed a novel fake atom mechanism. DiffDec was shown to be able to generate structure-aware R-groups with realistic geometric substructures by the analysis of bond angles and dihedral angles and simultaneously generate multiple R-groups for one scaffold on different growth anchors. The growth anchors could be provided by users or automatically determined by our model. DiffDec achieved R-group recovery rates of 69.67% and 45.34% in the single and multiple R-group decoration tasks, respectively, and these values were significantly higher than competing methods (37.33% and 26.85%). According to the molecular docking study, our decorated molecules obtained a better average binding affinity than baseline methods. The docking pose analysis revealed that DiffDec could decorate scaffolds with R-groups that exhibited improved binding affinities and more favorable interactions with the pocket. These results demonstrated the potential and applicability of DiffDec in real-world scaffold decoration for molecular optimization.
Collapse
Affiliation(s)
- Junjie Xie
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou 510006, China
- AixplorerBio Inc., Jiaxing 314031, China
| | - Sheng Chen
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou 510006, China
- AixplorerBio Inc., Jiaxing 314031, China
| | - Jinping Lei
- School of Pharmaceutical Sciences, Sun Yat-sen University, Guangzhou 510006, China
| | - Yuedong Yang
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou 510006, China
| |
Collapse
|
9
|
Zhang H, Huang J, Xie J, Huang W, Yang Y, Xu M, Lei J, Chen H. GRELinker: A Graph-Based Generative Model for Molecular Linker Design with Reinforcement and Curriculum Learning. J Chem Inf Model 2024; 64:666-676. [PMID: 38241022 DOI: 10.1021/acs.jcim.3c01700] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/13/2024]
Abstract
Fragment-based drug discovery (FBDD) is widely used in drug design. One useful strategy in FBDD is designing linkers for linking fragments to optimize their molecular properties. In the current study, we present a novel generative fragment linking model, GRELinker, which utilizes a gated-graph neural network combined with reinforcement and curriculum learning to generate molecules with desirable attributes. The model has been shown to be efficient in multiple tasks, including controlling log P, optimizing synthesizability or predicted bioactivity of compounds, and generating molecules with high 3D similarity but low 2D similarity to the lead compound. Specifically, our model outperforms the previously reported reinforcement learning (RL) built-in method DRlinker on these benchmark tasks. Moreover, GRELinker has been successfully used in an actual FBDD case to generate optimized molecules with enhanced affinities by employing the docking score as the scoring function in RL. Besides, the implementation of curriculum learning in our framework enables the generation of structurally complex linkers more efficiently. These results demonstrate the benefits and feasibility of GRELinker in linker design for molecular optimization and drug discovery.
Collapse
Affiliation(s)
- Hao Zhang
- School of Pharmaceutical Science, Sun Yat-sen University, Guangzhou 510006, China
| | - Jinchao Huang
- School of Pharmaceutical Science, Sun Yat-sen University, Guangzhou 510006, China
| | - Junjie Xie
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou 510006, China
| | - Weifeng Huang
- School of Pharmaceutical Science, Sun Yat-sen University, Guangzhou 510006, China
| | - Yuedong Yang
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou 510006, China
| | - Mingyuan Xu
- Guangzhou National Laboratory, Guangzhou International Bio Island, No. 9 Xin Dao Huan Bei Road, Guangzhou 510005, China
| | - Jinping Lei
- School of Pharmaceutical Science, Sun Yat-sen University, Guangzhou 510006, China
| | - Hongming Chen
- Guangzhou National Laboratory, Guangzhou International Bio Island, No. 9 Xin Dao Huan Bei Road, Guangzhou 510005, China
| |
Collapse
|
10
|
Gangwal A, Ansari A, Ahmad I, Azad AK, Kumarasamy V, Subramaniyan V, Wong LS. Generative artificial intelligence in drug discovery: basic framework, recent advances, challenges, and opportunities. Front Pharmacol 2024; 15:1331062. [PMID: 38384298 PMCID: PMC10879372 DOI: 10.3389/fphar.2024.1331062] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2023] [Accepted: 01/17/2024] [Indexed: 02/23/2024] Open
Abstract
There are two main ways to discover or design small drug molecules. The first involves fine-tuning existing molecules or commercially successful drugs through quantitative structure-activity relationships and virtual screening. The second approach involves generating new molecules through de novo drug design or inverse quantitative structure-activity relationship. Both methods aim to get a drug molecule with the best pharmacokinetic and pharmacodynamic profiles. However, bringing a new drug to market is an expensive and time-consuming endeavor, with the average cost being estimated at around $2.5 billion. One of the biggest challenges is screening the vast number of potential drug candidates to find one that is both safe and effective. The development of artificial intelligence in recent years has been phenomenal, ushering in a revolution in many fields. The field of pharmaceutical sciences has also significantly benefited from multiple applications of artificial intelligence, especially drug discovery projects. Artificial intelligence models are finding use in molecular property prediction, molecule generation, virtual screening, synthesis planning, repurposing, among others. Lately, generative artificial intelligence has gained popularity across domains for its ability to generate entirely new data, such as images, sentences, audios, videos, novel chemical molecules, etc. Generative artificial intelligence has also delivered promising results in drug discovery and development. This review article delves into the fundamentals and framework of various generative artificial intelligence models in the context of drug discovery via de novo drug design approach. Various basic and advanced models have been discussed, along with their recent applications. The review also explores recent examples and advances in the generative artificial intelligence approach, as well as the challenges and ongoing efforts to fully harness the potential of generative artificial intelligence in generating novel drug molecules in a faster and more affordable manner. Some clinical-level assets generated form generative artificial intelligence have also been discussed in this review to show the ever-increasing application of artificial intelligence in drug discovery through commercial partnerships.
Collapse
Affiliation(s)
- Amit Gangwal
- Department of Natural Product Chemistry, Shri Vile Parle Kelavani Mandal’s Institute of Pharmacy, Dhule, Maharashtra, India
| | - Azim Ansari
- Computer Aided Drug Design Center Shri Vile Parle Kelavani Mandal’s Institute of Pharmacy, Dhule, Maharashtra, India
| | - Iqrar Ahmad
- Department of Pharmaceutical Chemistry, Prof. Ravindra Nikam College of Pharmacy, Dhule, India
| | - Abul Kalam Azad
- Faculty of Pharmacy, University College of MAIWP International, Batu Caves, Malaysia
| | - Vinoth Kumarasamy
- Department of Parasitology and Medical Entomology, Faculty of Medicine, Universiti Kebangsaan Malaysia, Cheras, Malaysia
| | - Vetriselvan Subramaniyan
- Pharmacology Unit, Jeffrey Cheah School of Medicine and Health Sciences, Monash University Malaysia, Selangor, Malaysia
- School of Bioengineering and Biosciences, Lovely Professional University, Phagwara, Punjab, India
| | - Ling Shing Wong
- Faculty of Health and Life Sciences, INTI International University, Nilai, Malaysia
| |
Collapse
|
11
|
Nitulescu GM. Techniques and Strategies in Drug Design and Discovery. Int J Mol Sci 2024; 25:1364. [PMID: 38338643 PMCID: PMC10855429 DOI: 10.3390/ijms25031364] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2024] [Accepted: 01/18/2024] [Indexed: 02/12/2024] Open
Abstract
The process of drug discovery constitutes a highly intricate and formidable undertaking, encompassing the identification and advancement of novel therapeutic entities [...].
Collapse
Affiliation(s)
- George Mihai Nitulescu
- Faculty of Pharmacy, "Carol Davila" University of Medicine and Pharmacy, 6 Traian Vuia Street, 020956 Bucharest, Romania
| |
Collapse
|
12
|
Xu C, Liu R, Huang S, Li W, Li Z, Luo HB. 3D-SMGE: a pipeline for scaffold-based molecular generation and evaluation. Brief Bioinform 2023; 24:bbad327. [PMID: 37756591 DOI: 10.1093/bib/bbad327] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2023] [Revised: 08/19/2023] [Accepted: 08/30/2023] [Indexed: 09/29/2023] Open
Abstract
In the process of drug discovery, one of the key problems is how to improve the biological activity and ADMET properties starting from a specific structure, which is also called structural optimization. Based on a starting scaffold, the use of deep generative model to generate molecules with desired drug-like properties will provide a powerful tool to accelerate the structural optimization process. However, the existing generative models remain challenging in extracting molecular features efficiently in 3D space to generate drug-like 3D molecules. Moreover, most of the existing ADMET prediction models made predictions of different properties through a single model, which can result in reduced prediction accuracy on some datasets. To effectively generate molecules from a specific scaffold and provide basis for the structural optimization, the 3D-SMGE (3-Dimensional Scaffold-based Molecular Generation and Evaluation) work consisting of molecular generation and prediction of ADMET properties is presented. For the molecular generation, we proposed 3D-SMG, a novel deep generative model for the end-to-end design of 3D molecules. In the 3D-SMG model, we designed the cross-aggregated continuous-filter convolution (ca-cfconv), which is used to achieve efficient and low-cost 3D spatial feature extraction while ensuring the invariance of atomic space rotation. 3D-SMG was proved to generate valid, unique and novel molecules with high drug-likeness. Besides, the proposed data-adaptive multi-model ADMET prediction method outperformed or maintained the best evaluation metrics on 24 out of 27 ADMET benchmark datasets. 3D-SMGE is anticipated to emerge as a powerful tool for hit-to-lead structural optimizations and accelerate the drug discovery process.
Collapse
Affiliation(s)
- Chao Xu
- Key Laboratory of Tropical Biological Resources of Ministry of Education, School of Pharmaceutical Sciences, Hainan University, Haikou 570228, Hainan, P.R. China
| | - Runduo Liu
- School of Pharmaceutical Sciences, Sun Yat-Sen University, Guangzhou, 510000, Guangdong, P.R. China
| | - Shuheng Huang
- Key Laboratory of Tropical Biological Resources of Ministry of Education, School of Pharmaceutical Sciences, Hainan University, Haikou 570228, Hainan, P.R. China
| | - Wenchao Li
- School of Pharmaceutical Sciences, Sun Yat-Sen University, Guangzhou, 510000, Guangdong, P.R. China
| | - Zhe Li
- School of Pharmaceutical Sciences, Sun Yat-Sen University, Guangzhou, 510000, Guangdong, P.R. China
| | - Hai-Bin Luo
- Key Laboratory of Tropical Biological Resources of Ministry of Education, School of Pharmaceutical Sciences, Hainan University, Haikou 570228, Hainan, P.R. China
| |
Collapse
|
13
|
Li B, Ran T, Chen H. 3D based generative PROTAC linker design with reinforcement learning. Brief Bioinform 2023; 24:bbad323. [PMID: 37670499 DOI: 10.1093/bib/bbad323] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2023] [Revised: 08/06/2023] [Accepted: 08/20/2023] [Indexed: 09/07/2023] Open
Abstract
Proteolysis targeting chimera (PROTAC), has emerged as an effective modality to selectively degrade disease-related proteins by harnessing the ubiquitin-proteasome system. Due to PROTACs' hetero-bifunctional characteristics, in which a linker joins a warhead binding to a protein of interest (POI), conferring specificity and a E3-ligand binding to an E3 ubiquitin ligase, this could trigger the ubiquitination and transportation of POI to the proteasome, followed by degradation. The rational PROTAC linker design is challenging due to its relatively large molecular weight and the complexity of maintaining the binding mode of warhead and E3-ligand in the binding pockets of counterpart. Conventional linker generation method can only generate linkers in either 1D SMILES or 2D graph, without taking into account the information of ternary structures. Here we propose a novel 3D linker generative model PROTAC-INVENT which can not only generate SMILES of PROTAC but also its 3D putative binding conformation coupled with the target protein and the E3 ligase. The model is trained jointly with the RL approach to bias the generation of PROTAC structures toward pre-defined 2D and 3D based properties. Examples were provided to demonstrate the utility of the model for generating reasonable 3D conformation of PROTACs. On the other hand, our results show that the associated workflow for 3D PROTAC conformation generation can also be used as an efficient docking protocol for PROTACs.
Collapse
Affiliation(s)
- Baiqing Li
- Guangzhou Laboratory, Guangzhou 510005, Guangdong Province, China
| | - Ting Ran
- Guangzhou Laboratory, Guangzhou 510005, Guangdong Province, China
| | - Hongming Chen
- Guangzhou Laboratory, Guangzhou 510005, Guangdong Province, China
| |
Collapse
|
14
|
Gu Y, Li J, Kang H, Zhang B, Zheng S. Employing Molecular Conformations for Ligand-Based Virtual Screening with Equivariant Graph Neural Network and Deep Multiple Instance Learning. Molecules 2023; 28:5982. [PMID: 37630234 PMCID: PMC10459669 DOI: 10.3390/molecules28165982] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2023] [Revised: 07/27/2023] [Accepted: 08/03/2023] [Indexed: 08/27/2023] Open
Abstract
Ligand-based virtual screening (LBVS) is a promising approach for rapid and low-cost screening of potentially bioactive molecules in the early stage of drug discovery. Compared with traditional similarity-based machine learning methods, deep learning frameworks for LBVS can more effectively extract high-order molecule structure representations from molecular fingerprints or structures. However, the 3D conformation of a molecule largely influences its bioactivity and physical properties, and has rarely been considered in previous deep learning-based LBVS methods. Moreover, the relative bioactivity benchmark dataset is still lacking. To address these issues, we introduce a novel end-to-end deep learning architecture trained from molecular conformers for LBVS. We first extracted molecule conformers from multiple public molecular bioactivity data and consolidated them into a large-scale bioactivity benchmark dataset, which totally includes millions of endpoints and molecules corresponding to 954 targets. Then, we devised a deep learning-based LBVS called EquiVS to learn molecule representations from conformers for bioactivity prediction. Specifically, graph convolutional network (GCN) and equivariant graph neural network (EGNN) are sequentially stacked to learn high-order molecule-level and conformer-level representations, followed with attention-based deep multiple-instance learning (MIL) to aggregate these representations and then predict the potential bioactivity for the query molecule on a given target. We conducted various experiments to validate the data quality of our benchmark dataset, and confirmed EquiVS achieved better performance compared with 10 traditional machine learning or deep learning-based LBVS methods. Further ablation studies demonstrate the significant contribution of molecular conformation for bioactivity prediction, as well as the reasonability and non-redundancy of deep learning architecture in EquiVS. Finally, a model interpretation case study on CDK2 shows the potential of EquiVS in optimal conformer discovery. The overall study shows that our proposed benchmark dataset and EquiVS method have promising prospects in virtual screening applications.
Collapse
Affiliation(s)
- Yaowen Gu
- Institute of Medical Information (IMI), Chinese Academy of Medical Sciences and Peking Union Medical College (CAMS & PUMC), Beijing 100020, China; (Y.G.); (J.L.); (H.K.)
- Department of Chemistry, New York University, New York, NY 10027, USA
| | - Jiao Li
- Institute of Medical Information (IMI), Chinese Academy of Medical Sciences and Peking Union Medical College (CAMS & PUMC), Beijing 100020, China; (Y.G.); (J.L.); (H.K.)
| | - Hongyu Kang
- Institute of Medical Information (IMI), Chinese Academy of Medical Sciences and Peking Union Medical College (CAMS & PUMC), Beijing 100020, China; (Y.G.); (J.L.); (H.K.)
- Department of Biomedical Engineering, School of Life Science, Beijing Institute of Technology, Beijing 100081, China
| | - Bowen Zhang
- Beijing StoneWise Technology Co., Ltd., Beijing 100080, China;
| | - Si Zheng
- Institute of Medical Information (IMI), Chinese Academy of Medical Sciences and Peking Union Medical College (CAMS & PUMC), Beijing 100020, China; (Y.G.); (J.L.); (H.K.)
- Institute for Artificial Intelligence, Department of Computer Science and Technology, BNRist, Tsinghua University, Beijing 100084, China
| |
Collapse
|
15
|
Tang M, Li B, Chen H. Application of message passing neural networks for molecular property prediction. Curr Opin Struct Biol 2023; 81:102616. [PMID: 37267824 DOI: 10.1016/j.sbi.2023.102616] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2022] [Revised: 04/28/2023] [Accepted: 05/04/2023] [Indexed: 06/04/2023]
Abstract
Accurate molecular property prediction, as one of the classical cheminformatics topics, plays a prominent role in the fields of computer-aided drug design. For instance, property prediction models can be used to quickly screen large molecular libraries to find lead compounds. Message-passing neural networks (MPNNs), a sub-class of Graph neural networks (GNNs), have recently been demonstrated to outperform other deep learning methods on a variety of tasks, including the prediction of molecular characteristics. In this survey, we provide a brief review of the MPNN models and their applications on molecular property prediction.
Collapse
Affiliation(s)
- Miru Tang
- Guangzhou Laboratory, Guangzhou, 510005, Guangdong Province, China; Bioland Laboratory (Guangzhou Regenerative Medicine and Health-Guangdong Laboratory), Guangzhou, 510530, China; State Key Laboratory of Respiratory Disease, Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, Guangzhou, 510530, China
| | - Baiqing Li
- Guangzhou Laboratory, Guangzhou, 510005, Guangdong Province, China
| | - Hongming Chen
- Guangzhou Laboratory, Guangzhou, 510005, Guangdong Province, China.
| |
Collapse
|
16
|
Varikoti RA, Schultz KJ, Kombala CJ, Kruel A, Brandvold KR, Zhou M, Kumar N. Integrated data-driven and experimental approaches to accelerate lead optimization targeting SARS-CoV-2 main protease. J Comput Aided Mol Des 2023:10.1007/s10822-023-00509-1. [PMID: 37314632 DOI: 10.1007/s10822-023-00509-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2023] [Accepted: 05/23/2023] [Indexed: 06/15/2023]
Abstract
Identification of potential therapeutic candidates can be expedited by integrating computational modeling with domain aware machine learning (ML) models followed by experimental validation in an iterative manner. Generative deep learning models can generate thousands of new candidates, however, their physiochemical and biochemical properties are typically not fully optimized. Using our recently developed deep learning models and a scaffold as a starting point, we generated tens of thousands of compounds for SARS-CoV-2 Mpro that preserve the core scaffold. We utilized and implemented several computational tools such as structural alert and toxicity analysis, high throughput virtual screening, ML-based 3D quantitative structure-activity relationships, multi-parameter optimization, and graph neural networks on generated candidates to predict biological activity and binding affinity in advance. As a result of these combined computational endeavors, eight promising candidates were singled out and put through experimental testing using Native Mass Spectrometry and FRET-based functional assays. Two of the tested compounds with quinazoline-2-thiol and acetylpiperidine core moieties showed IC[Formula: see text] values in the low micromolar range: [Formula: see text] [Formula: see text]M and 3.41±0.0015 [Formula: see text]M, respectively. Molecular dynamics simulations further highlight that binding of these compounds results in allosteric modulations within the chain B and the interface domains of the Mpro. Our integrated approach provides a platform for data driven lead optimization with rapid characterization and experimental validation in a closed loop that could be applied to other potential protein targets.
Collapse
Affiliation(s)
- Rohith Anand Varikoti
- Earth and Biological Sciences Directorate, Pacific Northwest National Laboratory, 902 Battelle Blvd, Richland, WA, 99352, USA
| | - Katherine J Schultz
- Earth and Biological Sciences Directorate, Pacific Northwest National Laboratory, 902 Battelle Blvd, Richland, WA, 99352, USA
| | - Chathuri J Kombala
- Earth and Biological Sciences Directorate, Pacific Northwest National Laboratory, 902 Battelle Blvd, Richland, WA, 99352, USA
| | - Agustin Kruel
- Earth and Biological Sciences Directorate, Pacific Northwest National Laboratory, 902 Battelle Blvd, Richland, WA, 99352, USA
| | - Kristoffer R Brandvold
- Earth and Biological Sciences Directorate, Pacific Northwest National Laboratory, 902 Battelle Blvd, Richland, WA, 99352, USA
| | - Mowei Zhou
- Earth and Biological Sciences Directorate, Pacific Northwest National Laboratory, 902 Battelle Blvd, Richland, WA, 99352, USA
| | - Neeraj Kumar
- Earth and Biological Sciences Directorate, Pacific Northwest National Laboratory, 902 Battelle Blvd, Richland, WA, 99352, USA.
| |
Collapse
|
17
|
Wills S, Sanchez-Garcia R, Dudgeon T, Roughley SD, Merritt A, Hubbard RE, Davidson J, von Delft F, Deane CM. Fragment Merging Using a Graph Database Samples Different Catalogue Space than Similarity Search. J Chem Inf Model 2023. [PMID: 37229647 DOI: 10.1021/acs.jcim.3c00276] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/27/2023]
Abstract
Fragment merging is a promising approach to progressing fragments directly to on-scale potency: each designed compound incorporates the structural motifs of overlapping fragments in a way that ensures compounds recapitulate multiple high-quality interactions. Searching commercial catalogues provides one useful way to quickly and cheaply identify such merges and circumvents the challenge of synthetic accessibility, provided they can be readily identified. Here, we demonstrate that the Fragment Network, a graph database that provides a novel way to explore the chemical space surrounding fragment hits, is well-suited to this challenge. We use an iteration of the database containing >120 million catalogue compounds to find fragment merges for four crystallographic screening campaigns and contrast the results with a traditional fingerprint-based similarity search. The two approaches identify complementary sets of merges that recapitulate the observed fragment-protein interactions but lie in different regions of chemical space. We further show our methodology is an effective route to achieving on-scale potency by retrospective analyses for two different targets; in analyses of public COVID Moonshot and Mycobacterium tuberculosis EthR inhibitors, potential inhibitors with micromolar IC50 values were identified. This work demonstrates the use of the Fragment Network to increase the yield of fragment merges beyond that of a classical catalogue search.
Collapse
Affiliation(s)
- Stephanie Wills
- Department of Statistics, University of Oxford, Oxford OX1 3LB, United Kingdom
- Centre for Medicines Discovery, University of Oxford, Oxford OX3 7DQ, United Kingdom
| | - Ruben Sanchez-Garcia
- Department of Statistics, University of Oxford, Oxford OX1 3LB, United Kingdom
- Centre for Medicines Discovery, University of Oxford, Oxford OX3 7DQ, United Kingdom
| | - Tim Dudgeon
- Informatics Matters, Ltd., Perch Coworking, Franklins House, Bicester OX26 6JU, United Kingdom
| | - Stephen D Roughley
- Vernalis (R&D) Limited, Granta Park, Great Abington, Cambridge CB21 6GB, United Kingdom
| | - Andy Merritt
- LifeArc, Lynton House, 7-12 Tavistock Square, London WC1H 9LT, United Kingdom
| | - Roderick E Hubbard
- Vernalis (R&D) Limited, Granta Park, Great Abington, Cambridge CB21 6GB, United Kingdom
| | - James Davidson
- Vernalis (R&D) Limited, Granta Park, Great Abington, Cambridge CB21 6GB, United Kingdom
| | - Frank von Delft
- Centre for Medicines Discovery, University of Oxford, Oxford OX3 7DQ, United Kingdom
- Diamond Light Source, Didcot OX11 0DE, United Kingdom
- Research Complex at Harwell, Harwell Science and Innovation Campus, Didcot OX11 0FA, United Kingdom
- Department of Biochemistry, University of Johannesburg, Auckland Park, Johannesburg 2006, South Africa
| | - Charlotte M Deane
- Department of Statistics, University of Oxford, Oxford OX1 3LB, United Kingdom
| |
Collapse
|
18
|
Seo S, Lim J, Kim WY. Molecular Generative Model via Retrosynthetically Prepared Chemical Building Block Assembly. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2023; 10:e2206674. [PMID: 36596675 PMCID: PMC10015872 DOI: 10.1002/advs.202206674] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/14/2022] [Indexed: 06/17/2023]
Abstract
Deep generative models are attracting attention as a smart molecular design strategy. However, previous models often render molecules with low synthesizability, hindering their real-world applications. Here, a novel graph-based conditional generative model which makes molecules by tailoring retrosynthetically prepared chemical building blocks until achieving target properties in an auto-regressive fashion is proposed. This strategy improves the synthesizability and property control of the resulting molecules and also helps learn how to select appropriate building blocks and bind them together to achieve target properties. By applying a negative sampling method to the selection process of building blocks, this model overcame a critical limitation of previous fragment-based models, which can only use molecules from the training set during generation. As a result, the model works equally well with unseen building blocks without sacrificing computational efficiency. It is demonstrated that the model can generate potential inhibitors with high docking scores against the 3CL protease of SARS-COV-2.
Collapse
Affiliation(s)
- Seonghwan Seo
- HITS Incorporation124 Teheran‐ro, Gangnam‐guSeoul06234Republic of Korea
- Department of ChemistryKAIST, 291 Daehak‐ro, Yuseong‐guDaejeon34141Republic of Korea
| | - Jaechang Lim
- HITS Incorporation124 Teheran‐ro, Gangnam‐guSeoul06234Republic of Korea
| | - Woo Youn Kim
- HITS Incorporation124 Teheran‐ro, Gangnam‐guSeoul06234Republic of Korea
- Department of ChemistryKAIST, 291 Daehak‐ro, Yuseong‐guDaejeon34141Republic of Korea
- AI InstituteKAIST, 291 Daehak‐ro, Yuseong‐guDaejeon34141Republic of Korea
| |
Collapse
|
19
|
Tingle B, Tang KG, Castanon M, Gutierrez JJ, Khurelbaatar M, Dandarchuluun C, Moroz YS, Irwin JJ. ZINC-22─A Free Multi-Billion-Scale Database of Tangible Compounds for Ligand Discovery. J Chem Inf Model 2023; 63:1166-1176. [PMID: 36790087 PMCID: PMC9976280 DOI: 10.1021/acs.jcim.2c01253] [Citation(s) in RCA: 37] [Impact Index Per Article: 37.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2022] [Indexed: 02/16/2023]
Abstract
Purchasable chemical space has grown rapidly into the tens of billions of molecules, providing unprecedented opportunities for ligand discovery but straining the tools that might exploit these molecules at scale. We have therefore developed ZINC-22, a database of commercially accessible small molecules derived from multi-billion-scale make-on-demand libraries. The new database and tools enable analog searching in this vast new space via a facile GUI, CartBlanche, drawing on similarity methods that scale sublinearly in the number of molecules. The new library also uses data organization methods, enabling rapid lookup of molecules and their physical properties, including conformations, partial atomic charges, c Log P values, and solvation energies, all crucial for molecule docking, which had become slow with older database organizations in previous versions of ZINC. As the libraries have continued to grow, we have been interested in finding whether molecular diversity has suffered, for instance, because certain scaffolds have come to dominate via easy analoging. This has not occurred thus far, and chemical diversity continues to grow with database size, with a log increase in Bemis-Murcko scaffolds for every two-log unit increase in database size. Most new scaffolds come from compounds with the highest heavy atom count. Finally, we consider the implications for databases like ZINC as the libraries grow toward and beyond the trillion-molecule range. ZINC is freely available to everyone and may be accessed at cartblanche22.docking.org, via Globus, and in the Amazon AWS and Oracle OCI clouds.
Collapse
Affiliation(s)
- Benjamin
I. Tingle
- Department
of Pharmaceutical Chemistry, University
of California San Francisco, 1700 4th St, Mailcode 2550, San Francisco, California 94158-2330, United States
| | - Khanh G. Tang
- Department
of Pharmaceutical Chemistry, University
of California San Francisco, 1700 4th St, Mailcode 2550, San Francisco, California 94158-2330, United States
| | - Mar Castanon
- Department
of Pharmaceutical Chemistry, University
of California San Francisco, 1700 4th St, Mailcode 2550, San Francisco, California 94158-2330, United States
| | - John J. Gutierrez
- Department
of Pharmaceutical Chemistry, University
of California San Francisco, 1700 4th St, Mailcode 2550, San Francisco, California 94158-2330, United States
| | - Munkhzul Khurelbaatar
- Department
of Pharmaceutical Chemistry, University
of California San Francisco, 1700 4th St, Mailcode 2550, San Francisco, California 94158-2330, United States
| | - Chinzorig Dandarchuluun
- Department
of Pharmaceutical Chemistry, University
of California San Francisco, 1700 4th St, Mailcode 2550, San Francisco, California 94158-2330, United States
| | - Yurii S. Moroz
- Taras
Shevchenko National University of Kyïv, 60 Volodymyrska Street, Kyïv 01601, Ukraine
- Chemspace
LLC, 85 Chervonotkatska
Street, Kyïv 02094, Ukraine
| | - John J. Irwin
- Department
of Pharmaceutical Chemistry, University
of California San Francisco, 1700 4th St, Mailcode 2550, San Francisco, California 94158-2330, United States
| |
Collapse
|
20
|
Liu X, Ye K, van Vlijmen HWT, IJzerman AP, van Westen GJP. DrugEx v3: scaffold-constrained drug design with graph transformer-based reinforcement learning. J Cheminform 2023; 15:24. [PMID: 36803659 PMCID: PMC9940339 DOI: 10.1186/s13321-023-00694-z] [Citation(s) in RCA: 18] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2022] [Accepted: 02/06/2023] [Indexed: 02/22/2023] Open
Abstract
Rational drug design often starts from specific scaffolds to which side chains/substituents are added or modified due to the large drug-like chemical space available to search for novel drug-like molecules. With the rapid growth of deep learning in drug discovery, a variety of effective approaches have been developed for de novo drug design. In previous work we proposed a method named DrugEx, which can be applied in polypharmacology based on multi-objective deep reinforcement learning. However, the previous version is trained under fixed objectives and does not allow users to input any prior information (i.e. a desired scaffold). In order to improve the general applicability, we updated DrugEx to design drug molecules based on scaffolds which consist of multiple fragments provided by users. Here, a Transformer model was employed to generate molecular structures. The Transformer is a multi-head self-attention deep learning model containing an encoder to receive scaffolds as input and a decoder to generate molecules as output. In order to deal with the graph representation of molecules a novel positional encoding for each atom and bond based on an adjacency matrix was proposed, extending the architecture of the Transformer. The graph Transformer model contains growing and connecting procedures for molecule generation starting from a given scaffold based on fragments. Moreover, the generator was trained under a reinforcement learning framework to increase the number of desired ligands. As a proof of concept, the method was applied to design ligands for the adenosine A2A receptor (A2AAR) and compared with SMILES-based methods. The results show that 100% of the generated molecules are valid and most of them had a high predicted affinity value towards A2AAR with given scaffolds.
Collapse
Affiliation(s)
- Xuhan Liu
- grid.5132.50000 0001 2312 1970Drug Discovery and Safety, Leiden Academic Centre for Drug Research, Einsteinweg 55, Leiden, The Netherlands
| | - Kai Ye
- grid.43169.390000 0001 0599 1243School of Electrics and Information Engineering, Xi’an Jiaotong University, 28 XianningW Rd, Xi’an, China
| | - Herman W. T. van Vlijmen
- grid.5132.50000 0001 2312 1970Drug Discovery and Safety, Leiden Academic Centre for Drug Research, Einsteinweg 55, Leiden, The Netherlands ,grid.419619.20000 0004 0623 0341Janssen Pharmaceutica NV, Turnhoutseweg 30, B-2340 Beerse, Belgium
| | - Adriaan P. IJzerman
- grid.5132.50000 0001 2312 1970Drug Discovery and Safety, Leiden Academic Centre for Drug Research, Einsteinweg 55, Leiden, The Netherlands
| | - Gerard J. P. van Westen
- grid.5132.50000 0001 2312 1970Drug Discovery and Safety, Leiden Academic Centre for Drug Research, Einsteinweg 55, Leiden, The Netherlands
| |
Collapse
|
21
|
Talat A, Khan AU. Artificial intelligence as a smart approach to develop antimicrobial drug molecules: A paradigm to combat drug-resistant infections. Drug Discov Today 2023; 28:103491. [PMID: 36646245 DOI: 10.1016/j.drudis.2023.103491] [Citation(s) in RCA: 13] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2022] [Revised: 01/01/2023] [Accepted: 01/05/2023] [Indexed: 01/15/2023]
Abstract
Antimicrobial resistance (AMR) is a silent pandemic with the third highest global mortality. The antibiotic development pipeline is scarce even though AMR has escalated uncontrollably. Artificial intelligence (AI) is a revolutionary approach, accelerating drug discovery because of its fast pace, cost efficiency, lower labor requirements, and fewer chances of failure. AI has been used to discover several beta-lactamase inhibitors and antibiotic alternatives from antimicrobial peptides (AMPs), nonribosomal peptides, bacteriocins, and marine natural products. The significant recent increase in the use of AI platforms by pharmaceutical companies could result in the discovery of efficient antibiotic alternatives with lower chances of resistance generation.
Collapse
Affiliation(s)
- Absar Talat
- Medical Microbiology and Molecular Biology Laboratory, Interdisciplinary Biotechnology Unit, Aligarh Muslim University, Aligarh, India
| | - Asad U Khan
- Medical Microbiology and Molecular Biology Laboratory, Interdisciplinary Biotechnology Unit, Aligarh Muslim University, Aligarh, India.
| |
Collapse
|
22
|
Liao Z, Xie L, Mamitsuka H, Zhu S. Sc2Mol: a scaffold-based two-step molecule generator with variational autoencoder and transformer. Bioinformatics 2023; 39:btac814. [PMID: 36576008 PMCID: PMC9835482 DOI: 10.1093/bioinformatics/btac814] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2022] [Revised: 10/31/2022] [Accepted: 12/27/2022] [Indexed: 12/29/2022] Open
Abstract
MOTIVATION Finding molecules with desired pharmaceutical properties is crucial in drug discovery. Generative models can be an efficient tool to find desired molecules through the distribution learned by the model to approximate given training data. Existing generative models (i) do not consider backbone structures (scaffolds), resulting in inefficiency or (ii) need prior patterns for scaffolds, causing bias. Scaffolds are reasonable to use, and it is imperative to design a generative model without any prior scaffold patterns. RESULTS We propose a generative model-based molecule generator, Sc2Mol, without any prior scaffold patterns. Sc2Mol uses SMILES strings for molecules. It consists of two steps: scaffold generation and scaffold decoration, which are carried out by a variational autoencoder and a transformer, respectively. The two steps are powerful for implementing random molecule generation and scaffold optimization. Our empirical evaluation using drug-like molecule datasets confirmed the success of our model in distribution learning and molecule optimization. Also, our model could automatically learn the rules to transform coarse scaffolds into sophisticated drug candidates. These rules were consistent with those for current lead optimization. AVAILABILITY AND IMPLEMENTATION The code is available at https://github.com/zhiruiliao/Sc2Mol. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Zhirui Liao
- School of Computer Science, Fudan University, Shanghai 200433, China
| | - Lei Xie
- Department of Computer Science, Hunter College, The City University of New York, New York, NY 10065, USA
| | - Hiroshi Mamitsuka
- Bioinformatics Center, Institute for Chemical Research, Kyoto University, Uji, Kyoto Prefecture 611-0011, Japan
- Department of Computer Science, Aalto University, Espoo 00076, Finland
| | - Shanfeng Zhu
- Institute of Science and Technology for Brain-Inspired Intelligence and MOE Frontiers Center for Brain Science, Fudan University, Shanghai 200433, China
- Shanghai Qi Zhi Institute, Shanghai 200030, China
- Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence, Fudan University, Ministry of Education, Shanghai 200433, China
- Shanghai Key Lab of Intelligent Information Processing and Shanghai Institute of Artificial Intelligence Algorithm, Fudan University, Shanghai 200433, China
- Zhangjiang Fudan International Innovation Center, Shanghai 200433, China
- Institute of Artificial Intelligence Biomedicine, Nanjing University, Nanjing, Jiangsu 210031, China
| |
Collapse
|
23
|
Noguchi S, Inoue J. Exploration of Chemical Space Guided by PixelCNN for Fragment-Based De Novo Drug Discovery. J Chem Inf Model 2022; 62:5988-6001. [PMID: 36454646 DOI: 10.1021/acs.jcim.2c01345] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/05/2022]
Abstract
We report a novel framework for achieving fragment-based molecular design using pixel convolutional neural network (PixelCNN) combined with the simplified molecular input line entry system (SMILES) as molecular representation. While a widely used recurrent neural network (RNN) assumes monotonically decaying correlations in strings, PixelCNN captures a periodicity among characters of SMILES. Thus, PixelCNN provides us with a novel solution for the analysis of chemical space by extracting the periodicity of molecular structures that will be buried in SMILES. Moreover, this characteristic enables us to generate molecules by combining several simple building blocks, such as a benzene ring and side-chain structures, which contributes to the effective exploration of chemical space by step-by-step searching for molecules from a target fragment. In conclusion, PixelCNN could be a powerful approach focusing on the periodicity of molecules to explore chemical space for the fragment-based molecular design.
Collapse
Affiliation(s)
- Satoshi Noguchi
- Department of Advanced Interdisciplinary Studies, The University of Tokyo, 4-6-1 Komaba, Meguro, Tokyo153-8904, Japan
| | - Junya Inoue
- Institute for Industrial Science, The University of Tokyo, 5-1-5 Kashiwanoha, Kashiwa, Chiba277-0082, Japan.,Department of Materials Engineering, The University of Tokyo, 7-3-1 Hongo, Bunkyo, Tokyo113-8656, Japan.,Research Center for Advanced Science and Technology, The University of Tokyo, 4-6-1 Komaba, Meguro, Tokyo153-8904, Japan
| |
Collapse
|
24
|
Gider V, Budak C. Instruction of molecular structure similarity and scaffolds of drugs under investigation in ebola virus treatment by atom-pair and graph network: A combination of favipiravir and molnupiravir. Comput Biol Chem 2022; 101:107778. [DOI: 10.1016/j.compbiolchem.2022.107778] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2022] [Revised: 10/06/2022] [Accepted: 10/07/2022] [Indexed: 11/26/2022]
|
25
|
Xu T, Wang M, Liu X, Feng D, Zhu Y, Fan Z, Rao S, Lu J. A Scaffold-based Deep Generative Model Considering Molecular Stereochemical Information. Mol Inform 2022; 41:e2200088. [PMID: 36031563 DOI: 10.1002/minf.202200088] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Designing molecules with specific scaffolds can facilitate the discovery and optimization of lead compounds. Some scaffold-based molecular generation models have been developed using deep-learning methods based on specific scaffolds, although incorporating scaffold generalization is expected to achieve scaffold hopping. Moreover, most of the existing models focus on the 2D shape of the scaffold and overlook the stereochemical properties of the compound, especially for natural products. In this study, we optimized the scaffold-based molecular generation model designed by Lim et al. (Chemical Science 2020, 11, 1153-1164). Real-time ultrafast shape recognition with pharmacophore constraints (USRCAT) was introduced into the model to search for molecules similar to the 3D conformation and pharmacophore of the input scaffold sourced from the training set; the searched molecules were then used as new scaffolds to execute scaffold hopping. The optimized model could generate new molecules with the same chirality as the input scaffold. Furthermore, the probability distribution of the molecular structure and various physicochemical properties were analyzed to evaluate the model's generation capability. We thus believe that the optimized model can provide a basis for medicinal chemists to explore a wider chemical space toward optimization of the lead compounds and to screen the virtual compound library.
Collapse
Affiliation(s)
- Tianxu Xu
- Department, Institution:Key Laboratory of Molecular Pharmacology and Drug Evaluation, Ministry of Education, Collaborative Innovation Center of Advanced Drug Delivery System and Biotech Drugs in Universities of Shandong, School of Pharmacy, Yantai University, No. 30, Qingquan Road, Laishan District, Yantai, 264005, China
| | - Minjun Wang
- Department, Institution:Key Laboratory of Molecular Pharmacology and Drug Evaluation, Ministry of Education, Collaborative Innovation Center of Advanced Drug Delivery System and Biotech Drugs in Universities of Shandong, School of Pharmacy, Yantai University, No. 30, Qingquan Road, Laishan District, Yantai, 264005, China
| | - Xiaoqian Liu
- Department, Institution:Key Laboratory of Molecular Pharmacology and Drug Evaluation, Ministry of Education, Collaborative Innovation Center of Advanced Drug Delivery System and Biotech Drugs in Universities of Shandong, School of Pharmacy, Yantai University, No. 30, Qingquan Road, Laishan District, Yantai, 264005, China
| | - Dawei Feng
- Department, Institution:Key Laboratory of Molecular Pharmacology and Drug Evaluation, Ministry of Education, Collaborative Innovation Center of Advanced Drug Delivery System and Biotech Drugs in Universities of Shandong, School of Pharmacy, Yantai University, No. 30, Qingquan Road, Laishan District, Yantai, 264005, China
| | - Yanjuan Zhu
- Department, Institution:Key Laboratory of Molecular Pharmacology and Drug Evaluation, Ministry of Education, Collaborative Innovation Center of Advanced Drug Delivery System and Biotech Drugs in Universities of Shandong, School of Pharmacy, Yantai University, No. 30, Qingquan Road, Laishan District, Yantai, 264005, China
| | - Zhe Fan
- Department, Institution:Key Laboratory of Molecular Pharmacology and Drug Evaluation, Ministry of Education, Collaborative Innovation Center of Advanced Drug Delivery System and Biotech Drugs in Universities of Shandong, School of Pharmacy, Yantai University, No. 30, Qingquan Road, Laishan District, Yantai, 264005, China
| | - Shurong Rao
- Department, Institution:Key Laboratory of Molecular Pharmacology and Drug Evaluation, Ministry of Education, Collaborative Innovation Center of Advanced Drug Delivery System and Biotech Drugs in Universities of Shandong, School of Pharmacy, Yantai University, No. 30, Qingquan Road, Laishan District, Yantai, 264005, China
| | - Jing Lu
- Department, Institution:Key Laboratory of Molecular Pharmacology and Drug Evaluation, Ministry of Education, Collaborative Innovation Center of Advanced Drug Delivery System and Biotech Drugs in Universities of Shandong, School of Pharmacy, Yantai University, No. 30, Qingquan Road, Laishan District, Yantai, 264005, China
| |
Collapse
|
26
|
Hormazabal RS, Kang JW, Park K, Yang DR. Not from Scratch: Predicting Thermophysical Properties through Model-Based Transfer Learning Using Graph Convolutional Networks. J Chem Inf Model 2022; 62:5411-5424. [PMID: 36315416 DOI: 10.1021/acs.jcim.2c00846] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
In this study, a framework for the prediction of thermophysical properties based on transfer learning from existing estimation models is explored. The predictive capabilities of conventional group-contribution methods and traditional machine-learning approaches rely heavily on the availability of experimental datasets and their uncertainty. Through the use of a pretraining scheme, which leverages the knowledge established by other estimation methods, improved prediction models for thermophysical properties can be obtained after fine-tuning networks with more accurate experimental data. As our experiments show, for the case of critical properties of compounds, this pipeline not only improves the performance of the models on commonly found organic structures but can also help these models generalize to less explored areas of chemical space, where experimental data is scarce, such as inorganics and heavier organic compounds. Transfer learning from estimation models data also allows for graph-based deep learning models to create more flexible molecular features over a bigger chemical space, which leads to improved predictive capabilities and can give insights into the relationship between molecular structures and thermophysical properties. The generated molecular features can discriminate behavior discrepancy between isomers without the need of additional parameters. Also, this approach shows better robustness to outliers in experimental datasets.
Collapse
Affiliation(s)
- Rodrigo S Hormazabal
- Department of Chemical and Biological Engineering, Korea University, 145 Anam-ro, Seongbuk-gu, Seoul02841, Republic of Korea
| | - Jeong Won Kang
- Department of Chemical and Biological Engineering, Korea University, 145 Anam-ro, Seongbuk-gu, Seoul02841, Republic of Korea
| | - Kiho Park
- School of Chemical Engineering, Chonnam National University, 77 Yongbong-ro, Buk-gu, Gwangju61186, Republic of Korea
| | - Dae Ryook Yang
- Department of Chemical and Biological Engineering, Korea University, 145 Anam-ro, Seongbuk-gu, Seoul02841, Republic of Korea
| |
Collapse
|
27
|
Zhang Y, Luo M, Wu P, Wu S, Lee TY, Bai C. Application of Computational Biology and Artificial Intelligence in Drug Design. Int J Mol Sci 2022; 23:13568. [PMID: 36362355 PMCID: PMC9658956 DOI: 10.3390/ijms232113568] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2022] [Revised: 10/29/2022] [Accepted: 11/03/2022] [Indexed: 08/24/2023] Open
Abstract
Traditional drug design requires a great amount of research time and developmental expense. Booming computational approaches, including computational biology, computer-aided drug design, and artificial intelligence, have the potential to expedite the efficiency of drug discovery by minimizing the time and financial cost. In recent years, computational approaches are being widely used to improve the efficacy and effectiveness of drug discovery and pipeline, leading to the approval of plenty of new drugs for marketing. The present review emphasizes on the applications of these indispensable computational approaches in aiding target identification, lead discovery, and lead optimization. Some challenges of using these approaches for drug design are also discussed. Moreover, we propose a methodology for integrating various computational techniques into new drug discovery and design.
Collapse
Affiliation(s)
- Yue Zhang
- School of Life and Health Sciences, School of Medicine, The Chinese University of Hong Kong, Shenzhen 518172, China
- School of Chemistry and Materials Science, University of Science and Technology of China, Hefei 230026, China
- Warshel Institute for Computational Biology, Shenzhen 518172, China
| | - Mengqi Luo
- School of Life and Health Sciences, School of Medicine, The Chinese University of Hong Kong, Shenzhen 518172, China
- South China Hospital, Health Science Center, Shenzhen University, Shenzhen 518116, China
| | - Peng Wu
- School of Biomedical Engineering, Health Science Center, Shenzhen University, Shenzhen 518055, China
| | - Song Wu
- South China Hospital, Health Science Center, Shenzhen University, Shenzhen 518116, China
| | - Tzong-Yi Lee
- School of Life and Health Sciences, School of Medicine, The Chinese University of Hong Kong, Shenzhen 518172, China
- Warshel Institute for Computational Biology, Shenzhen 518172, China
| | - Chen Bai
- School of Life and Health Sciences, School of Medicine, The Chinese University of Hong Kong, Shenzhen 518172, China
- Warshel Institute for Computational Biology, Shenzhen 518172, China
| |
Collapse
|
28
|
A pocket-based 3D molecule generative model fueled by experimental electron density. Sci Rep 2022; 12:15100. [PMID: 36068257 PMCID: PMC9448726 DOI: 10.1038/s41598-022-19363-6] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2022] [Accepted: 08/29/2022] [Indexed: 11/08/2022] Open
Abstract
We report for the first time the use of experimental electron density (ED) as training data for the generation of drug-like three-dimensional molecules based on the structure of a target protein pocket. Similar to a structural biologist building molecules based on their ED, our model functions with two main components: a generative adversarial network (GAN) to generate the ligand ED in the input pocket and an ED interpretation module for molecule generation. The model was tested on three targets: a kinase (hematopoietic progenitor kinase 1), protease (SARS-CoV-2 main protease), and nuclear receptor (vitamin D receptor), and evaluated with a reference dataset composed of over 8000 compounds that have their activities reported in the literature. The evaluation considered the chemical validity, chemical space distribution-based diversity, and similarity with reference active compounds concerning the molecular structure and pocket-binding mode. Our model can generate molecules with similar structures to classical active compounds and novel compounds sharing similar binding modes with active compounds, making it a promising tool for library generation supporting high-throughput virtual screening. The ligand ED generated can also be used to support fragment-based drug design. Our model is available as an online service to academic users via https://edmg.stonewise.cn/#/create .
Collapse
|
29
|
Liu Z, Du J, Lin Z, Li Z, Liu B, Cui Z, Fang J, Xie L. DenovoProfiling: A webserver for de novo generated molecule library profiling. Comput Struct Biotechnol J 2022; 20:4082-4097. [PMID: 36016718 PMCID: PMC9379519 DOI: 10.1016/j.csbj.2022.07.045] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2022] [Revised: 07/25/2022] [Accepted: 07/25/2022] [Indexed: 01/10/2023] Open
Abstract
Various deep learning-based architectures for molecular generation have been proposed for de novo drug design. The flourish of the de novo molecular generation methods and applications has created a great demand for the visualization and functional profiling for the de novo generated molecules. An increasing number of publicly available chemogenomic databases sets good foundations and creates good opportunities for comprehensive profiling of the de novo library. In this paper, we present DenovoProfiling, a webserver dedicated to de novo library visualization and functional profiling. Currently, DenovoProfiling contains six modules: (1) identification & visualization module for chemical structure visualization and identify the reported structures, (2) chemical space module for chemical space exploration using similarity maps, principal components analysis (PCA), drug-like properties distribution, and scaffold-based clustering, (3) ADMET prediction module for predicting the ADMET properties of the de novo molecules, (4) molecular alignment module for three dimensional molecular shape analysis, (5) drugs mapping module for identifying structural similar drugs, and (6) target & pathway module for identifying the reported targets and corresponding functional pathways. DenovoProfiling could provide structural identification, chemical space exploration, drug mapping, and target & pathway information. The comprehensive annotated information could give users a clear picture of their de novo library and could guide the further selection of candidates for chemical synthesis and biological confirmation. DenovoProfiling is freely available at http://denovoprofiling.xielab.net.
Collapse
Key Words
- DDR1, Discovered potent discoidin domain receptor 1
- De novo drug design
- De novo molecule library
- Deep learning
- FBDD, Fragment-based drug design
- FDR, False discovery rate
- GAN, Generative adversarial networks
- HTS, High throughput screening
- LSTM, Long short-term memory
- Library profiling
- PCA, Principal components analysis
- RNN, Recurrent neural networks
- SCA, Scaffold-based classification approach
- VAE, Variational autoencoders
Collapse
Affiliation(s)
- Zhihong Liu
- School of Public Health, Xinxiang Medical University, Xinxiang, China
- Guangdong Provincial Key Laboratory of Microbial Culture Collection and Application, State Key Laboratory of Applied Microbiology Southern China, Institute of Microbiology, Guangdong Academy of Sciences, Guangzhou 510070, China
| | - Jiewen Du
- Beijing Jingpai Technology Co., Ltd., 1500-1, Hailong Building Z-Park, Beijing 100090, China
| | - Ziying Lin
- Guangdong Provincial Key Laboratory of Microbial Culture Collection and Application, State Key Laboratory of Applied Microbiology Southern China, Institute of Microbiology, Guangdong Academy of Sciences, Guangzhou 510070, China
| | - Ze Li
- School of Public Health, Xinxiang Medical University, Xinxiang, China
| | - Bingdong Liu
- Guangdong Provincial Key Laboratory of Microbial Culture Collection and Application, State Key Laboratory of Applied Microbiology Southern China, Institute of Microbiology, Guangdong Academy of Sciences, Guangzhou 510070, China
| | - Zongbin Cui
- Guangdong Provincial Key Laboratory of Microbial Culture Collection and Application, State Key Laboratory of Applied Microbiology Southern China, Institute of Microbiology, Guangdong Academy of Sciences, Guangzhou 510070, China
| | - Jiansong Fang
- Science and Technology Innovation Center, Guangzhou University of Chinese Medicine, Guangzhou, China
- Corresponding authors at: School of Public Health, Xinxiang Medical University, Xinxiang, China (L. Xie). Science and Technology Innovation Center, Guangzhou University of Chinese Medicine, Guangzhou, China (J. Fang).
| | - Liwei Xie
- School of Public Health, Xinxiang Medical University, Xinxiang, China
- Guangdong Provincial Key Laboratory of Microbial Culture Collection and Application, State Key Laboratory of Applied Microbiology Southern China, Institute of Microbiology, Guangdong Academy of Sciences, Guangzhou 510070, China
- Zhujiang Hospital, Southern Medical University, Guangzhou, China
- Corresponding authors at: School of Public Health, Xinxiang Medical University, Xinxiang, China (L. Xie). Science and Technology Innovation Center, Guangzhou University of Chinese Medicine, Guangzhou, China (J. Fang).
| |
Collapse
|
30
|
Zhang H, Saravanan KM, Yang Y, Wei Y, Yi P, Zhang JZH. Generating and screening de novo compounds against given targets using ultrafast deep learning models as core components. Brief Bioinform 2022; 23:6611918. [PMID: 35724626 DOI: 10.1093/bib/bbac226] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2022] [Revised: 04/27/2022] [Accepted: 05/14/2022] [Indexed: 11/13/2022] Open
Abstract
Deep learning is an artificial intelligence technique in which models express geometric transformations over multiple levels. This method has shown great promise in various fields, including drug development. The availability of public structure databases prompted the researchers to use generative artificial intelligence models to narrow down their search of the chemical space, a novel approach to chemogenomics and de novo drug development. In this study, we developed a strategy that combined an accelerated LSTM_Chem (long short-term memory for de novo compounds generation), dense fully convolutional neural network (DFCNN), and docking to generate a large number of de novo small molecular chemical compounds for given targets. To demonstrate its efficacy and applicability, six important targets that account for various human disorders were used as test examples. Moreover, using the M protease as a proof-of-concept example, we find that iteratively training with previously selected candidates can significantly increase the chance of obtaining novel compounds with higher and higher predicted binding affinities. In addition, we also check the potential benefit of obtaining reliable final de novo compounds with the help of MD simulation and metadynamics simulation. The generation of de novo compounds and the discovery of binders against various targets proposed here would be a practical and effective approach. Assessing the efficacy of these top de novo compounds with biochemical studies is promising to promote related drug development.
Collapse
Affiliation(s)
- Haiping Zhang
- Shenzhen Institute of Synthetic Biology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, Guangdong, China
| | - Konda Mani Saravanan
- Department of Biotechnology, Bharath Institute of Higher Education and Research, Chennai, 600073, Tamil Nadu, India
| | - Yang Yang
- Shenzhen Key Laboratory of Pathogen and Immunity, National Clinical Research Center for infectious disease, State Key Discipline of Infectious Disease, Shenzhen Third People's Hospital, Second Hospital Affiliated to Southern University of Science and Technology, Shenzhen, China
| | - Yanjie Wei
- Center for High Performance Computing, Joint Engineering Research Center for Health Big Data Intelligent Analysis Technology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, Guangdong, PR China 518055
| | - Pan Yi
- Center for High Performance Computing, Joint Engineering Research Center for Health Big Data Intelligent Analysis Technology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, Guangdong, PR China 518055
| | - John Z H Zhang
- Shenzhen Institute of Synthetic Biology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, Guangdong, China.,NYU-ECNU Center for Computational Chemistry at NYU Shanghai, Shanghai, 200062, China
| |
Collapse
|
31
|
Hadfield TE, Imrie F, Merritt A, Birchall K, Deane CM. Incorporating Target-Specific Pharmacophoric Information into Deep Generative Models for Fragment Elaboration. J Chem Inf Model 2022; 62:2280-2292. [PMID: 35499971 PMCID: PMC9131447 DOI: 10.1021/acs.jcim.1c01311] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
Abstract
Despite recent interest in deep generative models for scaffold elaboration, their applicability to fragment-to-lead campaigns has so far been limited. This is primarily due to their inability to account for local protein structure or a user's design hypothesis. We propose a novel method for fragment elaboration, STRIFE, that overcomes these issues. STRIFE takes as input fragment hotspot maps (FHMs) extracted from a protein target and processes them to provide meaningful and interpretable structural information to its generative model, which in turn is able to rapidly generate elaborations with complementary pharmacophores to the protein. In a large-scale evaluation, STRIFE outperforms existing, structure-unaware, fragment elaboration methods in proposing highly ligand-efficient elaborations. In addition to automatically extracting pharmacophoric information from a protein target's FHM, STRIFE optionally allows the user to specify their own design hypotheses.
Collapse
Affiliation(s)
- Thomas E Hadfield
- Oxford Protein Informatics Group, Department of Statistics, University of Oxford, Oxford OX1 3LB, United Kingdom
| | - Fergus Imrie
- Oxford Protein Informatics Group, Department of Statistics, University of Oxford, Oxford OX1 3LB, United Kingdom
| | - Andy Merritt
- LifeArc, SBC Open Innovation Campus, Stevenage SG1 2FX, United Kingdom
| | - Kristian Birchall
- LifeArc, SBC Open Innovation Campus, Stevenage SG1 2FX, United Kingdom
| | - Charlotte M Deane
- Oxford Protein Informatics Group, Department of Statistics, University of Oxford, Oxford OX1 3LB, United Kingdom
| |
Collapse
|
32
|
Nag S, Baidya ATK, Mandal A, Mathew AT, Das B, Devi B, Kumar R. Deep learning tools for advancing drug discovery and development. 3 Biotech 2022; 12:110. [PMID: 35433167 PMCID: PMC8994527 DOI: 10.1007/s13205-022-03165-8] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2021] [Accepted: 03/18/2022] [Indexed: 12/26/2022] Open
Abstract
A few decades ago, drug discovery and development were limited to a bunch of medicinal chemists working in a lab with enormous amount of testing, validations, and synthetic procedures, all contributing to considerable investments in time and wealth to get one drug out into the clinics. The advancements in computational techniques combined with a boom in multi-omics data led to the development of various bioinformatics/pharmacoinformatics/cheminformatics tools that have helped speed up the drug development process. But with the advent of artificial intelligence (AI), machine learning (ML) and deep learning (DL), the conventional drug discovery process has been further rationalized. Extensive biological data in the form of big data present in various databases across the globe acts as the raw materials for the ML/DL-based approaches and helps in accurate identifications of patterns and models which can be used to identify therapeutically active molecules with much fewer investments on time, workforce and wealth. In this review, we have begun by introducing the general concepts in the drug discovery pipeline, followed by an outline of the fields in the drug discovery process where ML/DL can be utilized. We have also introduced ML and DL along with their applications, various learning methods, and training models used to develop the ML/DL-based algorithms. Furthermore, we have summarized various DL-based tools existing in the public domain with their application in the drug discovery paradigm which includes DL tools for identification of drug targets and drug-target interaction such as DeepCPI, DeepDTA, WideDTA, PADME DeepAffinity, and DeepPocket. Additionally, we have discussed various DL-based models used in protein structure prediction, de novo design of new chemical scaffolds, virtual screening of chemical libraries for hit identification, absorption, distribution, metabolism, excretion, and toxicity (ADMET) prediction, metabolite prediction, clinical trial design, and oral bioavailability prediction. In the end, we have tried to shed light on some of the successful ML/DL-based models used in the drug discovery and development pipeline while also discussing the current challenges and prospects of the application of DL tools in drug discovery and development. We believe that this review will be useful for medicinal and computational chemists searching for DL tools for use in their drug discovery projects.
Collapse
Affiliation(s)
- Sagorika Nag
- Department of Pharmaceutical Engineering and Technology, Indian Institute of Technology (B.H.U.), Varanasi, UP 221005 India
| | - Anurag T. K. Baidya
- Department of Pharmaceutical Engineering and Technology, Indian Institute of Technology (B.H.U.), Varanasi, UP 221005 India
| | - Abhimanyu Mandal
- Department of Pharmaceutical Engineering and Technology, Indian Institute of Technology (B.H.U.), Varanasi, UP 221005 India
| | - Alen T. Mathew
- Department of Pharmaceutical Engineering and Technology, Indian Institute of Technology (B.H.U.), Varanasi, UP 221005 India
| | - Bhanuranjan Das
- Department of Pharmaceutical Engineering and Technology, Indian Institute of Technology (B.H.U.), Varanasi, UP 221005 India
| | - Bharti Devi
- Department of Pharmaceutical Engineering and Technology, Indian Institute of Technology (B.H.U.), Varanasi, UP 221005 India
| | - Rajnish Kumar
- Department of Pharmaceutical Engineering and Technology, Indian Institute of Technology (B.H.U.), Varanasi, UP 221005 India
| |
Collapse
|
33
|
Martinelli DD. Generative machine learning for de novo drug discovery: A systematic review. Comput Biol Med 2022; 145:105403. [PMID: 35339849 DOI: 10.1016/j.compbiomed.2022.105403] [Citation(s) in RCA: 25] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2022] [Revised: 03/10/2022] [Accepted: 03/11/2022] [Indexed: 02/08/2023]
Abstract
Recent research on artificial intelligence indicates that machine learning algorithms can auto-generate novel drug-like molecules. Generative models have revolutionized de novo drug discovery, rendering the explorative process more efficient. Several model frameworks and input formats have been proposed to enhance the performance of intelligent algorithms in generative molecular design. In this systematic literature review of experimental articles and reviews over the last five years, machine learning models, challenges associated with computational molecule design along with proposed solutions, and molecular encoding methods are discussed. A query-based search of the PubMed, ScienceDirect, Springer, Wiley Online Library, arXiv, MDPI, bioRxiv, and IEEE Xplore databases yielded 87 studies. Twelve additional studies were identified via citation searching. Of the articles in which machine learning was implemented, six prominent algorithms were identified: long short-term memory recurrent neural networks (LSTM-RNNs), variational autoencoders (VAEs), generative adversarial networks (GANs), adversarial autoencoders (AAEs), evolutionary algorithms, and gated recurrent unit (GRU-RNNs). Furthermore, eight central challenges were designated: homogeneity of generated molecular libraries, deficient synthesizability, limited assay data, model interpretability, incapacity for multi-property optimization, incomparability, restricted molecule size, and uncertainty in model evaluation. Molecules were encoded either as strings, which were occasionally augmented using randomization, as 2D graphs, or as 3D graphs. Statistical analysis and visualization are performed to illustrate how approaches to machine learning in de novo drug design have evolved over the past five years. Finally, future opportunities and reservations are discussed.
Collapse
|
34
|
Bilodeau C, Jin W, Jaakkola T, Barzilay R, Jensen KF. Generative models for molecular discovery: Recent advances and challenges. WIRES COMPUTATIONAL MOLECULAR SCIENCE 2022. [DOI: 10.1002/wcms.1608] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Affiliation(s)
- Camille Bilodeau
- Department of Chemical Engineering Massachusetts Institute of Technology Cambridge Massachusetts USA
| | - Wengong Jin
- Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology Cambridge Massachusetts USA
| | - Tommi Jaakkola
- Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology Cambridge Massachusetts USA
| | - Regina Barzilay
- Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology Cambridge Massachusetts USA
| | - Klavs F. Jensen
- Department of Chemical Engineering Massachusetts Institute of Technology Cambridge Massachusetts USA
| |
Collapse
|
35
|
Polanski J. Unsupervised Learning in Drug Design from Self-Organization to Deep Chemistry. Int J Mol Sci 2022; 23:2797. [PMID: 35269939 PMCID: PMC8910896 DOI: 10.3390/ijms23052797] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2022] [Revised: 02/27/2022] [Accepted: 02/27/2022] [Indexed: 12/10/2022] Open
Abstract
The availability of computers has brought novel prospects in drug design. Neural networks (NN) were an early tool that cheminformatics tested for converting data into drugs. However, the initial interest faded for almost two decades. The recent success of Deep Learning (DL) has inspired a renaissance of neural networks for their potential application in deep chemistry. DL targets direct data analysis without any human intervention. Although back-propagation NN is the main algorithm in the DL that is currently being used, unsupervised learning can be even more efficient. We review self-organizing maps (SOM) in mapping molecular representations from the 1990s to the current deep chemistry. We discovered the enormous efficiency of SOM not only for features that could be expected by humans, but also for those that are not trivial to human chemists. We reviewed the DL projects in the current literature, especially unsupervised architectures. DL appears to be efficient in pattern recognition (Deep Face) or chess (Deep Blue). However, an efficient deep chemistry is still a matter for the future. This is because the availability of measured property data in chemistry is still limited.
Collapse
Affiliation(s)
- Jaroslaw Polanski
- Institute of Chemistry, Faculty of Science and Technology, University of Silesia, Szkolna 9, 40-006 Katowice, Poland
| |
Collapse
|
36
|
Kaitoh K, Yamanishi Y. Scaffold-Retained Structure Generator to Exhaustively Create Molecules in an Arbitrary Chemical Space. J Chem Inf Model 2022; 62:2212-2225. [DOI: 10.1021/acs.jcim.1c01130] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Affiliation(s)
- Kazuma Kaitoh
- Department of Bioscience and Bioinformatics, Faculty of Computer Science and Systems Engineering, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka 820-8502, Japan
| | - Yoshihiro Yamanishi
- Department of Bioscience and Bioinformatics, Faculty of Computer Science and Systems Engineering, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka 820-8502, Japan
| |
Collapse
|
37
|
Hadfield TE, Deane CM. AI in 3D compound design. Curr Opin Struct Biol 2022; 73:102326. [PMID: 35101671 DOI: 10.1016/j.sbi.2021.102326] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2021] [Revised: 11/22/2021] [Accepted: 12/13/2021] [Indexed: 11/18/2022]
Abstract
The success of Artificial Intelligence (AI) across a wide range of domains has fuelled significant interest in its application to designing novel compounds and screening compounds against a specific target. However, many existing AI methods either do not account for the 3D structure of the target at all or struggle to capture meaningful spatial information from the target. In this Opinion, we highlight a range of recent structure-aware approaches which utilise deep learning for compound design and virtual screening. We discuss how such methods can be better integrated into existing drug discovery pipelines by facilitating the design of compounds which conform to a specified design hypothesis and by uncovering key protein-ligand interactions which can be used to aid molecule design.
Collapse
Affiliation(s)
- Thomas E Hadfield
- Oxford Protein Informatics Group, Department of Statistics, University of Oxford, Oxford OX1 3LB, UK
| | - Charlotte M Deane
- Oxford Protein Informatics Group, Department of Statistics, University of Oxford, Oxford OX1 3LB, UK.
| |
Collapse
|
38
|
Wang J, Zhang Y, Nie W, Luo Y, Deng L. Computational anti-COVID-19 drug design: progress and challenges. Brief Bioinform 2022; 23:bbab484. [PMID: 34850817 PMCID: PMC8690229 DOI: 10.1093/bib/bbab484] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2021] [Revised: 10/21/2021] [Accepted: 10/25/2021] [Indexed: 12/14/2022] Open
Abstract
Vaccines have made gratifying progress in preventing the 2019 coronavirus disease (COVID-19) pandemic. However, the emergence of variants, especially the latest delta variant, has brought considerable challenges to human health. Hence, the development of robust therapeutic approaches, such as anti-COVID-19 drug design, could aid in managing the pandemic more efficiently. Some drug design strategies have been successfully applied during the COVID-19 pandemic to create and validate related lead drugs. The computational drug design methods used for COVID-19 can be roughly divided into (i) structure-based approaches and (ii) artificial intelligence (AI)-based approaches. Structure-based approaches investigate different molecular fragments and functional groups through lead drugs and apply relevant tools to produce antiviral drugs. AI-based approaches usually use end-to-end learning to explore a larger biochemical space to design antiviral drugs. This review provides an overview of the two design strategies of anti-COVID-19 drugs, the advantages and disadvantages of these strategies and discussions of future developments.
Collapse
Affiliation(s)
- Jinxian Wang
- School of Computer Science and Engineering, Central South University,410075, Changsha, China
| | - Ying Zhang
- Department of Pharmacy, Heilongjiang Province Land Reclamation Headquarters General Hospital, 150001, Harbin, China
| | - Wenjuan Nie
- School of Computer Science and Engineering, Central South University,410075, Changsha, China
| | - Yi Luo
- School of Science, The University of Auckland,Auckland 1010, Auckland, New Zealand
| | - Lei Deng
- School of Computer Science and Engineering, Central South University,410075, Changsha, China
| |
Collapse
|
39
|
Abstract
Abstract
Machine learning (ML) has revolutionised the field of structure-based drug design (SBDD) in recent years. During the training stage, ML techniques typically analyse large amounts of experimentally determined data to create predictive models in order to inform the drug discovery process. Deep learning (DL) is a subfield of ML, that relies on multiple layers of a neural network to extract significantly more complex patterns from experimental data, and has recently become a popular choice in SBDD. This review provides a thorough summary of the recent DL trends in SBDD with a particular focus on de novo drug design, binding site prediction, and binding affinity prediction of small molecules.
Collapse
|
40
|
Abstract
Artificial intelligence (AI) tools find increasing application in drug discovery supporting every stage of the Design-Make-Test-Analyse (DMTA) cycle. The main focus of this chapter is the application in molecular generation with the aid of deep neural networks (DNN). We present a historical overview of the main advances in the field. We analyze the concepts of distribution and goal-directed learning and then highlight some of the recent applications of generative models in drug design with a focus into research work from the biopharmaceutical industry. We present in some more detail REINVENT which is an open-source software developed within our group in AstraZeneca and the main platform for AI molecular design support for a number of medicinal chemistry projects in the company and we also demonstrate some of our work in library design. Finally, we present some of the main challenges in the application of AI in Drug Discovery and different approaches to respond to these challenges which define areas for current and future work.
Collapse
|
41
|
Synthesis and Evaluation of the Antibacterial and Antioxidant Activities of Some Novel Chloroquinoline Analogs. J CHEM-NY 2021. [DOI: 10.1155/2021/2408006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
Quinoline heterocycle is a useful scaffold to develop bioactive molecules used as anticancer, antimalaria, and antimicrobials. Inspired by their numerous biological activities, an attempt was made to synthesize a series of novel 7-chloroquinoline derivatives, including 2,7-dichloroquinoline-3-carbonitrile (5), 2,7-dichloroquinoline-3-carboxamide (6), 7-chloro-2-methoxyquinoline-3-carbaldehyde (7), 7-chloro-2-ethoxyquinoline-3-carbaldehyde (8), and 2-chloroquinoline-3-carbonitrile (12) by the application of Vilsmeier–Haack reaction and aromatic nucleophilic substitution of 2,7-dichloroquinoline-3-carbaldehyde. The carbaldehyde functional group was transformed into nitriles using POCl3 and NaN3, which was subsequently converted to amide using CH3CO2H and H2SO4. The compounds synthesized were screened for their antibacterial activity against Staphylococcus aureus, Escherichia coli, Pseudomonas aeruginosa, and Streptococcus pyogenes. Compounds 6 and 8 showed good activity against E. coli with an inhibition zone of 11.00 ± 0.04 and 12.00 ± 0.00 mm, respectively. Compound 5 had good activity against S. aureus and P. aeruginosa with an inhibition zone of 11.00 ± 0.03 mm relative to standard amoxicillin (18 ± 0.00 mm). Compound 7 displayed good activity against S. pyogenes with an inhibition zone of 11.00 ± 0.02 mm. The radical scavenging activity of these compounds was evaluated using 1,1-diphenyl-2-picrylhydrazyl (DPPH), and compounds 5 and 6 displayed the strongest antioxidant activity with IC50 of 2.17 and 0.31 µg/mL relative to ascorbic acid (2.41 µg/mL), respectively. The molecular docking study of the synthesized compounds was conducted to investigate their binding pattern with topoisomerase IIβ and E. coli DNA gyrase B. Compounds 6 (−6.4 kcal/mol) and 8 (−6.6 kcal/mol) exhibited better binding affinity in their in silico molecular docking against E. coli DNA gyrase. The synthesized compounds were also found to have minimum binding energy ranging from −6.9 to −7.3 kcal/mol against topoisomerase IIβ. The SwissADME predicted results showed that the synthesized compounds 5–8 and 12 satisfy Lipinski’s rule of five with zero violations. The ProTox-II predicted organ toxicity results revealed that all the synthesized compounds were inactive in hepatotoxicity, immunotoxicity, mutagenicity, and cytotoxicity. The findings of the in vitro antibacterial and molecular docking analysis suggested that compound 8 might be considered a hit compound for further analysis as antibacterial and anticancer drug. The radical scavenging activity displayed by compounds 5 and 6 suggests these compounds as a radical scavenger.
Collapse
|
42
|
Zhang J, Zhang J, Liu Q, Fan XX, Leung ELH, Yao XJ, Liu L. Resistance looms for KRAS G12C inhibitors and rational tackling strategies. Pharmacol Ther 2021; 229:108050. [PMID: 34864132 DOI: 10.1016/j.pharmthera.2021.108050] [Citation(s) in RCA: 26] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2021] [Revised: 11/28/2021] [Accepted: 11/29/2021] [Indexed: 12/13/2022]
Abstract
KRAS mutations are one of the most frequent activating alterations in carcinoma. Recent efforts have witnessed a revolutionary strategy for KRAS G12C inhibitors with exhibiting conspicuous clinical responses across multiple tumor types, providing new impetus for renewed drug development and culminating in sotorasib with approximately 6-month median progression-free survival in KRAS G12C-driven lung cancer. However, diverse genomic and histological mechanisms conferring resistance to KRAS G12C inhibitors may limit their clinical efficacy. Herein, we first briefly discuss the recent resistance looms for KRAS G12C inhibitors, focusing on their clinical trials. We then comprehensively interrogate and underscore our current understanding of resistance mechanisms and the necessity of incorporating genomic analyses into the clinical investigation to further decipher resistance mechanisms. Finally, we highlight the future role of novel treatment strategies especially rational identification of targeted combinatorial approaches in tackling drug resistance, and propose our views on including the application of robust biomarkers to precisely guide combination medication regimens.
Collapse
Affiliation(s)
- Junmin Zhang
- State Key Laboratory of Quality Research in Chinese Medicine, Dr. Neher's Biophysics Laboratory for Innovative Drug Discovery, Macau University of Science and Technology, Macau (SAR), China; School of Pharmacy, State Key Laboratory of Applied Organic Chemistry, College of Chemistry and Chemical Engineering, Lanzhou University, Lanzhou 730000, China
| | - Juanhong Zhang
- State Key Laboratory of Quality Research in Chinese Medicine, Dr. Neher's Biophysics Laboratory for Innovative Drug Discovery, Macau University of Science and Technology, Macau (SAR), China; School of Pharmacy, State Key Laboratory of Applied Organic Chemistry, College of Chemistry and Chemical Engineering, Lanzhou University, Lanzhou 730000, China; College of Life Science, Northwest Normal University, Lanzhou 730070, China
| | - Qing Liu
- State Key Laboratory of Quality Research in Chinese Medicine, Dr. Neher's Biophysics Laboratory for Innovative Drug Discovery, Macau University of Science and Technology, Macau (SAR), China
| | - Xing-Xing Fan
- State Key Laboratory of Quality Research in Chinese Medicine, Dr. Neher's Biophysics Laboratory for Innovative Drug Discovery, Macau University of Science and Technology, Macau (SAR), China
| | - Elaine Lai-Han Leung
- State Key Laboratory of Quality Research in Chinese Medicine, Dr. Neher's Biophysics Laboratory for Innovative Drug Discovery, Macau University of Science and Technology, Macau (SAR), China.
| | - Xiao-Jun Yao
- State Key Laboratory of Quality Research in Chinese Medicine, Dr. Neher's Biophysics Laboratory for Innovative Drug Discovery, Macau University of Science and Technology, Macau (SAR), China.
| | - Liang Liu
- State Key Laboratory of Quality Research in Chinese Medicine, Dr. Neher's Biophysics Laboratory for Innovative Drug Discovery, Macau University of Science and Technology, Macau (SAR), China.
| |
Collapse
|
43
|
Zheng S, Lei Z, Ai H, Chen H, Deng D, Yang Y. Deep scaffold hopping with multimodal transformer neural networks. J Cheminform 2021; 13:87. [PMID: 34774103 PMCID: PMC8590293 DOI: 10.1186/s13321-021-00565-5] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2021] [Accepted: 10/31/2021] [Indexed: 11/10/2022] Open
Abstract
Scaffold hopping is a central task of modern medicinal chemistry for rational drug design, which aims to design molecules of novel scaffolds sharing similar target biological activities toward known hit molecules. Traditionally, scaffolding hopping depends on searching databases of available compounds that can't exploit vast chemical space. In this study, we have re-formulated this task as a supervised molecule-to-molecule translation to generate hopped molecules novel in 2D structure but similar in 3D structure, as inspired by the fact that candidate compounds bind with their targets through 3D conformations. To efficiently train the model, we curated over 50 thousand pairs of molecules with increased bioactivity, similar 3D structure, but different 2D structure from public bioactivity database, which spanned 40 kinases commonly investigated by medicinal chemists. Moreover, we have designed a multimodal molecular transformer architecture by integrating molecular 3D conformer through a spatial graph neural network and protein sequence information through Transformer. The trained DeepHop model was shown able to generate around 70% molecules having improved bioactivity together with high 3D similarity but low 2D scaffold similarity to the template molecules. This ratio was 1.9 times higher than other state-of-the-art deep learning methods and rule- and virtual screening-based methods. Furthermore, we demonstrated that the model could generalize to new target proteins through fine-tuning with a small set of active compounds. Case studies have also shown the advantages and usefulness of DeepHop in practical scaffold hopping scenarios.
Collapse
Affiliation(s)
- Shuangjia Zheng
- School of Data and Computer Science, Sun Yat-Sen University, China, 132 East Circle at University City, Guangzhou, 510006, China
| | - Zengrong Lei
- Fermion Technology Co., Ltd, 1088 Newport East Road, Guangzhou, 510335, China
| | - Haitao Ai
- Fermion Technology Co., Ltd, 1088 Newport East Road, Guangzhou, 510335, China
| | - Hongming Chen
- Centre of Chemistry and Chemical Biology, Guangzhou Regenerative Medicine and Health Guangdong Laboratory, Guangzhou, 510530, China
| | - Daiguo Deng
- Fermion Technology Co., Ltd, 1088 Newport East Road, Guangzhou, 510335, China.
| | - Yuedong Yang
- School of Data and Computer Science, Sun Yat-Sen University, China, 132 East Circle at University City, Guangzhou, 510006, China.
| |
Collapse
|
44
|
Joshi RP, Gebauer NWA, Bontha M, Khazaieli M, James RM, Brown JB, Kumar N. 3D-Scaffold: A Deep Learning Framework to Generate 3D Coordinates of Drug-like Molecules with Desired Scaffolds. J Phys Chem B 2021; 125:12166-12176. [PMID: 34662142 DOI: 10.1021/acs.jpcb.1c06437] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023]
Abstract
The prerequisite of therapeutic drug design and discovery is to identify novel molecules and developing lead candidates with desired biophysical and biochemical properties. Deep generative models have demonstrated their ability to find such molecules by exploring a huge chemical space efficiently. An effective way to generate new molecules with desired target properties is by constraining the critical fucntional groups or the core scaffolds in the generation process. To this end, we developed a domain aware generative framework called 3D-Scaffold that takes 3D coordinates of the desired scaffold as an input and generates 3D coordinates of novel therapeutic candidates as an output while always preserving the desired scaffolds in generated structures. We demonstrated that our framework generates predominantly valid, unique, novel, and experimentally synthesizable molecules that have drug-like properties similar to the molecules in the training set. Using domain specific data sets, we generate covalent and noncovalent antiviral inhibitors targeting viral proteins. To measure the success of our framework in generating therapeutic candidates, generated structures were subjected to high throughput virtual screening via docking simulations, which shows favorable interaction against SARS-CoV-2 main protease (Mpro) and nonstructural protein endoribonuclease (NSP15) targets. Most importantly, our deep learning model performs well with relatively small 3D structural training data and quickly learns to generalize to new scaffolds, highlighting its potential application to other domains for generating target specific candidates.
Collapse
Affiliation(s)
- Rajendra P Joshi
- Pacific Northwest National Laboratory, Richland, Washington 99352, United States
| | - Niklas W A Gebauer
- Machine Learning Group, Technische Universität Berlin, 10587 Berlin, Germany.,BASLEARN - TU Berlin/BASF Joint Lab for Machine Learning, Technische Universität Berlin, 10587 Berlin, Germany.,Berlin Institute for the Foundations of Learning and Data, 10587 Berlin, Germany
| | - Mridula Bontha
- Pacific Northwest National Laboratory, Richland, Washington 99352, United States
| | - Mercedeh Khazaieli
- Pacific Northwest National Laboratory, Richland, Washington 99352, United States
| | - Rhema M James
- Pacific Northwest National Laboratory, Richland, Washington 99352, United States
| | - James B Brown
- Environmental Genomics & Systems Biology, Lawrence Berkeley National Laboratory, Berkley, California 94710, United States
| | - Neeraj Kumar
- Pacific Northwest National Laboratory, Richland, Washington 99352, United States
| |
Collapse
|
45
|
Imrie F, Hadfield TE, Bradley AR, Deane CM. Deep generative design with 3D pharmacophoric constraints. Chem Sci 2021; 12:14577-14589. [PMID: 34881010 PMCID: PMC8580048 DOI: 10.1039/d1sc02436a] [Citation(s) in RCA: 29] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2021] [Accepted: 10/18/2021] [Indexed: 12/30/2022] Open
Abstract
Generative models have increasingly been proposed as a solution to the molecular design problem. However, it has proved challenging to control the design process or incorporate prior knowledge, limiting their practical use in drug discovery. In particular, generative methods have made limited use of three-dimensional (3D) structural information even though this is critical to binding. This work describes a method to incorporate such information and demonstrates the benefit of doing so. We combine an existing graph-based deep generative model, DeLinker, with a convolutional neural network to utilise physically-meaningful 3D representations of molecules and target pharmacophores. We apply our model, DEVELOP, to both linker and R-group design, demonstrating its suitability for both hit-to-lead and lead optimisation. The 3D pharmacophoric information results in improved generation and allows greater control of the design process. In multiple large-scale evaluations, we show that including 3D pharmacophoric constraints results in substantial improvements in the quality of generated molecules. On a challenging test set derived from PDBbind, our model improves the proportion of generated molecules with high 3D similarity to the original molecule by over 300%. In addition, DEVELOP recovers 10× more of the original molecules compared to the baseline DeLinker method. Our approach is general-purpose, readily modifiable to alternate 3D representations, and can be incorporated into other generative frameworks. Code is available at https://github.com/oxpig/DEVELOP.
Collapse
Affiliation(s)
- Fergus Imrie
- Oxford Protein Informatics Group, Department of Statistics, University of Oxford Oxford OX1 3LB UK
| | - Thomas E Hadfield
- Oxford Protein Informatics Group, Department of Statistics, University of Oxford Oxford OX1 3LB UK
| | - Anthony R Bradley
- Exscientia Ltd The Schrödinger Building, Oxford Science Park Oxford OX4 4GE UK
| | - Charlotte M Deane
- Oxford Protein Informatics Group, Department of Statistics, University of Oxford Oxford OX1 3LB UK
| |
Collapse
|
46
|
Joshi RP, Kumar N. Artificial Intelligence for Autonomous Molecular Design: A Perspective. Molecules 2021; 26:6761. [PMID: 34833853 PMCID: PMC8619999 DOI: 10.3390/molecules26226761] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2021] [Revised: 10/23/2021] [Accepted: 10/29/2021] [Indexed: 11/23/2022] Open
Abstract
Domain-aware artificial intelligence has been increasingly adopted in recent years to expedite molecular design in various applications, including drug design and discovery. Recent advances in areas such as physics-informed machine learning and reasoning, software engineering, high-end hardware development, and computing infrastructures are providing opportunities to build scalable and explainable AI molecular discovery systems. This could improve a design hypothesis through feedback analysis, data integration that can provide a basis for the introduction of end-to-end automation for compound discovery and optimization, and enable more intelligent searches of chemical space. Several state-of-the-art ML architectures are predominantly and independently used for predicting the properties of small molecules, their high throughput synthesis, and screening, iteratively identifying and optimizing lead therapeutic candidates. However, such deep learning and ML approaches also raise considerable conceptual, technical, scalability, and end-to-end error quantification challenges, as well as skepticism about the current AI hype to build automated tools. To this end, synergistically and intelligently using these individual components along with robust quantum physics-based molecular representation and data generation tools in a closed-loop holds enormous promise for accelerated therapeutic design to critically analyze the opportunities and challenges for their more widespread application. This article aims to identify the most recent technology and breakthrough achieved by each of the components and discusses how such autonomous AI and ML workflows can be integrated to radically accelerate the protein target or disease model-based probe design that can be iteratively validated experimentally. Taken together, this could significantly reduce the timeline for end-to-end therapeutic discovery and optimization upon the arrival of any novel zoonotic transmission event. Our article serves as a guide for medicinal, computational chemistry and biology, analytical chemistry, and the ML community to practice autonomous molecular design in precision medicine and drug discovery.
Collapse
Affiliation(s)
| | - Neeraj Kumar
- Computational Biology Group, Biological Science Division, Pacific Northwest National Laboratory, 902 Battelle Blvd, Richland, WA 99352, USA;
| |
Collapse
|
47
|
Deep Learning Applied to Ligand-Based De Novo Drug Design. METHODS IN MOLECULAR BIOLOGY (CLIFTON, N.J.) 2021; 2390:273-299. [PMID: 34731474 DOI: 10.1007/978-1-0716-1787-8_12] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Abstract
In the latest years, the application of deep generative models to suggest virtual compounds is becoming a new and powerful tool in drug discovery projects. The idea behind this review is to offer an updated view on de novo design approaches based on artificial intelligent (AI) algorithms, with a particular focus on ligand-based methods. We start this review by reporting a brief overview of the most relevant de novo design approaches developed before the use of AI techniques. We then describe the nowadays most common neural network architectures employed in ligand-based de novo design, together with an up-to-date list of more than 100 deep generative models found in the literature (2017-2020). In order to show how deep generative approaches are applied into drug discovery context, we report all the now available studies in which generated compounds have been synthetized and their biological activity tested. Finally, we discuss what we envisage as beneficial future directions for further application of deep generative models in de novo drug design.
Collapse
|
48
|
Li Y, Pei J, Lai L. Structure-based de novo drug design using 3D deep generative models. Chem Sci 2021; 12:13664-13675. [PMID: 34760151 PMCID: PMC8549794 DOI: 10.1039/d1sc04444c] [Citation(s) in RCA: 52] [Impact Index Per Article: 17.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2021] [Accepted: 09/09/2021] [Indexed: 12/14/2022] Open
Abstract
Deep generative models are attracting much attention in the field of de novo molecule design. Compared to traditional methods, deep generative models can be trained in a fully data-driven way with little requirement for expert knowledge. Although many models have been developed to generate 1D and 2D molecular structures, 3D molecule generation is less explored, and the direct design of drug-like molecules inside target binding sites remains challenging. In this work, we introduce DeepLigBuilder, a novel deep learning-based method for de novo drug design that generates 3D molecular structures in the binding sites of target proteins. We first developed Ligand Neural Network (L-Net), a novel graph generative model for the end-to-end design of chemically and conformationally valid 3D molecules with high drug-likeness. Then, we combined L-Net with Monte Carlo tree search to perform structure-based de novo drug design tasks. In the case study of inhibitor design for the main protease of SARS-CoV-2, DeepLigBuilder suggested a list of drug-like compounds with novel chemical structures, high predicted affinity, and similar binding features to those of known inhibitors. The current version of L-Net was trained on drug-like compounds from ChEMBL, which could be easily extended to other molecular datasets with desired properties based on users' demands and applied in functional molecule generation. Merging deep generative models with atomic-level interaction evaluation, DeepLigBuilder provides a state-of-the-art model for structure-based de novo drug design and lead optimization. DeepLigBuilder, a novel deep generative model for structure-based de novo drug design, directly generates 3D structures of drug-like compounds in the target binding site.![]()
Collapse
Affiliation(s)
- Yibo Li
- Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Peking University Beijing 100871 China
| | - Jianfeng Pei
- Center for Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking University Beijing 100871 China
| | - Luhua Lai
- Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Peking University Beijing 100871 China .,Center for Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking University Beijing 100871 China .,BNLMS, College of Chemistry and Molecular Engineering, Peking University Beijing 100871 China
| |
Collapse
|
49
|
Fialková V, Zhao J, Papadopoulos K, Engkvist O, Bjerrum EJ, Kogej T, Patronov A. LibINVENT: Reaction-based Generative Scaffold Decoration for in Silico Library Design. J Chem Inf Model 2021; 62:2046-2063. [PMID: 34460269 DOI: 10.1021/acs.jcim.1c00469] [Citation(s) in RCA: 28] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Because of the strong relationship between the desired molecular activity and its structural core, the screening of focused, core-sharing chemical libraries is a key step in lead optimization. Despite the plethora of current research focused on in silico methods for molecule generation, to our knowledge, no tool capable of designing such libraries has been proposed. In this work, we present a novel tool for de novo drug design called LibINVENT. It is capable of rapidly proposing chemical libraries of compounds sharing the same core while maximizing a range of desirable properties. To further help the process of designing focused libraries, the user can list specific chemical reactions that can be used for the library creation. LibINVENT is therefore a flexible tool for generating virtual chemical libraries for lead optimization in a broad range of scenarios. Additionally, the shared core ensures that the compounds in the library are similar, possess desirable properties, and can also be synthesized under the same or similar conditions. The LibINVENT code is freely available in our public repository at https://github.com/MolecularAI/Lib-INVENT. The code necessary for data preprocessing is further available at: https://github.com/MolecularAI/Lib-INVENT-dataset.
Collapse
Affiliation(s)
- Vendy Fialková
- Molecular AI, Discovery Sciences, R&D, AstraZeneca, Gothenburg 43183, Sweden
| | - Jiaxi Zhao
- Molecular AI, Discovery Sciences, R&D, AstraZeneca, Gothenburg 43183, Sweden.,Department of Pharmaceutical Biosciences, Uppsala University, Uppsala 75237, Sweden
| | - Kostas Papadopoulos
- Molecular AI, Discovery Sciences, R&D, AstraZeneca, Gothenburg 43183, Sweden
| | - Ola Engkvist
- Molecular AI, Discovery Sciences, R&D, AstraZeneca, Gothenburg 43183, Sweden.,Department of Computer Science and Engineering, Chalmers University of Technology, Gothenburg 41756, Sweden
| | | | - Thierry Kogej
- Molecular AI, Discovery Sciences, R&D, AstraZeneca, Gothenburg 43183, Sweden
| | - Atanas Patronov
- Molecular AI, Discovery Sciences, R&D, AstraZeneca, Gothenburg 43183, Sweden
| |
Collapse
|
50
|
Yu L, Su Y, Liu Y, Zeng X. Review of unsupervised pretraining strategies for molecules representation. Brief Funct Genomics 2021; 20:323-332. [PMID: 34342611 DOI: 10.1093/bfgp/elab036] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2021] [Revised: 07/07/2021] [Accepted: 07/08/2021] [Indexed: 11/14/2022] Open
Abstract
In recent years, the computer-assisted techniques make a great progress in the field of drug discovery. And, yet, the problem of limited labeled data problem is still challenging and also restricts the performance of these techniques in specific tasks, such as molecular property prediction, compound-protein interaction and de novo molecular generation. One effective solution is to utilize the experience and knowledge gained from other tasks to cope with related pursuits. Unsupervised pretraining is promising, due to its capability of leveraging a vast number of unlabeled molecules and acquiring a more informative molecular representation for the downstream tasks. In particular, models trained on large-scale unlabeled molecules can capture generalizable features, and this ability can be employed to improve the performance of specific downstream tasks. Many relevant pretraining works have been recently proposed. Here, we provide an overview of molecular unsupervised pretraining and related applications in drug discovery. Challenges and possible solutions are also summarized.
Collapse
|