1
|
Lu X, Xie L, Xu L, Mao R, Xu X, Chang S. Multimodal fused deep learning for drug property prediction: Integrating chemical language and molecular graph. Comput Struct Biotechnol J 2024; 23:1666-1679. [PMID: 38680871 PMCID: PMC11046066 DOI: 10.1016/j.csbj.2024.04.030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2024] [Revised: 04/01/2024] [Accepted: 04/10/2024] [Indexed: 05/01/2024] Open
Abstract
Accurately predicting molecular properties is a challenging but essential task in drug discovery. Recently, many mono-modal deep learning methods have been successfully applied to molecular property prediction. However, mono-modal learning is inherently limited as it relies solely on a single modality of molecular representation, which restricts a comprehensive understanding of drug molecules. To overcome the limitations, we propose a multimodal fused deep learning (MMFDL) model to leverage information from different molecular representations. Specifically, we construct a triple-modal learning model by employing Transformer-Encoder, Bidirectional Gated Recurrent Unit (BiGRU), and graph convolutional network (GCN) to process three modalities of information from chemical language and molecular graph: SMILES-encoded vectors, ECFP fingerprints, and molecular graphs, respectively. We evaluate the proposed triple-modal model using five fusion approaches on six molecule datasets, including Delaney, Llinas2020, Lipophilicity, SAMPL, BACE, and pKa from DataWarrior. The results show that the MMFDL model achieves the highest Pearson coefficients, and stable distribution of Pearson coefficients in the random splitting test, outperforming mono-modal models in accuracy and reliability. Furthermore, we validate the generalization ability of our model in the prediction of binding constants for protein-ligand complex molecules, and assess the resilience capability against noise. Through analysis of feature distributions in chemical space and the assigned contribution of each modal model, we demonstrate that the MMFDL model shows the ability to acquire complementary information by using proper models and suitable fusion approaches. By leveraging diverse sources of bioinformatics information, multimodal deep learning models hold the potential for successful drug discovery.
Collapse
Affiliation(s)
- Xiaohua Lu
- Institute of Bioinformatics and Medical Engineering, Jiangsu University of Technology, Changzhou 213001, China
| | - Liangxu Xie
- Institute of Bioinformatics and Medical Engineering, Jiangsu University of Technology, Changzhou 213001, China
| | - Lei Xu
- Institute of Bioinformatics and Medical Engineering, Jiangsu University of Technology, Changzhou 213001, China
| | - Rongzhi Mao
- Institute of Bioinformatics and Medical Engineering, Jiangsu University of Technology, Changzhou 213001, China
| | - Xiaojun Xu
- Institute of Bioinformatics and Medical Engineering, Jiangsu University of Technology, Changzhou 213001, China
| | - Shan Chang
- Institute of Bioinformatics and Medical Engineering, Jiangsu University of Technology, Changzhou 213001, China
| |
Collapse
|
2
|
Liang L, Liu Z, Yang X, Zhang Y, Liu H, Chen Y. Prediction of blood-brain barrier permeability using machine learning approaches based on various molecular representation. Mol Inform 2024:e202300327. [PMID: 38864837 DOI: 10.1002/minf.202300327] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2023] [Revised: 03/18/2024] [Accepted: 04/18/2024] [Indexed: 06/13/2024]
Abstract
The assessment of compound blood-brain barrier (BBB) permeability poses a significant challenge in the discovery of drugs targeting the central nervous system. Conventional experimental approaches to measure BBB permeability are labor-intensive, cost-ineffective, and time-consuming. In this study, we constructed six machine learning classification models by combining various machine learning algorithms and molecular representations. The model based on ExtraTree algorithm and random partitioning strategy obtains the best prediction result, with AUC value of 0.932±0.004 and balanced accuracy (BA) of 0.837±0.010 for the test set. We employed the SHAP method to identify important features associated with BBB permeability. In addition, matched molecular pair (MMP) analysis and representative substructure derivation method were utilized to uncover the transformation rules and distinctive structural features of BBB permeable compounds. The machine learning models proposed in this work can serve as an effective tool for assessing BBB permeability in the drug discovery for central nervous system disease.
Collapse
Affiliation(s)
- Li Liang
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, 639 Longmian Avenue, Nanjing, 211198, China
| | - Zhiwen Liu
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, 639 Longmian Avenue, Nanjing, 211198, China
| | - Xinyi Yang
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, 639 Longmian Avenue, Nanjing, 211198, China
| | - Yanmin Zhang
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, 639 Longmian Avenue, Nanjing, 211198, China
| | - Haichun Liu
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, 639 Longmian Avenue, Nanjing, 211198, China
| | - Yadong Chen
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, 639 Longmian Avenue, Nanjing, 211198, China
| |
Collapse
|
3
|
Jardim C, de Waal A, Fabris-Rotelli I, Rad NN, Mazarura J, Sherry D. Feature engineered embeddings for classification of molecular data. Comput Biol Chem 2024; 110:108056. [PMID: 38796282 DOI: 10.1016/j.compbiolchem.2024.108056] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2023] [Revised: 03/19/2024] [Accepted: 03/20/2024] [Indexed: 05/28/2024]
Abstract
The classification of molecules is of particular importance to the drug discovery process and several other use cases. Data in this domain can be partitioned into structural and sequence/text data. Several techniques such as deep learning are able to classify molecules and predict their functions using both types of data. Molecular structure and encoded chemical information are sufficient to classify a characteristic of a molecule. However, the use of a molecule's structural information typically requires large amounts of computational power with deep learning models that take a long time to train. In this study, we present an alternative approach to molecule classification that addresses the limitations of other techniques. This approach uses natural language processing techniques in the form of count vectorisation, term frequency-inverse document frequency, word2vec and Latent Dirichlet Allocation to feature engineer molecular text data. Through this approach, we aim to make a robust and easily reproducible embedding that is fast to implement and solely dependent on chemical (text) data such as the sequence of a protein. Further, we investigate the usefulness of these embeddings for machine learning models. We apply the techniques to two different types of molecular text data: FASTA sequence data and Simplified Molecular Input Line Entry Specification data. We show that these embeddings provide excellent performance for classification.
Collapse
|
4
|
Chen Z, Wang R, Guo J, Wang X. The role and future prospects of artificial intelligence algorithms in peptide drug development. Biomed Pharmacother 2024; 175:116709. [PMID: 38713945 DOI: 10.1016/j.biopha.2024.116709] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2024] [Revised: 05/01/2024] [Accepted: 05/02/2024] [Indexed: 05/09/2024] Open
Abstract
Peptide medications have been more well-known in recent years due to their many benefits, including low side effects, high biological activity, specificity, effectiveness, and so on. Over 100 peptide medications have been introduced to the market to treat a variety of illnesses. Most of these peptide medications are developed on the basis of endogenous peptides or natural peptides, which frequently required expensive, time-consuming, and extensive tests to confirm. As artificial intelligence advances quickly, it is now possible to build machine learning or deep learning models that screen a large number of candidate sequences for therapeutic peptides. Therapeutic peptides, such as those with antibacterial or anticancer properties, have been developed by the application of artificial intelligence algorithms.The process of finding and developing peptide drugs is outlined in this review, along with a few related cases that were helped by AI and conventional methods. These resources will open up new avenues for peptide drug development and discovery, helping to meet the pressing needs of clinical patients for disease treatment. Although peptide drugs are a new class of biopharmaceuticals that distinguish them from chemical and small molecule drugs, their clinical purpose and value cannot be ignored. However, the traditional peptide drug research and development has a long development cycle and high investment, and the creation of peptide medications will be substantially hastened by the AI-assisted (AI+) mode, offering a new boost for combating diseases.
Collapse
Affiliation(s)
- Zhiheng Chen
- School of Biological Science and Medical Engineering, Beihang University, Beijing 100083, China.
| | - Ruoxi Wang
- School of Biological Science and Medical Engineering, Beihang University, Beijing 100083, China.
| | - Junqi Guo
- School of Biological Science and Medical Engineering, Beihang University, Beijing 100083, China.
| | - Xiaogang Wang
- Guangdong Provincial Key Laboratory of Bone and Joint Degenerative Diseases, The Third Affiliated Hospital of Southern Medical University, Guangzhou, Guangdong 510630, China.
| |
Collapse
|
5
|
Li Y, Liu B, Deng J, Guo Y, Du H. Image-based molecular representation learning for drug development: a survey. Brief Bioinform 2024; 25:bbae294. [PMID: 38920347 PMCID: PMC11200195 DOI: 10.1093/bib/bbae294] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2024] [Revised: 05/19/2024] [Accepted: 06/08/2024] [Indexed: 06/27/2024] Open
Abstract
Artificial intelligence (AI) powered drug development has received remarkable attention in recent years. It addresses the limitations of traditional experimental methods that are costly and time-consuming. While there have been many surveys attempting to summarize related research, they only focus on general AI or specific aspects such as natural language processing and graph neural network. Considering the rapid advance on computer vision, using the molecular image to enable AI appears to be a more intuitive and effective approach since each chemical substance has a unique visual representation. In this paper, we provide the first survey on image-based molecular representation for drug development. The survey proposes a taxonomy based on the learning paradigms in computer vision and reviews a large number of corresponding papers, highlighting the contributions of molecular visual representation in drug development. Besides, we discuss the applications, limitations and future directions in the field. We hope this survey could offer valuable insight into the use of image-based molecular representation learning in the context of drug development.
Collapse
Affiliation(s)
- Yue Li
- Division of Gastroenterology, Dongzhimen Hospital, Beijing University of Chinese Medicine, No. 5 Haiyun Warehouse, 100700, Beijing, China
| | - Bingyan Liu
- School of Computer Science, Beijing University of Posts and Telecommunications, No.10 Xituchen Street, 100876, Beijing, China
| | - Jinyan Deng
- Division of Gastroenterology, Dongzhimen Hospital, Beijing University of Chinese Medicine, No. 5 Haiyun Warehouse, 100700, Beijing, China
| | - Yi Guo
- Division of Gastroenterology, Dongzhimen Hospital, Beijing University of Chinese Medicine, No. 5 Haiyun Warehouse, 100700, Beijing, China
| | - Hongbo Du
- Division of Gastroenterology, Dongzhimen Hospital, Beijing University of Chinese Medicine, No. 5 Haiyun Warehouse, 100700, Beijing, China
- Institute of Liver Disease, Beijing University of Chinese Medicine, No. 5 Haiyun Warehouse, 100700, Beijing, China
| |
Collapse
|
6
|
Abbas MKG, Rassam A, Karamshahi F, Abunora R, Abouseada M. The Role of AI in Drug Discovery. Chembiochem 2024:e202300816. [PMID: 38735845 DOI: 10.1002/cbic.202300816] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2023] [Revised: 05/09/2024] [Accepted: 05/10/2024] [Indexed: 05/14/2024]
Abstract
The emergence of Artificial Intelligence (AI) in drug discovery marks a pivotal shift in pharmaceutical research, blending sophisticated computational techniques with conventional scientific exploration to break through enduring obstacles. This review paper elucidates the multifaceted applications of AI across various stages of drug development, highlighting significant advancements and methodologies. It delves into AI's instrumental role in drug design, polypharmacology, chemical synthesis, drug repurposing, and the prediction of drug properties such as toxicity, bioactivity, and physicochemical characteristics. Despite AI's promising advancements, the paper also addresses the challenges and limitations encountered in the field, including data quality, generalizability, computational demands, and ethical considerations. By offering a comprehensive overview of AI's role in drug discovery, this paper underscores the technology's potential to significantly enhance drug development, while also acknowledging the hurdles that must be overcome to fully realize its benefits.
Collapse
Affiliation(s)
- M K G Abbas
- Center for Advanced Materials, Qatar University, P.O. Box, 2713, Doha, Qatar
| | - Abrar Rassam
- Secondary Education, Educational Sciences, Qatar University, P.O. Box, 2713, Doha, Qatar
| | - Fatima Karamshahi
- Department of Chemistry and Earth Sciences, Qatar University, P.O. Box, 2713, Doha, Qatar
| | - Rehab Abunora
- Faculty of Medicine, General Medicine and Surgery, Helwan University, Cairo, Egypt
| | - Maha Abouseada
- Department of Chemistry and Earth Sciences, Qatar University, P.O. Box, 2713, Doha, Qatar
| |
Collapse
|
7
|
Zhang M, Lin S, Han L, Zhang J, Liu S, Yang X, Wang R, Yang X, Yi Y. Safety and efficacy evaluation of halicin as an effective drug for inhibiting intestinal infections. Front Pharmacol 2024; 15:1389293. [PMID: 38783954 PMCID: PMC11111955 DOI: 10.3389/fphar.2024.1389293] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2024] [Accepted: 04/24/2024] [Indexed: 05/25/2024] Open
Abstract
Halicin, the first antibacterial agent discovered by artificial intelligence, exerts broad-spectrum antibacterial effects and has a unique structure. Our study found that halicin had a good inhibitory effect on clinical isolates of drug-resistant strains and Clostridium perfringens (C. perfringens). The safety of halicin was evaluated by acute oral toxicity, genotoxicity and subchronic toxicity studies. The results of acute toxicity test indicated that halicin, as a low-toxicity compound, had an LD50 of 2018.3 mg/kg. The results of sperm malformation, bone marrow chromosome aberration and cell micronucleus tests showed that halicin had no obvious genotoxicity. However, the results of the 90-day subchronic toxicity test indicated that the test rats exhibited weight loss and slight renal inflammation at a high dose of 201.8 mg/kg. Teratogenicity of zebrafish embryos showed that halicin had no significant teratogenicity. Analysis of intestinal microbiota showed that halicin had a significant effect on the intestinal microbial composition, but caused a faster recovery. Furthermore, drug metabolism experiments showed that halicin was poorly absorbed and quickly eliminated in vivo. Our study found that halicin had a good therapeutic effect on intestinal infection model of C. perfringens. These results show the feasibility of developing oral halicin as a clinical candidate drug for treating intestinal infections.
Collapse
Affiliation(s)
- Maolu Zhang
- State Key Laboratory of Biobased Material and Green Papermaking (LBMP), Qilu University of Technology (Shandong Academy of Sciences), Jinan, Shandong, China
- Shandong Provincial Animal and Poultry Green Health Products Creation Engineering Laboratory, Institute of Poultry Science, Shandong Academy of Agricultural Science, Jinan, Shandong, China
| | - Shuqian Lin
- Shandong Provincial Animal and Poultry Green Health Products Creation Engineering Laboratory, Institute of Poultry Science, Shandong Academy of Agricultural Science, Jinan, Shandong, China
| | - Lianquan Han
- State Key Laboratory of Biobased Material and Green Papermaking (LBMP), Qilu University of Technology (Shandong Academy of Sciences), Jinan, Shandong, China
| | - Jiaming Zhang
- Shandong Provincial Animal and Poultry Green Health Products Creation Engineering Laboratory, Institute of Poultry Science, Shandong Academy of Agricultural Science, Jinan, Shandong, China
| | - Shaoning Liu
- Animal Products Quality and Safety Center of Shandong Province, Jinan, Shandong, China
| | - Xiuzhen Yang
- Animal Products Quality and Safety Center of Shandong Province, Jinan, Shandong, China
| | - Ruiming Wang
- State Key Laboratory of Biobased Material and Green Papermaking (LBMP), Qilu University of Technology (Shandong Academy of Sciences), Jinan, Shandong, China
| | - Xiaohui Yang
- State Key Laboratory of Biobased Material and Green Papermaking (LBMP), Qilu University of Technology (Shandong Academy of Sciences), Jinan, Shandong, China
| | - Yunpeng Yi
- Shandong Provincial Animal and Poultry Green Health Products Creation Engineering Laboratory, Institute of Poultry Science, Shandong Academy of Agricultural Science, Jinan, Shandong, China
| |
Collapse
|
8
|
Long TZ, Jiang DJ, Shi SH, Deng YC, Wang WX, Cao DS. Enhancing Multi-species Liver Microsomal Stability Prediction through Artificial Intelligence. J Chem Inf Model 2024; 64:3222-3236. [PMID: 38498003 DOI: 10.1021/acs.jcim.4c00159] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/19/2024]
Abstract
Liver microsomal stability, a crucial aspect of metabolic stability, significantly impacts practical drug discovery. However, current models for predicting liver microsomal stability are based on limited molecular information from a single species. To address this limitation, we constructed the largest public database of compounds from three common species: human, rat, and mouse. Subsequently, we developed a series of classification models using both traditional descriptor-based and classic graph-based machine learning (ML) algorithms. Remarkably, the best-performing models for the three species achieved Matthews correlation coefficients (MCCs) of 0.616, 0.603, and 0.574, respectively, on the test set. Furthermore, through the construction of consensus models based on these individual models, we have demonstrated their superior predictive performance in comparison with the existing models of the same type. To explore the similarities and differences in the properties of liver microsomal stability among multispecies molecules, we conducted preliminary interpretative explorations using the Shapley additive explanations (SHAP) and atom heatmap approaches for the models and misclassified molecules. Additionally, we further investigated representative structural modifications and substructures that decrease the liver microsomal stability in different species using the matched molecule pair analysis (MMPA) method and substructure extraction techniques. The established prediction models, along with insightful interpretation information regarding liver microsomal stability, will significantly contribute to enhancing the efficiency of exploring practical drugs for development.
Collapse
Affiliation(s)
- Teng-Zhi Long
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha 410013, Hunan, P. R. China
| | - De-Jun Jiang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, P. R. China
| | - Shao-Hua Shi
- Advancing Translational Medicine in Bone and Joint Diseases, School of Chinese Medicine, Hong Kong Baptist University, Kowloon, Hong Kong SAR 999077, P. R. China
| | - You-Chao Deng
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha 410013, Hunan, P. R. China
| | - Wen-Xuan Wang
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha 410013, Hunan, P. R. China
| | - Dong-Sheng Cao
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha 410013, Hunan, P. R. China
- Advancing Translational Medicine in Bone and Joint Diseases, School of Chinese Medicine, Hong Kong Baptist University, Kowloon, Hong Kong SAR 999077, P. R. China
- Department of Pharmacy, Xiangya Hospital, Central South University, Changsha 410008, Hunan, P. R. China
| |
Collapse
|
9
|
Zhang C, Xie L, Lu X, Mao R, Xu L, Xu X. Developing an Improved Cycle Architecture for AI-Based Generation of New Structures Aimed at Drug Discovery. Molecules 2024; 29:1499. [PMID: 38611779 PMCID: PMC11013495 DOI: 10.3390/molecules29071499] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2024] [Revised: 03/18/2024] [Accepted: 03/21/2024] [Indexed: 04/14/2024] Open
Abstract
Drug discovery involves a crucial step of optimizing molecules with the desired structural groups. In the domain of computer-aided drug discovery, deep learning has emerged as a prominent technique in molecular modeling. Deep generative models, based on deep learning, play a crucial role in generating novel molecules when optimizing molecules. However, many existing molecular generative models have limitations as they solely process input information in a forward way. To overcome this limitation, we propose an improved generative model called BD-CycleGAN, which incorporates BiLSTM (bidirectional long short-term memory) and Mol-CycleGAN (molecular cycle generative adversarial network) to preserve the information of molecular input. To evaluate the proposed model, we assess its performance by analyzing the structural distribution and evaluation matrices of generated molecules in the process of structural transformation. The results demonstrate that the BD-CycleGAN model achieves a higher success rate and exhibits increased diversity in molecular generation. Furthermore, we demonstrate its application in molecular docking, where it successfully increases the docking score for the generated molecules. The proposed BD-CycleGAN architecture harnesses the power of deep learning to facilitate the generation of molecules with desired structural features, thus offering promising advancements in the field of drug discovery processes.
Collapse
Affiliation(s)
| | | | | | | | - Lei Xu
- Institute of Bioinformatics and Medical Engineering, School of Electrical and Information Engineering, Jiangsu University of Technology, Changzhou 213001, China; (C.Z.); (L.X.); (X.L.); (R.M.)
| | - Xiaojun Xu
- Institute of Bioinformatics and Medical Engineering, School of Electrical and Information Engineering, Jiangsu University of Technology, Changzhou 213001, China; (C.Z.); (L.X.); (X.L.); (R.M.)
| |
Collapse
|
10
|
Qi X, Zhao Y, Qi Z, Hou S, Chen J. Machine Learning Empowering Drug Discovery: Applications, Opportunities and Challenges. Molecules 2024; 29:903. [PMID: 38398653 PMCID: PMC10892089 DOI: 10.3390/molecules29040903] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2024] [Revised: 02/08/2024] [Accepted: 02/14/2024] [Indexed: 02/25/2024] Open
Abstract
Drug discovery plays a critical role in advancing human health by developing new medications and treatments to combat diseases. How to accelerate the pace and reduce the costs of new drug discovery has long been a key concern for the pharmaceutical industry. Fortunately, by leveraging advanced algorithms, computational power and biological big data, artificial intelligence (AI) technology, especially machine learning (ML), holds the promise of making the hunt for new drugs more efficient. Recently, the Transformer-based models that have achieved revolutionary breakthroughs in natural language processing have sparked a new era of their applications in drug discovery. Herein, we introduce the latest applications of ML in drug discovery, highlight the potential of advanced Transformer-based ML models, and discuss the future prospects and challenges in the field.
Collapse
Affiliation(s)
- Xin Qi
- School of Chemistry and Life Sciences, Suzhou University of Science and Technology, Suzhou 215011, China; (Y.Z.); (S.H.); (J.C.)
| | - Yuanchun Zhao
- School of Chemistry and Life Sciences, Suzhou University of Science and Technology, Suzhou 215011, China; (Y.Z.); (S.H.); (J.C.)
| | - Zhuang Qi
- School of Software, Shandong University, Jinan 250101, China;
| | - Siyu Hou
- School of Chemistry and Life Sciences, Suzhou University of Science and Technology, Suzhou 215011, China; (Y.Z.); (S.H.); (J.C.)
| | - Jiajia Chen
- School of Chemistry and Life Sciences, Suzhou University of Science and Technology, Suzhou 215011, China; (Y.Z.); (S.H.); (J.C.)
| |
Collapse
|
11
|
Alkubaisi BO, Aljobowry R, Ali SM, Sultan S, Zaraei SO, Ravi A, Al-Tel TH, El-Gamal MI. The latest perspectives of small molecules FMS kinase inhibitors. Eur J Med Chem 2023; 261:115796. [PMID: 37708796 DOI: 10.1016/j.ejmech.2023.115796] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2023] [Revised: 09/03/2023] [Accepted: 09/04/2023] [Indexed: 09/16/2023]
Abstract
FMS kinase is a type III tyrosine kinase receptor that plays a central role in the pathophysiology and management of several diseases, including a range of cancer types, inflammatory disorders, neurodegenerative disorders, and bone disorders among others. In this review, the pathophysiological pathways of FMS kinase in different diseases and the recent developments of its monoclonal antibodies and inhibitors during the last five years are discussed. The biological and biochemical features of these inhibitors, including binding interactions, structure-activity relationships (SAR), selectivity, and potencies are discussed. The focus of this article is on the compounds that are promising leads and undergoing advanced clinical investigations, as well as on those that received FDA approval. In this article, we attempt to classify the reviewed FMS inhibitors according to their core chemical structure including pyridine, pyrrolopyridine, pyrazolopyridine, quinoline, and pyrimidine derivatives.
Collapse
Affiliation(s)
- Bilal O Alkubaisi
- Research Institute for Medical and Health Sciences, University of Sharjah, Sharjah, 27272, United Arab Emirates
| | - Raya Aljobowry
- College of Pharmacy, University of Sharjah, Sharjah, 27272, United Arab Emirates
| | - Salma M Ali
- College of Pharmacy, University of Sharjah, Sharjah, 27272, United Arab Emirates
| | - Sara Sultan
- College of Pharmacy, University of Sharjah, Sharjah, 27272, United Arab Emirates
| | - Seyed-Omar Zaraei
- Research Institute for Medical and Health Sciences, University of Sharjah, Sharjah, 27272, United Arab Emirates
| | - Anil Ravi
- Research Institute for Medical and Health Sciences, University of Sharjah, Sharjah, 27272, United Arab Emirates
| | - Taleb H Al-Tel
- Research Institute for Medical and Health Sciences, University of Sharjah, Sharjah, 27272, United Arab Emirates; College of Pharmacy, University of Sharjah, Sharjah, 27272, United Arab Emirates.
| | - Mohammed I El-Gamal
- Research Institute for Medical and Health Sciences, University of Sharjah, Sharjah, 27272, United Arab Emirates; College of Pharmacy, University of Sharjah, Sharjah, 27272, United Arab Emirates; Faculty of Pharmacy, Mansoura University, Mansoura, 35516, Egypt.
| |
Collapse
|
12
|
Wang Y, Xia Y, Yan J, Yuan Y, Shen HB, Pan X. ZeroBind: a protein-specific zero-shot predictor with subgraph matching for drug-target interactions. Nat Commun 2023; 14:7861. [PMID: 38030641 PMCID: PMC10687269 DOI: 10.1038/s41467-023-43597-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2023] [Accepted: 11/13/2023] [Indexed: 12/01/2023] Open
Abstract
Existing drug-target interaction (DTI) prediction methods generally fail to generalize well to novel (unseen) proteins and drugs. In this study, we propose a protein-specific meta-learning framework ZeroBind with subgraph matching for predicting protein-drug interactions from their structures. During the meta-training process, ZeroBind formulates training a protein-specific model, which is also considered a learning task, and each task uses graph neural networks (GNNs) to learn the protein graph embedding and the molecular graph embedding. Inspired by the fact that molecules bind to a binding pocket in proteins instead of the whole protein, ZeroBind introduces a weakly supervised subgraph information bottleneck (SIB) module to recognize the maximally informative and compressive subgraphs in protein graphs as potential binding pockets. In addition, ZeroBind trains the models of individual proteins as multiple tasks, whose importance is automatically learned with a task adaptive self-attention module to make final predictions. The results show that ZeroBind achieves superior performance on DTI prediction over existing methods, especially for those unseen proteins and drugs, and performs well after fine-tuning for those proteins or drugs with a few known binding partners.
Collapse
Affiliation(s)
- Yuxuan Wang
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai, 200240, China
| | - Ying Xia
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai, 200240, China
| | - Junchi Yan
- Department of Computer Science and Engineering, and MoE Key Lab of Artificial Intelligence, AI Institute, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Ye Yuan
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai, 200240, China
| | - Hong-Bin Shen
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai, 200240, China
| | - Xiaoyong Pan
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai, 200240, China.
| |
Collapse
|
13
|
Dong J, Wu Z, Xu H, Ouyang D. FormulationAI: a novel web-based platform for drug formulation design driven by artificial intelligence. Brief Bioinform 2023; 25:bbad419. [PMID: 37991246 PMCID: PMC10783856 DOI: 10.1093/bib/bbad419] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2023] [Revised: 10/13/2023] [Accepted: 10/31/2023] [Indexed: 11/23/2023] Open
Abstract
Today, pharmaceutical industry faces great pressure to employ more efficient and systematic ways in drug discovery and development process. However, conventional formulation studies still strongly rely on personal experiences by trial-and-error experiments, resulting in a labor-consuming, tedious and costly pipeline. Thus, it is highly required to develop intelligent and efficient methods for formulation development to keep pace with the progress of the pharmaceutical industry. Here, we developed a comprehensive web-based platform (FormulationAI) for in silico formulation design. First, the most comprehensive datasets of six widely used drug formulation systems in the pharmaceutical industry were collected over 10 years, including cyclodextrin formulation, solid dispersion, phospholipid complex, nanocrystals, self-emulsifying and liposome systems. Then, intelligent prediction and evaluation of 16 important properties from the six systems were investigated and implemented by systematic study and comparison of different AI algorithms and molecular representations. Finally, an efficient prediction platform was established and validated, which enables the formulation design just by inputting basic information of drugs and excipients. FormulationAI is the first freely available comprehensive web-based platform, which provides a powerful solution to assist the formulation design in pharmaceutical industry. It is available at https://formulationai.computpharm.org/.
Collapse
Affiliation(s)
- Jie Dong
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha, China
- Institute of Chinese Medical Sciences (ICMS), State Key Laboratory of Quality Research in Chinese Medicine, University of Macau, Macau, China
| | - Zheng Wu
- Institute of Chinese Medical Sciences (ICMS), State Key Laboratory of Quality Research in Chinese Medicine, University of Macau, Macau, China
| | - Huanle Xu
- Faculty of Science and Technology, University of Macau, Macau, China
| | - Defang Ouyang
- Institute of Chinese Medical Sciences (ICMS), State Key Laboratory of Quality Research in Chinese Medicine, University of Macau, Macau, China
| |
Collapse
|
14
|
Zhang Y, Liu C, Liu M, Liu T, Lin H, Huang CB, Ning L. Attention is all you need: utilizing attention in AI-enabled drug discovery. Brief Bioinform 2023; 25:bbad467. [PMID: 38189543 PMCID: PMC10772984 DOI: 10.1093/bib/bbad467] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2023] [Revised: 11/03/2023] [Accepted: 11/25/2023] [Indexed: 01/09/2024] Open
Abstract
Recently, attention mechanism and derived models have gained significant traction in drug development due to their outstanding performance and interpretability in handling complex data structures. This review offers an in-depth exploration of the principles underlying attention-based models and their advantages in drug discovery. We further elaborate on their applications in various aspects of drug development, from molecular screening and target binding to property prediction and molecule generation. Finally, we discuss the current challenges faced in the application of attention mechanisms and Artificial Intelligence technologies, including data quality, model interpretability and computational resource constraints, along with future directions for research. Given the accelerating pace of technological advancement, we believe that attention-based models will have an increasingly prominent role in future drug discovery. We anticipate that these models will usher in revolutionary breakthroughs in the pharmaceutical domain, significantly accelerating the pace of drug development.
Collapse
Affiliation(s)
- Yang Zhang
- Innovative Institute of Chinese Medicine and Pharmacy, Academy for Interdiscipline, Chengdu University of Traditional Chinese Medicine, Chengdu, China
| | - Caiqi Liu
- Department of Gastrointestinal Medical Oncology, Harbin Medical University Cancer Hospital, No.150 Haping Road, Nangang District, Harbin, Heilongjiang 150081, China
- Key Laboratory of Molecular Oncology of Heilongjiang Province, No.150 Haping Road, Nangang District, Harbin, Heilongjiang 150081, China
| | - Mujiexin Liu
- Chongqing Key Laboratory of Sichuan-Chongqing Co-construction for Diagnosis and Treatment of Infectious Diseases Integrated Traditional Chinese and Western Medicine, College of Medical Technology, Chengdu University of Traditional Chinese Medicine, Chengdu, China
| | - Tianyuan Liu
- Graduate School of Science and Technology, University of Tsukuba, Tsukuba, Japan
| | - Hao Lin
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Cheng-Bing Huang
- School of Computer Science and Technology, Aba Teachers University, Aba, China
| | - Lin Ning
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, Zhejiang, China
- School of Healthcare Technology, Chengdu Neusoft University, Chengdu 611844, China
| |
Collapse
|
15
|
Wang Z, Zhong H, Zhang J, Pan P, Wang D, Liu H, Yao X, Hou T, Kang Y. Small-Molecule Conformer Generators: Evaluation of Traditional Methods and AI Models on High-Quality Data Sets. J Chem Inf Model 2023; 63:6525-6536. [PMID: 37883143 DOI: 10.1021/acs.jcim.3c01519] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2023]
Abstract
Small-molecule conformer generation (SMCG) is an extremely important task in both ligand- and structure-based computer-aided drug design, especially during the hit discovery phase. Recently, a multitude of artificial intelligence (AI) models tailored for SMCG have emerged. Despite developers typically furnishing performance evaluation data upon releasing their AI models, a comprehensive and equitable performance comparison between AI models and conventional methods is still lacking. In this study, we curated a new benchmarking data set comprising 3354 high-quality ligand bioactive conformations. Subsequently, we conducted a systematic assessment of the performance of four widely adopted traditional methods (i.e., ConfGenX, Conformator, OMEGA, and RDKit ETKDG) and five AI models (i.e., ConfGF, DMCG, GeoDiff, GeoMol, and torsional diffusion) in the tasks of reproducing bioactive and low-energy conformations of small molecules. In the former task, the AI models have no advantage, particularly with a maximum ensemble size of 1. Even the best-performing AI model GeoMol is still worse than any of the tested traditional methods. Conversely, in the latter task, the torsional diffusion model shows obvious advantages, surpassing the best-performing traditional method ConfGenX by 26.09 and 12.97% on the COV-R and COV-P metrics, respectively. Furthermore, the influence of force field-based fine-tuning on the quality of the generated conformers was also discussed. Finally, a user-friendly Web server called fastSMCG was developed to enable researchers to rapidly and flexibly generate small-molecule conformers using both traditional and AI methods. We anticipate that our work will offer valuable practical assistance to the scientific community in this field.
Collapse
Affiliation(s)
- Zhe Wang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Haiyang Zhong
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Jintu Zhang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Peichen Pan
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Dong Wang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Huanxiang Liu
- Faculty of Applied Science, Macao Polytechnic University, Macao SAR 999078, China
| | - Xiaojun Yao
- State Key Laboratory of Quality Research in Chinese Medicine, Macau Institute for Applied Research in Medicine and Health, Macau University of Science and Technology, Macao SAR 999078, China
| | - Tingjun Hou
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Yu Kang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| |
Collapse
|
16
|
Schweizer L. Drug Intelligence Science (DIS®): Pioneering a high-resolution translational platform to enhance the probability of success for drug discovery and development. Drug Discov Today 2023; 28:103795. [PMID: 37805064 DOI: 10.1016/j.drudis.2023.103795] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2023] [Revised: 09/26/2023] [Accepted: 10/02/2023] [Indexed: 10/09/2023]
Abstract
Translational research has a crucial role in bridging the gap between basic biology discoveries and their clinical applications. Deep scientific understanding and advanced technology platforms are both crucial for translational research. Here, I describe a novel integrated Drug Intelligence Science (DIS®) translational platform that combines single cell technology with artificial intelligence (AI) and machine learning (ML) to gain insights into high-resolution cell biology, thus enabling the discovery of disease-relevant targets, high-quality drug candidates, and predictive biomarkers. The innovative DIS® approach has the potential to provide unprecedented mechanistic understanding of human diseases and enable in-depth pharmacological profiling of drug candidates to increase the probability of success (POS) in drug discovery and development.
Collapse
Affiliation(s)
- Liang Schweizer
- HiFiBiO Therapeutics, 237 Putnam Avenue, Cambridge, MA 02139, USA.
| |
Collapse
|
17
|
Yin X, Wang X, Li Y, Wang J, Wang Y, Deng Y, Hou T, Liu H, Luo P, Yao X. CODD-Pred: A Web Server for Efficient Target Identification and Bioactivity Prediction of Small Molecules. J Chem Inf Model 2023; 63:6169-6176. [PMID: 37820365 DOI: 10.1021/acs.jcim.3c00685] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/13/2023]
Abstract
Target identification and bioactivity prediction are critical steps in the drug discovery process. Here we introduce CODD-Pred (COmprehensive Drug Design Predictor), an online web server with well-curated data sets from the GOSTAR database, which is designed with a dual purpose of predicting potential protein drug targets and computing bioactivity values of small molecules. We first designed a double molecular graph perception (DMGP) framework for target prediction based on a large library of 646 498 small molecules interacting with 640 human targets. The framework achieved a top-5 accuracy of over 80% for hitting at least one target on both external validation sets. Additionally, its performance on the external validation set comprising 200 molecules surpassed that of four existing target prediction servers. Second, we collected 56 targets closely related to the occurrence and development of cancer, metabolic diseases, and inflammatory immune diseases and developed a multi-model self-validation activity prediction (MSAP) framework that enables accurate bioactivity quantification predictions for small-molecule ligands of these 56 targets. CODD-Pred is a handy tool for rapid evaluation and optimization of small molecules with specific target activity. CODD-Pred is freely accessible at http://codd.iddd.group/.
Collapse
Affiliation(s)
- Xiaodan Yin
- Dr. Neher's Biophysics Laboratory for Innovative Drug Discovery, State Key Laboratory of Quality Research in Chinese Medicine, Macau Institute for Applied Research in Medicine and Health, Macau University of Science and Technology, Macao, 999078, China
- Carbon-Silicon AI Technology Co., Ltd, Zhejiang, Hangzhou 310018, China
| | - Xiaorui Wang
- Dr. Neher's Biophysics Laboratory for Innovative Drug Discovery, State Key Laboratory of Quality Research in Chinese Medicine, Macau Institute for Applied Research in Medicine and Health, Macau University of Science and Technology, Macao, 999078, China
- Carbon-Silicon AI Technology Co., Ltd, Zhejiang, Hangzhou 310018, China
| | - Yuquan Li
- College of Chemistry and Chemical Engineering, Lanzhou University, Lanzhou, 730000, China
| | - Jike Wang
- College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou, 310058, China
| | - Yuwei Wang
- College of Pharmacy, Shaanxi University of Chinese Medicine, Xianyang, 712000, China
| | - Yafeng Deng
- Carbon-Silicon AI Technology Co., Ltd, Zhejiang, Hangzhou 310018, China
| | - Tingjun Hou
- College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou, 310058, China
| | - Huanxiang Liu
- Faculty of Applied Sciences, Macao Polytechnic University, Macao, 999078, China
| | - Pei Luo
- Dr. Neher's Biophysics Laboratory for Innovative Drug Discovery, State Key Laboratory of Quality Research in Chinese Medicine, Macau Institute for Applied Research in Medicine and Health, Macau University of Science and Technology, Macao, 999078, China
| | - Xiaojun Yao
- Faculty of Applied Sciences, Macao Polytechnic University, Macao, 999078, China
| |
Collapse
|
18
|
Abouelwafa M, Ibrahim TM, El-Hadidi MS, Mahnashi MH, Owaidah AY, Saeedi NH, Attia HG, Georrge JJ, Mostafa A. Using CADD tools to inhibit the overexpressed genes FAP, FN1, and MMP1 by repurposing ginsenoside C and Rg1 as a treatment for oral cancer. Front Mol Biosci 2023; 10:1248885. [PMID: 37936719 PMCID: PMC10627001 DOI: 10.3389/fmolb.2023.1248885] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2023] [Accepted: 08/11/2023] [Indexed: 11/09/2023] Open
Abstract
Oral cancer is one of the most common cancer types. Many factors can express certain genes that cause the proliferation of oral tissues. Overexpressed genes were detected in oral cancer patients; three were highly impacted. FAP, FN1, and MMP1 were the targeted genes that showed inhibition results in silico by ginsenoside C and Rg1. Approved drugs were retrieved from the DrugBank database. The docking scores show an excellent interaction between the ligands and the targeted macromolecules. Further molecular dynamics simulations showed the binding stability of the proposed natural products. This work recommends repurposing ginsenoside C and Rg1 as potential binders for the selected targets and endorses future experimental validation for the treatment of oral cancer.
Collapse
Affiliation(s)
- Manal Abouelwafa
- Department of Bioinformatics, Christ College, Rajkot, Gujarat, India
| | - Tamer M. Ibrahim
- Department of Pharmaceutical Chemistry, Faculty of Pharmacy, Kafrelsheikh University, Kafrelsheikh, Egypt
- Bioinformatics Group, Center for Informatics Sciences, School of Information Technology and Computer Science, Nile University, Giza, Egypt
| | - Mohamed S. El-Hadidi
- Bioinformatics Group, Center for Informatics Sciences, School of Information Technology and Computer Science, Nile University, Giza, Egypt
| | - Mater H. Mahnashi
- Department of Pharmaceutical Chemistry, College of Pharmacy, Najran University, Najran, Saudi Arabia
| | - Amani Y. Owaidah
- Department of Clinical Laboratory Sciences, College of Applied Medical Sciences, Imam Abdulrahman bin Faisal University, Dammam, Saudi Arabia
| | - Nizar H. Saeedi
- Department of Medical Laboratory Technology, Faculty of Applied Medical Sciences, University of Tabuk, Tabuk, Saudi Arabia
| | - Hany G. Attia
- Department of Pharmacognosy, College of Pharmacy, Najran University, Najran, Saudi Arabia
| | - John J. Georrge
- Department of Bioinformatics, University of North Bengal, West Bengal, India
| | - Amany Mostafa
- Nanomedicine and Tissue Engineering Laboratory, Medical Research Centre of Excellence, National Research Centre (NRC), Cairo, Egypt
| |
Collapse
|
19
|
Deng J, Yang Z, Wang H, Ojima I, Samaras D, Wang F. A systematic study of key elements underlying molecular property prediction. Nat Commun 2023; 14:6395. [PMID: 37833262 PMCID: PMC10575948 DOI: 10.1038/s41467-023-41948-6] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2022] [Accepted: 09/18/2023] [Indexed: 10/15/2023] Open
Abstract
Artificial intelligence (AI) has been widely applied in drug discovery with a major task as molecular property prediction. Despite booming techniques in molecular representation learning, key elements underlying molecular property prediction remain largely unexplored, which impedes further advancements in this field. Herein, we conduct an extensive evaluation of representative models using various representations on the MoleculeNet datasets, a suite of opioids-related datasets and two additional activity datasets from the literature. To investigate the predictive power in low-data and high-data space, a series of descriptors datasets of varying sizes are also assembled to evaluate the models. In total, we have trained 62,820 models, including 50,220 models on fixed representations, 4200 models on SMILES sequences and 8400 models on molecular graphs. Based on extensive experimentation and rigorous comparison, we show that representation learning models exhibit limited performance in molecular property prediction in most datasets. Besides, multiple key elements underlying molecular property prediction can affect the evaluation results. Furthermore, we show that activity cliffs can significantly impact model prediction. Finally, we explore into potential causes why representation learning models can fail and show that dataset size is essential for representation learning models to excel.
Collapse
Affiliation(s)
- Jianyuan Deng
- Stony Brook University, Department of Biomedical Informatics, Stony Brook, NY, 11794, USA
| | - Zhibo Yang
- Stony Brook University, Department of Computer Science, Stony Brook, NY, 11794, USA
| | - Hehe Wang
- Stony Brook University, Department of Chemistry, Stony Brook, NY, 11794, USA
| | - Iwao Ojima
- Stony Brook University, Department of Chemistry, Stony Brook, NY, 11794, USA
| | - Dimitris Samaras
- Stony Brook University, Department of Computer Science, Stony Brook, NY, 11794, USA
| | - Fusheng Wang
- Stony Brook University, Department of Biomedical Informatics, Stony Brook, NY, 11794, USA.
- Stony Brook University, Department of Computer Science, Stony Brook, NY, 11794, USA.
| |
Collapse
|
20
|
Kırboğa KK, Abbasi S, Küçüksille EU. Explainability and white box in drug discovery. Chem Biol Drug Des 2023; 102:217-233. [PMID: 37105727 DOI: 10.1111/cbdd.14262] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2022] [Revised: 03/24/2023] [Accepted: 04/12/2023] [Indexed: 04/29/2023]
Abstract
Recently, artificial intelligence (AI) techniques have been increasingly used to overcome the challenges in drug discovery. Although traditional AI techniques generally have high accuracy rates, there may be difficulties in explaining the decision process and patterns. This can create difficulties in understanding and making sense of the outputs of algorithms used in drug discovery. Therefore, using explainable AI (XAI) techniques, the causes and consequences of the decision process are better understood. This can help further improve the drug discovery process and make the right decisions. To address this issue, Explainable Artificial Intelligence (XAI) emerged as a process and method that securely captures the results and outputs of machine learning (ML) and deep learning (DL) algorithms. Using techniques such as SHAP (SHApley Additive ExPlanations) and LIME (Locally Interpretable Model-Independent Explanations) has made the drug targeting phase clearer and more understandable. XAI methods are expected to reduce time and cost in future computational drug discovery studies. This review provides a comprehensive overview of XAI-based drug discovery and development prediction. XAI mechanisms to increase confidence in AI and modeling methods. The limitations and future directions of XAI in drug discovery are also discussed.
Collapse
Affiliation(s)
- Kevser Kübra Kırboğa
- Bioengineering Department, Bilecik Seyh Edebali University, Bilecik, Turkey
- Informatics Institute, Istanbul Technical University, Maslak, Turkey
| | - Sumra Abbasi
- Department of Biological Sciences, National of Medical Sciences, Rawalpindi, Pakistan
| | - Ecir Uğur Küçüksille
- Department of Computer Engineering, Süleyman Demirel University, Isparta, Turkey
| |
Collapse
|
21
|
Qureshi R, Irfan M, Gondal TM, Khan S, Wu J, Hadi MU, Heymach J, Le X, Yan H, Alam T. AI in drug discovery and its clinical relevance. Heliyon 2023; 9:e17575. [PMID: 37396052 PMCID: PMC10302550 DOI: 10.1016/j.heliyon.2023.e17575] [Citation(s) in RCA: 28] [Impact Index Per Article: 28.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2023] [Revised: 06/17/2023] [Accepted: 06/21/2023] [Indexed: 07/04/2023] Open
Abstract
The COVID-19 pandemic has emphasized the need for novel drug discovery process. However, the journey from conceptualizing a drug to its eventual implementation in clinical settings is a long, complex, and expensive process, with many potential points of failure. Over the past decade, a vast growth in medical information has coincided with advances in computational hardware (cloud computing, GPUs, and TPUs) and the rise of deep learning. Medical data generated from large molecular screening profiles, personal health or pathology records, and public health organizations could benefit from analysis by Artificial Intelligence (AI) approaches to speed up and prevent failures in the drug discovery pipeline. We present applications of AI at various stages of drug discovery pipelines, including the inherently computational approaches of de novo design and prediction of a drug's likely properties. Open-source databases and AI-based software tools that facilitate drug design are discussed along with their associated problems of molecule representation, data collection, complexity, labeling, and disparities among labels. How contemporary AI methods, such as graph neural networks, reinforcement learning, and generated models, along with structure-based methods, (i.e., molecular dynamics simulations and molecular docking) can contribute to drug discovery applications and analysis of drug responses is also explored. Finally, recent developments and investments in AI-based start-up companies for biotechnology, drug design and their current progress, hopes and promotions are discussed in this article.
Collapse
Affiliation(s)
- Rizwan Qureshi
- College of Science and Engineering, Hamad Bin Khalifa University, Doha, Qatar
- Department of Imaging Physics, MD Anderson Cancer Center, The University of Texas, Houston, USA
| | - Muhammad Irfan
- Faculty of Electrical Engineering, Ghulam Ishaq Khan Institute of Engineering Sciences and Technology, Swabi, Pakistan
| | | | - Sheheryar Khan
- School of Professional Education & Executive Development, The Hong Kong Polytechnic University, Hong Kong
| | - Jia Wu
- Department of Imaging Physics, MD Anderson Cancer Center, The University of Texas, Houston, USA
| | | | - John Heymach
- Department of Thoracic Head and Neck Medical Oncology, Division of Cancer Medicine, The University of Texas, MD Anderson Cancer Center, Houston, USA
| | - Xiuning Le
- Department of Thoracic Head and Neck Medical Oncology, Division of Cancer Medicine, The University of Texas, MD Anderson Cancer Center, Houston, USA
| | - Hong Yan
- Department of Electrical Engineering, City University of Hong Kong, Kowloon, Hong Kong
| | - Tanvir Alam
- College of Science and Engineering, Hamad Bin Khalifa University, Doha, Qatar
| |
Collapse
|
22
|
Smer-Barreto V, Quintanilla A, Elliott RJR, Dawson JC, Sun J, Campa VM, Lorente-Macías Á, Unciti-Broceta A, Carragher NO, Acosta JC, Oyarzún DA. Discovery of senolytics using machine learning. Nat Commun 2023; 14:3445. [PMID: 37301862 PMCID: PMC10257182 DOI: 10.1038/s41467-023-39120-1] [Citation(s) in RCA: 12] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2022] [Accepted: 05/31/2023] [Indexed: 06/12/2023] Open
Abstract
Cellular senescence is a stress response involved in ageing and diverse disease processes including cancer, type-2 diabetes, osteoarthritis and viral infection. Despite growing interest in targeted elimination of senescent cells, only few senolytics are known due to the lack of well-characterised molecular targets. Here, we report the discovery of three senolytics using cost-effective machine learning algorithms trained solely on published data. We computationally screened various chemical libraries and validated the senolytic action of ginkgetin, periplocin and oleandrin in human cell lines under various modalities of senescence. The compounds have potency comparable to known senolytics, and we show that oleandrin has improved potency over its target as compared to best-in-class alternatives. Our approach led to several hundred-fold reduction in drug screening costs and demonstrates that artificial intelligence can take maximum advantage of small and heterogeneous drug screening data, paving the way for new open science approaches to early-stage drug discovery.
Collapse
Affiliation(s)
- Vanessa Smer-Barreto
- Cancer Research UK Edinburgh Centre, MRC Institute of Genetics and Cancer, University of Edinburgh, Crewe Road, Edinburgh, EH4 2XR, UK.
| | - Andrea Quintanilla
- Instituto de Biomedicina y Biotecnología de Cantabria (IBBTEC), CSIC-Universidad de Cantabria-SODERCAN. C/ Albert Einstein 22, Santander, 39011, Spain
| | - Richard J R Elliott
- Cancer Research UK Edinburgh Centre, MRC Institute of Genetics and Cancer, University of Edinburgh, Crewe Road, Edinburgh, EH4 2XR, UK
| | - John C Dawson
- Cancer Research UK Edinburgh Centre, MRC Institute of Genetics and Cancer, University of Edinburgh, Crewe Road, Edinburgh, EH4 2XR, UK
| | - Jiugeng Sun
- School of Informatics, University of Edinburgh, 10 Crichton St, Edinburgh, EH8 9AB, UK
| | - Víctor M Campa
- Instituto de Biomedicina y Biotecnología de Cantabria (IBBTEC), CSIC-Universidad de Cantabria-SODERCAN. C/ Albert Einstein 22, Santander, 39011, Spain
| | - Álvaro Lorente-Macías
- Cancer Research UK Edinburgh Centre, MRC Institute of Genetics and Cancer, University of Edinburgh, Crewe Road, Edinburgh, EH4 2XR, UK
| | - Asier Unciti-Broceta
- Cancer Research UK Edinburgh Centre, MRC Institute of Genetics and Cancer, University of Edinburgh, Crewe Road, Edinburgh, EH4 2XR, UK
| | - Neil O Carragher
- Cancer Research UK Edinburgh Centre, MRC Institute of Genetics and Cancer, University of Edinburgh, Crewe Road, Edinburgh, EH4 2XR, UK
| | - Juan Carlos Acosta
- Cancer Research UK Edinburgh Centre, MRC Institute of Genetics and Cancer, University of Edinburgh, Crewe Road, Edinburgh, EH4 2XR, UK.
- Instituto de Biomedicina y Biotecnología de Cantabria (IBBTEC), CSIC-Universidad de Cantabria-SODERCAN. C/ Albert Einstein 22, Santander, 39011, Spain.
| | - Diego A Oyarzún
- School of Informatics, University of Edinburgh, 10 Crichton St, Edinburgh, EH8 9AB, UK.
- School of Biological Sciences, University of Edinburgh, Max Born Crescent, Edinburgh, EH9 3BF, UK.
- The Alan Turing Institute, 96 Euston Road, London, NW1 2DB, UK.
| |
Collapse
|
23
|
Kang H, Hou L, Gu Y, Lu X, Li J, Li Q. Drug-disease association prediction with literature based multi-feature fusion. Front Pharmacol 2023; 14:1205144. [PMID: 37284317 PMCID: PMC10239876 DOI: 10.3389/fphar.2023.1205144] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2023] [Accepted: 05/09/2023] [Indexed: 06/08/2023] Open
Abstract
Introduction: Exploring the potential efficacy of a drug is a valid approach for drug development with shorter development times and lower costs. Recently, several computational drug repositioning methods have been introduced to learn multi-features for potential association prediction. However, fully leveraging the vast amount of information in the scientific literature to enhance drug-disease association prediction is a great challenge. Methods: We constructed a drug-disease association prediction method called Literature Based Multi-Feature Fusion (LBMFF), which effectively integrated known drugs, diseases, side effects and target associations from public databases as well as literature semantic features. Specifically, a pre-training and fine-tuning BERT model was introduced to extract literature semantic information for similarity assessment. Then, we revealed drug and disease embeddings from the constructed fusion similarity matrix by a graph convolutional network with an attention mechanism. Results: LBMFF achieved superior performance in drug-disease association prediction with an AUC value of 0.8818 and an AUPR value of 0.5916. Discussion: LBMFF achieved relative improvements of 31.67% and 16.09%, respectively, over the second-best results, compared to single feature methods and seven existing state-of-the-art prediction methods on the same test datasets. Meanwhile, case studies have verified that LBMFF can discover new associations to accelerate drug development. The proposed benchmark dataset and source code are available at: https://github.com/kang-hongyu/LBMFF.
Collapse
Affiliation(s)
- Hongyu Kang
- Department of Biomedical Engineering, School of Life Science, Beijing Institute of Technology, Beijing, China
- Institute of Medical Information, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Li Hou
- Institute of Medical Information, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Yaowen Gu
- Institute of Medical Information, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Xiao Lu
- Department of Biomedical Engineering, School of Life Science, Beijing Institute of Technology, Beijing, China
| | - Jiao Li
- Institute of Medical Information, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Qin Li
- Department of Biomedical Engineering, School of Life Science, Beijing Institute of Technology, Beijing, China
| |
Collapse
|
24
|
Gircha AI, Boev AS, Avchaciov K, Fedichev PO, Fedorov AK. Hybrid quantum-classical machine learning for generative chemistry and drug design. Sci Rep 2023; 13:8250. [PMID: 37217521 DOI: 10.1038/s41598-023-32703-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2022] [Accepted: 03/31/2023] [Indexed: 05/24/2023] Open
Abstract
Deep generative chemistry models emerge as powerful tools to expedite drug discovery. However, the immense size and complexity of the structural space of all possible drug-like molecules pose significant obstacles, which could be overcome with hybrid architectures combining quantum computers with deep classical networks. As the first step toward this goal, we built a compact discrete variational autoencoder (DVAE) with a Restricted Boltzmann Machine (RBM) of reduced size in its latent layer. The size of the proposed model was small enough to fit on a state-of-the-art D-Wave quantum annealer and allowed training on a subset of the ChEMBL dataset of biologically active compounds. Finally, we generated 2331 novel chemical structures with medicinal chemistry and synthetic accessibility properties in the ranges typical for molecules from ChEMBL. The presented results demonstrate the feasibility of using already existing or soon-to-be-available quantum computing devices as testbeds for future drug discovery applications.
Collapse
Affiliation(s)
- A I Gircha
- Russian Quantum Center, Skolkovo, Moscow, 121205, Russia
| | - A S Boev
- Russian Quantum Center, Skolkovo, Moscow, 121205, Russia
| | - K Avchaciov
- Gero PTE. LTD., 133 Cecil Street #14-01 Keck Seng Tower, Singapore, 069535, Singapore
| | - P O Fedichev
- Gero PTE. LTD., 133 Cecil Street #14-01 Keck Seng Tower, Singapore, 069535, Singapore.
| | - A K Fedorov
- Russian Quantum Center, Skolkovo, Moscow, 121205, Russia.
| |
Collapse
|
25
|
Luo Y, Wang P, Mou M, Zheng H, Hong J, Tao L, Zhu F. A novel strategy for designing the magic shotguns for distantly related target pairs. Brief Bioinform 2023; 24:6984790. [PMID: 36631399 DOI: 10.1093/bib/bbac621] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2022] [Revised: 11/09/2022] [Accepted: 12/17/2022] [Indexed: 01/13/2023] Open
Abstract
Due to its promising capacity in improving drug efficacy, polypharmacology has emerged to be a new theme in the drug discovery of complex disease. In the process of novel multi-target drugs (MTDs) discovery, in silico strategies come to be quite essential for the advantage of high throughput and low cost. However, current researchers mostly aim at typical closely related target pairs. Because of the intricate pathogenesis networks of complex diseases, many distantly related targets are found to play crucial role in synergistic treatment. Therefore, an innovational method to develop drugs which could simultaneously target distantly related target pairs is of utmost importance. At the same time, reducing the false discovery rate in the design of MTDs remains to be the daunting technological difficulty. In this research, effective small molecule clustering in the positive dataset, together with a putative negative dataset generation strategy, was adopted in the process of model constructions. Through comprehensive assessment on 10 target pairs with hierarchical similarity-levels, the proposed strategy turned out to reduce the false discovery rate successfully. Constructed model types with much smaller numbers of inhibitor molecules gained considerable yields and showed better false-hit controllability than before. To further evaluate the generalization ability, an in-depth assessment of high-throughput virtual screening on ChEMBL database was conducted. As a result, this novel strategy could hierarchically improve the enrichment factors for each target pair (especially for those distantly related/unrelated target pairs), corresponding to target pair similarity-levels.
Collapse
Affiliation(s)
- Yongchao Luo
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Panpan Wang
- College of Chemistry and Pharmaceutical Engineering, Huanghuai University, Zhumadian 463000, China
| | - Minjie Mou
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Hanqi Zheng
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Jiajun Hong
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Lin Tao
- Key Laboratory of Elemene Class Anti-Cancer Chinese Medicine of Zhejiang Province, School of Medicine, Hangzhou Normal University, Hangzhou 310036, China
| | - Feng Zhu
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| |
Collapse
|
26
|
Wang Y, Huang M, Deng H, Li W, Wu Z, Tang Y, Liu G. Identification of vital chemical information via visualization of graph neural networks. Brief Bioinform 2023; 24:6936421. [PMID: 36537081 DOI: 10.1093/bib/bbac577] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2022] [Revised: 11/02/2022] [Accepted: 11/25/2022] [Indexed: 12/24/2022] Open
Abstract
Qualitative or quantitative prediction models of structure-activity relationships based on graph neural networks (GNNs) are prevalent in drug discovery applications and commonly have excellently predictive power. However, the network information flows of GNNs are highly complex and accompanied by poor interpretability. Unfortunately, there are relatively less studies on GNN attributions, and their developments in drug research are still at the early stages. In this work, we adopted several advanced attribution techniques for different GNN frameworks and applied them to explain multiple drug molecule property prediction tasks, enabling the identification and visualization of vital chemical information in the networks. Additionally, we evaluated them quantitatively with attribution metrics such as accuracy, sparsity, fidelity and infidelity, stability and sensitivity; discussed their applicability and limitations; and provided an open-source benchmark platform for researchers. The results showed that all attribution techniques were effective, while those directly related to the predicted labels, such as integrated gradient, preferred to have better attribution performance. These attribution techniques we have implemented could be directly used for the vast majority of chemical GNN interpretation tasks.
Collapse
Affiliation(s)
- Yimeng Wang
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, China
| | - Mengting Huang
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, China
| | - Hua Deng
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, China
| | - Weihua Li
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, China
| | - Zengrui Wu
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, China
| | - Yun Tang
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, China
| | - Guixia Liu
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, China
| |
Collapse
|
27
|
Long TZ, Shi SH, Liu S, Lu AP, Liu ZQ, Li M, Hou TJ, Cao DS. Structural Analysis and Prediction of Hematotoxicity Using Deep Learning Approaches. J Chem Inf Model 2023; 63:111-125. [PMID: 36472475 DOI: 10.1021/acs.jcim.2c01088] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Hematotoxicity has been becoming a serious but overlooked toxicity in drug discovery. However, only a few in silico models have been reported for the prediction of hematotoxicity. In this study, we constructed a high-quality dataset comprising 759 hematotoxic compounds and 1623 nonhematotoxic compounds and then established a series of classification models based on a combination of seven machine learning (ML) algorithms and nine molecular representations. The results based on two data partitioning strategies and applicability domain (AD) analysis illustrate that the best prediction model based on Attentive FP yielded a balanced accuracy (BA) of 72.6%, an area under the receiver operating characteristic curve (AUC) value of 76.8% for the validation set, and a BA of 69.2%, an AUC of 75.9% for the test set. In addition, compared with existing filtering rules and models, our model achieved the highest BA value of 67.5% for the external validation set. Additionally, the shapley additive explanation (SHAP) and atom heatmap approaches were utilized to discover the important features and structural fragments related to hematotoxicity, which could offer helpful tips to detect undesired positive substances. Furthermore, matched molecular pair analysis (MMPA) and representative substructure derivation technique were employed to further characterize and investigate the transformation principles and distinctive structural features of hematotoxic chemicals. We believe that the novel graph-based deep learning algorithms and insightful interpretation presented in this study can be used as a trustworthy and effective tool to assess hematotoxicity in the development of new drugs.
Collapse
Affiliation(s)
- Teng-Zhi Long
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha 410013, Hunan, P. R. China
| | - Shao-Hua Shi
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha 410013, Hunan, P. R. China.,Advancing Translational Medicine in Bone and Joint Diseases, School of Chinese Medicine, Hong Kong Baptist University, Hong Kong SAR, 0000, P. R. China
| | - Shao Liu
- Department of Pharmacy, Xiangya Hospital, Central South University, Changsha 410008, Hunan, P. R. China
| | - Ai-Ping Lu
- Advancing Translational Medicine in Bone and Joint Diseases, School of Chinese Medicine, Hong Kong Baptist University, Hong Kong SAR, 0000, P. R. China
| | - Zhao-Qian Liu
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha 410013, Hunan, P. R. China
| | - Min Li
- School of Computer Science and Engineering, Central South University, Changsha 410083, P. R. China
| | - Ting-Jun Hou
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, P. R. China
| | - Dong-Sheng Cao
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha 410013, Hunan, P. R. China.,Advancing Translational Medicine in Bone and Joint Diseases, School of Chinese Medicine, Hong Kong Baptist University, Hong Kong SAR, 0000, P. R. China.,Department of Pharmacy, Xiangya Hospital, Central South University, Changsha 410008, Hunan, P. R. China
| |
Collapse
|
28
|
Dong X, Wong R, Lyu W, Abell-Hart K, Deng J, Liu Y, Hajagos JG, Rosenthal RN, Chen C, Wang F. An integrated LSTM-HeteroRGNN model for interpretable opioid overdose risk prediction. Artif Intell Med 2023; 135:102439. [PMID: 36628797 PMCID: PMC9630306 DOI: 10.1016/j.artmed.2022.102439] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2022] [Revised: 09/27/2022] [Accepted: 10/28/2022] [Indexed: 11/06/2022]
Abstract
Opioid overdose (OD) has become a leading cause of accidental death in the United States, and overdose deaths reached a record high during the COVID-19 pandemic. Combating the opioid crisis requires targeting high-need populations by identifying individuals at risk of OD. While deep learning emerges as a powerful method for building predictive models using large scale electronic health records (EHR), it is challenged by the complex intrinsic relationships among EHR data. Further, its utility is limited by the lack of clinically meaningful explainability, which is necessary for making informed clinical or policy decisions using such models. In this paper, we present LIGHTED, an integrated deep learning model combining long short term memory (LSTM) and graph neural networks (GNN) to predict patients' OD risk. The LIGHTED model can incorporate the temporal effects of disease progression and the knowledge learned from interactions among clinical features. We evaluated the model using Cerner's Health Facts database with over 5 million patients. Our experiments demonstrated that the model outperforms traditional machine learning methods and other deep learning models. We also proposed a novel interpretability method by exploiting embeddings provided by GNNs to cluster patients and EHR features respectively, and conducted qualitative feature cluster analysis for clinical interpretations. Our study shows that LIGHTED can take advantage of longitudinal EHR data and the intrinsic graph structure of EHRs among patients to provide effective and interpretable OD risk predictions that may potentially improve clinical decision support.
Collapse
Affiliation(s)
- Xinyu Dong
- Department of Computer Science, Stony Brook University, Stony Brook, NY, United States of America
| | - Rachel Wong
- Department of Biomedical Informatics, Stony Brook University, Stony Brook, NY, United States of America
| | - Weimin Lyu
- Department of Computer Science, Stony Brook University, Stony Brook, NY, United States of America
| | - Kayley Abell-Hart
- Department of Biomedical Informatics, Stony Brook University, Stony Brook, NY, United States of America
| | - Jianyuan Deng
- Department of Biomedical Informatics, Stony Brook University, Stony Brook, NY, United States of America
| | - Yinan Liu
- Department of Computer Science, Stony Brook University, Stony Brook, NY, United States of America
| | - Janos G. Hajagos
- Department of Biomedical Informatics, Stony Brook University, Stony Brook, NY, United States of America
| | - Richard N. Rosenthal
- Department of Psychiatry, Renaissance School of Medicine at Stony Brook University, Stony Brook, NY, United States of America
| | - Chao Chen
- Department of Computer Science, Stony Brook University, Stony Brook, NY, United States of America; Department of Biomedical Informatics, Stony Brook University, Stony Brook, NY, United States of America.
| | - Fusheng Wang
- Department of Computer Science, Stony Brook University, Stony Brook, NY, United States of America; Department of Biomedical Informatics, Stony Brook University, Stony Brook, NY, United States of America.
| |
Collapse
|
29
|
Wan H, Liu Q, Ju Y. Utilize a few features to classify presynaptic and postsynaptic neurotoxins. Comput Biol Med 2023; 152:106380. [PMID: 36473343 DOI: 10.1016/j.compbiomed.2022.106380] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2022] [Revised: 10/21/2022] [Accepted: 11/28/2022] [Indexed: 12/02/2022]
Abstract
Neurotoxins are a class of proteins that have a significant damaging effect on nerve tissue. Neurotoxins are classified into presynaptic neurotoxins and postsynaptic neurotoxins, and accurate identification of neurotoxins plays a key role in drug development. In this study, 90 presynaptic neurotoxins and 165 postsynaptic neurotoxins were classified. The features of the presynaptic and postsynaptic neurotoxin sequences were extracted using the AutoProp feature extraction method and feature selection was performed using the maximum relevance maximum distance (MRMD) program, Finally, only two features were retained to achieve 84.7% classification accuracy. Moreover, it was found that the two retained features were present in the conserved sites and motifs of presynaptic neurotoxins and could represent the critical structures of presynaptic neurotoxins. This method demonstrates that using a few key features to classify proteins can effectively identify critical protein structures.
Collapse
Affiliation(s)
- Hao Wan
- Institute of Advanced Cross-field Science, College of Life Science, Qingdao University, Qingdao, China
| | - Qing Liu
- Department of Anesthesiology, Hospital (T.C.M) Affiliated to Southwest Medical University, Luzhou, China.
| | - Ying Ju
- School of Informatics, Xiamen University, Xiamen, China.
| |
Collapse
|
30
|
Nguyen MT, Nguyen T, Tran T. Learning to discover medicines. INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS 2022; 16:1-16. [PMID: 36440369 PMCID: PMC9676887 DOI: 10.1007/s41060-022-00371-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2022] [Accepted: 11/05/2022] [Indexed: 11/19/2022]
Abstract
Discovering new medicines is the hallmark of the human endeavor to live a better and longer life. Yet the pace of discovery has slowed down as we need to venture into more wildly unexplored biomedical space to find one that matches today's high standard. Modern AI-enabled by powerful computing, large biomedical databases, and breakthroughs in deep learning offers a new hope to break this loop as AI is rapidly maturing, ready to make a huge impact in the area. In this paper, we review recent advances in AI methodologies that aim to crack this challenge. We organize the vast and rapidly growing literature on AI for drug discovery into three relatively stable sub-areas: (a) representation learning over molecular sequences and geometric graphs; (b) data-driven reasoning where we predict molecular properties and their binding, optimize existing compounds, generate de novo molecules, and plan the synthesis of target molecules; and (c) knowledge-based reasoning where we discuss the construction and reasoning over biomedical knowledge graphs. We will also identify open challenges and chart possible research directions for the years to come.
Collapse
Affiliation(s)
- Minh-Tri Nguyen
- Applied Artificial Intelligence Institute, Deakin University, Burwood, VIC Australia
| | - Thin Nguyen
- Applied Artificial Intelligence Institute, Deakin University, Burwood, VIC Australia
| | - Truyen Tran
- Applied Artificial Intelligence Institute, Deakin University, Burwood, VIC Australia
| |
Collapse
|
31
|
Is the reductionist paradox an Achilles Heel of drug discovery? J Comput Aided Mol Des 2022; 36:329-338. [PMID: 35861913 DOI: 10.1007/s10822-022-00457-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2022] [Accepted: 05/02/2022] [Indexed: 10/17/2022]
|