1
|
Li J, Zhang J, Guo R, Dai J, Niu Z, Wang Y, Wang T, Jiang X, Hu W. Progress of machine learning in the application of small molecule druggability prediction. Eur J Med Chem 2025; 285:117269. [PMID: 39808972 DOI: 10.1016/j.ejmech.2025.117269] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2024] [Revised: 01/07/2025] [Accepted: 01/08/2025] [Indexed: 01/16/2025]
Abstract
Machine learning (ML) has become an important tool for predicting the pharmaceutical properties of small molecules. Recent advancements in ML algorithms enable the rapid and accurate evaluation of solubility, activity, toxicity, pharmacokinetics, and other molecular properties through ML-based models. By conducting virtual screening of drug targets and elucidating drug-target protein interactions, researchers can conduct preliminary evaluations of the activity and safety of compounds from the ultra-large drug compound libraries, thereby accelerating the screening process for lead compounds. Moreover, ML leverages existing experimental data to train and generate new datasets, addressing the challenge of limited compounds and protein target data. This review provided a concise overview of ML applications in predicting small molecule properties, focusing on model construction principles, molecular feature selection, and other essential aspects. It also discussed the potential applications of ML in the screening of pharmaceutical small molecules.
Collapse
Affiliation(s)
- Junyao Li
- School of Chemistry and Life Sciences, Suzhou University of Science and Technology, Suzhou, China; School of Life Sciences, Huaiyin Normal University, Huaian, 223300, China; Institute of Translational Medicine, School of Medicine, Yangzhou University, Yangzhou, 225009, China
| | - Jianmei Zhang
- School of Chemistry and Life Sciences, Suzhou University of Science and Technology, Suzhou, China
| | - Rui Guo
- School of Chemistry and Life Sciences, Suzhou University of Science and Technology, Suzhou, China; Institute of Translational Medicine, School of Medicine, Yangzhou University, Yangzhou, 225009, China
| | - Jiawei Dai
- Institute of Translational Medicine, School of Medicine, Yangzhou University, Yangzhou, 225009, China
| | - Zhiqiang Niu
- Institute of Translational Medicine, School of Medicine, Yangzhou University, Yangzhou, 225009, China
| | - Yan Wang
- School of Chemistry and Life Sciences, Suzhou University of Science and Technology, Suzhou, China
| | - Taoyun Wang
- School of Chemistry and Life Sciences, Suzhou University of Science and Technology, Suzhou, China.
| | - Xiaojian Jiang
- School of Life Sciences, Huaiyin Normal University, Huaian, 223300, China.
| | - Weicheng Hu
- Institute of Translational Medicine, School of Medicine, Yangzhou University, Yangzhou, 225009, China.
| |
Collapse
|
2
|
Yang S, Bai M, Liu W, Li W, Zhong Z, Kwok LY, Dong G, Sun Z. Predicting Lactobacillus delbrueckii subsp. bulgaricus-Streptococcus thermophilus interactions based on a highly accurate semi-supervised learning method. SCIENCE CHINA. LIFE SCIENCES 2025; 68:558-574. [PMID: 39417929 DOI: 10.1007/s11427-023-2569-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/18/2023] [Accepted: 03/15/2024] [Indexed: 10/19/2024]
Abstract
Lactobacillus delbrueckii subsp. bulgaricus (L. bulgaricus) and Streptococcus thermophilus (S. thermophilus) are commonly used starters in milk fermentation. Fermentation experiments revealed that L. bulgaricus-S. thermophilus interactions (LbStI) substantially impact dairy product quality and production. Traditional biological humidity experiments are time-consuming and labor-intensive in screening interaction combinations, an artificial intelligence-based method for screening interactive starter combinations is necessary. However, in the current research on artificial intelligence based interaction prediction in the field of bioinformatics, most successful models adopt supervised learning methods, and there is a lack of research on interaction prediction with only a small number of labeled samples. Hence, this study aimed to develop a semi-supervised learning framework for predicting LbStI using genomic data from 362 isolates (181 per species). The framework consisted of a two-part model: a co-clustering prediction model (based on the Kyoto Encyclopedia of Genes and Genomes (KEGG) dataset) and a Laplacian regularized least squares prediction model (based on K-mer analysis and gene composition of all isolates datasets). To enhance accuracy, we integrated the separate outcomes produced by each component of the two-part model to generate the ultimate LbStI prediction results, which were verified through milk fermentation experiments. Validation through milk fermentation experiments confirmed a high precision rate of 85% (17/20; validated with 20 randomly selected combinations of expected interacting isolates). Our data suggest that the biosynthetic pathways of cysteine, riboflavin, teichoic acid, and exopolysaccharides, as well as the ATP-binding cassette transport systems, contribute to the mutualistic relationship between these starter bacteria during milk fermentation. However, this finding requires further experimental verification. The presented model and data are valuable resources for academics and industry professionals interested in screening dairy starter cultures and understanding their interactions.
Collapse
Affiliation(s)
- Shujuan Yang
- Key Laboratory of Dairy Biotechnology and Engineering, Ministry of Education, Inner Mongolia Agricultural University, Hohhot, 010018, China
- Key Laboratory of Dairy Products Processing, Ministry of Agriculture and Rural Affairs, Inner Mongolia Agricultural University, Hohhot, 010018, China
- Inner Mongolia Key Laboratory of Dairy Biotechnology and Engineering, Inner Mongolia Agricultural University, Hohhot, 010018, China
- Collaborative Innovative Center for Lactic Acid Bacteria and Fermented Dairy Products, Ministry of Education, Inner Mongolia Agricultural University, Hohhot, 010018, China
| | - Mei Bai
- Key Laboratory of Dairy Biotechnology and Engineering, Ministry of Education, Inner Mongolia Agricultural University, Hohhot, 010018, China
- Key Laboratory of Dairy Products Processing, Ministry of Agriculture and Rural Affairs, Inner Mongolia Agricultural University, Hohhot, 010018, China
- Inner Mongolia Key Laboratory of Dairy Biotechnology and Engineering, Inner Mongolia Agricultural University, Hohhot, 010018, China
- Collaborative Innovative Center for Lactic Acid Bacteria and Fermented Dairy Products, Ministry of Education, Inner Mongolia Agricultural University, Hohhot, 010018, China
| | - Weichi Liu
- College of Computer and Information Engineering, Inner Mongolia Agricultural University, Hohhot, 010018, China
- Inner Mongolia Autonomous Region Key Laboratory of Big Data Research and Application of Agriculture and Animal Husbandry, Hohhot, 010018, China
| | - Weicheng Li
- Key Laboratory of Dairy Biotechnology and Engineering, Ministry of Education, Inner Mongolia Agricultural University, Hohhot, 010018, China
- Key Laboratory of Dairy Products Processing, Ministry of Agriculture and Rural Affairs, Inner Mongolia Agricultural University, Hohhot, 010018, China
- Inner Mongolia Key Laboratory of Dairy Biotechnology and Engineering, Inner Mongolia Agricultural University, Hohhot, 010018, China
- Collaborative Innovative Center for Lactic Acid Bacteria and Fermented Dairy Products, Ministry of Education, Inner Mongolia Agricultural University, Hohhot, 010018, China
| | - Zhi Zhong
- Key Laboratory of Dairy Biotechnology and Engineering, Ministry of Education, Inner Mongolia Agricultural University, Hohhot, 010018, China
- Key Laboratory of Dairy Products Processing, Ministry of Agriculture and Rural Affairs, Inner Mongolia Agricultural University, Hohhot, 010018, China
- Inner Mongolia Key Laboratory of Dairy Biotechnology and Engineering, Inner Mongolia Agricultural University, Hohhot, 010018, China
- Collaborative Innovative Center for Lactic Acid Bacteria and Fermented Dairy Products, Ministry of Education, Inner Mongolia Agricultural University, Hohhot, 010018, China
| | - Lai-Yu Kwok
- Key Laboratory of Dairy Biotechnology and Engineering, Ministry of Education, Inner Mongolia Agricultural University, Hohhot, 010018, China
- Key Laboratory of Dairy Products Processing, Ministry of Agriculture and Rural Affairs, Inner Mongolia Agricultural University, Hohhot, 010018, China
- Inner Mongolia Key Laboratory of Dairy Biotechnology and Engineering, Inner Mongolia Agricultural University, Hohhot, 010018, China
- Collaborative Innovative Center for Lactic Acid Bacteria and Fermented Dairy Products, Ministry of Education, Inner Mongolia Agricultural University, Hohhot, 010018, China
| | - Gaifang Dong
- College of Computer and Information Engineering, Inner Mongolia Agricultural University, Hohhot, 010018, China.
- Inner Mongolia Autonomous Region Key Laboratory of Big Data Research and Application of Agriculture and Animal Husbandry, Hohhot, 010018, China.
| | - Zhihong Sun
- Key Laboratory of Dairy Biotechnology and Engineering, Ministry of Education, Inner Mongolia Agricultural University, Hohhot, 010018, China.
- Key Laboratory of Dairy Products Processing, Ministry of Agriculture and Rural Affairs, Inner Mongolia Agricultural University, Hohhot, 010018, China.
- Inner Mongolia Key Laboratory of Dairy Biotechnology and Engineering, Inner Mongolia Agricultural University, Hohhot, 010018, China.
- Collaborative Innovative Center for Lactic Acid Bacteria and Fermented Dairy Products, Ministry of Education, Inner Mongolia Agricultural University, Hohhot, 010018, China.
| |
Collapse
|
3
|
Hu J, Hu S, Xia M, Zheng K, Zhang X. Drug-target binding affinity prediction based on power graph and word2vec. BMC Med Genomics 2025; 18:9. [PMID: 39806396 PMCID: PMC11730168 DOI: 10.1186/s12920-024-02073-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2023] [Accepted: 12/13/2024] [Indexed: 01/16/2025] Open
Abstract
BACKGROUND Drug and protein targets affect the physiological functions and metabolic effects of the body through bonding reactions, and accurate prediction of drug-protein target interactions is crucial for drug development. In order to shorten the drug development cycle and reduce costs, machine learning methods are gradually playing an important role in the field of drug-target interactions. RESULTS Compared with other methods, regression-based drug target affinity is more representative of the binding ability. Accurate prediction of drug target affinity can effectively reduce the time and cost of drug retargeting and new drug development. In this paper, a drug target affinity prediction model (WPGraphDTA) based on power graph and word2vec is proposed. CONCLUSIONS In this model, the drug molecular features in the power graph module are extracted by a graph neural network, and then the protein features are obtained by the Word2vec method. After feature fusion, they are input into the three full connection layers to obtain the drug target affinity prediction value. We conducted experiments on the Davis and Kiba datasets, and the experimental results showed that WPGraphDTA exhibited good prediction performance.
Collapse
Affiliation(s)
- Jing Hu
- School of Computer Science and Technology, Wuhan University of Science and Technology, Wuhan, 430065, Hubei, China.
- Hubei Province Key Laboratory of Intelligent Information Processing and Real-Time Industrial System, Wuhan, China.
- Institute of Big Data Science and Engineering, Wuhan University of Science and Technology, Wuhan, Hubei, China.
| | - Shuo Hu
- School of Computer Science and Technology, Wuhan University of Science and Technology, Wuhan, 430065, Hubei, China
| | - Minghao Xia
- School of Computer Science and Technology, Wuhan University of Science and Technology, Wuhan, 430065, Hubei, China
| | - Kangxing Zheng
- School of Computer Science and Technology, Wuhan University of Science and Technology, Wuhan, 430065, Hubei, China
| | - Xiaolong Zhang
- School of Computer Science and Technology, Wuhan University of Science and Technology, Wuhan, 430065, Hubei, China.
- Hubei Province Key Laboratory of Intelligent Information Processing and Real-Time Industrial System, Wuhan, China.
- Institute of Big Data Science and Engineering, Wuhan University of Science and Technology, Wuhan, Hubei, China.
| |
Collapse
|
4
|
Wang J, He R, Wang X, Li H, Lu Y. MCF-DTI: Multi-Scale Convolutional Local-Global Feature Fusion for Drug-Target Interaction Prediction. Molecules 2025; 30:274. [PMID: 39860144 PMCID: PMC11767603 DOI: 10.3390/molecules30020274] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2024] [Revised: 12/21/2024] [Accepted: 01/10/2025] [Indexed: 01/27/2025] Open
Abstract
Predicting drug-target interactions (DTIs) is a crucial step in the development of new drugs and drug repurposing. In this paper, we propose a novel drug-target prediction model called MCF-DTI. The model utilizes the SMILES representation of drugs and the sequence features of targets, employing a multi-scale convolutional neural network (MSCNN) with parallel shared-weight modules to extract features from the drug side. For the target side, it combines MSCNN with Transformer modules to capture both local and global features effectively. The extracted features are then weighted and fused, enabling comprehensive feature representation to enhance the predictive power of the model. Experimental results on the Davis dataset demonstrate that MCF-DTI achieves an AUC of 0.9746 and an AUPR of 0.9542, outperforming other state-of-the-art models. Our case study demonstrates that our model effectively validated several known drug-target relationships in lung cancer and predicted the therapeutic potential of certain preclinical compounds in treating lung cancer. These findings contribute valuable insights for subsequent drug repurposing efforts and novel drug development.
Collapse
Affiliation(s)
- Jihong Wang
- School of Computer, Guangdong University of Education, Guangzhou 510310, China
| | - Ruijia He
- School of Computer, Guangdong University of Education, Guangzhou 510310, China
| | - Xiaodan Wang
- School of Chemistry and Chemical Engineering, Guangdong Pharmaceutical University, Zhongshan 528458, China
| | - Hongjian Li
- School of Chemistry and Chemical Engineering, Guangdong Pharmaceutical University, Zhongshan 528458, China
| | - Yulei Lu
- School of Computer, Guangdong University of Education, Guangzhou 510310, China
| |
Collapse
|
5
|
Mishra VP, Singh YN, Khan F, Dutta MK. SeqDPI: A 1D-CNN approach for predicting binding affinity of kinase inhibitors. J Comput Chem 2025; 46:e27518. [PMID: 39644133 DOI: 10.1002/jcc.27518] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2024] [Revised: 08/26/2024] [Accepted: 10/13/2024] [Indexed: 12/09/2024]
Abstract
Predicting drug target binding affinity has huge relevance in Modern drug discovery and drug repositioning processes which assist doctors to come up with new drugs or even use the existing drugs for new target proteins. In silico models, using advanced deep learning techniques could further assist these prediction tasks by providing most prominent drug target pairs. Considering these factors, a deep learning based algorithmic framework is developed in this study to support drug target interaction prediction. The proposed SeqDPI model extract the relevant drug and protein features from the one dimensional Sequential representation of the dataset considered using optimized CNN networks that deploy convolutions on varying length of amino acid subsequence's to capture hidden pattern, the convolved drug- protein features obtained are then used as an input to L2 penalized feed forward neural network which matches the local residue patterns in protein classes with molecular fingerprints of drugs to predict the binding strength for all drug target pairs. The proposed model reduces the convolution strain typically encountered in existing in silico models that utilize complex 3D structures of drug protein datasets. The result shows that the SeqDPI model achieves a mean square error MSE of (0.167) across cross validation folds, outperforming baseline models such as KronRLS (0.406), Simboost (0.226), and DeepPS (0.214). Additionally, SeqDPI attains a high CI score of 0.9114 on the benchmark KIBA dataset, demonstrating its statistical significance and computational efficiency compared to existing methods. This gives the relevance and effectiveness of SeqDPI model in accurately predicting binding affinities while working with simpler one-dimensional data, making it a robust and computationally cost-effective solution for drug-target interaction prediction.
Collapse
Affiliation(s)
- Vinay Priy Mishra
- Centre for Advanced Studies, Dr. A.P.J. Abdul Kalam Technical University, Lucknow, India
| | - Yogendra Narain Singh
- Department of Computer Science & Engineering, Institute of Engineering and Technology, Lucknow, India
| | - Feroz Khan
- Technology Dissemination & Computational Biology Division, CSIR-Central Institute of Medicinal and Aromatic Plants, Lucknow, India
| | | |
Collapse
|
6
|
Xie Y, Wang X, Wang P, Bi X. A pseudo-label supervised graph fusion attention network for drug–target interaction prediction. EXPERT SYSTEMS WITH APPLICATIONS 2025; 259:125264. [DOI: 10.1016/j.eswa.2024.125264] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/03/2025]
|
7
|
Singh PK, Sachan K, Khandelwal V, Singh S, Singh S. Role of Artificial Intelligence in Drug Discovery to Revolutionize the Pharmaceutical Industry: Resources, Methods and Applications. Recent Pat Biotechnol 2025; 19:35-52. [PMID: 39840410 DOI: 10.2174/0118722083297406240313090140] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2023] [Revised: 02/22/2024] [Accepted: 02/28/2024] [Indexed: 01/23/2025]
Abstract
Traditional drug discovery methods such as wet-lab testing, validations, and synthetic techniques are time-consuming and expensive. Artificial Intelligence (AI) approaches have progressed to the point where they can have a significant impact on the drug discovery process. Using massive volumes of open data, artificial intelligence methods are revolutionizing the pharmaceutical industry. In the last few decades, many AI-based models have been developed and implemented in many areas of the drug development process. These models have been used as a supplement to conventional research to uncover superior pharmaceuticals expeditiously. AI's involvement in the pharmaceutical industry was used mostly for reverse engineering of existing patents and the invention of new synthesis pathways. Drug research and development to repurposing and productivity benefits in the pharmaceutical business through clinical trials. AI is studied in this article for its numerous potential uses. We have discussed how AI can be put to use in the pharmaceutical sector, specifically for predicting a drug's toxicity, bioactivity, and physicochemical characteristics, among other things. In this review article, we have discussed its application to a variety of problems, including de novo drug discovery, target structure prediction, interaction prediction, and binding affinity prediction. AI for predicting drug interactions and nanomedicines were also considered.
Collapse
Affiliation(s)
- Pranjal Kumar Singh
- Department of Pharmacy, Kalka Institute for Research and Advanced Studies, Meerut, Uttar Pradesh, India
| | - Kapil Sachan
- KIET School of Pharmacy, KIET Group of Institutions, Ghaziabad, Uttar Pradesh, India
| | - Vishal Khandelwal
- Department of Biotechnology, GLA University, Mathura, Uttar Pradesh, India
| | - Sumita Singh
- Faculty of Pharmacy, Swami Vivekanand Subharti University, Meerut, Uttar Pradesh, India
| | - Smita Singh
- SRM Modinagar College of Pharmacy, SRM Institute of Science and Technology, Delhi NCR Campus, Modinagar, Ghaziabad, Uttar Pradesh, India
| |
Collapse
|
8
|
Lee J, Kim D, Jun DW, Kim Y. Multimodal Fusion-Based Lightweight Model for Enhanced Generalization in Drug-Target Interaction Prediction. J Chem Inf Model 2024; 64:9215-9226. [PMID: 39626073 DOI: 10.1021/acs.jcim.4c01397] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2024]
Abstract
Predicting drug-target interactions (DTIs) with precision is a crucial challenge in the quest for efficient and cost-effective drug discovery. Existing DTI prediction models often require significant computational resources because of the intricate and exceptionally lengthy protein target sequences. This study introduces MMF-DTI, a lightweight model that uses multimodal fusion, to improve the generalizability of DTI predictions without sacrificing computational efficiency. The MMF-DTI model combines four distinct modalities: molecular sequence, molecular properties, target sequence, and target function description. This approach is noteworthy because it is the first to use natural language-based target function descriptions in predicting DTIs. The effectiveness of MMF-DTI has been confirmed through benchmark data sets, demonstrating its comparable performance in terms of generalizability, especially in scenarios with limited information about the drug or target. Remarkably, MMF-DTI accomplishes this using only half of the parameters and 17% of the VRAM compared with previous state-of-the-art models. This allows it to function even in constrained computational environments. The combination of performance and efficiency highlights the potential of multimodal data fusion in improving the overall applicability of models, providing promising opportunities for future drug discovery endeavors.
Collapse
Affiliation(s)
- Jonghyun Lee
- Institute of Biomedical Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania 19104, United States
| | - Dokyoon Kim
- Institute of Biomedical Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania 19104, United States
| | - Dae Won Jun
- Department of Medical and Digital Engineering, Hanyang University College of Engineering, Seoul 04763, Republic of Korea
- Department of Internal Medicine, Hanyang University College of Medicine, Seoul 04763, Republic of Korea
| | - Yun Kim
- College of Pharmacy, Daegu Catholic University, Gyeongsan 38430, Republic of Korea
| |
Collapse
|
9
|
Wang M, Wang J, Ji J, Ma C, Wang H, He J, Song Y, Zhang X, Cao Y, Dai Y, Hua M, Qin R, Li K, Cao L. Improving compound-protein interaction prediction by focusing on intra-modality and inter-modality dynamics with a multimodal tensor fusion strategy. Comput Struct Biotechnol J 2024; 23:3714-3729. [PMID: 39525082 PMCID: PMC11544084 DOI: 10.1016/j.csbj.2024.10.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2024] [Revised: 10/01/2024] [Accepted: 10/01/2024] [Indexed: 11/16/2024] Open
Abstract
Identifying novel compound-protein interactions (CPIs) plays a pivotal role in target identification and drug discovery. Although the recent multimodal methods have achieved outstanding advances in CPI prediction, they fail to effectively learn both intra-modality and inter-modality dynamics, which limits their prediction performance. To address the limitation, we propose a novel multimodal tensor fusion CPI prediction framework, named MMTF-CPI, which contains three unimodal learning modules for structure, heterogeneous network and transcriptional profiling modalities, a tensor fusion module and a prediction module. MMTF-CPI is capable of focusing on both intra-modality and inter-modality dynamics with the tensor fusion module. We demonstrated that MMTF-CPI is superior to multiple state-of-the-art multimodal methods across seven datasets. The prediction performance of MMTF-CPI is significantly improved with the tensor fusion module compared to other fusion methods. Moreover, our case studies confirmed the practical value of MMTF-CPI in target identification. Via MMTF-CPI, we also discovered several candidate compounds for the therapy of breast cancer and non-small cell lung cancer.
Collapse
Affiliation(s)
- Meng Wang
- Department of Biostatistics, Harbin Medical University, Harbin 150081, China
| | - Jianmin Wang
- Department of Integrative Biotechnology, Yonsei University, Incheon 21983, South Korea
| | - Jianxin Ji
- Department of Biostatistics, Harbin Medical University, Harbin 150081, China
| | - Chenjing Ma
- Department of Biostatistics, Harbin Medical University, Harbin 150081, China
| | - Hesong Wang
- Department of Biostatistics, Harbin Medical University, Harbin 150081, China
| | - Jia He
- Department of Biostatistics, Harbin Medical University, Harbin 150081, China
| | - Yongzhen Song
- Department of Biostatistics, Harbin Medical University, Harbin 150081, China
| | - Xuan Zhang
- Department of Biostatistics, Harbin Medical University, Harbin 150081, China
| | - Yong Cao
- Department of Biostatistics, Harbin Medical University, Harbin 150081, China
| | - Yanyan Dai
- Department of Biostatistics, Harbin Medical University, Harbin 150081, China
| | - Menglei Hua
- Department of Biostatistics, Harbin Medical University, Harbin 150081, China
| | - Ruihao Qin
- Department of Biostatistics, Harbin Medical University, Harbin 150081, China
| | - Kang Li
- Department of Biostatistics, Harbin Medical University, Harbin 150081, China
| | - Lei Cao
- Department of Biostatistics, Harbin Medical University, Harbin 150081, China
| |
Collapse
|
10
|
Odugbemi AI, Nyirenda C, Christoffels A, Egieyeh SA. Artificial intelligence in antidiabetic drug discovery: The advances in QSAR and the prediction of α-glucosidase inhibitors. Comput Struct Biotechnol J 2024; 23:2964-2977. [PMID: 39148608 PMCID: PMC11326494 DOI: 10.1016/j.csbj.2024.07.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2024] [Revised: 07/03/2024] [Accepted: 07/03/2024] [Indexed: 08/17/2024] Open
Abstract
Artificial Intelligence is transforming drug discovery, particularly in the hit identification phase of therapeutic compounds. One tool that has been instrumental in this transformation is Quantitative Structure-Activity Relationship (QSAR) analysis. This computer-aided drug design tool uses machine learning to predict the biological activity of new compounds based on the numerical representation of chemical structures against various biological targets. With diabetes mellitus becoming a significant health challenge in recent times, there is intense research interest in modulating antidiabetic drug targets. α-Glucosidase is an antidiabetic target that has gained attention due to its ability to suppress postprandial hyperglycaemia, a key contributor to diabetic complications. This review explored a detailed approach to developing QSAR models, focusing on strategies for generating input variables (molecular descriptors) and computational approaches ranging from classical machine learning algorithms to modern deep learning algorithms. We also highlighted studies that have used these approaches to develop predictive models for α-glucosidase inhibitors to modulate this critical antidiabetic drug target.
Collapse
Affiliation(s)
- Adeshina I Odugbemi
- South African Medical Research Council Bioinformatics Unit, South African National Bioinformatics Institute, University of the Western Cape, Bellville, Cape Town 7535, South Africa
- School of Pharmacy, University of the Western Cape, Bellville, Cape Town 7535, South Africa
- National Institute for Theoretical and Computational Sciences (NITheCS), South Africa
| | - Clement Nyirenda
- Department of Computer Science, University of the Western Cape, Cape Town 7535, South Africa
| | - Alan Christoffels
- South African Medical Research Council Bioinformatics Unit, South African National Bioinformatics Institute, University of the Western Cape, Bellville, Cape Town 7535, South Africa
- Africa Centres for Disease Control and Prevention, African Union, Addis Ababa, Ethiopia
| | - Samuel A Egieyeh
- School of Pharmacy, University of the Western Cape, Bellville, Cape Town 7535, South Africa
- National Institute for Theoretical and Computational Sciences (NITheCS), South Africa
| |
Collapse
|
11
|
Liu D, Song T, Wang S. MM-DRPNet: A multimodal dynamic radial partitioning network for enhanced protein-ligand binding affinity prediction. Comput Struct Biotechnol J 2024; 23:4396-4405. [PMID: 39737077 PMCID: PMC11683220 DOI: 10.1016/j.csbj.2024.11.050] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2024] [Revised: 11/23/2024] [Accepted: 11/30/2024] [Indexed: 01/01/2025] Open
Abstract
Accurate prediction of drug-target binding affinity remains a fundamental challenge in contemporary drug discovery. Despite significant advances in computational methods for protein-ligand binding affinity prediction, current approaches still face substantial limitations in prediction accuracy. Moreover, the prevalent methodologies often overlook critical three-dimensional (3D) structural information, thereby constraining their practical utility in computer-aided drug design (CADD). Here we present MM-DRPNet, a multimodal deep learning framework that enhances binding affinity prediction by integrating protein-ligand structural information with interaction features and physicochemical properties. The core innovation lies in our dynamic radial partitioning (DRP) algorithm, which adaptively segments 3D space based on complex-specific interaction patterns, surpassing traditional fixed partitioning methods in capturing spatial interactions. MM-DRPNet further incorporates molecular topological features to comprehensively model both structural and spatial relationships. Extensive evaluations on benchmark datasets demonstrate that MM-DRPNet significantly outperforms state-of-the-art methods across multiple metrics, with ablation studies confirming the substantial contribution of each architectural component. Source code for MM-DRPNet is freely available for download at https://github.com/Bigrock-dd/MMDRPv1.
Collapse
Affiliation(s)
- Dayan Liu
- College of Computer Science and Technology, China University of Petroleum (East China), Qingdao, 266580, Shandong, China
| | - Tao Song
- College of Computer Science and Technology, China University of Petroleum (East China), Qingdao, 266580, Shandong, China
| | - Shudong Wang
- College of Computer Science and Technology, China University of Petroleum (East China), Qingdao, 266580, Shandong, China
| |
Collapse
|
12
|
Desai S, Wilson J, Ji C, Sautner J, Prussia AJ, Demchuk E, Mumtaz MM, Ruiz P. The Role of Simulation Science in Public Health at the Agency for Toxic Substances and Disease Registry: An Overview and Analysis of the Last Decade. TOXICS 2024; 12:811. [PMID: 39590991 PMCID: PMC11598116 DOI: 10.3390/toxics12110811] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/23/2024] [Revised: 10/31/2024] [Accepted: 11/07/2024] [Indexed: 11/28/2024]
Abstract
Environmental exposures are ubiquitous and play a significant, and sometimes understated, role in public health as they can lead to the development of various chronic and infectious diseases. In an ideal world, there would be sufficient experimental data to determine the health effects of exposure to priority environmental contaminants. However, this is not the case, as emerging chemicals are continuously added to this list, furthering the data gaps. Recently, simulation science has evolved and can provide appropriate solutions using a multitude of computational methods and tools. In its quest to protect communities across the country from environmental health threats, ATSDR employs a variety of simulation science tools such as Physiologically Based Pharmacokinetic (PBPK) modeling, Quantitative Structure-Activity Relationship (QSAR) modeling, and benchmark dose (BMD) modeling, among others. ATSDR's use of such tools has enabled the agency to evaluate exposures in a timely, efficient, and effective manner. ATSDR's work in simulation science has also had a notable impact beyond the agency, as evidenced by external researchers' widespread appraisal and adaptation of the agency's methodology. ATSDR continues to advance simulation science tools and their applications by collaborating with researchers within and outside the agency, including other federal/state agencies, NGOs, the private sector, and academia.
Collapse
Affiliation(s)
- Siddhi Desai
- Oak Ridge Institute for Science and Education, Oak Ridge, TN 37830, USA
- Office of Innovation and Analytics, Agency for Toxic Substances and Disease Registry, Atlanta, GA 30329, USA
| | - Jewell Wilson
- Office of Innovation and Analytics, Agency for Toxic Substances and Disease Registry, Atlanta, GA 30329, USA
| | - Chao Ji
- Office of Innovation and Analytics, Agency for Toxic Substances and Disease Registry, Atlanta, GA 30329, USA
| | - Jason Sautner
- Office of Innovation and Analytics, Agency for Toxic Substances and Disease Registry, Atlanta, GA 30329, USA
| | - Andrew J. Prussia
- Office of Innovation and Analytics, Agency for Toxic Substances and Disease Registry, Atlanta, GA 30329, USA
| | - Eugene Demchuk
- Office of Innovation and Analytics, Agency for Toxic Substances and Disease Registry, Atlanta, GA 30329, USA
| | - M. Moiz Mumtaz
- Office of Associate Director for Science, Agency for Toxic Substances and Disease Registry, Atlanta, GA 30329, USA
| | - Patricia Ruiz
- Office of Innovation and Analytics, Agency for Toxic Substances and Disease Registry, Atlanta, GA 30329, USA
| |
Collapse
|
13
|
Zhang M, Hong Y, Shen L, Xu S, Xu Y, Zhang X, Liu J, Liu X. A heterogeneous graph neural network with automatic discovery of effective metapaths for drug–target interaction prediction. FUTURE GENERATION COMPUTER SYSTEMS 2024; 160:283-294. [DOI: 10.1016/j.future.2024.05.054] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/03/2025]
|
14
|
Tao W, Lin X, Liu Y, Zeng L, Ma T, Cheng N, Jiang J, Zeng X, Yuan S. Bridging chemical structure and conceptual knowledge enables accurate prediction of compound-protein interaction. BMC Biol 2024; 22:248. [PMID: 39468510 PMCID: PMC11520867 DOI: 10.1186/s12915-024-02049-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2024] [Accepted: 10/17/2024] [Indexed: 10/30/2024] Open
Abstract
BACKGROUND Accurate prediction of compound-protein interaction (CPI) plays a crucial role in drug discovery. Existing data-driven methods aim to learn from the chemical structures of compounds and proteins yet ignore the conceptual knowledge that is the interrelationships among the fundamental elements in the biomedical knowledge graph (KG). Knowledge graphs provide a comprehensive view of entities and relationships beyond individual compounds and proteins. They encompass a wealth of information like pathways, diseases, and biological processes, offering a richer context for CPI prediction. This contextual information can be used to identify indirect interactions, infer potential relationships, and improve prediction accuracy. In real-world applications, the prevalence of knowledge-missing compounds and proteins is a critical barrier for injecting knowledge into data-driven models. RESULTS Here, we propose BEACON, a data and knowledge dual-driven framework that bridges chemical structure and conceptual knowledge for CPI prediction. The proposed BEACON learns the consistent representations by maximizing the mutual information between chemical structure and conceptual knowledge and predicts the missing representations by minimizing their conditional entropy. BEACON achieves state-of-the-art performance on multiple datasets compared to competing methods, notably with 5.1% and 6.6% performance gain on the BIOSNAP and DrugBank datasets, respectively. Moreover, BEACON is the only approach capable of effectively predicting knowledge representations for knowledge-lacking compounds and proteins. CONCLUSIONS Overall, our work provides a general approach for directly injecting conceptual knowledge to enhance the performance of CPI prediction.
Collapse
Affiliation(s)
- Wen Tao
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, 410082, Hunan, China
| | - Xuan Lin
- School of Computer Science, Xiangtan University, Xiangtan, 411105, Hunan, China
- Laboratory of Intelligent Computing and Information Processing, Ministry of Education (Xiangtan University), Xiangtan, 411105, Hunan, China
| | - Yuansheng Liu
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, 410082, Hunan, China.
- Laboratory of Intelligent Computing and Information Processing, Ministry of Education (Xiangtan University), Xiangtan, 411105, Hunan, China.
| | - Li Zeng
- Department of AIDD, Shanghai Yuyao Biotechnology Co., Ltd., Shanghai, 201109, Shanghai, China
| | - Tengfei Ma
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, 410082, Hunan, China
| | - Ning Cheng
- School of Informatics, Hunan University of Chinese Medicine, Changsha, 410208, Hunan, China
| | - Jing Jiang
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, 410082, Hunan, China
| | - Xiangxiang Zeng
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, 410082, Hunan, China
| | - Sisi Yuan
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, Charlotte, 28223, NC, USA.
| |
Collapse
|
15
|
Zhu Y, Zhang Y, Li X, Wang L. 3MTox: A motif-level graph-based multi-view chemical language model for toxicity identification with deep interpretation. JOURNAL OF HAZARDOUS MATERIALS 2024; 476:135114. [PMID: 38986414 DOI: 10.1016/j.jhazmat.2024.135114] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/27/2024] [Revised: 06/24/2024] [Accepted: 07/04/2024] [Indexed: 07/12/2024]
Abstract
Toxicity identification plays a key role in maintaining human health, as it can alert humans to the potential hazards caused by long-term exposure to a wide variety of chemical compounds. Experimental methods for determining toxicity are time-consuming, and costly, while computational methods offer an alternative for the early identification of toxicity. For example, some classical ML and DL methods, which demonstrate excellent performance in toxicity prediction. However, these methods also have some defects, such as over-reliance on artificial features and easy overfitting, etc. Proposing novel models with superior prediction performance is still an urgent task. In this study, we propose a motifs-level graph-based multi-view pretraining language model, called 3MTox, for toxicity identification. The 3MTox model uses Bidirectional Encoder Representations from Transformers (BERT) as the backbone framework, and a motif graph as input. The results of extensive experiments showed that our 3MTox model achieved state-of-the-art performance on toxicity benchmark datasets and outperformed the baseline models considered. In addition, the interpretability of the model ensures that the it can quickly and accurately identify toxicity sites in a given molecule, thereby contributing to the determination of the status of toxicity and associated analyses. We think that the 3MTox model is among the most promising tools that are currently available for toxicity identification.
Collapse
Affiliation(s)
- Yingying Zhu
- Guangdong Provincial Key Laboratory of Fermentation and Enzyme Engineering, Joint International Research Laboratory of Synthetic Biology and Medicine, Ministry of Education, Guangdong Provincial Engineering and Technology Research Center of Biopharmaceuticals, School of Biology and Biological Engineering, South China University of Technology, Guangzhou 510006, China
| | - Yanhong Zhang
- Guangdong Provincial Key Laboratory of Fermentation and Enzyme Engineering, Joint International Research Laboratory of Synthetic Biology and Medicine, Ministry of Education, Guangdong Provincial Engineering and Technology Research Center of Biopharmaceuticals, School of Biology and Biological Engineering, South China University of Technology, Guangzhou 510006, China
| | - Xinze Li
- Guangdong Provincial Key Laboratory of Fermentation and Enzyme Engineering, Joint International Research Laboratory of Synthetic Biology and Medicine, Ministry of Education, Guangdong Provincial Engineering and Technology Research Center of Biopharmaceuticals, School of Biology and Biological Engineering, South China University of Technology, Guangzhou 510006, China
| | - Ling Wang
- Guangdong Provincial Key Laboratory of Fermentation and Enzyme Engineering, Joint International Research Laboratory of Synthetic Biology and Medicine, Ministry of Education, Guangdong Provincial Engineering and Technology Research Center of Biopharmaceuticals, School of Biology and Biological Engineering, South China University of Technology, Guangzhou 510006, China.
| |
Collapse
|
16
|
Ahmed KT, Ansari MI, Zhang W. DTI-LM: language model powered drug-target interaction prediction. Bioinformatics 2024; 40:btae533. [PMID: 39221997 PMCID: PMC11520403 DOI: 10.1093/bioinformatics/btae533] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2024] [Revised: 08/05/2024] [Accepted: 08/29/2024] [Indexed: 09/04/2024] Open
Abstract
MOTIVATION The identification and understanding of drug-target interactions (DTIs) play a pivotal role in the drug discovery and development process. Sequence representations of drugs and proteins in computational model offer advantages such as their widespread availability, easier input quality control, and reduced computational resource requirements. These make them an efficient and accessible tools for various computational biology and drug discovery applications. Many sequence-based DTI prediction methods have been developed over the years. Despite the advancement in methodology, cold start DTI prediction involving unknown drug or protein remains a challenging task, particularly for sequence-based models. Introducing DTI-LM, a novel framework leveraging advanced pretrained language models, we harness their exceptional context-capturing abilities along with neighborhood information to predict DTIs. DTI-LM is specifically designed to rely solely on sequence representations for drugs and proteins, aiming to bridge the gap between warm start and cold start predictions. RESULTS Large-scale experiments on four datasets show that DTI-LM can achieve state-of-the-art performance on DTI predictions. Notably, it excels in overcoming the common challenges faced by sequence-based models in cold start predictions for proteins, yielding impressive results. The incorporation of neighborhood information through a graph attention network further enhances prediction accuracy. Nevertheless, a disparity persists between cold start predictions for proteins and drugs. A detailed examination of DTI-LM reveals that language models exhibit contrasting capabilities in capturing similarities between drugs and proteins. AVAILABILITY AND IMPLEMENTATION Source code is available at: https://github.com/compbiolabucf/DTI-LM.
Collapse
Affiliation(s)
- Khandakar Tanvir Ahmed
- Department of Computer Science, University of Central Florida, Orlando, FL 32816, United States
- Genomics and Bioinformatics Cluster, University of Central Florida, Orlando, FL 32816, United States
| | - Md Istiaq Ansari
- Department of Computer Science, University of Central Florida, Orlando, FL 32816, United States
- Genomics and Bioinformatics Cluster, University of Central Florida, Orlando, FL 32816, United States
| | - Wei Zhang
- Department of Computer Science, University of Central Florida, Orlando, FL 32816, United States
- Genomics and Bioinformatics Cluster, University of Central Florida, Orlando, FL 32816, United States
| |
Collapse
|
17
|
Aliev TA, Lavrentev FV, Dyakonov AV, Diveev DA, Shilovskikh VV, Skorb EV. Electrochemical platform for detecting Escherichia coli bacteria using machine learning methods. Biosens Bioelectron 2024; 259:116377. [PMID: 38776798 DOI: 10.1016/j.bios.2024.116377] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2024] [Revised: 04/24/2024] [Accepted: 05/08/2024] [Indexed: 05/25/2024]
Abstract
We present an electrochemical platform designed to reduce time of Escherichia coli bacteria detection from 24 to 48-h to 30 min. The presented approach is based on a system which includes gallium-indium (eGaIn) alloy to provide conductivity and a hydrogel system to preserve bacteria and their metabolic species during the analysis. The work is dedicated to accurate and fast detection of Escherichia coli bacteria in different environments with the supply of machine learning methods. Electrochemical data obtained during the analysis is processed via multilayer perceptron model to identify i.e. predict bacterial concentration in the samples. The performed approach provides the effectiveness of bacteria identification in the range of 102-109 colony forming units per ml with the average accuracy of 97%. The proposed bioelectrochemical system combined with machine learning model is prospective for food analysis, agriculture, biomedicine.
Collapse
Affiliation(s)
- Timur A Aliev
- Infochemistry Scientific Center, ITMO University, 9 Lomonosova Street, Saint-Petersburg, 191002, Russia
| | - Filipp V Lavrentev
- Infochemistry Scientific Center, ITMO University, 9 Lomonosova Street, Saint-Petersburg, 191002, Russia
| | - Alexandr V Dyakonov
- Infochemistry Scientific Center, ITMO University, 9 Lomonosova Street, Saint-Petersburg, 191002, Russia
| | - Daniil A Diveev
- Infochemistry Scientific Center, ITMO University, 9 Lomonosova Street, Saint-Petersburg, 191002, Russia
| | - Vladimir V Shilovskikh
- Infochemistry Scientific Center, ITMO University, 9 Lomonosova Street, Saint-Petersburg, 191002, Russia; Saint Petersburg State University, Universitetskaya Embankment 7-9, Saint-Petersburg, 199034, Russia
| | - Ekaterina V Skorb
- Infochemistry Scientific Center, ITMO University, 9 Lomonosova Street, Saint-Petersburg, 191002, Russia.
| |
Collapse
|
18
|
Cheng X, Yang X, Guan Y, Feng Y. ERT-GFAN: A multimodal drug-target interaction prediction model based on molecular biology and knowledge-enhanced attention mechanism. Comput Biol Med 2024; 180:109012. [PMID: 39153394 DOI: 10.1016/j.compbiomed.2024.109012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2024] [Revised: 08/06/2024] [Accepted: 08/07/2024] [Indexed: 08/19/2024]
Abstract
In drug discovery, precisely identifying drug-target interactions is crucial for finding new drugs and understanding drug mechanisms. Evolving drug/target heterogeneous data presents challenges in obtaining multimodal representation in drug-target prediction(DTI). To deal with this, we propose 'ERT-GFAN', a multimodal drug-target interaction prediction model inspired by molecular biology. Firstly, it integrates bio-inspired principles to obtain structure feature of drugs and targets using Extended Connectivity Fingerprints(ECFP). Simultaneously, the knowledge graph embedding model RotatE is employed to discover the interaction feature of drug-target pairs. Subsequently, Transformer is utilized to refine the contextual neighborhood features from the obtained structure feature and interaction features, and multi-modal high-dimensional fusion features of the three-modal information constructed. Finally, the final DTI prediction results are outputted by integrating the multimodal fusion features into a graphical high-dimensional fusion feature attention network (GFAN) using our innovative multimodal high-dimensional fusion feature attention. This multimodal approach offers a comprehensive understanding of drug-target interactions, addressing challenges in complex knowledge graphs. By combining structure feature, interaction feature, and contextual neighborhood features, 'ERT-GFAN' excels in predicting DTI. Empirical evaluations on three datasets demonstrate our method's superior performance, with AUC of 0.9739, 0.9862, and 0.9667, AUPR of 0.9598, 0.9789, and 0.9750, and Mean Reciprocal Rank(MRR) of 0.7386, 0.7035, and 0.7133. Ablation studies show over a 5% improvement in predictive performance compared to baseline unimodal and bimodal models. These results, along with detailed case studies, highlight the efficacy and robustness of our approach.
Collapse
Affiliation(s)
- Xiaoqing Cheng
- College of Computer Science and Technology, Qingdao University, Qingdao, 266071, China.
| | - Xixin Yang
- College of Computer Science and Technology, Qingdao University, Qingdao, 266071, China; School of Automation, Qingdao University, Qingdao, 266071, China.
| | - Yuanlin Guan
- School of Mechanical and Automotive Engineering, Qingdao University of Technology, Qingdao, 266071, China; Key Lab of Industrial Fluid Energy Conservation and Pollution Control, Ministry of Education, Qingdao University of Technology, Qingdao, 266071, China
| | - Yihan Feng
- College of Computer Science and Technology, Qingdao University, Qingdao, 266071, China
| |
Collapse
|
19
|
Ahmad F, Muhmood T. Clinical translation of nanomedicine with integrated digital medicine and machine learning interventions. Colloids Surf B Biointerfaces 2024; 241:114041. [PMID: 38897022 DOI: 10.1016/j.colsurfb.2024.114041] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2024] [Revised: 06/11/2024] [Accepted: 06/13/2024] [Indexed: 06/21/2024]
Abstract
Nanomaterials based therapeutics transform the ways of disease prevention, diagnosis and treatment with increasing sophistications in nanotechnology at a breakneck pace, but very few could reach to the clinic due to inconsistencies in preclinical studies followed by regulatory hinderances. To tackle this, integrating the nanomedicine discovery with digital medicine provide technologies as tools of specific biological activity measurement. Hence, overcome the redundancies in nanomedicine discovery by the on-site data acquisition and analytics through integrating intelligent sensors and artificial intelligence (AI) or machine learning (ML). Integrated AI/ML wearable sensors directly gather clinically relevant biochemical information from the subject's body and process data for physicians to make right clinical decision(s) in a time and cost-effective way. This review summarizes insights and recommend the infusion of actionable big data computation enabled sensors in burgeoning field of nanomedicine at academia, research institutes, and pharmaceutical industries, with a potential of clinical translation. Furthermore, many blind spots are present in modern clinically relevant computation, one of which could prevent ML-guided low-cost new nanomedicine development from being successfully translated into the clinic was also discussed.
Collapse
Affiliation(s)
- Farooq Ahmad
- State Key Laboratory of Chemistry and Utilization of Carbon Based Energy Resources, College of Chemistry, Xinjiang University, Urumqi 830017, China.
| | - Tahir Muhmood
- International Iberian Nanotechnology Laboratory (INL), Avenida Mestre José Veiga, Braga 4715-330, Portugal.
| |
Collapse
|
20
|
Zhao Z, Zhao L, Kong C, Zhou J, Zhou F. A review of biophysical strategies to investigate protein-ligand binding: What have we employed? Int J Biol Macromol 2024; 276:133973. [PMID: 39032877 DOI: 10.1016/j.ijbiomac.2024.133973] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2024] [Revised: 07/15/2024] [Accepted: 07/16/2024] [Indexed: 07/23/2024]
Abstract
The protein-ligand binding frequently occurs in living organisms and plays a crucial role in the execution of the functions of proteins and drugs. It is also an indispensable part of drug discovery and screening. While the methods for investigating protein-ligand binding are diverse, each has its own objectives, strengths, and limitations, which all influence the choice of method. Many studies concentrate on one or a few specific methods, suggesting that comprehensive summaries are lacking. Therefore in this review, these methods are comprehensively summarized and are discussed in detail: prediction and simulation methods, thermal and thermodynamic methods, spectroscopic methods, methods of determining three-dimensional structures of the complex, mass spectrometry-based methods and others. It is also important to integrate these methods based on the specific objectives of the research. With the aim of advancing pharmaceutical research, this review seeks to deepen the understanding of the protein-ligand binding process.
Collapse
Affiliation(s)
- Zhen Zhao
- Beijing Key Laboratory of Functional Food from Plant Resources, College of Food Science and Nutritional Engineering, China Agricultural University, 17 Tsinghua East Road, Beijing 100083, China.
| | - Liang Zhao
- Beijing Engineering and Technology Research Center of Food Additives, School of Food and Health, Beijing Technology and Business University, 11 Fucheng Road, Beijing 100048, China.
| | - Chenxi Kong
- Beijing Key Laboratory of Functional Food from Plant Resources, College of Food Science and Nutritional Engineering, China Agricultural University, 17 Tsinghua East Road, Beijing 100083, China
| | - Jingxuan Zhou
- Beijing Key Laboratory of Functional Food from Plant Resources, College of Food Science and Nutritional Engineering, China Agricultural University, 17 Tsinghua East Road, Beijing 100083, China.
| | - Feng Zhou
- Beijing Key Laboratory of Functional Food from Plant Resources, College of Food Science and Nutritional Engineering, China Agricultural University, 17 Tsinghua East Road, Beijing 100083, China.
| |
Collapse
|
21
|
Zheng Y, Ma Y, Xiong Q, Zhu K, Weng N, Zhu Q. The role of artificial intelligence in the development of anticancer therapeutics from natural polyphenols: Current advances and future prospects. Pharmacol Res 2024; 208:107381. [PMID: 39218422 DOI: 10.1016/j.phrs.2024.107381] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/11/2024] [Revised: 08/06/2024] [Accepted: 08/26/2024] [Indexed: 09/04/2024]
Abstract
Natural polyphenols, abundant in the human diet, are derived from a wide variety of sources. Numerous preclinical studies have demonstrated their significant anticancer properties against various malignancies, making them valuable resources for drug development. However, traditional experimental methods for developing anticancer therapies from natural polyphenols are time-consuming and labor-intensive. Recently, artificial intelligence has shown promising advancements in drug discovery. Integrating AI technologies into the development process for natural polyphenols can substantially reduce development time and enhance efficiency. In this study, we review the crucial roles of natural polyphenols in anticancer treatment and explore the potential of AI technologies to aid in drug development. Specifically, we discuss the application of AI in key stages such as drug structure prediction, virtual drug screening, prediction of biological activity, and drug-target protein interaction, highlighting the potential to revolutionize the development of natural polyphenol-based anticancer therapies.
Collapse
Affiliation(s)
- Ying Zheng
- Division of Abdominal Tumor Multimodality Treatment, Cancer Center, West China Hospital, Sichuan University, No.37 Guoxue Alley, Chengdu, Sichuan 610041, China
| | - Yifei Ma
- Division of Abdominal Tumor Multimodality Treatment, Cancer Center, West China Hospital, Sichuan University, No.37 Guoxue Alley, Chengdu, Sichuan 610041, China
| | - Qunli Xiong
- Division of Abdominal Tumor Multimodality Treatment, Cancer Center, West China Hospital, Sichuan University, No.37 Guoxue Alley, Chengdu, Sichuan 610041, China
| | - Kai Zhu
- Department of Medical Oncology, Clinical Oncology School of Fujian Medical University, Fujian Cancer Hospital, Fujian 350011, PR China
| | - Ningna Weng
- Department of Medical Oncology, Clinical Oncology School of Fujian Medical University, Fujian Cancer Hospital, Fujian 350011, PR China
| | - Qing Zhu
- Division of Abdominal Tumor Multimodality Treatment, Cancer Center, West China Hospital, Sichuan University, No.37 Guoxue Alley, Chengdu, Sichuan 610041, China.
| |
Collapse
|
22
|
Duo L, Liu Y, Ren J, Tang B, Hirst JD. Artificial intelligence for small molecule anticancer drug discovery. Expert Opin Drug Discov 2024; 19:933-948. [PMID: 39074493 DOI: 10.1080/17460441.2024.2367014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2024] [Accepted: 06/07/2024] [Indexed: 07/31/2024]
Abstract
INTRODUCTION The transition from conventional cytotoxic chemotherapy to targeted cancer therapy with small-molecule anticancer drugs has enhanced treatment outcomes. This approach, which now dominates cancer treatment, has its advantages. Despite the regulatory approval of several targeted molecules for clinical use, challenges such as low response rates and drug resistance still persist. Conventional drug discovery methods are costly and time-consuming, necessitating more efficient approaches. The rise of artificial intelligence (AI) and access to large-scale datasets have revolutionized the field of small-molecule cancer drug discovery. Machine learning (ML), particularly deep learning (DL) techniques, enables the rapid identification and development of novel anticancer agents by analyzing vast amounts of genomic, proteomic, and imaging data to uncover hidden patterns and relationships. AREA COVERED In this review, the authors explore the important landmarks in the history of AI-driven drug discovery. They also highlight various applications in small-molecule cancer drug discovery, outline the challenges faced, and provide insights for future research. EXPERT OPINION The advent of big data has allowed AI to penetrate and enable innovations in almost every stage of medicine discovery, transforming the landscape of oncology research through the development of state-of-the-art algorithms and models. Despite challenges in data quality, model interpretability, and technical limitations, advancements promise breakthroughs in personalized and precision oncology, revolutionizing future cancer management.
Collapse
Affiliation(s)
- Lihui Duo
- Faculty of Science and Engineering, University of Nottingham Ningbo China, Ningbo, China
| | - Yu Liu
- Faculty of Science and Engineering, University of Nottingham Ningbo China, Ningbo, China
| | - Jianfeng Ren
- Faculty of Science and Engineering, University of Nottingham Ningbo China, Ningbo, China
| | - Bencan Tang
- Faculty of Science and Engineering, University of Nottingham Ningbo China, Ningbo, China
| | - Jonathan D Hirst
- School of Chemistry, University of Nottingham University Park, Nottingham, UK
| |
Collapse
|
23
|
Chen Y, Liang X, Du W, Liang Y, Wong G, Chen L. Drug-Target Interaction Prediction Based on an Interactive Inference Network. Int J Mol Sci 2024; 25:7753. [PMID: 39062996 PMCID: PMC11277210 DOI: 10.3390/ijms25147753] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2024] [Revised: 06/25/2024] [Accepted: 06/27/2024] [Indexed: 07/28/2024] Open
Abstract
Drug-target interactions underlie the actions of chemical substances in medicine. Moreover, drug repurposing can expand use profiles while reducing costs and development time by exploiting potential multi-functional pharmacological properties based upon additional target interactions. Nonetheless, drug repurposing relies on the accurate identification and validation of drug-target interactions (DTIs). In this study, a novel drug-target interaction prediction model was developed. The model, based on an interactive inference network, contains embedding, encoding, interaction, feature extraction, and output layers. In addition, this study used Morgan and PubChem molecular fingerprints as additional information for drug encoding. The interaction layer in our model simulates the drug-target interaction process, which assists in understanding the interaction by representing the interaction space. Our method achieves high levels of predictive performance, as well as interpretability of drug-target interactions. Additionally, we predicted and validated 22 Alzheimer's disease-related targets, suggesting our model is robust and effective and thus may be beneficial for drug repurposing.
Collapse
Affiliation(s)
- Yuqi Chen
- College of Mathematics and Computer, Shantou University, Shantou 515063, China; (Y.C.); (X.L.)
| | - Xiaomin Liang
- College of Mathematics and Computer, Shantou University, Shantou 515063, China; (Y.C.); (X.L.)
| | - Wei Du
- Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun 130012, China; (W.D.); (Y.L.)
| | - Yanchun Liang
- Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun 130012, China; (W.D.); (Y.L.)
| | - Garry Wong
- Faculty of Health Sciences, University of Macau, Taipa, Macau SAR 999078, China;
| | - Liang Chen
- College of Mathematics and Computer, Shantou University, Shantou 515063, China; (Y.C.); (X.L.)
| |
Collapse
|
24
|
Abou Hajal A, Al Meslamani AZ. Overcoming barriers to machine learning applications in toxicity prediction. Expert Opin Drug Metab Toxicol 2024; 20:549-553. [PMID: 38088128 DOI: 10.1080/17425255.2023.2294939] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2023] [Accepted: 12/11/2023] [Indexed: 07/25/2024]
Affiliation(s)
- Abdallah Abou Hajal
- College of Pharmacy, Al Ain University, Abu Dhabi, United Arab Emirates
- AAU Health and Biomedical Research Center, Al Ain University, Abu Dhabi, United Arab Emirates
| | - Ahmad Z Al Meslamani
- College of Pharmacy, Al Ain University, Abu Dhabi, United Arab Emirates
- AAU Health and Biomedical Research Center, Al Ain University, Abu Dhabi, United Arab Emirates
| |
Collapse
|
25
|
Abubakar ML, Kapoor N, Sharma A, Gambhir L, Jasuja ND, Sharma G. Artificial Intelligence in Drug Identification and Validation: A Scoping Review. Drug Res (Stuttg) 2024; 74:208-219. [PMID: 38830370 DOI: 10.1055/a-2306-8311] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/05/2024]
Abstract
The end-to-end process in the discovery of drugs involves therapeutic candidate identification, validation of identified targets, identification of hit compound series, lead identification and optimization, characterization, and formulation and development. The process is lengthy, expensive, tedious, and inefficient, with a large attrition rate for novel drug discovery. Today, the pharmaceutical industry is focused on improving the drug discovery process. Finding and selecting acceptable drug candidates effectively can significantly impact the price and profitability of new medications. Aside from the cost, there is a need to reduce the end-to-end process time, limiting the number of experiments at various stages. To achieve this, artificial intelligence (AI) has been utilized at various stages of drug discovery. The present study aims to identify the recent work that has developed AI-based models at various stages of drug discovery, identify the stages that need more concern, present the taxonomy of AI methods in drug discovery, and provide research opportunities. From January 2016 to September 1, 2023, the study identified all publications that were cited in the electronic databases including Scopus, NCBI PubMed, MEDLINE, Anthropology Plus, Embase, APA PsycInfo, SOCIndex, and CINAHL. Utilising a standardized form, data were extracted, and presented possible research prospects based on the analysis of the extracted data.
Collapse
Affiliation(s)
| | - Neha Kapoor
- School of Applied Sciences, Suresh Gyan Vihar University, Jaipur, Rajasthan, India
| | - Asha Sharma
- Department of Zoology, Swargiya P. N. K. S. Govt. PG College, Dausa, Rajasthan, India
| | - Lokesh Gambhir
- School of Basic and Applied Sciences, Shri Guru Ram Rai University, Dehradun, Uttarakhand, India
| | | | - Gaurav Sharma
- School of Applied Sciences, Suresh Gyan Vihar University, Jaipur, Rajasthan, India
| |
Collapse
|
26
|
Yang Z, Liu J, Yang F, Zhang X, Zhang Q, Zhu X, Jiang P. Advancing Drug-Target Interaction prediction with BERT and subsequence embedding. Comput Biol Chem 2024; 110:108058. [PMID: 38593480 DOI: 10.1016/j.compbiolchem.2024.108058] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2023] [Revised: 02/01/2024] [Accepted: 03/12/2024] [Indexed: 04/11/2024]
Abstract
Exploring the relationship between proteins and drugs plays a significant role in discovering new synthetic drugs. The Drug-Target Interaction (DTI) prediction is a fundamental task in the relationship between proteins and drugs. Unlike encoding proteins by amino acids, we use amino acid subsequence to encode proteins, which simulates the biological process of DTI better. For this research purpose, we proposed a novel deep learning framework based on Bidirectional Encoder Representation from Transformers (BERT), which integrates high-frequency subsequence embedding and transfer learning methods to complete the DTI prediction task. As the first key module, subsequence embedding allows to explore the functional interaction units from drug and protein sequences and then contribute to finding DTI modules. As the second key module, transfer learning promotes the model learn the common DTI features from protein and drug sequences in a large dataset. Overall, the BERT-based model can learn two kinds features through the multi-head self-attention mechanism: internal features of sequence and interaction features of both proteins and drugs, respectively. Compared with other methods, BERT-based methods enable more DTI-related features to be discovered by means of attention scores which associated with tokenized protein/drug subsequences. We conducted extensive experiments for the DTI prediction task on three different benchmark datasets. The experimental results show that the model achieves an average prediction metrics higher than most baseline methods. In order to verify the importance of transfer learning, we conducted an ablation study on datasets, and the results show the superiority of transfer learning. In addition, we test the scalability of the model on the dataset in unseen drugs and proteins, and the results of the experiments show that it is acceptable in scalability.
Collapse
Affiliation(s)
- Zhihui Yang
- Institute of Artificial Intelligence, School of Computer Science, Wuhan University, Wuhan, 430072, Hubei province, China
| | - Juan Liu
- Institute of Artificial Intelligence, School of Computer Science, Wuhan University, Wuhan, 430072, Hubei province, China.
| | - Feng Yang
- Institute of Artificial Intelligence, School of Computer Science, Wuhan University, Wuhan, 430072, Hubei province, China
| | - Xiaolei Zhang
- Institute of Artificial Intelligence, School of Computer Science, Wuhan University, Wuhan, 430072, Hubei province, China
| | - Qiang Zhang
- Institute of Artificial Intelligence, School of Computer Science, Wuhan University, Wuhan, 430072, Hubei province, China
| | - Xuekai Zhu
- Institute of Artificial Intelligence, School of Computer Science, Wuhan University, Wuhan, 430072, Hubei province, China
| | - Peng Jiang
- Institute of Artificial Intelligence, School of Computer Science, Wuhan University, Wuhan, 430072, Hubei province, China
| |
Collapse
|
27
|
Zhang R, Yuan R, Tian B. PointGAT: A Quantum Chemical Property Prediction Model Integrating Graph Attention and 3D Geometry. J Chem Theory Comput 2024; 20:4115-4128. [PMID: 38727259 DOI: 10.1021/acs.jctc.3c01420] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/29/2024]
Abstract
Predicting quantum chemical properties is a fundamental challenge for computational chemistry. While the development of graph neural networks has advanced molecular representation learning and property prediction, their performance could be further enhanced by incorporating three-dimensional (3D) structural geometry into two-dimensional (2D) molecular graph representation. In this study, we introduce the PointGAT model for quantum molecular property prediction, which integrates 3D molecular coordinates with graph-attention modeling. Comparison with other current models in molecular prediction tasks showed that PointGAT could provide higher predictive accuracy in various benchmark data sets from MoleculeNet, including ESOL, FreeSolv, Lipop, HIV, and 6 out of 12 tasks of the QM9 data set. To further examine PointGAT prediction of quantum mechanical (QM) energies, we constructed a C10 data set comprising 11,841 charged and chiral carbocation intermediates with QM energies calculated at the DM21/6-31G*//B3LYP/6-31G* levels. Notably, PointGAT achieved an R2 value of 0.950 and an MAE of 1.616 kcal/mol, outperforming even the best-performing graph neural network model with a reduction of 0.216 kcal/mol in MAE and an improvement of 0.050 in R2. Additional ablation studies indicated that incorporating molecular geometry into the model resulted in markedly higher predictive accuracy, reducing the MAE value from 1.802 to 1.616 kcal/mol. Moreover, visualization of PointGAT atomic attention weights suggested its predictions were interpretable. Findings in this study support the application of PointGAT as a powerful and versatile tool for quantum chemical property prediction that can facilitate high-accuracy modeling for fundamental exploration of chemical space as well as drug design and molecular engineering.
Collapse
Affiliation(s)
- Rong Zhang
- MOE Key Laboratory of Bioinformatics, State Key Laboratory of Molecular Oncology, School of Pharmaceutical Sciences, Tsinghua University, Beijing 100084, China
| | - Rongqing Yuan
- Department of Chemistry, Tsinghua University, Beijing 100084, China
| | - Boxue Tian
- MOE Key Laboratory of Bioinformatics, State Key Laboratory of Molecular Oncology, School of Pharmaceutical Sciences, Tsinghua University, Beijing 100084, China
| |
Collapse
|
28
|
Goles M, Daza A, Cabas-Mora G, Sarmiento-Varón L, Sepúlveda-Yañez J, Anvari-Kazemabad H, Davari MD, Uribe-Paredes R, Olivera-Nappa Á, Navarrete MA, Medina-Ortiz D. Peptide-based drug discovery through artificial intelligence: towards an autonomous design of therapeutic peptides. Brief Bioinform 2024; 25:bbae275. [PMID: 38856172 PMCID: PMC11163380 DOI: 10.1093/bib/bbae275] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2024] [Revised: 04/23/2024] [Accepted: 06/04/2024] [Indexed: 06/11/2024] Open
Abstract
With their diverse biological activities, peptides are promising candidates for therapeutic applications, showing antimicrobial, antitumour and hormonal signalling capabilities. Despite their advantages, therapeutic peptides face challenges such as short half-life, limited oral bioavailability and susceptibility to plasma degradation. The rise of computational tools and artificial intelligence (AI) in peptide research has spurred the development of advanced methodologies and databases that are pivotal in the exploration of these complex macromolecules. This perspective delves into integrating AI in peptide development, encompassing classifier methods, predictive systems and the avant-garde design facilitated by deep-generative models like generative adversarial networks and variational autoencoders. There are still challenges, such as the need for processing optimization and careful validation of predictive models. This work outlines traditional strategies for machine learning model construction and training techniques and proposes a comprehensive AI-assisted peptide design and validation pipeline. The evolving landscape of peptide design using AI is emphasized, showcasing the practicality of these methods in expediting the development and discovery of novel peptides within the context of peptide-based drug discovery.
Collapse
Affiliation(s)
- Montserrat Goles
- Departamento de Ingeniería en Computación, Universidad de Magallanes, Av. Pdte. Manuel Bulnes 01855, 6210427, Punta Arenas, Chile
- Departamento de Ingeniería Química, Biotecnología y Materiales, Universidad de Chile, Beauchef 851, 8370456, Santiago, Chile
| | - Anamaría Daza
- Centre for Biotechnology and Bioengineering, CeBiB, Universidad de Chile, Beauchef 851, 8370456, Santiago, Chile
| | - Gabriel Cabas-Mora
- Departamento de Ingeniería en Computación, Universidad de Magallanes, Av. Pdte. Manuel Bulnes 01855, 6210427, Punta Arenas, Chile
| | - Lindybeth Sarmiento-Varón
- Centro Asistencial de Docencia e Investigación, CADI, Universidad de Magallanes, Av. Los Flamencos 01364, 6210005, Punta Arenas, Chile
| | - Julieta Sepúlveda-Yañez
- Facultad de Ciencias de la Salud, Universidad de Magallanes, Av. Pdte. Manuel Bulnes 01855, 6210427, Punta Arenas, Chile
| | - Hoda Anvari-Kazemabad
- Departamento de Ingeniería en Computación, Universidad de Magallanes, Av. Pdte. Manuel Bulnes 01855, 6210427, Punta Arenas, Chile
| | - Mehdi D Davari
- Department of Bioorganic Chemistry, Leibniz Institute of Plant Biochemistry, Weinberg 3, 06120, Halle, Germany
| | - Roberto Uribe-Paredes
- Departamento de Ingeniería en Computación, Universidad de Magallanes, Av. Pdte. Manuel Bulnes 01855, 6210427, Punta Arenas, Chile
| | - Álvaro Olivera-Nappa
- Centre for Biotechnology and Bioengineering, CeBiB, Universidad de Chile, Beauchef 851, 8370456, Santiago, Chile
| | - Marcelo A Navarrete
- Centro Asistencial de Docencia e Investigación, CADI, Universidad de Magallanes, Av. Los Flamencos 01364, 6210005, Punta Arenas, Chile
- Escuela de Medicina, Universidad de Magallanes, Av. Pdte. Manuel Bulnes 01855, 6210427, Punta Arenas, Chile
| | - David Medina-Ortiz
- Departamento de Ingeniería en Computación, Universidad de Magallanes, Av. Pdte. Manuel Bulnes 01855, 6210427, Punta Arenas, Chile
- Centre for Biotechnology and Bioengineering, CeBiB, Universidad de Chile, Beauchef 851, 8370456, Santiago, Chile
| |
Collapse
|
29
|
Baul S, Tanvir Ahmed K, Jiang Q, Wang G, Li Q, Yong J, Zhang W. Integrating spatial transcriptomics and bulk RNA-seq: predicting gene expression with enhanced resolution through graph attention networks. Brief Bioinform 2024; 25:bbae316. [PMID: 38960406 PMCID: PMC11221891 DOI: 10.1093/bib/bbae316] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2024] [Revised: 06/04/2024] [Accepted: 06/17/2024] [Indexed: 07/05/2024] Open
Abstract
Spatial transcriptomics data play a crucial role in cancer research, providing a nuanced understanding of the spatial organization of gene expression within tumor tissues. Unraveling the spatial dynamics of gene expression can unveil key insights into tumor heterogeneity and aid in identifying potential therapeutic targets. However, in many large-scale cancer studies, spatial transcriptomics data are limited, with bulk RNA-seq and corresponding Whole Slide Image (WSI) data being more common (e.g. TCGA project). To address this gap, there is a critical need to develop methodologies that can estimate gene expression at near-cell (spot) level resolution from existing WSI and bulk RNA-seq data. This approach is essential for reanalyzing expansive cohort studies and uncovering novel biomarkers that have been overlooked in the initial assessments. In this study, we present STGAT (Spatial Transcriptomics Graph Attention Network), a novel approach leveraging Graph Attention Networks (GAT) to discern spatial dependencies among spots. Trained on spatial transcriptomics data, STGAT is designed to estimate gene expression profiles at spot-level resolution and predict whether each spot represents tumor or non-tumor tissue, especially in patient samples where only WSI and bulk RNA-seq data are available. Comprehensive tests on two breast cancer spatial transcriptomics datasets demonstrated that STGAT outperformed existing methods in accurately predicting gene expression. Further analyses using the TCGA breast cancer dataset revealed that gene expression estimated from tumor-only spots (predicted by STGAT) provides more accurate molecular signatures for breast cancer sub-type and tumor stage prediction, and also leading to improved patient survival and disease-free analysis. Availability: Code is available at https://github.com/compbiolabucf/STGAT.
Collapse
Affiliation(s)
- Sudipto Baul
- Department of Computer Science, University of Central Florida, Orlando, FL 32816, United States
| | - Khandakar Tanvir Ahmed
- Department of Computer Science, University of Central Florida, Orlando, FL 32816, United States
| | - Qibing Jiang
- Department of Computer Science, University of Central Florida, Orlando, FL 32816, United States
| | - Guangyu Wang
- Houston Methodist Research Institute, Weill Cornell Medical College, Houston, TX 77030, United States
| | - Qian Li
- Department of Biostatistics, St. Jude Children’s Research Hospital, Memphis, TN 38105, United States
| | - Jeongsik Yong
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota Twin Cities, Minneapolis, MN 55455, United States
| | - Wei Zhang
- Department of Computer Science, University of Central Florida, Orlando, FL 32816, United States
| |
Collapse
|
30
|
Yao R, Shen Z, Xu X, Ling G, Xiang R, Song T, Zhai F, Zhai Y. Knowledge mapping of graph neural networks for drug discovery: a bibliometric and visualized analysis. Front Pharmacol 2024; 15:1393415. [PMID: 38799167 PMCID: PMC11116974 DOI: 10.3389/fphar.2024.1393415] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2024] [Accepted: 04/12/2024] [Indexed: 05/29/2024] Open
Abstract
Introduction In recent years, graph neural network has been extensively applied to drug discovery research. Although researchers have made significant progress in this field, there is less research on bibliometrics. The purpose of this study is to conduct a comprehensive bibliometric analysis of graph neural network applications in drug discovery in order to identify current research hotspots and trends, as well as serve as a reference for future research. Methods Publications from 2017 to 2023 about the application of graph neural network in drug discovery were collected from the Web of Science Core Collection. Bibliometrix, VOSviewer, and Citespace were mainly used for bibliometric studies. Results and Discussion In this paper, a total of 652 papers from 48 countries/regions were included. Research interest in this field is continuously increasing. China and the United States have a significant advantage in terms of funding, the number of publications, and collaborations with other institutions and countries. Although some cooperation networks have been formed in this field, extensive worldwide cooperation still needs to be strengthened. The results of the keyword analysis clarified that graph neural network has primarily been applied to drug-target interaction, drug repurposing, and drug-drug interaction, while graph convolutional neural network and its related optimization methods are currently the core algorithms in this field. Data availability and ethical supervision, balancing computing resources, and developing novel graph neural network models with better interpretability are the key technical issues currently faced. This paper analyzes the current state, hot spots, and trends of graph neural network applications in drug discovery through bibliometric approaches, as well as the current issues and challenges in this field. These findings provide researchers with valuable insights on the current status and future directions of this field.
Collapse
Affiliation(s)
| | | | | | | | | | | | - Fei Zhai
- Faculty of Medical Device, Shenyang Pharmaceutical University, Shenyang, China
| | - Yuxuan Zhai
- Faculty of Medical Device, Shenyang Pharmaceutical University, Shenyang, China
| |
Collapse
|
31
|
Kroll A, Ranjan S, Lercher MJ. A multimodal Transformer Network for protein-small molecule interactions enhances predictions of kinase inhibition and enzyme-substrate relationships. PLoS Comput Biol 2024; 20:e1012100. [PMID: 38768223 PMCID: PMC11142704 DOI: 10.1371/journal.pcbi.1012100] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2024] [Revised: 05/31/2024] [Accepted: 04/24/2024] [Indexed: 05/22/2024] Open
Abstract
The activities of most enzymes and drugs depend on interactions between proteins and small molecules. Accurate prediction of these interactions could greatly accelerate pharmaceutical and biotechnological research. Current machine learning models designed for this task have a limited ability to generalize beyond the proteins used for training. This limitation is likely due to a lack of information exchange between the protein and the small molecule during the generation of the required numerical representations. Here, we introduce ProSmith, a machine learning framework that employs a multimodal Transformer Network to simultaneously process protein amino acid sequences and small molecule strings in the same input. This approach facilitates the exchange of all relevant information between the two molecule types during the computation of their numerical representations, allowing the model to account for their structural and functional interactions. Our final model combines gradient boosting predictions based on the resulting multimodal Transformer Network with independent predictions based on separate deep learning representations of the proteins and small molecules. The resulting predictions outperform recently published state-of-the-art models for predicting protein-small molecule interactions across three diverse tasks: predicting kinase inhibitions; inferring potential substrates for enzymes; and predicting Michaelis constants KM. The Python code provided can be used to easily implement and improve machine learning predictions involving arbitrary protein-small molecule interactions.
Collapse
Affiliation(s)
- Alexander Kroll
- Institute for Computer Science and Department of Biology, Heinrich Heine University, Düsseldorf, Germany
| | - Sahasra Ranjan
- Department of Computer Science and Engineering, Indian Institute of Technology Bombay, Powai, Mumbai, India
| | - Martin J. Lercher
- Institute for Computer Science and Department of Biology, Heinrich Heine University, Düsseldorf, Germany
| |
Collapse
|
32
|
Svensson E, Hoedt PJ, Hochreiter S, Klambauer G. HyperPCM: Robust Task-Conditioned Modeling of Drug-Target Interactions. J Chem Inf Model 2024; 64:2539-2553. [PMID: 38185877 PMCID: PMC11005051 DOI: 10.1021/acs.jcim.3c01417] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2023] [Revised: 11/27/2023] [Accepted: 11/27/2023] [Indexed: 01/09/2024]
Abstract
A central problem in drug discovery is to identify the interactions between drug-like compounds and protein targets. Over the past few decades, various quantitative structure-activity relationship (QSAR) and proteo-chemometric (PCM) approaches have been developed to model and predict these interactions. While QSAR approaches solely utilize representations of the drug compound, PCM methods incorporate both representations of the protein target and the drug compound, enabling them to achieve above-chance predictive accuracy on previously unseen protein targets. Both QSAR and PCM approaches have recently been improved by machine learning and deep neural networks, that allow the development of drug-target interaction prediction models from measurement data. However, deep neural networks typically require large amounts of training data and cannot robustly adapt to new tasks, such as predicting interaction for unseen protein targets at inference time. In this work, we propose to use HyperNetworks to efficiently transfer information between tasks during inference and thus to accurately predict drug-target interactions on unseen protein targets. Our HyperPCM method reaches state-of-the-art performance compared to previous methods on multiple well-known benchmarks, including Davis, DUD-E, and a ChEMBL derived data set, and particularly excels at zero-shot inference involving unseen protein targets. Our method, as well as reproducible data preparation, is available at https://github.com/ml-jku/hyper-dti.
Collapse
Affiliation(s)
- Emma Svensson
- ELLIS
Unit Linz & Institute for Machine Learning, Johannes Kepler University, Linz 4040, Austria
- Molecular
AI, Discovery Sciences, R&D, AstraZeneca, Gothenburg, 431 83, Sweden
| | - Pieter-Jan Hoedt
- ELLIS
Unit Linz & Institute for Machine Learning, Johannes Kepler University, Linz 4040, Austria
| | - Sepp Hochreiter
- ELLIS
Unit Linz & Institute for Machine Learning, Johannes Kepler University, Linz 4040, Austria
- Institute
of Advanced Research in Artificial Intelligence (IARAI), Vienna 1030, Austria
| | - Günter Klambauer
- ELLIS
Unit Linz & Institute for Machine Learning, Johannes Kepler University, Linz 4040, Austria
| |
Collapse
|
33
|
Agu PC, Obulose CN. Piquing artificial intelligence towards drug discovery: Tools, techniques, and applications. Drug Dev Res 2024; 85:e22159. [PMID: 38375772 DOI: 10.1002/ddr.22159] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2023] [Revised: 01/12/2024] [Accepted: 01/29/2024] [Indexed: 02/21/2024]
Abstract
The purpose of this study was to discuss how artificial intelligence (AI) methods have affected the field of drug development. It looks at how AI models and data resources are reshaping the drug development process by offering more affordable and expedient options to conventional approaches. The paper opens with an overview of well-known information sources for drug development. The discussion then moves on to molecular representation techniques that make it possible to convert data into representations that computers can understand. The paper also gives a general overview of the algorithms used in the creation of drug discovery models based on AI. In particular, the paper looks at how AI algorithms might be used to forecast drug toxicity, drug bioactivity, and drug physicochemical properties. De novo drug design, binding affinity prediction, and other AI-based models for drug-target interaction were covered in deeper detail. Modern applications of AI in nanomedicine design and pharmacological synergism/antagonism prediction were also covered. The potential advantages of AI in drug development are highlighted as the evaluation comes to a close. It underlines how AI may greatly speed up and improve the efficiency of drug discovery, resulting in the creation of new and better medicines. To fully realize the promise of AI in drug discovery, the review acknowledges the difficulties that come with its uses in this field and advocates for more study and development.
Collapse
Affiliation(s)
- Peter Chinedu Agu
- Department of Biochemistry, College of Science, Evangel University, Akaeze, Ebonyi State, Nigeria
| | - Chidiebere Nwiboko Obulose
- Department of Computer Sciences, Our Savior Institute of Science, Agriculture, and Technology (OSISATECH Polytechnic), Enugu, Nigeria
| |
Collapse
|
34
|
Wang M, Wang J, Rong Z, Wang L, Xu Z, Zhang L, He J, Li S, Cao L, Hou Y, Li K. A bidirectional interpretable compound-protein interaction prediction framework based on cross attention. Comput Biol Med 2024; 172:108239. [PMID: 38460309 DOI: 10.1016/j.compbiomed.2024.108239] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2023] [Revised: 02/25/2024] [Accepted: 02/26/2024] [Indexed: 03/11/2024]
Abstract
The identification of compound-protein interactions (CPIs) plays a vital role in drug discovery. However, the huge cost and labor-intensive nature in vitro and vivo experiments make it urgent for researchers to develop novel CPI prediction methods. Despite emerging deep learning methods have achieved promising performance in CPI prediction, they also face ongoing challenges: (i) providing bidirectional interpretability from both the chemical and biological perspective for the prediction results; (ii) comprehensively evaluating model generalization performance; (iii) demonstrating the practical applicability of these models. To overcome the challenges posed by current deep learning methods, we propose a cross multi-head attention oriented bidirectional interpretable CPI prediction model (CmhAttCPI). First, CmhAttCPI takes molecular graphs and protein sequences as inputs, utilizing the GCW module to learn atom features and the CNN module to learn residue features, respectively. Second, the model applies cross multi-head attention module to compute attention weights for atoms and residues. Finally, CmhAttCPI employs a fully connected neural network to predict scores for CPIs. We evaluated the performance of CmhAttCPI on balanced datasets and imbalanced datasets. The results consistently show that CmhAttCPI outperforms multiple state-of-the-art methods. We constructed three scenarios based on compound and protein clustering and comprehensively evaluated the model generalization ability within these scenarios. The results demonstrate that the generalization ability of CmhAttCPI surpasses that of other models. Besides, the visualizations of attention weights reveal that CmhAttCPI provides chemical and biological interpretation for CPI prediction. Moreover, case studies confirm the practical applicability of CmhAttCPI in discovering anticancer candidates.
Collapse
Affiliation(s)
- Meng Wang
- School of Public Health, Harbin Medical University, Harbin, 150081, China
| | - Jianmin Wang
- School of Integrative Biotechnology and Translational Medicine, Yonsei University, Incheon, 21983, Republic of Korea
| | - Zhiwei Rong
- School of Public Health, Peking University, Beijing, 100871, China
| | - Liuying Wang
- School of Public Health, Harbin Medical University, Harbin, 150081, China
| | - Zhenyi Xu
- School of Public Health, Harbin Medical University, Harbin, 150081, China
| | - Liuchao Zhang
- School of Public Health, Harbin Medical University, Harbin, 150081, China
| | - Jia He
- School of Public Health, Harbin Medical University, Harbin, 150081, China
| | - Shuang Li
- School of Public Health, Harbin Medical University, Harbin, 150081, China
| | - Lei Cao
- School of Public Health, Harbin Medical University, Harbin, 150081, China
| | - Yan Hou
- School of Public Health, Peking University, Beijing, 100871, China
| | - Kang Li
- School of Public Health, Harbin Medical University, Harbin, 150081, China.
| |
Collapse
|
35
|
Ghandikota SK, Jegga AG. Application of artificial intelligence and machine learning in drug repurposing. PROGRESS IN MOLECULAR BIOLOGY AND TRANSLATIONAL SCIENCE 2024; 205:171-211. [PMID: 38789178 DOI: 10.1016/bs.pmbts.2024.03.030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/26/2024]
Abstract
The purpose of drug repurposing is to leverage previously approved drugs for a particular disease indication and apply them to another disease. It can be seen as a faster and more cost-effective approach to drug discovery and a powerful tool for achieving precision medicine. In addition, drug repurposing can be used to identify therapeutic candidates for rare diseases and phenotypic conditions with limited information on disease biology. Machine learning and artificial intelligence (AI) methodologies have enabled the construction of effective, data-driven repurposing pipelines by integrating and analyzing large-scale biomedical data. Recent technological advances, especially in heterogeneous network mining and natural language processing, have opened up exciting new opportunities and analytical strategies for drug repurposing. In this review, we first introduce the challenges in repurposing approaches and highlight some success stories, including those during the COVID-19 pandemic. Next, we review some existing computational frameworks in the literature, organized on the basis of the type of biomedical input data analyzed and the computational algorithms involved. In conclusion, we outline some exciting new directions that drug repurposing research may take, as pioneered by the generative AI revolution.
Collapse
Affiliation(s)
- Sudhir K Ghandikota
- Division of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, United States
| | - Anil G Jegga
- Division of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, United States; Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH, United States.
| |
Collapse
|
36
|
Zhao W, Yu Y, Liu G, Liang Y, Xu D, Feng X, Guan R. MSI-DTI: predicting drug-target interaction based on multi-source information and multi-head self-attention. Brief Bioinform 2024; 25:bbae238. [PMID: 38762789 PMCID: PMC11102638 DOI: 10.1093/bib/bbae238] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2024] [Revised: 04/09/2024] [Accepted: 05/03/2024] [Indexed: 05/20/2024] Open
Abstract
Identifying drug-target interactions (DTIs) holds significant importance in drug discovery and development, playing a crucial role in various areas such as virtual screening, drug repurposing and identification of potential drug side effects. However, existing methods commonly exploit only a single type of feature from drugs and targets, suffering from miscellaneous challenges such as high sparsity and cold-start problems. We propose a novel framework called MSI-DTI (Multi-Source Information-based Drug-Target Interaction Prediction) to enhance prediction performance, which obtains feature representations from different views by integrating biometric features and knowledge graph representations from multi-source information. Our approach involves constructing a Drug-Target Knowledge Graph (DTKG), obtaining multiple feature representations from diverse information sources for SMILES sequences and amino acid sequences, incorporating network features from DTKG and performing an effective multi-source information fusion. Subsequently, we employ a multi-head self-attention mechanism coupled with residual connections to capture higher-order interaction information between sparse features while preserving lower-order information. Experimental results on DTKG and two benchmark datasets demonstrate that our MSI-DTI outperforms several state-of-the-art DTIs prediction methods, yielding more accurate and robust predictions. The source codes and datasets are publicly accessible at https://github.com/KEAML-JLU/MSI-DTI.
Collapse
Affiliation(s)
- Wenchuan Zhao
- Key Laboratory of Symbolic Computation and Knowledge Engineering of the Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun 130012, Jilin, China
| | - Yufeng Yu
- Key Laboratory of Symbolic Computation and Knowledge Engineering of the Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun 130012, Jilin, China
| | - Guosheng Liu
- Key Laboratory of Symbolic Computation and Knowledge Engineering of the Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun 130012, Jilin, China
| | - Yanchun Liang
- Zhuhai Laboratory of the Key Laboratory of Symbolic Computation and Knowledge Engineering of the Ministry of Education, Zhuhai College of Science and Technology, Zhuhai 519041, China
| | - Dong Xu
- Department of Computer Science, Informatics Institute, and Christopher S. Bond Life Sciences Center, University of Missouri, Columbia, MO 65211, USA
| | - Xiaoyue Feng
- Key Laboratory of Symbolic Computation and Knowledge Engineering of the Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun 130012, Jilin, China
| | - Renchu Guan
- Key Laboratory of Symbolic Computation and Knowledge Engineering of the Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun 130012, Jilin, China
| |
Collapse
|
37
|
Zhang Y, Li S, Meng K, Sun S. Machine Learning for Sequence and Structure-Based Protein-Ligand Interaction Prediction. J Chem Inf Model 2024; 64:1456-1472. [PMID: 38385768 DOI: 10.1021/acs.jcim.3c01841] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/23/2024]
Abstract
Developing new drugs is too expensive and time -consuming. Accurately predicting the interaction between drugs and targets will likely change how the drug is discovered. Machine learning-based protein-ligand interaction prediction has demonstrated significant potential. In this paper, computational methods, focusing on sequence and structure to study protein-ligand interactions, are examined. Therefore, this paper starts by presenting an overview of the data sets applied in this area, as well as the various approaches applied for representing proteins and ligands. Then, sequence-based and structure-based classification criteria are subsequently utilized to categorize and summarize both the classical machine learning models and deep learning models employed in protein-ligand interaction studies. Moreover, the evaluation methods and interpretability of these models are proposed. Furthermore, delving into the diverse applications of protein-ligand interaction models in drug research is presented. Lastly, the current challenges and future directions in this field are addressed.
Collapse
Affiliation(s)
- Yunjiang Zhang
- Beijing Key Laboratory for Green Catalysis and Separation, The Faculty of Environment and Life, Beijing University of Technology, Beijing 100124, P. R. China
| | - Shuyuan Li
- Beijing Key Laboratory for Green Catalysis and Separation, The Faculty of Environment and Life, Beijing University of Technology, Beijing 100124, P. R. China
| | - Kong Meng
- Beijing Key Laboratory for Green Catalysis and Separation, The Faculty of Environment and Life, Beijing University of Technology, Beijing 100124, P. R. China
| | - Shaorui Sun
- Beijing Key Laboratory for Green Catalysis and Separation, The Faculty of Environment and Life, Beijing University of Technology, Beijing 100124, P. R. China
| |
Collapse
|
38
|
Sun Y, Li YY, Leung CK, Hu P. iNGNN-DTI: prediction of drug-target interaction with interpretable nested graph neural network and pretrained molecule models. Bioinformatics 2024; 40:btae135. [PMID: 38449285 PMCID: PMC10957515 DOI: 10.1093/bioinformatics/btae135] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2023] [Revised: 12/31/2023] [Accepted: 03/05/2024] [Indexed: 03/08/2024] Open
Abstract
MOTIVATION Drug-target interaction (DTI) prediction aims to identify interactions between drugs and protein targets. Deep learning can automatically learn discriminative features from drug and protein target representations for DTI prediction, but challenges remain, making it an open question. Existing approaches encode drugs and targets into features using deep learning models, but they often lack explanations for underlying interactions. Moreover, limited labeled DTIs in the chemical space can hinder model generalization. RESULTS We propose an interpretable nested graph neural network for DTI prediction (iNGNN-DTI) using pre-trained molecule and protein models. The analysis is conducted on graph data representing drugs and targets by using a specific type of nested graph neural network, in which the target graphs are created based on 3D structures using Alphafold2. This architecture is highly expressive in capturing substructures of the graph data. We use a cross-attention module to capture interaction information between the substructures of drugs and targets. To improve feature representations, we integrate features learned by models that are pre-trained on large unlabeled small molecule and protein datasets, respectively. We evaluate our model on three benchmark datasets, and it shows a consistent improvement on all baseline models in all datasets. We also run an experiment with previously unseen drugs or targets in the test set, and our model outperforms all of the baselines. Furthermore, the iNGNN-DTI can provide more insights into the interaction by visualizing the weights learned by the cross-attention module. AVAILABILITY AND IMPLEMENTATION The source code of the algorithm is available at https://github.com/syan1992/iNGNN-DTI.
Collapse
Affiliation(s)
- Yan Sun
- Department of Biochemistry, Western University, London, ON, N6G 2V4, Canada
- Department of Computer Science, University of Manitoba, Winnipeg, MB, R3T 2N2, Canada
- Department of Computer Science, Western University, London, ON, N6G 2V4, Canada
| | - Yan Yi Li
- Division of Biostatistics, University of Toronto, Toronto, ON, M5T 3M7, Canada
| | - Carson K Leung
- Department of Computer Science, University of Manitoba, Winnipeg, MB, R3T 2N2, Canada
| | - Pingzhao Hu
- Department of Biochemistry, Western University, London, ON, N6G 2V4, Canada
- Department of Computer Science, University of Manitoba, Winnipeg, MB, R3T 2N2, Canada
- Department of Computer Science, Western University, London, ON, N6G 2V4, Canada
- Division of Biostatistics, University of Toronto, Toronto, ON, M5T 3M7, Canada
- Department of Oncology, Western University, London, ON, N6G 2V4, Canada
- Department of Epidemiology and Biostatistics, Western University, London, ON, N6G 2V4, Canada
- The Children’s Health Research Institute, Lawson Health Research Institute, London, ON, N6A 4V2, Canada
| |
Collapse
|
39
|
Huang D, Ye X, Sakurai T. Multi-party collaborative drug discovery via federated learning. Comput Biol Med 2024; 171:108181. [PMID: 38428094 DOI: 10.1016/j.compbiomed.2024.108181] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2023] [Revised: 01/28/2024] [Accepted: 02/18/2024] [Indexed: 03/03/2024]
Abstract
In the field of drug discovery and pharmacology research, precise and rapid prediction of drug-target binding affinity (DTA) and drug-drug interaction (DDI) are essential for drug efficacy and safety. However, pharmacological data are often distributed across different institutions. Moreover, due to concerns regarding data privacy and intellectual property, the sharing of pharmacological data is often restricted. It is difficult for institutions to achieve the desired performance by solely utilizing their data. This urgent challenge calls for a solution that not only enhances collaboration between multiple institutions to improve prediction accuracy but also safeguards data privacy. In this study, we propose a novel federated learning (FL) framework to advance the prediction of DTA and DDI, namely FL-DTA and FL-DDI. The proposed framework enables multiple institutions to collaboratively train a predictive model without the need to share their local data. Moreover, to ensure data privacy, we employ secure multi-party computation (MPC) during the federated learning model aggregation phase. We evaluated the proposed method on two DTA and one DDI benchmark datasets and compared them with centralized learning and local learning. The experimental results indicate that the proposed method performs closely to centralized learning, and significantly outperforms local learning. Moreover, the proposed framework ensures data security while promoting collaboration among institutions, thereby accelerating the drug discovery process.
Collapse
Affiliation(s)
- Dong Huang
- Department of Computer Science, University of Tsukuba, Tsukuba, 3058577, Japan
| | - Xiucai Ye
- Department of Computer Science, University of Tsukuba, Tsukuba, 3058577, Japan.
| | - Tetsuya Sakurai
- Department of Computer Science, University of Tsukuba, Tsukuba, 3058577, Japan
| |
Collapse
|
40
|
Jeong E, Hong H, Lee YA, Kim KS. Potential Rheumatoid Arthritis-Associated Interstitial Lung Disease Treatment and Computational Approach for Future Drug Development. Int J Mol Sci 2024; 25:2682. [PMID: 38473928 PMCID: PMC11154459 DOI: 10.3390/ijms25052682] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2024] [Revised: 02/19/2024] [Accepted: 02/20/2024] [Indexed: 03/14/2024] Open
Abstract
Rheumatoid arthritis (RA) is a systemic autoimmune disease characterized by swelling in at least one joint. Owing to an overactive immune response, extra-articular manifestations are observed in certain cases, with interstitial lung disease (ILD) being the most common. Rheumatoid arthritis-associated interstitial lung disease (RA-ILD) is characterized by chronic inflammation of the interstitial space, which causes fibrosis and the scarring of lung tissue. Controlling inflammation and pulmonary fibrosis in RA-ILD is important because they are associated with high morbidity and mortality. Pirfenidone and nintedanib are specific drugs against idiopathic pulmonary fibrosis and showed efficacy against RA-ILD in several clinical trials. Immunosuppressants and disease-modifying antirheumatic drugs (DMARDs) with anti-fibrotic effects have also been used to treat RA-ILD. Immunosuppressants moderate the overexpression of cytokines and immune cells to reduce pulmonary damage and slow the progression of fibrosis. DMARDs with mild anti-fibrotic effects target specific fibrotic pathways to regulate fibrogenic cellular activity, extracellular matrix homeostasis, and oxidative stress levels. Therefore, specific medications are required to effectively treat RA-ILD. In this review, the commonly used RA-ILD treatments are discussed based on their molecular mechanisms and clinical trial results. In addition, a computational approach is proposed to develop specific drugs for RA-ILD.
Collapse
Affiliation(s)
- Eunji Jeong
- Department of Medicine, College of Medicine, Kyung Hee University, Seoul 02447, Republic of Korea;
| | - Hyunseok Hong
- Yale College, Yale University, New Haven, CT 06520, USA;
- Department of Clinical Pharmacology and Therapeutics, School of Medicine, Kyung Hee University, Seoul 02447, Republic of Korea
| | - Yeon-Ah Lee
- Division of Rheumatology, Department of Internal Medicine, Kyung Hee University Hospital, Seoul 02447, Republic of Korea;
| | - Kyoung-Soo Kim
- Department of Medicine, College of Medicine, Kyung Hee University, Seoul 02447, Republic of Korea;
- Department of Clinical Pharmacology and Therapeutics, School of Medicine, Kyung Hee University, Seoul 02447, Republic of Korea
- East-West Bone & Joint Disease Research Institute, Kyung Hee University Hospital at Gangdong, Seoul 05278, Republic of Korea
| |
Collapse
|
41
|
Tian C, Wang L, Cui Z, Wu H. GTAMP-DTA: Graph transformer combined with attention mechanism for drug-target binding affinity prediction. Comput Biol Chem 2024; 108:107982. [PMID: 38039800 DOI: 10.1016/j.compbiolchem.2023.107982] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2023] [Revised: 10/21/2023] [Accepted: 11/07/2023] [Indexed: 12/03/2023]
Abstract
Drug target affinity prediction (DTA) is critical to the success of drug development. While numerous machine learning methods have been developed for this task, there remains a necessity to further enhance the accuracy and reliability of predictions. Considerable bias in drug target binding prediction may result due to missing structural information or missing information. In addition, current methods focus only on simulating individual non-covalent interactions between drugs and proteins, thereby neglecting the intricate interplay among different drugs and their interactions with proteins. GTAMP-DTA combines special Attention mechanisms, assigning each atom or amino acid an attention vector. Interactions between drug forms and protein forms were considered to capture information about their interactions. And fusion transformer was used to learn protein characterization from raw amino acid sequences, which were then merged with molecular map features extracted from SMILES. A self-supervised pre-trained embedding that uses pre-trained transformers to encode drug and protein attributes is introduced in order to address the lack of labeled data. Experimental results demonstrate that our model outperforms state-of-the-art methods on both the Davis and KIBA datasets. Additionally, the model's performance undergoes evaluation using three distinct pooling layers (max-pooling, mean-pooling, sum-pooling) along with variations of the attention mechanism. GTAMP-DTA shows significant performance improvements compared to other methods.
Collapse
Affiliation(s)
- Chuangchuang Tian
- College of Electronic and Information Engineering, Suzhou University of Science and Technology, Suzhou 215009, China
| | - Luping Wang
- College of Electronic and Information Engineering, Suzhou University of Science and Technology, Suzhou 215009, China
| | - Zhiming Cui
- College of Electronic and Information Engineering, Suzhou University of Science and Technology, Suzhou 215009, China
| | - Hongjie Wu
- College of Electronic and Information Engineering, Suzhou University of Science and Technology, Suzhou 215009, China; Suzhou Smart City Research Institute, Suzhou University of Science and Technology, Suzhou 215009, China.
| |
Collapse
|
42
|
Dehghan A, Abbasi K, Razzaghi P, Banadkuki H, Gharaghani S. CCL-DTI: contributing the contrastive loss in drug-target interaction prediction. BMC Bioinformatics 2024; 25:48. [PMID: 38291364 PMCID: PMC11264960 DOI: 10.1186/s12859-024-05671-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2023] [Accepted: 01/22/2024] [Indexed: 02/01/2024] Open
Abstract
BACKGROUND The Drug-Target Interaction (DTI) prediction uses a drug molecule and a protein sequence as inputs to predict the binding affinity value. In recent years, deep learning-based models have gotten more attention. These methods have two modules: the feature extraction module and the task prediction module. In most deep learning-based approaches, a simple task prediction loss (i.e., categorical cross entropy for the classification task and mean squared error for the regression task) is used to learn the model. In machine learning, contrastive-based loss functions are developed to learn more discriminative feature space. In a deep learning-based model, extracting more discriminative feature space leads to performance improvement for the task prediction module. RESULTS In this paper, we have used multimodal knowledge as input and proposed an attention-based fusion technique to combine this knowledge. Also, we investigate how utilizing contrastive loss function along the task prediction loss could help the approach to learn a more powerful model. Four contrastive loss functions are considered: (1) max-margin contrastive loss function, (2) triplet loss function, (3) Multi-class N-pair Loss Objective, and (4) NT-Xent loss function. The proposed model is evaluated using four well-known datasets: Wang et al. dataset, Luo's dataset, Davis, and KIBA datasets. CONCLUSIONS Accordingly, after reviewing the state-of-the-art methods, we developed a multimodal feature extraction network by combining protein sequences and drug molecules, along with protein-protein interaction networks and drug-drug interaction networks. The results show it performs significantly better than the comparable state-of-the-art approaches.
Collapse
Affiliation(s)
- Alireza Dehghan
- Department of Bioinformatics, Kish International Campus, University of Tehran, Kish, 1417614411, Iran
| | - Karim Abbasi
- Laboratory of System Biology, Bioinformatics and Artificial Intelligence in Medicine (LBB&AI), Faculty of Mathematics and Computer Science, Kharazmi University, Tehran, 1417614411, Iran
| | - Parvin Razzaghi
- Department of Computer Science and Information Technology, Institute for Advanced Studies in Basic Sciences (IASBS), Zanjan, 4513766731, Iran.
| | - Hossein Banadkuki
- Laboratory of Bioinformatics and Drug Design (LBD), Institute of Biochemistry and Biophysics, University of Tehran, Tehran, 1417614411, Iran
| | - Sajjad Gharaghani
- Laboratory of Bioinformatics and Drug Design (LBD), Institute of Biochemistry and Biophysics, University of Tehran, Tehran, 1417614411, Iran.
| |
Collapse
|
43
|
Zhang C, Zang T, Zhao T. KGE-UNIT: toward the unification of molecular interactions prediction based on knowledge graph and multi-task learning on drug discovery. Brief Bioinform 2024; 25:bbae043. [PMID: 38348746 PMCID: PMC10939374 DOI: 10.1093/bib/bbae043] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2023] [Revised: 12/29/2023] [Accepted: 01/23/2024] [Indexed: 02/15/2024] Open
Abstract
The prediction of molecular interactions is vital for drug discovery. Existing methods often focus on individual prediction tasks and overlook the relationships between them. Additionally, certain tasks encounter limitations due to insufficient data availability, resulting in limited performance. To overcome these limitations, we propose KGE-UNIT, a unified framework that combines knowledge graph embedding (KGE) and multi-task learning, for simultaneous prediction of drug-target interactions (DTIs) and drug-drug interactions (DDIs) and enhancing the performance of each task, even when data availability is limited. Via KGE, we extract heterogeneous features from the drug knowledge graph to enhance the structural features of drug and protein nodes, thereby improving the quality of features. Additionally, employing multi-task learning, we introduce an innovative predictor that comprises the task-aware Convolutional Neural Network-based (CNN-based) encoder and the task-aware attention decoder which can fuse better multimodal features, capture the contextual interactions of molecular tasks and enhance task awareness, leading to improved performance. Experiments on two imbalanced datasets for DTIs and DDIs demonstrate the superiority of KGE-UNIT, achieving high area under the receiver operating characteristics curves (AUROCs) (0.942, 0.987) and area under the precision-recall curve ( AUPRs) (0.930, 0.980) for DTIs and high AUROCs (0.975, 0.989) and AUPRs (0.966, 0.988) for DDIs. Notably, on the LUO dataset where the data were more limited, KGE-UNIT exhibited a more pronounced improvement, with increases of 4.32$\%$ in AUROC and 3.56$\%$ in AUPR for DTIs and 6.56$\%$ in AUROC and 8.17$\%$ in AUPR for DDIs. The scalability of KGE-UNIT is demonstrated through its extension to protein-protein interactions prediction, ablation studies and case studies further validate its effectiveness.
Collapse
Affiliation(s)
- Chengcheng Zhang
- Department of Computer Science, Harbin Institute of Technology, Harbin, 150001, China
| | - Tianyi Zang
- Department of Computer Science, Harbin Institute of Technology, Harbin, 150001, China
| | - Tianyi Zhao
- School of Medicine and Health, Harbin Institute of Technology, Harbin, 150001, China
| |
Collapse
|
44
|
Xu W, Yang X, Guan Y, Cheng X, Wang Y. Integrative approach for predicting drug-target interactions via matrix factorization and broad learning systems. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2024; 21:2608-2625. [PMID: 38454698 DOI: 10.3934/mbe.2024115] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/09/2024]
Abstract
In the drug discovery process, time and costs are the most typical problems resulting from the experimental screening of drug-target interactions (DTIs). To address these limitations, many computational methods have been developed to achieve more accurate predictions. However, identifying DTIs mostly rely on separate learning tasks with drug and target features that neglect interaction representation between drugs and target. In addition, the lack of these relationships may lead to a greatly impaired performance on the prediction of DTIs. Aiming at capturing comprehensive drug-target representations and simplifying the network structure, we propose an integrative approach with a convolution broad learning system for the DTI prediction (ConvBLS-DTI) to reduce the impact of the data sparsity and incompleteness. First, given the lack of known interactions for the drug and target, the weighted K-nearest known neighbors (WKNKN) method was used as a preprocessing strategy for unknown drug-target pairs. Second, a neighborhood regularized logistic matrix factorization (NRLMF) was applied to extract features of updated drug-target interaction information, which focused more on the known interaction pair parties. Then, a broad learning network incorporating a convolutional neural network was established to predict DTIs, which can make classification more effective using a different perspective. Finally, based on the four benchmark datasets in three scenarios, the ConvBLS-DTI's overall performance out-performed some mainstream methods. The test results demonstrate that our model achieves improved prediction effect on the area under the receiver operating characteristic curve and the precision-recall curve.
Collapse
Affiliation(s)
- Wanying Xu
- College of Computer Science & Technology, Qingdao University, Qingdao 266071, China
| | - Xixin Yang
- College of Computer Science & Technology, Qingdao University, Qingdao 266071, China
- School of Automation, Qingdao University, Qingdao 266071, China
| | - Yuanlin Guan
- Key Lab of Industrial Fluid Energy Conservation and Pollution Control, Ministry of Education, Qingdao University of Technology, Qingdao 266520, China
- School of Mechanical & Automotive Engineering, Qingdao University of Technology, Qingdao 266520, China
| | - Xiaoqing Cheng
- College of Computer Science & Technology, Qingdao University, Qingdao 266071, China
| | - Yu Wang
- College of Computer Science & Technology, Qingdao University, Qingdao 266071, China
| |
Collapse
|
45
|
Abdul Raheem AK, Dhannoon BN. Comprehensive Review on Drug-target Interaction Prediction - Latest Developments and Overview. Curr Drug Discov Technol 2024; 21:e010923220652. [PMID: 37680152 DOI: 10.2174/1570163820666230901160043] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2023] [Revised: 05/29/2023] [Accepted: 07/18/2023] [Indexed: 09/09/2023]
Abstract
Drug-target interactions (DTIs) are an important part of the drug development process. When the drug (a chemical molecule) binds to a target (proteins or nucleic acids), it modulates the biological behavior/function of the target, returning it to its normal state. Predicting DTIs plays a vital role in the drug discovery (DD) process as it has the potential to enhance efficiency and reduce costs. However, DTI prediction poses significant challenges and expenses due to the time-consuming and costly nature of experimental assays. As a result, researchers have increased their efforts to identify the association between medications and targets in the hopes of speeding up drug development and shortening the time to market. This paper provides a detailed discussion of the initial stage in drug discovery, namely drug-target interactions. It focuses on exploring the application of machine learning methods within this step. Additionally, we aim to conduct a comprehensive review of relevant papers and databases utilized in this field. Drug target interaction prediction covers a wide range of applications: drug discovery, prediction of adverse effects and drug repositioning. The prediction of drugtarget interactions can be categorized into three main computational methods: docking simulation approaches, ligand-based methods, and machine-learning techniques.
Collapse
Affiliation(s)
- Ali K Abdul Raheem
- Software Department, College of Information Technology, University of Babylon, Hillah, Babil, Iraq
- University of Warith Al-Anbiyaa, Kerbala, Iraq
| | - Ban N Dhannoon
- Department of Computer Science, College of Science, Al-Nahrain University, Baghdad, Iraq
| |
Collapse
|
46
|
Wang Y, Li L, Shen Y, Zhang Y, Zhang Y, Shang X. Deep Learning Integration with Phenotypic Similarities and Heterogeneous Networks for Drug-Target Interaction Prediction. 2023 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM) 2023:2945-2951. [DOI: 10.1109/bibm58861.2023.10385907] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/03/2025]
Affiliation(s)
- Yongtian Wang
- Northwestern Polytechnical University,School of Computer Science,Xi’an,PR China
| | - Li Li
- Northwestern Polytechnical University,School of Computer Science,Xi’an,PR China
| | - Yewei Shen
- Northwestern Polytechnical University,School of Computer Science,Xi’an,PR China
| | - Yizhuo Zhang
- Northwestern Polytechnical University,School of Computer Science,Xi’an,PR China
| | - Yuhe Zhang
- Northwestern Polytechnical University,School of Computer Science,Xi’an,PR China
| | - Xuequn Shang
- Northwestern Polytechnical University,School of Computer Science,Xi’an,PR China
| |
Collapse
|
47
|
Liyaqat T, Ahmad T, Saxena C. TeM-DTBA: time-efficient drug target binding affinity prediction using multiple modalities with Lasso feature selection. J Comput Aided Mol Des 2023; 37:573-584. [PMID: 37777631 DOI: 10.1007/s10822-023-00533-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2023] [Accepted: 09/07/2023] [Indexed: 10/02/2023]
Abstract
Drug discovery, especially virtual screening and drug repositioning, can be accelerated through deeper understanding and prediction of Drug Target Interactions (DTIs). The advancement of deep learning as well as the time and financial costs associated with conventional wet-lab experiments have made computational methods for DTI prediction more popular. However, the majority of these computational methods handle the DTI problem as a binary classification task, ignoring the quantitative binding affinity that determines the drug efficacy to their target proteins. Moreover, computational space as well as execution time of the model is often ignored over accuracy. To address these challenges, we introduce a novel method, called Time-efficient Multimodal Drug Target Binding Affinity (TeM-DTBA), which predicts the binding affinity between drugs and targets by fusing different modalities based on compound structures and target sequences. We employ the Lasso feature selection method, which lowers the dimensionality of feature vectors and speeds up the proposed model training time by more than 50%. The results from two benchmark datasets demonstrate that our method outperforms state-of-the-art methods in terms of performance. The mean squared errors of 18.8% and 23.19%, achieved on the KIBA and Davis datasets, respectively, suggest that our method is more accurate in predicting drug-target binding affinity.
Collapse
Affiliation(s)
- Tanya Liyaqat
- Department of Computer Engineering, Jamia Millia Islamia, New Delhi, India.
| | - Tanvir Ahmad
- Department of Computer Engineering, Jamia Millia Islamia, New Delhi, India
| | - Chandni Saxena
- The Chinese University of Hong Kong, Sha Tin, SAR, China
| |
Collapse
|
48
|
Hua Y, Luo L, Qiu H, Huang D, Zhao Y, Liu H, Lu T, Chen Y, Zhang Y, Jiang Y. Multimodal multi-task deep neural network framework for kinase-target prediction. Mol Divers 2023; 27:2491-2503. [PMID: 36369613 DOI: 10.1007/s11030-022-10565-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2022] [Accepted: 11/01/2022] [Indexed: 11/13/2022]
Abstract
Kinase plays a significant role in various disease signaling pathways. Due to the highly conserved sequence of kinase family members, understanding the selectivity profile of kinase inhibitors remains a priority for drug discovery. Previous methods for kinase selectivity identification use biochemical assays, which are very useful but limited by the protein available. The lack of kinase selectivity can exert benefits but also can cause adverse effects. With the explosion of the dataset for kinase activities, current computational methods can achieve accuracy for large-scale selectivity predictions. Here, we present a multimodal multi-task deep neural network model for kinase selectivity prediction by calculating the fingerprint and physiochemical descriptors. With the multimodal inputs of structure and physiochemical properties information, the multi-task framework could accurately predict the kinome map for selectivity analysis. The proposed model displays better performance for kinase-target prediction based on system evaluations.
Collapse
Affiliation(s)
- Yi Hua
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, 639 Longmian Avenue, Nanjing, 211198, China
| | - Lin Luo
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, 639 Longmian Avenue, Nanjing, 211198, China
| | - Haodi Qiu
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, 639 Longmian Avenue, Nanjing, 211198, China
| | - Dingfang Huang
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, 639 Longmian Avenue, Nanjing, 211198, China
| | - Yang Zhao
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, 639 Longmian Avenue, Nanjing, 211198, China
| | - Haichun Liu
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, 639 Longmian Avenue, Nanjing, 211198, China
| | - Tao Lu
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, 639 Longmian Avenue, Nanjing, 211198, China
- State Key Laboratory of Natural Medicines, China Pharmaceutical University, 24 Tongjiaxiang, Nanjing, 210009, China
| | - Yadong Chen
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, 639 Longmian Avenue, Nanjing, 211198, China.
| | - Yanmin Zhang
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, 639 Longmian Avenue, Nanjing, 211198, China.
| | - Yulei Jiang
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, 639 Longmian Avenue, Nanjing, 211198, China.
| |
Collapse
|
49
|
Wang Y, Zhang Z, Piao C, Huang Y, Zhang Y, Zhang C, Lu YJ, Liu D. LDS-CNN: a deep learning framework for drug-target interactions prediction based on large-scale drug screening. Health Inf Sci Syst 2023; 11:42. [PMID: 37667773 PMCID: PMC10475000 DOI: 10.1007/s13755-023-00243-w] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2023] [Accepted: 08/14/2023] [Indexed: 09/06/2023] Open
Abstract
Background Drug-target interaction (DTI) is a vital drug design strategy that plays a significant role in many processes of complex diseases and cellular events. In the face of challenges such as extensive protein data and experimental costs, it is suggested to apply bioinformatics approaches to exploit potential interactions to design new targeted medications. Different data and interaction types bring difficulties to study involving incompatible and heterology formats. The analysis of drug-target interactions in a comprehensive and unified model is a significant challenge. Method Here, we propose a general method for predicting interactions between small-molecule drugs and protein targets, Large-scale Drug target Screening Convolutional Neural Network (LDS-CNN), which used unified encoding to achieve the calculation of the different data formats in an integrated model to realize feature abstraction and potential object prediction. Result On 898,412 interaction data involving 1683 small-molecule compounds and 14,350 human proteins from 8.8 billion records, the proposed method achieved an area under the curve (AUC) of 0.96, an area under the precision-recall curve (AUPRC) of 0.95, and an accuracy of 90.13%. The experimental results illustrated that the proposed method attained high accuracy on the test set, indicating its high predictive ability in drug-target interaction prediction. LDS-CNN is effective for the prediction of large-scale datasets and datasets composed of data with different formats. Conclusion In this study, we propose a DTI prediction method to solve the problems of unified encoding of large-scale data in multiple formats. It provides a feasible way to efficiently abstract the features among different types of drug-related data, thus reducing experimental costs and time consumption. The proposed method can be used to identify potential drug targets and candidates for the treatment of complex diseases. This work provides a reference for DTI to process large-scale data and different formats with deep learning methods and provides certain suggestions for future research.
Collapse
Affiliation(s)
- Yang Wang
- School of Computer Science and Technology, Guangdong University of Technology, Guangzhou, 510006 China
| | - Zuxian Zhang
- School of Biomedical and Pharmaceutical Sciences, Guangdong University of Technology, Guangzhou, 510006 China
| | - Chenghong Piao
- The First Affiliated Hospital of Ningbo University, Ningbo, 315010 China
| | - Ying Huang
- School of Biomedical and Pharmaceutical Sciences, Guangdong University of Technology, Guangzhou, 510006 China
| | - Yihan Zhang
- School of Biomedical and Pharmaceutical Sciences, Guangdong University of Technology, Guangzhou, 510006 China
| | - Chi Zhang
- Shanghai Institute of Biological Products, Shanghai, 201403 China
| | - Yu-Jing Lu
- School of Biomedical and Pharmaceutical Sciences, Guangdong University of Technology, Guangzhou, 510006 China
- Smart Medical Innovation Technology Center, Guangdong University of Technology, Guangzhou, 510006 China
| | - Dongning Liu
- School of Computer Science and Technology, Guangdong University of Technology, Guangzhou, 510006 China
| |
Collapse
|
50
|
Chen J, Gu Z, Lai L, Pei J. In silico protein function prediction: the rise of machine learning-based approaches. MEDICAL REVIEW (2021) 2023; 3:487-510. [PMID: 38282798 PMCID: PMC10808870 DOI: 10.1515/mr-2023-0038] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/14/2023] [Accepted: 10/11/2023] [Indexed: 01/30/2024]
Abstract
Proteins function as integral actors in essential life processes, rendering the realm of protein research a fundamental domain that possesses the potential to propel advancements in pharmaceuticals and disease investigation. Within the context of protein research, an imperious demand arises to uncover protein functionalities and untangle intricate mechanistic underpinnings. Due to the exorbitant costs and limited throughput inherent in experimental investigations, computational models offer a promising alternative to accelerate protein function annotation. In recent years, protein pre-training models have exhibited noteworthy advancement across multiple prediction tasks. This advancement highlights a notable prospect for effectively tackling the intricate downstream task associated with protein function prediction. In this review, we elucidate the historical evolution and research paradigms of computational methods for predicting protein function. Subsequently, we summarize the progress in protein and molecule representation as well as feature extraction techniques. Furthermore, we assess the performance of machine learning-based algorithms across various objectives in protein function prediction, thereby offering a comprehensive perspective on the progress within this field.
Collapse
Affiliation(s)
- Jiaxiao Chen
- Center for Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, China
| | - Zhonghui Gu
- Peking-Tsinghua Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, China
| | - Luhua Lai
- Center for Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, China
- Peking-Tsinghua Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, China
- BNLMS, College of Chemistry and Molecular Engineering, Peking University, Beijing, China
- Research Unit of Drug Design Method, Chinese Academy of Medical Sciences (2021RU014), Beijing, China
| | - Jianfeng Pei
- Center for Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, China
- Research Unit of Drug Design Method, Chinese Academy of Medical Sciences (2021RU014), Beijing, China
| |
Collapse
|