1
|
Abubakar ML, Kapoor N, Sharma A, Gambhir L, Jasuja ND, Sharma G. Artificial Intelligence in Drug Identification and Validation: A Scoping Review. Drug Res (Stuttg) 2024; 74:208-219. [PMID: 38830370 DOI: 10.1055/a-2306-8311] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/05/2024]
Abstract
The end-to-end process in the discovery of drugs involves therapeutic candidate identification, validation of identified targets, identification of hit compound series, lead identification and optimization, characterization, and formulation and development. The process is lengthy, expensive, tedious, and inefficient, with a large attrition rate for novel drug discovery. Today, the pharmaceutical industry is focused on improving the drug discovery process. Finding and selecting acceptable drug candidates effectively can significantly impact the price and profitability of new medications. Aside from the cost, there is a need to reduce the end-to-end process time, limiting the number of experiments at various stages. To achieve this, artificial intelligence (AI) has been utilized at various stages of drug discovery. The present study aims to identify the recent work that has developed AI-based models at various stages of drug discovery, identify the stages that need more concern, present the taxonomy of AI methods in drug discovery, and provide research opportunities. From January 2016 to September 1, 2023, the study identified all publications that were cited in the electronic databases including Scopus, NCBI PubMed, MEDLINE, Anthropology Plus, Embase, APA PsycInfo, SOCIndex, and CINAHL. Utilising a standardized form, data were extracted, and presented possible research prospects based on the analysis of the extracted data.
Collapse
Affiliation(s)
| | - Neha Kapoor
- School of Applied Sciences, Suresh Gyan Vihar University, Jaipur, Rajasthan, India
| | - Asha Sharma
- Department of Zoology, Swargiya P. N. K. S. Govt. PG College, Dausa, Rajasthan, India
| | - Lokesh Gambhir
- School of Basic and Applied Sciences, Shri Guru Ram Rai University, Dehradun, Uttarakhand, India
| | | | - Gaurav Sharma
- School of Applied Sciences, Suresh Gyan Vihar University, Jaipur, Rajasthan, India
| |
Collapse
|
2
|
Liang Z, Lin C, Tan G, Li J, He Y, Cai S. A low-cost machine learning framework for predicting drug-drug interactions based on fusion of multiple features and a parameter self-tuning strategy. Phys Chem Chem Phys 2024; 26:6300-6315. [PMID: 38305788 DOI: 10.1039/d4cp00039k] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2024]
Abstract
Poly-drug therapy is now recognized as a crucial treatment, and the analysis of drug-drug interactions (DDIs) offers substantial theoretical support and guidance for its implementation. Predicting potential DDIs using intelligent algorithms is an emerging approach in pharmacological research. However, the existing supervised models and deep learning-based techniques still have several limitations. This paper proposes a novel DDI analysis and prediction framework called the Multi-View Semi-supervised Graph-based (MVSG) framework, which provides a comprehensive judgment by integrating multiple DDI features and functions without any time-consuming training process. Unlike conventional approaches, MVSG can search for the most suitable similarity (or distance) measurement among DDI data and construct graph structures for each feature. By employing a parameter self-tuning strategy, MVSG fuses multiple graphs according to the contributions of features' information. The actual anticancer drug data are extracted from the authoritative public database for evaluating the effectiveness of our framework, including 904 drugs, 7730 DDI records and 19 types of drug interactions. Validation results indicate that the prediction is more accurate when multiple features are adopted by our framework. In comparison to conventional machine learning techniques, MVSG can achieve higher performance even with less labeled data and without a training process. Finally, MVSG is employed to narrow down the search for potential valuable combinations.
Collapse
Affiliation(s)
- Zexiao Liang
- School of Integrated Circuits, Guangdong University of Technology, 100 Waihuan Xi Road, Panyu District, Guangzhou, 510006, Guangdong, China.
| | - Canxin Lin
- School of Computer Science and Technology, Guangdong University of Technology, 100 Waihuan Xi Road, Panyu District, Guangzhou, 510006, Guangdong, China
| | - Guoliang Tan
- School of Automation, Guangdong University of Technology, 100 Waihuan Xi Road, Panyu District, Guangzhou, 510006, Guangdong, China
| | - Jianzhong Li
- School of Integrated Circuits, Guangdong University of Technology, 100 Waihuan Xi Road, Panyu District, Guangzhou, 510006, Guangdong, China.
| | - Yan He
- School of Biomedical and Pharmaceutical Sciences, Guangdong University of Technology, 100 Waihuan Xi Road, Panyu District, Guangzhou, 510006, Guangdong, China
| | - Shuting Cai
- School of Integrated Circuits, Guangdong University of Technology, 100 Waihuan Xi Road, Panyu District, Guangzhou, 510006, Guangdong, China.
| |
Collapse
|
3
|
Lee J, Jun DW, Song I, Kim Y. DLM-DTI: a dual language model for the prediction of drug-target interaction with hint-based learning. J Cheminform 2024; 16:14. [PMID: 38297330 PMCID: PMC10832108 DOI: 10.1186/s13321-024-00808-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2023] [Accepted: 01/22/2024] [Indexed: 02/02/2024] Open
Abstract
The drug discovery process is demanding and time-consuming, and machine learning-based research is increasingly proposed to enhance efficiency. A significant challenge in this field is predicting whether a drug molecule's structure will interact with a target protein. A recent study attempted to address this challenge by utilizing an encoder that leverages prior knowledge of molecular and protein structures, resulting in notable improvements in the prediction performance of the drug-target interactions task. Nonetheless, the target encoders employed in previous studies exhibit computational complexity that increases quadratically with the input length, thereby limiting their practical utility. To overcome this challenge, we adopt a hint-based learning strategy to develop a compact and efficient target encoder. With the adaptation parameter, our model can blend general knowledge and target-oriented knowledge to build features of the protein sequences. This approach yielded considerable performance enhancements and improved learning efficiency on three benchmark datasets: BIOSNAP, DAVIS, and Binding DB. Furthermore, our methodology boasts the merit of necessitating only a minimal Video RAM (VRAM) allocation, specifically 7.7GB, during the training phase (16.24% of the previous state-of-the-art model). This ensures the feasibility of training and inference even with constrained computational resources.
Collapse
Affiliation(s)
- Jonghyun Lee
- Department of Medical and Digital Engineering, Hanyang University College of Engineering, 222, Wangsimni-ro, Seongdong-gu, Seoul, 04763, Korea
| | - Dae Won Jun
- Department of Medical and Digital Engineering, Hanyang University College of Engineering, 222, Wangsimni-ro, Seongdong-gu, Seoul, 04763, Korea
- Department of Internal Medicine, Hanyang University College of Medicine, 222, Wangsimni-ro, Seongdong-gu, Seoul, 04763, Korea
| | - Ildae Song
- Department of Pharmaceutical Science and Technology, Kyungsung University, 309, Suyeong-ro, Nam-gu, Busan, 48434, Korea
| | - Yun Kim
- College of Pharmacy, Deagu Catholic University, 13-13, Hayang-ro, Hayang-eup, Gyeongsan-si, 38430, Gyeongsangbuk-do, Korea.
| |
Collapse
|
4
|
Wang W, Yu M, Sun B, Li J, Liu D, Zhang H, Wang X, Zhou Y. SMGCN: Multiple Similarity and Multiple Kernel Fusion Based Graph Convolutional Neural Network for Drug-Target Interactions Prediction. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2024; 21:143-154. [PMID: 38051618 DOI: 10.1109/tcbb.2023.3339645] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/07/2023]
Abstract
Accurately identifying potential drug-target interactions (DTIs) is a critical step in accelerating drug discovery. Despite many studies that have been conducted over the past decades, detecting DTIs remains a highly challenging and complicated process. Therefore, we propose a novel method called SMGCN, which combines multiple similarity and multiple kernel fusion based on Graph Convolutional Network (GCN) to predict DTIs. In order to capture the features of the network structure and fully explore direct or indirect relationships between nodes, we propose the method of multiple similarity, which combines similarity fusion matrices with Random Walk with Restart (RWR) and cosine similarity. Then, we use GCN to extract multi-layer low-dimensional embedding features. Unlike traditional GCN methods, we incorporate Multiple Kernel Learning (MKL). Finally, we use the Dual Laplace Regularized Least Squares method to predict novel DTIs through combinatorial kernels in drug and target spaces. We conduct experiments on a golden standard dataset, and demonstrate the effectiveness of our proposed model in predicting DTIs through showing significant improvements in Area Under the Curve (AUC) and Area Under the Precision-Recall Curve (AUPR). In addition, our model can also discover some new DTIs, which can be verified by the KEGG BRITE Database and relevant literature.
Collapse
|
5
|
Aldahdooh J, Vähä-Koskela M, Tang J, Tanoli Z. Using BERT to identify drug-target interactions from whole PubMed. BMC Bioinformatics 2022; 23:245. [PMID: 35729494 PMCID: PMC9214985 DOI: 10.1186/s12859-022-04768-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2021] [Accepted: 06/03/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Drug-target interactions (DTIs) are critical for drug repurposing and elucidation of drug mechanisms, and are manually curated by large databases, such as ChEMBL, BindingDB, DrugBank and DrugTargetCommons. However, the number of curated articles likely constitutes only a fraction of all the articles that contain experimentally determined DTIs. Finding such articles and extracting the experimental information is a challenging task, and there is a pressing need for systematic approaches to assist the curation of DTIs. To this end, we applied Bidirectional Encoder Representations from Transformers (BERT) to identify such articles. Because DTI data intimately depends on the type of assays used to generate it, we also aimed to incorporate functions to predict the assay format. RESULTS Our novel method identified 0.6 million articles (along with drug and protein information) which are not previously included in public DTI databases. Using 10-fold cross-validation, we obtained ~ 99% accuracy for identifying articles containing quantitative drug-target profiles. The F1 micro for the prediction of assay format is 88%, which leaves room for improvement in future studies. CONCLUSION The BERT model in this study is robust and the proposed pipeline can be used to identify previously overlooked articles containing quantitative DTIs. Overall, our method provides a significant advancement in machine-assisted DTI extraction and curation. We expect it to be a useful addition to drug mechanism discovery and repurposing.
Collapse
Affiliation(s)
- Jehad Aldahdooh
- Research Program in Systems Oncology, Faculty of Medicine, University of Helsinki, Helsinki, Finland.,Doctoral Programme in Computer Science, University of Helsinki, Helsinki, Finland
| | - Markus Vähä-Koskela
- Institute for Molecular Medicine Finland, University of Helsinki, Helsinki, Finland
| | - Jing Tang
- Research Program in Systems Oncology, Faculty of Medicine, University of Helsinki, Helsinki, Finland.
| | - Ziaurrehman Tanoli
- Research Program in Systems Oncology, Faculty of Medicine, University of Helsinki, Helsinki, Finland. .,BioICAWtech, Helsinki, Finland.
| |
Collapse
|
6
|
Pu L, Singha M, Ramanujam J, Brylinski M. CancerOmicsNet: a multi-omics network-based approach to anti-cancer drug profiling. Oncotarget 2022; 13:695-706. [PMID: 35601606 PMCID: PMC9119687 DOI: 10.18632/oncotarget.28234] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2022] [Accepted: 05/03/2022] [Indexed: 11/25/2022] Open
Abstract
Development of novel anti-cancer treatments requires not only a comprehensive knowledge of cancer processes and drug mechanisms of action, but also the ability to accurately predict the response of various cancer cell lines to therapeutics. Numerous computational methods have been developed to address this issue, including algorithms employing supervised machine learning. Nonetheless, high prediction accuracies reported for many of these techniques may result from a significant overlap among training, validation, and testing sets, making existing predictors inapplicable to new data. To address these issues, we developed CancerOmicsNet, a graph neural network with sophisticated attention propagation mechanisms to predict the therapeutic effects of kinase inhibitors across various tumors. Emphasizing on the system-level complexity of cancer, CancerOmicsNet integrates multiple heterogeneous data, such as biological networks, genomics, inhibitor profiling, and gene-disease associations, into a unified graph structure. The performance of CancerOmicsNet, properly cross-validated at the tissue level, is 0.83 in terms of the area under the receiver operating characteristics, which is notably higher than those measured for other approaches. CancerOmicsNet generalizes well to unseen data, i.e., it can predict therapeutic effects across a variety of cancer cell lines and inhibitors. CancerOmicsNet is freely available to the academic community at https://github.com/pulimeng/CancerOmicsNet.
Collapse
Affiliation(s)
- Limeng Pu
- Center for Computation and Technology, Louisiana State University, Baton Rouge, LA 70803, USA.,These authors contributed equally to this work
| | - Manali Singha
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA 70803, USA.,These authors contributed equally to this work
| | - Jagannathan Ramanujam
- Center for Computation and Technology, Louisiana State University, Baton Rouge, LA 70803, USA.,Division of Electrical and Computer Engineering, Louisiana State University, Baton Rouge, LA 70803, USA
| | - Michal Brylinski
- Center for Computation and Technology, Louisiana State University, Baton Rouge, LA 70803, USA.,Department of Biological Sciences, Louisiana State University, Baton Rouge, LA 70803, USA
| |
Collapse
|
7
|
An integrated network representation of multiple cancer-specific data for graph-based machine learning. NPJ Syst Biol Appl 2022; 8:14. [PMID: 35487924 PMCID: PMC9054771 DOI: 10.1038/s41540-022-00226-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2021] [Accepted: 04/04/2022] [Indexed: 12/20/2022] Open
Abstract
Genomic profiles of cancer cells provide valuable information on genetic alterations in cancer. Several recent studies employed these data to predict the response of cancer cell lines to drug treatment. Nonetheless, due to the multifactorial phenotypes and intricate mechanisms of cancer, the accurate prediction of the effect of pharmacotherapy on a specific cell line based on the genetic information alone is problematic. Emphasizing on the system-level complexity of cancer, we devised a procedure to integrate multiple heterogeneous data, including biological networks, genomics, inhibitor profiling, and gene-disease associations, into a unified graph structure. In order to construct compact, yet information-rich cancer-specific networks, we developed a novel graph reduction algorithm. Driven by not only the topological information, but also the biological knowledge, the graph reduction increases the feature-only entropy while preserving the valuable graph-feature information. Subsequent comparative benchmarking simulations employing a tissue level cross-validation protocol demonstrate that the accuracy of a graph-based predictor of the drug efficacy is 0.68, which is notably higher than those measured for more traditional, matrix-based techniques on the same data. Overall, the non-Euclidean representation of the cancer-specific data improves the performance of machine learning to predict the response of cancer to pharmacotherapy. The generated data are freely available to the academic community at https://osf.io/dzx7b/.
Collapse
|
8
|
A Novel Deep Neural Network Technique for Drug–Target Interaction. Pharmaceutics 2022; 14:pharmaceutics14030625. [PMID: 35336000 PMCID: PMC8954728 DOI: 10.3390/pharmaceutics14030625] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2022] [Revised: 03/08/2022] [Accepted: 03/08/2022] [Indexed: 01/20/2023] Open
Abstract
Drug discovery (DD) is a time-consuming and expensive process. Thus, the industry employs strategies such as drug repositioning and drug repurposing, which allows the application of already approved drugs to treat a different disease, as occurred in the first months of 2020, during the COVID-19 pandemic. The prediction of drug–target interactions is an essential part of the DD process because it can accelerate it and reduce the required costs. DTI prediction performed in silico have used approaches based on molecular docking simulations, including similarity-based and network- and graph-based ones. This paper presents MPS2IT-DTI, a DTI prediction model obtained from research conducted in the following steps: the definition of a new method for encoding molecule and protein sequences onto images; the definition of a deep-learning approach based on a convolutional neural network in order to create a new method for DTI prediction. Training results conducted with the Davis and KIBA datasets show that MPS2IT-DTI is viable compared to other state-of-the-art (SOTA) approaches in terms of performance and complexity of the neural network model. With the Davis dataset, we obtained 0.876 for the concordance index and 0.276 for the MSE; with the KIBA dataset, we obtained 0.836 and 0.226 for the concordance index and the MSE, respectively. Moreover, the MPS2IT-DTI model represents molecule and protein sequences as images, instead of treating them as an NLP task, and as such, does not employ an embedding layer, which is present in other models.
Collapse
|
9
|
Zheng Y, Wu Z. Cascade Deep Forest With Heterogeneous Similarity Measures for Drug-Target Interaction Prediction. Front Genet 2021; 12:702259. [PMID: 34504515 PMCID: PMC8421679 DOI: 10.3389/fgene.2021.702259] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2021] [Accepted: 05/24/2021] [Indexed: 11/13/2022] Open
Abstract
Drug repositioning is a method of systematically identifying potential molecular targets that known drugs may act on. Compared with traditional methods, drug repositioning has been extensively studied due to the development of multi-omics technology and system biology methods. Because of its biological network properties, it is possible to apply machine learning related algorithms for prediction. Based on various heterogeneous network model, this paper proposes a method named THNCDF for predicting drug-target interactions. Various heterogeneous networks are integrated to build a tripartite network, and similarity calculation methods are used to obtain similarity matrix. Then, the cascade deep forest method is used to make prediction. Results indicate that THNCDF outperforms the previously reported methods based on the 10-fold cross-validation on the benchmark data sets proposed by Y. Yamanishi. The area under Precision Recall curve (AUPR) value on the Enzyme, GPCR, Ion Channel, and Nuclear Receptor data sets is 0.988, 0.980, 0.938, and 0.906 separately. The experimental results well illustrate the feasibility of this method.
Collapse
Affiliation(s)
- Ying Zheng
- School of Computer & Communication Engineering, Changsha University of Science & Technology, Changsha, China
| | | |
Collapse
|
10
|
Jiang D, Ding S, Mao Z, You L, Ruan Y. Integrated analysis of potential pathways by which aloe-emodin induces the apoptosis of colon cancer cells. Cancer Cell Int 2021; 21:238. [PMID: 33902610 PMCID: PMC8077783 DOI: 10.1186/s12935-021-01942-8] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2020] [Accepted: 04/19/2021] [Indexed: 01/22/2023] Open
Abstract
Background Colon cancer is a malignant gastrointestinal tumour with high incidence, mortality and metastasis rates worldwide. Aloe-emodin is a monomer compound derived from hydroxyanthraquinone. Aloe-emodin produces a wide range of antitumour effects and is produced by rhubarb, aloe and other herbs. However, the mechanism by which aloe-emodin influences colon cancer is still unclear. We hope these findings will lead to the development of a new therapeutic strategy for the treatment of colon cancer in the clinic. Methods We identified the overlapping targets of aloe-emodin and colon cancer and performed protein–protein interaction (PPI), Gene Ontology (GO), and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analyses. In addition, we selected apoptosis pathways for experimental verification with cell viability, cell proliferation, caspase-3 activity, DAPI staining, cell cycle and western blotting analyses to evaluate the apoptotic effect of aloe-emodin on colon cancer cells. Results The MTT assay and cell colony formation assay showed that aloe-emodin inhibited cell proliferation. DAPI staining confirmed that aloe-emodin induced apoptosis. Aloe-emodin upregulated the protein level of Bax and decreased the expression of Bcl-2, which activates caspase-3 and caspase-9. Furthermore, the protein expression level of cytochrome C increased in a time-dependent manner in the cytoplasm but decreased in a time-dependent manner in the mitochondria. Conclusion These results indicate that aloe-emodin may induce the apoptosis of human colon cancer cells through mitochondria-related pathways.
Collapse
Affiliation(s)
- Dongxiao Jiang
- College of Pharmaceutical Sciences, Zhejiang Chinese Medical University, Hangzhou, 310053, People's Republic of China
| | - Shufei Ding
- Shaoxing Hospital Of Traditional Chinese Medicine, Shaoxing, 312000, People's Republic of China
| | - Zhujun Mao
- College of Pharmaceutical Sciences, Zhejiang Chinese Medical University, Hangzhou, 310053, People's Republic of China
| | - Liyan You
- College of Pharmaceutical Sciences, Zhejiang Chinese Medical University, Hangzhou, 310053, People's Republic of China
| | - Yeping Ruan
- College of Pharmaceutical Sciences, Zhejiang Chinese Medical University, Hangzhou, 310053, People's Republic of China.
| |
Collapse
|