1
|
Dhiwar PS, Purawarga Matada GS, Pal R, Singh E, Ghara A, Maji L, Sengupta S, Andhale G. An assessment of EGFR and HER2 inhibitors with structure activity relationship of fused pyrimidine derivatives for breast cancer: a brief review. J Biomol Struct Dyn 2024; 42:1564-1581. [PMID: 37158086 DOI: 10.1080/07391102.2023.2204351] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2022] [Accepted: 03/30/2023] [Indexed: 05/10/2023]
Abstract
Epidermal growth factor receptor (EGFR) and its subtype human epidermal growth factor receptor 2 (HER2) gets activated when its endogenous ligand(s) bind to its ATP binding site of target receptors. In breast cancer (BC), EGFR and HER2 are two proteins are overexpressed which leads to overexpression of cells proliferation and decreases cell death/apoptosis. Pyrimidine is one of the most widely studied heterocyclic scaffolds for EGFR as well as HER2 inhibition. We gather some remarkable results for fused-pyrimidine derivatives on various cancerous cell lines (in-vitro) and animal (in-vivo) evaluation to highlight their potency. The heterocyclic (five, six-membered, etc.) moieties which are coupled with pyrimidine moiety are potent against EGFR and HER2 inhibitions. Hence structure-activity relationship (SAR) plays important role in study of heterocyclic moiety along pyrimidine and effects of substituents, groups for increase or decrease in the cancerous activity and toxicity. By thoughtful of fused pyrimidines SAR study, it facilitates in receiving excellent overview of the compounds by concerning of efficacy and potential summary for future EGFR inhibitors. Furthermore, we studied the in-silico interactions of synthesized compounds to evaluate binding affinity towards the key amino acids..Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Prasad Sanjay Dhiwar
- Intergrated drug discovery center, Department of Pharmaceutical Chemistry, Acharya & BM Reddy College of Pharmacy, Bengaluru, India
| | | | - Rohit Pal
- Intergrated drug discovery center, Department of Pharmaceutical Chemistry, Acharya & BM Reddy College of Pharmacy, Bengaluru, India
| | - Ekta Singh
- Intergrated drug discovery center, Department of Pharmaceutical Chemistry, Acharya & BM Reddy College of Pharmacy, Bengaluru, India
| | - Abhishek Ghara
- Intergrated drug discovery center, Department of Pharmaceutical Chemistry, Acharya & BM Reddy College of Pharmacy, Bengaluru, India
| | - Lalmohan Maji
- Intergrated drug discovery center, Department of Pharmaceutical Chemistry, Acharya & BM Reddy College of Pharmacy, Bengaluru, India
| | - Sindhuja Sengupta
- Intergrated drug discovery center, Department of Pharmaceutical Chemistry, Acharya & BM Reddy College of Pharmacy, Bengaluru, India
| | - Ganesh Andhale
- Department of Pharmaceutical Chemistry, Alard College of Pharmacy, Pune, India
| |
Collapse
|
2
|
Wang W, Yu M, Sun B, Li J, Liu D, Zhang H, Wang X, Zhou Y. SMGCN: Multiple Similarity and Multiple Kernel Fusion Based Graph Convolutional Neural Network for Drug-Target Interactions Prediction. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2024; 21:143-154. [PMID: 38051618 DOI: 10.1109/tcbb.2023.3339645] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/07/2023]
Abstract
Accurately identifying potential drug-target interactions (DTIs) is a critical step in accelerating drug discovery. Despite many studies that have been conducted over the past decades, detecting DTIs remains a highly challenging and complicated process. Therefore, we propose a novel method called SMGCN, which combines multiple similarity and multiple kernel fusion based on Graph Convolutional Network (GCN) to predict DTIs. In order to capture the features of the network structure and fully explore direct or indirect relationships between nodes, we propose the method of multiple similarity, which combines similarity fusion matrices with Random Walk with Restart (RWR) and cosine similarity. Then, we use GCN to extract multi-layer low-dimensional embedding features. Unlike traditional GCN methods, we incorporate Multiple Kernel Learning (MKL). Finally, we use the Dual Laplace Regularized Least Squares method to predict novel DTIs through combinatorial kernels in drug and target spaces. We conduct experiments on a golden standard dataset, and demonstrate the effectiveness of our proposed model in predicting DTIs through showing significant improvements in Area Under the Curve (AUC) and Area Under the Precision-Recall Curve (AUPR). In addition, our model can also discover some new DTIs, which can be verified by the KEGG BRITE Database and relevant literature.
Collapse
|
3
|
Abdul Raheem AK, Dhannoon BN. Comprehensive Review on Drug-target Interaction Prediction - Latest Developments and Overview. Curr Drug Discov Technol 2024; 21:e010923220652. [PMID: 37680152 DOI: 10.2174/1570163820666230901160043] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2023] [Revised: 05/29/2023] [Accepted: 07/18/2023] [Indexed: 09/09/2023]
Abstract
Drug-target interactions (DTIs) are an important part of the drug development process. When the drug (a chemical molecule) binds to a target (proteins or nucleic acids), it modulates the biological behavior/function of the target, returning it to its normal state. Predicting DTIs plays a vital role in the drug discovery (DD) process as it has the potential to enhance efficiency and reduce costs. However, DTI prediction poses significant challenges and expenses due to the time-consuming and costly nature of experimental assays. As a result, researchers have increased their efforts to identify the association between medications and targets in the hopes of speeding up drug development and shortening the time to market. This paper provides a detailed discussion of the initial stage in drug discovery, namely drug-target interactions. It focuses on exploring the application of machine learning methods within this step. Additionally, we aim to conduct a comprehensive review of relevant papers and databases utilized in this field. Drug target interaction prediction covers a wide range of applications: drug discovery, prediction of adverse effects and drug repositioning. The prediction of drugtarget interactions can be categorized into three main computational methods: docking simulation approaches, ligand-based methods, and machine-learning techniques.
Collapse
Affiliation(s)
- Ali K Abdul Raheem
- Software Department, College of Information Technology, University of Babylon, Hillah, Babil, Iraq
- University of Warith Al-Anbiyaa, Kerbala, Iraq
| | - Ban N Dhannoon
- Department of Computer Science, College of Science, Al-Nahrain University, Baghdad, Iraq
| |
Collapse
|
4
|
Chen J, Zhang L, Cheng K, Jin B, Lu X, Che C. Predicting Drug-Target Interaction Via Self-Supervised Learning. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:2781-2789. [PMID: 35230952 DOI: 10.1109/tcbb.2022.3153963] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Recent advances in graph representation learning provide new opportunities for computational drug-target interaction (DTI) prediction. However, it still suffers from deficiencies of dependence on manual labels and vulnerability to attacks. Inspired by the success of self-supervised learning (SSL) algorithms, which can leverage input data itself as supervision,we propose SupDTI, a SSL-enhanced drug-target interaction prediction framework based on a heterogeneous network (i.e., drug-protein, drug-drug, and protein-protein interaction network; drug-disease, drug-side-effect, and protein-disease association network; drug-structure and protein-sequence similarity network). Specifically, SupDTI is an end-to-end learning framework consisting of five components. First, localized and globalized graph convolutions are designed to capture the nodes' information from both local and global perspectives, respectively. Then, we develop a variational autoencoder to constrain the nodes' representation to have desired statistical characteristics. Finally, a unified self-supervised learning strategy is leveraged to enhance the nodes' representation, namely, a contrastive learning module is employed to enable the nodes' representation to fit the graph-level representation, followed by a generative learning module which further maximizes the node-level agreement across the global and local views by learning the probabilistic connectivity distribution of the original heterogeneous network. Experimental results show that our model can achieve better prediction performance than state-of-the-art methods.
Collapse
|
5
|
Shen Y, Liu JX, Yin MM, Zheng CH, Gao YL. BMPMDA: Prediction of MiRNA-Disease Associations Using a Space Projection Model Based on Block Matrix. Interdiscip Sci 2023; 15:88-99. [PMID: 36335274 DOI: 10.1007/s12539-022-00542-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2022] [Revised: 10/13/2022] [Accepted: 10/14/2022] [Indexed: 11/07/2022]
Abstract
With the high-quality development of bioinformatics technology, miRNA-disease associations (MDAs) are gradually being uncovered. At present, convenient and efficient prediction methods, which solve the problem of resource-consuming in traditional wet experiments, need to be further put forward. In this study, a space projection model based on block matrix is presented for predicting MDAs (BMPMDA). Specifically, two block matrices are first composed of the known association matrix and similarity to increase comprehensiveness. For the integrity of information in the heterogeneous network, matrix completion (MC) is utilized to mine potential MDAs. Considering the neighborhood information of data points, linear neighborhood similarity (LNS) is regarded as a measure of similarity. Next, LNS is projected onto the corresponding completed association matrix to derive the projection score. Finally, the AUC and AUPR values for BMPMDA reach 0.9691 and 0.6231, respectively. Additionally, the majority of novel MDAs in three disease cases are identified in existing databases and literature. It suggests that BMPMDA can serve as a reliable prediction model for biological research.
Collapse
Affiliation(s)
- Yi Shen
- Qufu Normal University, Rizhao, 276800, China
| | | | | | - Chun-Hou Zheng
- Co-Innovation Center for Information Supply and Assurance Technology, Anhui University, Hefei, 230000, China
| | - Ying-Lian Gao
- Library of Qufu Normal University, Qufu Normal University, Rizhao, 276800, China.
| |
Collapse
|
6
|
Huang D, He H, Ouyang J, Zhao C, Dong X, Xie J. Small molecule drug and biotech drug interaction prediction based on multi-modal representation learning. BMC Bioinformatics 2022; 23:561. [PMID: 36575376 PMCID: PMC9793529 DOI: 10.1186/s12859-022-05101-2] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2022] [Accepted: 12/06/2022] [Indexed: 12/28/2022] Open
Abstract
BACKGROUND Drug-drug interactions (DDIs) occur when two or more drugs are taken simultaneously or successively. Early detection of adverse drug interactions can be essential in preventing medical errors and reducing healthcare costs. Many computational methods already predict interactions between small molecule drugs (SMDs). As the number of biotechnology drugs (BioDs) increases, so makes the threat of interactions between SMDs and BioDs. However, few computational methods are available to predict their interactions. RESULTS Considering the structural specificity and relational complexity of SMDs and BioDs, a novel multi-modal representation learning method called Multi-SBI is proposed to predict their interactions. First, multi-modal features are used to adequately represent the heterogeneous structure and complex relationships of SMDs and BioDs. Second, an undersampling method based on Positive-unlabeled learning (PU-sampling) is introduced to obtain negative samples with high confidence from the unlabeled data set. Finally, both learned representations of SMD and BioD are fed into DNN classifiers to predict their interaction events. In addition, we also conduct a retrospective analysis. CONCLUSIONS Our proposed multi-modal representation learning method can extract drug features more comprehensively in heterogeneous drugs. In addition, PU-sampling can effectively reduce the noise in the sampling procedure. Our proposed method significantly outperforms other state-of-the-art drug interaction prediction methods. In a retrospective analysis of DrugBank 5.1.0, 14 out of the 20 predictions with the highest confidence were validated in the latest version of DrugBank 5.1.8, demonstrating that Multi-SBI is a valuable tool for predicting new drug interactions through effectively extracting and learning heterogeneous drug features.
Collapse
Affiliation(s)
- Dingkai Huang
- grid.39436.3b0000 0001 2323 5732School of Computer Engineering and Science, Shanghai University, Shanghai, 200444 China
| | - Hongjian He
- grid.39436.3b0000 0001 2323 5732School of Computer Engineering and Science, Shanghai University, Shanghai, 200444 China
| | - Jiaming Ouyang
- grid.39436.3b0000 0001 2323 5732School of Computer Engineering and Science, Shanghai University, Shanghai, 200444 China
| | - Chang Zhao
- grid.39436.3b0000 0001 2323 5732School of Computer Engineering and Science, Shanghai University, Shanghai, 200444 China
| | - Xin Dong
- grid.39436.3b0000 0001 2323 5732School of Medicine, Shanghai University, Shanghai, 200444 China
| | - Jiang Xie
- grid.39436.3b0000 0001 2323 5732School of Computer Engineering and Science, Shanghai University, Shanghai, 200444 China
| |
Collapse
|
7
|
Qian Y, Ding Y, Zou Q, Guo F. Identification of drug-side effect association via restricted Boltzmann machines with penalized term. Brief Bioinform 2022; 23:6762741. [PMID: 36259601 DOI: 10.1093/bib/bbac458] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2022] [Revised: 09/09/2022] [Accepted: 09/25/2022] [Indexed: 12/14/2022] Open
Abstract
In the entire life cycle of drug development, the side effect is one of the major failure factors. Severe side effects of drugs that go undetected until the post-marketing stage leads to around two million patient morbidities every year in the United States. Therefore, there is an urgent need for a method to predict side effects of approved drugs and new drugs. Following this need, we present a new predictor for finding side effects of drugs. Firstly, multiple similarity matrices are constructed based on the association profile feature and drug chemical structure information. Secondly, these similarity matrices are integrated by Centered Kernel Alignment-based Multiple Kernel Learning algorithm. Then, Weighted K nearest known neighbors is utilized to complement the adjacency matrix. Next, we construct Restricted Boltzmann machines (RBM) in drug space and side effect space, respectively, and apply a penalized maximum likelihood approach to train model. At last, the average decision rule was adopted to integrate predictions from RBMs. Comparison results and case studies demonstrate, with four benchmark datasets, that our method can give a more accurate and reliable prediction result.
Collapse
Affiliation(s)
- Yuqing Qian
- School of Electronic and Information Engineering, Suzhou University of Science and Technology, Suzhou 215009, PR China
| | - Yijie Ding
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou 324000, PR China
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, 610054, PR China
| | - Fei Guo
- School of Computer Science and Engineering, Central South University, Changsha 410083, PR China
| |
Collapse
|
8
|
Chen L, Lin D, Xu H, Li J, Lin L. WLLP: A weighted reconstruction-based linear label propagation algorithm for predicting potential therapeutic agents for COVID-19. Front Microbiol 2022; 13:1040252. [PMID: 36466666 PMCID: PMC9713947 DOI: 10.3389/fmicb.2022.1040252] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2022] [Accepted: 10/06/2022] [Indexed: 11/18/2022] Open
Abstract
The global coronavirus disease 2019 (COVID-19) pandemic caused by the severe acute respiratory syndrome coronavirus-2 (SARS-CoV) has led to a huge health and economic crises. However, the research required to develop new drugs and vaccines is very expensive in terms of labor, money, and time. Owing to recent advances in data science, drug-repositioning technologies have become one of the most promising strategies available for developing effective treatment options. Using the previously reported human drug virus database (HDVD), we proposed a model to predict possible drug regimens based on a weighted reconstruction-based linear label propagation algorithm (WLLP). For the drug–virus association matrix, we used the weighted K-nearest known neighbors method for preprocessing and label propagation of the network based on the linear neighborhood similarity of drugs and viruses to obtain the final prediction results. In the framework of 10 times 10-fold cross-validated area under the receiver operating characteristic (ROC) curve (AUC), WLLP exhibited excellent performance with an AUC of 0.8828 ± 0.0037 and an area under the precision-recall curve of 0.5277 ± 0.0053, outperforming the other four models used for comparison. We also predicted effective drug regimens against SARS-CoV-2, and this case study showed that WLLP can be used to suggest potential drugs for the treatment of COVID-19.
Collapse
Affiliation(s)
- Langcheng Chen
- Center of Campus Network and Modern Educational Technology, Guangdong University of Technology, Guangzhou, China
| | - Dongying Lin
- School of Computer Science, Guangdong University of Technology, Guangzhou, China
| | - Haojie Xu
- School of Computer Science, Guangdong University of Technology, Guangzhou, China
| | - Jianming Li
- School of Computer Science, Guangdong University of Technology, Guangzhou, China
| | - Lieqing Lin
- Center of Campus Network and Modern Educational Technology, Guangdong University of Technology, Guangzhou, China
- *Correspondence: Lieqing Lin
| |
Collapse
|
9
|
MSF-UBRW: An Improved Unbalanced Bi-Random Walk Method to Infer Human lncRNA-Disease Associations. Genes (Basel) 2022; 13:genes13112032. [DOI: 10.3390/genes13112032] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2022] [Revised: 10/24/2022] [Accepted: 10/28/2022] [Indexed: 11/06/2022] Open
Abstract
Long-non-coding RNA (lncRNA) is a transcription product that exerts its biological functions through a variety of mechanisms. The occurrence and development of a series of human diseases are closely related to abnormal expression levels of lncRNAs. Scientists have developed many computational models to identify the lncRNA-disease associations (LDAs). However, many potential LDAs are still unknown. In this paper, a novel method, namely MSF-UBRW (multiple similarities fusion based on unbalanced bi-random walk), is designed to explore new LDAs. First, two similarities (functional similarity and Gaussian Interaction Profile kernel similarity) of lncRNAs are calculated and fused linearly, also for disease data. Then, the known association matrix is preprocessed. Next, the linear neighbor similarities of lncRNAs and diseases are calculated, respectively. After that, the potential associations are predicted based on unbalanced bi-random walk. The fusion of multiple similarities improves the prediction performance of MSF-UBRW to a large extent. Finally, the prediction ability of the MSF-UBRW algorithm is measured by two statistical methods, leave-one-out cross-validation (LOOCV) and 5-fold cross-validation (5-fold CV). The AUCs of 0.9391 in LOOCV and 0.9183 (±0.0054) in 5-fold CV confirmed the reliable prediction ability of the MSF-UBRW method. Case studies of three common diseases also show that the MSF-UBRW method can infer new LDAs effectively.
Collapse
|
10
|
Hu J, Gao J, Fang X, Liu Z, Wang F, Huang W, Wu H, Zhao G. DTSyn: a dual-transformer-based neural network to predict synergistic drug combinations. Brief Bioinform 2022; 23:6652782. [PMID: 35915050 DOI: 10.1093/bib/bbac302] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2022] [Revised: 06/23/2022] [Accepted: 07/04/2022] [Indexed: 11/14/2022] Open
Abstract
Drug combination therapies are superior to monotherapy for cancer treatment in many ways. Identifying novel drug combinations by screening is challenging for the wet-lab experiments due to the time-consuming process of the enormous search space of possible drug pairs. Thus, computational methods have been developed to predict drug pairs with potential synergistic functions. Notwithstanding the success of current models, understanding the mechanism of drug synergy from a chemical-gene-tissue interaction perspective lacks study, hindering current algorithms from drug mechanism study. Here, we proposed a deep neural network model termed DTSyn (Dual Transformer encoder model for drug pair Synergy prediction) based on a multi-head attention mechanism to identify novel drug combinations. We designed a fine-granularity transformer encoder to capture chemical substructure-gene and gene-gene associations and a coarse-granularity transformer encoder to extract chemical-chemical and chemical-cell line interactions. DTSyn achieved the highest receiver operating characteristic area under the curve of 0.73, 0.78. 0.82 and 0.81 on four different cross-validation tasks, outperforming all competing methods. Further, DTSyn achieved the best True Positive Rate (TPR) over five independent data sets. The ablation study showed that both transformer encoder blocks contributed to the performance of DTSyn. In addition, DTSyn can extract interactions among chemicals and cell lines, representing the potential mechanisms of drug action. By leveraging the attention mechanism and pretrained gene embeddings, DTSyn shows improved interpretability ability. Thus, we envision our model as a valuable tool to prioritize synergistic drug pairs with chemical and cell line gene expression profile.
Collapse
Affiliation(s)
- Jing Hu
- Baidu, Inc., 701, Na Xian Road, 201210, Shanghai, China
| | - Jie Gao
- Baidu, Inc., 701, Na Xian Road, 201210, Shanghai, China
| | - Xiaomin Fang
- Baidu, Inc., Xue Fu Road, 518000, Shenzhen, China
| | - Zijing Liu
- Baidu, Inc., Xue Fu Road, 518000, Shenzhen, China
| | - Fan Wang
- Baidu, Inc., Xue Fu Road, 518000, Shenzhen, China
| | - Weili Huang
- HWL Consulting LLC, 3328 Antigua Dr, 97408, Oregon, US
| | - Hua Wu
- Baidu, Inc., No. 10 Shangdi 10th Street, 100085, Beijing, China
| | - Guodong Zhao
- Baidu, Inc., 701, Na Xian Road, 201210, Shanghai, China
| |
Collapse
|
11
|
Wang Y, Wang L, Wong L, Zhao B, Su X, Li Y, You Z. RoFDT: Identification of Drug–Target Interactions from Protein Sequence and Drug Molecular Structure Using Rotation Forest. BIOLOGY 2022; 11:biology11050741. [PMID: 35625469 PMCID: PMC9138819 DOI: 10.3390/biology11050741] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/15/2022] [Revised: 05/02/2022] [Accepted: 05/06/2022] [Indexed: 11/16/2022]
Abstract
As the basis for screening drug candidates, the identification of drug–target interactions (DTIs) plays a crucial role in the innovative drugs research. However, due to the inherent constraints of small-scale and time-consuming wet experiments, DTI recognition is usually difficult to carry out. In the present study, we developed a computational approach called RoFDT to predict DTIs by combining feature-weighted Rotation Forest (FwRF) with a protein sequence. In particular, we first encode protein sequences as numerical matrices by Position-Specific Score Matrix (PSSM), then extract their features utilize Pseudo Position-Specific Score Matrix (PsePSSM) and combine them with drug structure information-molecular fingerprints and finally feed them into the FwRF classifier and validate the performance of RoFDT on Enzyme, GPCR, Ion Channel and Nuclear Receptor datasets. In the above dataset, RoFDT achieved 91.68%, 84.72%, 88.11% and 78.33% accuracy, respectively. RoFDT shows excellent performance in comparison with support vector machine models and previous superior approaches. Furthermore, 7 of the top 10 DTIs with RoFDT estimate scores were proven by the relevant database. These results demonstrate that RoFDT can be employed to a powerful predictive approach for DTIs to provide theoretical support for innovative drug discovery.
Collapse
Affiliation(s)
- Ying Wang
- College of Information Science and Engineering, Zaozhuang University, Zaozhuang 277160, China;
| | - Lei Wang
- College of Information Science and Engineering, Zaozhuang University, Zaozhuang 277160, China;
- Big Data and Intelligent Computing Research Center, Guangxi Academy of Sciences, Nanning 530007, China;
- Correspondence: (L.W.); (Z.Y.); Tel.: +86-151-0632-2257 (L.W.); +86-173-9276-3836 (Z.Y.)
| | - Leon Wong
- Big Data and Intelligent Computing Research Center, Guangxi Academy of Sciences, Nanning 530007, China;
| | - Bowei Zhao
- Xinjiang Technical Institutes of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China; (B.Z.); (X.S.)
| | - Xiaorui Su
- Xinjiang Technical Institutes of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China; (B.Z.); (X.S.)
| | - Yang Li
- School of Computer Science and Information Engineering, Hefei University of Technology, Hefei 230601, China;
| | - Zhuhong You
- Big Data and Intelligent Computing Research Center, Guangxi Academy of Sciences, Nanning 530007, China;
- School of Computer Science, Northwestern Polytechnical University, Xi’an 710129, China
- Correspondence: (L.W.); (Z.Y.); Tel.: +86-151-0632-2257 (L.W.); +86-173-9276-3836 (Z.Y.)
| |
Collapse
|
12
|
Xie G, Jiang J, Sun Y. LDA-LNSUBRW: lncRNA-Disease Association Prediction Based on Linear Neighborhood Similarity and Unbalanced bi-Random Walk. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:989-997. [PMID: 32870798 DOI: 10.1109/tcbb.2020.3020595] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Increasing number of experiments show that lncRNAs are involved in many biological processes, and their mutations and disorders are associated with many diseases. However, verifying the relationships between lncRNAs and diseases is time consuming and laborio. Searching for effective computational methods will contribute to our understanding of the underlying mechanisms of disease and identifying biomarkers of diseases. Therefore, we proposed a method called lncRNA-disease association prediction based on linear neighborhood similarity and unbalanced bi-random walk (LDA-LNSUBRW). Given that the known lncRNA-disease associations are rare, a pretreatment step should be performed to obtain the interaction possibility of unknown cases, so as to help us predict the potential associations. In the framework of leave-one-out cross-validation (LOOCV)and fivefold cross-validation (5-fold CV), LDA-LNSUBRW achieved effective performance with AUC of 0.8874 and 0.8632 ± 0.0051, respectively. The experimental results in this paper show that the proposed method is superior to five other state-of-the-art methods. In addition, case studies of three diseases (lung cancer, breast cancer, and osteosarcoma)were carried out to illustrate that LDA-LNSUBRW could predict the relevant lncRNAs.
Collapse
|
13
|
Xie G, Li J, Gu G, Sun Y, Lin Z, Zhu Y, Wang W. BGMSDDA: a bipartite graph diffusion algorithm with multiple similarity integration for drug-disease association prediction. Mol Omics 2021; 17:997-1011. [PMID: 34610633 DOI: 10.1039/d1mo00237f] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Drug repositioning, a method that relies on the information from the original drug-disease association matrix, aims to identify new indications for existing drugs and is expected to greatly reduce the cost and time of drug development. However, most current drug repositioning methods make use of the original drug-disease association matrix directly without preconditioning. As relatively only a few associations between drugs and diseases have been determined from actual observations, the original drug-disease association matrix used in the prediction is sparse, which affects the performance of the prediction method. A method for mining similar features of drugs and diseases is still lacking. To solve these problems, we developed a bipartite graph diffusion algorithm with multiple similarity integration for drug-disease association prediction (BGMSDDA). First, the weight K nearest known neighbors (WKNKN) algorithm was used to reconstruct the drug-disease association matrix. Secondly, an effective method was designed to extract similar characteristics of drugs and diseases based on integrating linear neighborhood similarity and Gaussian kernel similarity. Finally, bipartite graph diffusion was used to infer undiscovered drug-disease associations. After carrying out 10-fold cross-validation experiments, BGMSDDA showed excellent performance on two datasets, specifically with AUC values of 0.939 (Fdataset) and 0.954 (Cdataset), and AUPR values of 0.466 (Fdataset) and 0.565 (Cdataset). Furthermore, to evaluate the accuracy of the results of BGMSDDA, we conducted case studies on three medically used drugs selected from Fdataset and Cdataset and validated the predictive associated diseases of each drug with some databases. Based on the results obtained, BGMSDDA was demonstrated to be useful for predicting drug-disease associations.
Collapse
Affiliation(s)
- Guobo Xie
- School of Computer Science, Guangdong University of Technology, Guangzhou, China.
| | - Jianming Li
- School of Computer Science, Guangdong University of Technology, Guangzhou, China.
| | - Guosheng Gu
- School of Computer Science, Guangdong University of Technology, Guangzhou, China.
| | - Yuping Sun
- School of Computer Science, Guangdong University of Technology, Guangzhou, China.
| | - Zhiyi Lin
- School of Computer Science, Guangdong University of Technology, Guangzhou, China.
| | - Yinting Zhu
- School of Computer Science, Guangdong University of Technology, Guangzhou, China.
| | - Weiming Wang
- School of Computer Science, Guangdong University of Technology, Guangzhou, China.
| |
Collapse
|
14
|
Wu X, Zeng W, Lin F, Zhou X. NeuRank: learning to rank with neural networks for drug-target interaction prediction. BMC Bioinformatics 2021; 22:567. [PMID: 34836495 PMCID: PMC8620576 DOI: 10.1186/s12859-021-04476-y] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2021] [Accepted: 11/08/2021] [Indexed: 01/09/2023] Open
Abstract
BACKGROUND Experimental verification of a drug discovery process is expensive and time-consuming. Therefore, recently, the demand to more efficiently and effectively identify drug-target interactions (DTIs) has intensified. RESULTS We treat the prediction of DTIs as a ranking problem and propose a neural network architecture, NeuRank, to address it. Also, we assume that similar drug compounds are likely to interact with similar target proteins. Thus, in our model, we add drug and target similarities, which are very effective at improving the prediction of DTIs. Then, we develop NeuRank from a point-wise to a pair-wise, and further to list-wise model. CONCLUSION Finally, results from extensive experiments on five public data sets (DrugBank, Enzymes, Ion Channels, G-Protein-Coupled Receptors, and Nuclear Receptors) show that, in identifying DTIs, our models achieve better performance than other state-of-the-art methods.
Collapse
Affiliation(s)
- Xiujin Wu
- School of Informatics, Xiamen University, Xiamen, China
| | - Wenhua Zeng
- School of Informatics, Xiamen University, Xiamen, China
| | - Fan Lin
- School of Informatics, Xiamen University, Xiamen, China
| | - Xiuze Zhou
- Shuye Technology Co., Ltd., Hangzhou, China
| |
Collapse
|
15
|
Identification of drug-target interactions via multi-view graph regularized link propagation model. Neurocomputing 2021. [DOI: 10.1016/j.neucom.2021.05.100] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
|
16
|
|
17
|
Wang J, Wang W, Yan C, Luo J, Zhang G. Predicting Drug-Disease Association Based on Ensemble Strategy. Front Genet 2021; 12:666575. [PMID: 34012464 PMCID: PMC8128144 DOI: 10.3389/fgene.2021.666575] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2021] [Accepted: 03/23/2021] [Indexed: 12/29/2022] Open
Abstract
Drug repositioning is used to find new uses for existing drugs, effectively shortening the drug research and development cycle and reducing costs and risks. A new model of drug repositioning based on ensemble learning is proposed. This work develops a novel computational drug repositioning approach called CMAF to discover potential drug-disease associations. First, for new drugs and diseases or unknown drug-disease pairs, based on their known neighbor information, an association probability can be obtained by implementing the weighted K nearest known neighbors (WKNKN) method and improving the drug-disease association information. Then, a new drug similarity network and new disease similarity network can be constructed. Three prediction models are applied and ensembled to enable the final association of drug-disease pairs based on improved drug-disease association information and the constructed similarity network. The experimental results demonstrate that the developed approach outperforms recent state-of-the-art prediction models. Case studies further confirm the predictive ability of the proposed method. Our proposed method can effectively improve the prediction results.
Collapse
Affiliation(s)
- Jianlin Wang
- School of Computer and Information Engineering, Henan University, Kaifeng, China
| | - Wenxiu Wang
- School of Computer and Information Engineering, Henan University, Kaifeng, China
| | - Chaokun Yan
- School of Computer and Information Engineering, Henan University, Kaifeng, China
| | - Junwei Luo
- College of Computer Science and Technology, Henan Polytechnic University, Jiaozuo, China
| | - Ge Zhang
- School of Computer and Information Engineering, Henan University, Kaifeng, China
| |
Collapse
|
18
|
An Ensemble Learning-Based Method for Inferring Drug-Target Interactions Combining Protein Sequences and Drug Fingerprints. BIOMED RESEARCH INTERNATIONAL 2021; 2021:9933873. [PMID: 33987446 PMCID: PMC8093043 DOI: 10.1155/2021/9933873] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/15/2021] [Revised: 04/14/2021] [Accepted: 04/16/2021] [Indexed: 11/24/2022]
Abstract
Identifying the interactions of the drug-target is central to the cognate areas including drug discovery and drug reposition. Although the high-throughput biotechnologies have made tremendous progress, the indispensable clinical trials remain to be expensive, laborious, and intricate. Therefore, a convenient and reliable computer-aided method has become the focus on inferring drug-target interactions (DTIs). In this research, we propose a novel computational model integrating a pyramid histogram of oriented gradients (PHOG), Position-Specific Scoring Matrix (PSSM), and rotation forest (RF) classifier for identifying DTIs. Specifically, protein primary sequences are first converted into PSSMs to describe the potential biological evolution information. After that, PHOG is employed to mine the highly representative features of PSSM from multiple pyramid levels, and the complete describers of drug-target pairs are generated by combining the molecular substructure fingerprints and PHOG features. Finally, we feed the complete describers into the RF classifier for effective prediction. The experiments of 5-fold Cross-Validations (CV) yield mean accuracies of 88.96%, 86.37%, 82.88%, and 76.92% on four golden standard data sets (enzyme, ion channel, G protein-coupled receptors (GPCRs), and nuclear receptor, respectively). Moreover, the paper also conducts the state-of-art light gradient boosting machine (LGBM) and support vector machine (SVM) to further verify the performance of the proposed model. The experimental outcomes substantiate that the established model is feasible and reliable to predict DTIs. There is an excellent prospect that our model is capable of predicting DTIs as an efficient tool on a large scale.
Collapse
|
19
|
Wang A, Wang M. Drug-Target Interaction Prediction via Dual Laplacian Graph Regularized Logistic Matrix Factorization. BIOMED RESEARCH INTERNATIONAL 2021; 2021:5599263. [PMID: 33855072 PMCID: PMC8019634 DOI: 10.1155/2021/5599263] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/15/2021] [Revised: 03/06/2021] [Accepted: 03/13/2021] [Indexed: 11/18/2022]
Abstract
Drug-target interactions provide useful information for biomedical drug discovery as well as drug development. However, it is costly and time consuming to find drug-target interactions by experimental methods. As a result, developing computational approaches for this task is necessary and has practical significance. In this study, we establish a novel dual Laplacian graph regularized logistic matrix factorization model for drug-target interaction prediction, referred to as DLGrLMF briefly. Specifically, DLGrLMF regards the task of drug-target interaction prediction as a weighted logistic matrix factorization problem, in which the experimentally validated interactions are allocated with larger weights. Meanwhile, by considering that drugs with similar chemical structure should have interactions with similar targets and targets with similar genomic sequence similarity should in turn have interactions with similar drugs, the drug pairwise chemical structure similarities as well as the target pairwise genomic sequence similarities are fully exploited to serve the matrix factorization problem by using a dual Laplacian graph regularization term. In addition, we design a gradient descent algorithm to solve the resultant optimization problem. Finally, the efficacy of DLGrLMF is validated on various benchmark datasets and the experimental results demonstrate that DLGrLMF performs better than other state-of-the-art methods. Case studies are also conducted to validate that DLGrLMF can successfully predict most of the experimental validated drug-target interactions.
Collapse
Affiliation(s)
- Aizhen Wang
- Department of Pharmacy, The Affiliated Huai'an Hospital of Xuzhou Medical University and The Second People's Hospital of Huai'an, Huai'an 223002, China
| | - Minhui Wang
- Department of Pharmacy, Lianshui People's Hospital Affiliated to Kangda College, Nanjing Medical University, Huai'an 223300, China
| |
Collapse
|
20
|
Zhang W, Li Z, Guo W, Yang W, Huang F. A Fast Linear Neighborhood Similarity-Based Network Link Inference Method to Predict MicroRNA-Disease Associations. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021; 18:405-415. [PMID: 31369383 DOI: 10.1109/tcbb.2019.2931546] [Citation(s) in RCA: 62] [Impact Index Per Article: 20.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Increasing evidences revealed that microRNAs (miRNAs) play critical roles in important biological processes. The identification of disease-related miRNAs is critical to understand the molecular mechanisms of human diseases. Most existing computational methods require diverse features to predict miRNA-disease associations. However, diverse features are not available for all miRNAs or diseases. In addition, most methods can't predict links for miRNAs or diseases without association information. In this paper, we propose a fast linear neighborhood similarity-based network link inference method, named FLNSNLI, to predict miRNA-disease associations. First, known miRNA-disease associations are formulated as a bipartite network, and miRNAs (or diseases) are expressed as association profiles. Second, miRNA-miRNA similarity and disease-disease similarity are calculated by fast linear neighborhood similarity measure and association profiles. Third, the label propagation algorithm is respectively implemented on two sides to score candidate miRNA-disease associations. Finally, FLNSNLI adopts the weighted average strategy and makes predictions. Moreover, we develop a link complementing approach, and extend FLNSNLI to predict links for miRNAs (or diseases) without known associations. In computational experiments, FLNSNLI produces high-accuracy performances, and outperforms other state-of-the-art methods. More importantly, FLNSNLI requires less information but performs well. Case studies on three popular diseases show that FLNSNLI is useful for the microRNA-disease association prediction.
Collapse
|
21
|
Ding Y, Tang J, Guo F. The Computational Models of Drug-target Interaction Prediction. Protein Pept Lett 2020; 27:348-358. [PMID: 30968771 DOI: 10.2174/0929866526666190410124110] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2019] [Revised: 02/22/2019] [Accepted: 04/02/2019] [Indexed: 12/19/2022]
Abstract
The identification of Drug-Target Interactions (DTIs) is an important process in drug discovery and medical research. However, the tradition experimental methods for DTIs identification are still time consuming, extremely expensive and challenging. In the past ten years, various computational methods have been developed to identify potential DTIs. In this paper, the identification methods of DTIs are summarized. What's more, several state-of-the-art computational methods are mainly introduced, containing network-based method and machine learning-based method. In particular, for machine learning-based methods, including the supervised and semisupervised models, have essential differences in the approach of negative samples. Although these effective computational models in identification of DTIs have achieved significant improvements, network-based and machine learning-based methods have their disadvantages, respectively. These computational methods are evaluated on four benchmark data sets via values of Area Under the Precision Recall curve (AUPR).
Collapse
Affiliation(s)
- Yijie Ding
- School of Electronic and Information Engineering, Suzhou University of Science and Technology, Suzhou, China
| | - Jijun Tang
- Department of Computer Science and Engineering, University of South Carolina, Columbia, SC, United States.,School of Computer Science and Technology, College of Intelligence and Computing, Tianjin University, Tianjin, China
| | - Fei Guo
- School of Computer Science and Technology, College of Intelligence and Computing, Tianjin University, Tianjin, China
| |
Collapse
|
22
|
Chu Y, Shan X, Chen T, Jiang M, Wang Y, Wang Q, Salahub DR, Xiong Y, Wei DQ. DTI-MLCD: predicting drug-target interactions using multi-label learning with community detection method. Brief Bioinform 2020; 22:5910189. [PMID: 32964234 DOI: 10.1093/bib/bbaa205] [Citation(s) in RCA: 32] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2020] [Revised: 08/06/2020] [Accepted: 08/10/2020] [Indexed: 12/20/2022] Open
Abstract
Identifying drug-target interactions (DTIs) is an important step for drug discovery and drug repositioning. To reduce the experimental cost, a large number of computational approaches have been proposed for this task. The machine learning-based models, especially binary classification models, have been developed to predict whether a drug-target pair interacts or not. However, there is still much room for improvement in the performance of current methods. Multi-label learning can overcome some difficulties caused by single-label learning in order to improve the predictive performance. The key challenge faced by multi-label learning is the exponential-sized output space, and considering label correlations can help to overcome this challenge. In this paper, we facilitate multi-label classification by introducing community detection methods for DTI prediction, named DTI-MLCD. Moreover, we updated the gold standard data set by adding 15,000 more positive DTI samples in comparison to the data set, which has widely been used by most of previously published DTI prediction methods since 2008. The proposed DTI-MLCD is applied to both data sets, demonstrating its superiority over other machine learning methods and several existing methods. The data sets and source code of this study are freely available at https://github.com/a96123155/DTI-MLCD.
Collapse
Affiliation(s)
- Yanyi Chu
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| | - Xiaoqi Shan
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| | - Tianhang Chen
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| | - Mingming Jiang
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| | - Yanjing Wang
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| | - Qiankun Wang
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| | | | - Yi Xiong
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| | - Dong-Qing Wei
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| |
Collapse
|
23
|
Identification of Drug–Target Interactions via Dual Laplacian Regularized Least Squares with Multiple Kernel Fusion. Knowl Based Syst 2020. [DOI: 10.1016/j.knosys.2020.106254] [Citation(s) in RCA: 71] [Impact Index Per Article: 17.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
|
24
|
A Novel Triple Matrix Factorization Method for Detecting Drug-Side Effect Association Based on Kernel Target Alignment. BIOMED RESEARCH INTERNATIONAL 2020; 2020:4675395. [PMID: 32596314 PMCID: PMC7275954 DOI: 10.1155/2020/4675395] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/15/2020] [Accepted: 04/08/2020] [Indexed: 01/01/2023]
Abstract
All drugs usually have side effects, which endanger the health of patients. To identify potential side effects of drugs, biological and pharmacological experiments are done but are expensive and time-consuming. So, computation-based methods have been developed to accurately and quickly predict side effects. To predict potential associations between drugs and side effects, we propose a novel method called the Triple Matrix Factorization- (TMF-) based model. TMF is built by the biprojection matrix and latent feature of kernels, which is based on Low Rank Approximation (LRA). LRA could construct a lower rank matrix to approximate the original matrix, which not only retains the characteristics of the original matrix but also reduces the storage space and computational complexity of the data. To fuse multivariate information, multiple kernel matrices are constructed and integrated via Kernel Target Alignment-based Multiple Kernel Learning (KTA-MKL) in drug and side effect space, respectively. Compared with other methods, our model achieves better performance on three benchmark datasets. The values of the Area Under the Precision-Recall curve (AUPR) are 0.677, 0.685, and 0.680 on three datasets, respectively.
Collapse
|
25
|
Incorporating chemical sub-structures and protein evolutionary information for inferring drug-target interactions. Sci Rep 2020; 10:6641. [PMID: 32313024 PMCID: PMC7171114 DOI: 10.1038/s41598-020-62891-2] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2020] [Accepted: 03/12/2020] [Indexed: 01/29/2023] Open
Abstract
Accumulating evidence has shown that drug-target interactions (DTIs) play a crucial role in the process of genomic drug discovery. Although biological experimental technology has made great progress, the identification of DTIs is still very time-consuming and expensive nowadays. Hence it is urgent to develop in silico model as a supplement to the biological experiments to predict the potential DTIs. In this work, a new model is designed to predict DTIs by incorporating chemical sub-structures and protein evolutionary information. Specifically, we first use Position-Specific Scoring Matrix (PSSM) to convert the protein sequence into the numerical descriptor containing biological evolutionary information, then use Discrete Cosine Transform (DCT) algorithm to extract the hidden features and integrate them with the chemical sub-structures descriptor, and finally utilize Rotation Forest (RF) classifier to accurately predict whether there is interaction between the drug and the target protein. In the 5-fold cross-validation (CV) experiment, the average accuracy of the proposed model on the benchmark datasets of Enzymes, Ion Channels, GPCRs and Nuclear Receptors reached 0.9140, 0.8919, 0.8724 and 0.8111, respectively. In order to fully evaluate the performance of the proposed model, we compare it with different feature extraction model, classifier model, and other state-of-the-art models. Furthermore, we also implemented case studies. As a result, 8 of the top 10 drug-target pairs with the highest prediction score were confirmed by related databases. These excellent results indicate that the proposed model has outstanding ability in predicting DTIs and can provide reliable candidates for biological experiments.
Collapse
|
26
|
Huang F, Qiu Y, Li Q, Liu S, Ni F. Predicting Drug-Disease Associations via Multi-Task Learning Based on Collective Matrix Factorization. Front Bioeng Biotechnol 2020; 8:218. [PMID: 32373595 PMCID: PMC7179666 DOI: 10.3389/fbioe.2020.00218] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2019] [Accepted: 03/04/2020] [Indexed: 12/30/2022] Open
Abstract
Identifying drug-disease associations is integral to drug development. Computationally prioritizing candidate drug-disease associations has attracted growing attention due to its contribution to reducing the cost of laboratory screening. Drug-disease associations involve different association types, such as drug indications and drug side effects. However, the existing models for predicting drug-disease associations merely concentrate on independent tasks: recommending novel indications to benefit drug repositioning, predicting potential side effects to prevent drug-induced risk, or only determining the existence of drug-disease association. They ignore crucial prior knowledge of the correlations between different association types. Since the Comparative Toxicogenomics Database (CTD) annotates the drug-disease associations as therapeutic or marker/mechanism, we consider predicting the two types of association. To this end, we propose a collective matrix factorization-based multi-task learning method (CMFMTL) in this paper. CMFMTL handles the problem as multi-task learning where each task is to predict one type of association, and two tasks complement and improve each other by capturing the relatedness between them. First, drug-disease associations are represented as a bipartite network with two types of links representing therapeutic effects and non-therapeutic effects. Then, CMFMTL, respectively, approximates the association matrix regarding each link type by matrix tri-factorization, and shares the low-dimensional latent representations for drugs and diseases in the two related tasks for the goal of collective learning. Finally, CMFMTL puts the two tasks into a unified framework and an efficient algorithm is developed to solve our proposed optimization problem. In the computational experiments, CMFMTL outperforms several state-of-the-art methods both in the two tasks. Moreover, case studies show that CMFMTL helps to find out novel drug-disease associations that are not included in CTD, and simultaneously predicts their association types.
Collapse
Affiliation(s)
- Feng Huang
- College of Informatics, Huazhong Agricultural University, Wuhan, China
| | - Yang Qiu
- College of Informatics, Huazhong Agricultural University, Wuhan, China
| | - Qiaojun Li
- College of Informatics, Huazhong Agricultural University, Wuhan, China
- School of Electronic and Information Engineering, Henan Polytechnic Institute, Henan Nanyang, China
| | - Shichao Liu
- College of Informatics, Huazhong Agricultural University, Wuhan, China
- Hubei Engineering Technology Research Center of Agricultural Big Data, Wuhan, China
| | - Fuchuan Ni
- College of Informatics, Huazhong Agricultural University, Wuhan, China
- Hubei Engineering Technology Research Center of Agricultural Big Data, Wuhan, China
| |
Collapse
|
27
|
Pliakos K, Vens C. Drug-target interaction prediction with tree-ensemble learning and output space reconstruction. BMC Bioinformatics 2020; 21:49. [PMID: 32033537 PMCID: PMC7006075 DOI: 10.1186/s12859-020-3379-z] [Citation(s) in RCA: 33] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2019] [Accepted: 01/21/2020] [Indexed: 12/21/2022] Open
Abstract
Background Computational prediction of drug-target interactions (DTI) is vital for drug discovery. The experimental identification of interactions between drugs and target proteins is very onerous. Modern technologies have mitigated the problem, leveraging the development of new drugs. However, drug development remains extremely expensive and time consuming. Therefore, in silico DTI predictions based on machine learning can alleviate the burdensome task of drug development. Many machine learning approaches have been proposed over the years for DTI prediction. Nevertheless, prediction accuracy and efficiency are persisting problems that still need to be tackled. Here, we propose a new learning method which addresses DTI prediction as a multi-output prediction task by learning ensembles of multi-output bi-clustering trees (eBICT) on reconstructed networks. In our setting, the nodes of a DTI network (drugs and proteins) are represented by features (background information). The interactions between the nodes of a DTI network are modeled as an interaction matrix and compose the output space in our problem. The proposed approach integrates background information from both drug and target protein spaces into the same global network framework. Results We performed an empirical evaluation, comparing the proposed approach to state of the art DTI prediction methods and demonstrated the effectiveness of the proposed approach in different prediction settings. For evaluation purposes, we used several benchmark datasets that represent drug-protein networks. We show that output space reconstruction can boost the predictive performance of tree-ensemble learning methods, yielding more accurate DTI predictions. Conclusions We proposed a new DTI prediction method where bi-clustering trees are built on reconstructed networks. Building tree-ensemble learning models with output space reconstruction leads to superior prediction results, while preserving the advantages of tree-ensembles, such as scalability, interpretability and inductive setting.
Collapse
Affiliation(s)
- Konstantinos Pliakos
- KU Leuven, Campus KULAK, Faculty of Medicine, Kortrijk, Belgium. .,ITEC, imec research group at KU Leuven, Kortrijk, Belgium.
| | - Celine Vens
- KU Leuven, Campus KULAK, Faculty of Medicine, Kortrijk, Belgium.,ITEC, imec research group at KU Leuven, Kortrijk, Belgium
| |
Collapse
|
28
|
Bagherian M, Sabeti E, Wang K, Sartor MA, Nikolovska-Coleska Z, Najarian K. Machine learning approaches and databases for prediction of drug-target interaction: a survey paper. Brief Bioinform 2020; 22:247-269. [PMID: 31950972 PMCID: PMC7820849 DOI: 10.1093/bib/bbz157] [Citation(s) in RCA: 148] [Impact Index Per Article: 37.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2019] [Revised: 11/01/2019] [Accepted: 11/07/2019] [Indexed: 12/12/2022] Open
Abstract
The task of predicting the interactions between drugs and targets plays a key role in the process of drug discovery. There is a need to develop novel and efficient prediction approaches in order to avoid costly and laborious yet not-always-deterministic experiments to determine drug–target interactions (DTIs) by experiments alone. These approaches should be capable of identifying the potential DTIs in a timely manner. In this article, we describe the data required for the task of DTI prediction followed by a comprehensive catalog consisting of machine learning methods and databases, which have been proposed and utilized to predict DTIs. The advantages and disadvantages of each set of methods are also briefly discussed. Lastly, the challenges one may face in prediction of DTI using machine learning approaches are highlighted and we conclude by shedding some lights on important future research directions.
Collapse
Affiliation(s)
- Maryam Bagherian
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Elyas Sabeti
- Michigan Institute for Data Science, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Kai Wang
- Department of Biostatistics, School of Public Health, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Maureen A Sartor
- Department of Pathology, University of Michigan, Ann Arbor, MI, 48109, USA
| | | | - Kayvan Najarian
- Department of Electrical Engineering and Computer Science, College of Engineering, University of Michigan, Ann Arbor, MI, 48109, USA
| |
Collapse
|
29
|
Chu Y, Kaushik AC, Wang X, Wang W, Zhang Y, Shan X, Salahub DR, Xiong Y, Wei DQ. DTI-CDF: a cascade deep forest model towards the prediction of drug-target interactions based on hybrid features. Brief Bioinform 2019; 22:451-462. [PMID: 31885041 DOI: 10.1093/bib/bbz152] [Citation(s) in RCA: 100] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2019] [Revised: 11/01/2019] [Accepted: 11/04/2019] [Indexed: 12/18/2022] Open
Abstract
Drug-target interactions (DTIs) play a crucial role in target-based drug discovery and development. Computational prediction of DTIs can effectively complement experimental wet-lab techniques for the identification of DTIs, which are typically time- and resource-consuming. However, the performances of the current DTI prediction approaches suffer from a problem of low precision and high false-positive rate. In this study, we aim to develop a novel DTI prediction method for improving the prediction performance based on a cascade deep forest (CDF) model, named DTI-CDF, with multiple similarity-based features between drugs and the similarity-based features between target proteins extracted from the heterogeneous graph, which contains known DTIs. In the experiments, we built five replicates of 10-fold cross-validation under three different experimental settings of data sets, namely, corresponding DTI values of certain drugs (SD), targets (ST), or drug-target pairs (SP) in the training sets are missed but existed in the test sets. The experimental results demonstrate that our proposed approach DTI-CDF achieves a significantly higher performance than that of the traditional ensemble learning-based methods such as random forest and XGBoost, deep neural network, and the state-of-the-art methods such as DDR. Furthermore, there are 1352 newly predicted DTIs which are proved to be correct by KEGG and DrugBank databases. The data sets and source code are freely available at https://github.com//a96123155/DTI-CDF.
Collapse
Affiliation(s)
- Yanyi Chu
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| | | | - Xiangeng Wang
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| | - Wei Wang
- Mathematical Sciences, Shanghai Jiao Tong University
| | - Yufang Zhang
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| | | | | | - Yi Xiong
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| | - Dong-Qing Wei
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University
| |
Collapse
|
30
|
Zhang W, Tang G, Zhou S, Niu Y. LncRNA-miRNA interaction prediction through sequence-derived linear neighborhood propagation method with information combination. BMC Genomics 2019; 20:946. [PMID: 31856716 PMCID: PMC6923828 DOI: 10.1186/s12864-019-6284-y] [Citation(s) in RCA: 26] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
BACKGROUND Researchers discover lncRNAs can act as decoys or sponges to regulate the behavior of miRNAs. Identification of lncRNA-miRNA interactions helps to understand the functions of lncRNAs, especially their roles in complicated diseases. Computational methods can save time and reduce cost in identifying lncRNA-miRNA interactions, but there have been only a few computational methods. RESULTS In this paper, we propose a sequence-derived linear neighborhood propagation method (SLNPM) to predict lncRNA-miRNA interactions. First, we calculate the integrated lncRNA-lncRNA similarity and the integrated miRNA-miRNA similarity by combining known lncRNA-miRNA interactions, lncRNA sequences and miRNA sequences. We consider two similarity calculation strategies respectively, namely similarity-based information combination (SC) and interaction profile-based information combination (PC). Second, the integrated lncRNA similarity-based graph and the integrated miRNA similarity-based graph are respectively constructed, and the label propagation processes are implemented on two graphs to score lncRNA-miRNA pairs. Finally, the weighted averages of their outputs are adopted as final predictions. Therefore, we construct two editions of SLNPM: sequence-derived linear neighborhood propagation method based on similarity information combination (SLNPM-SC) and sequence-derived linear neighborhood propagation method based on interaction profile information combination (SLNPM-PC). The experimental results show that SLNPM-SC and SLNPM-PC predict lncRNA-miRNA interactions with higher accuracy compared with other state-of-the-art methods. The case studies demonstrate that SLNPM-SC and SLNPM-PC help to find novel lncRNA-miRNA interactions for given lncRNAs or miRNAs. CONCLUSION The study reveals that known interactions bring the most important information for lncRNA-miRNA interaction prediction, and sequences of lncRNAs (miRNAs) also provide useful information. In conclusion, SLNPM-SC and SLNPM-PC are promising for lncRNA-miRNA interaction prediction.
Collapse
Affiliation(s)
- Wen Zhang
- College of informatics, Huazhong Agricultural University, Wuhan, 430070 China
| | - Guifeng Tang
- School of Computer Science, Wuhan University, Wuhan, 430072 China
| | - Shuang Zhou
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong, China
| | - Yanqing Niu
- School of Mathematics and Statistics, South-Central University for Nationalities, Wuhan, 430074 China
| |
Collapse
|
31
|
Zhang W, Lin W, Zhang D, Wang S, Shi J, Niu Y. Recent Advances in the Machine Learning-Based Drug-Target Interaction Prediction. Curr Drug Metab 2019; 20:194-202. [PMID: 30129407 DOI: 10.2174/1389200219666180821094047] [Citation(s) in RCA: 32] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2017] [Revised: 01/18/2018] [Accepted: 03/19/2018] [Indexed: 12/28/2022]
Abstract
BACKGROUND The identification of drug-target interactions is a crucial issue in drug discovery. In recent years, researchers have made great efforts on the drug-target interaction predictions, and developed databases, software and computational methods. RESULTS In the paper, we review the recent advances in machine learning-based drug-target interaction prediction. First, we briefly introduce the datasets and data, and summarize features for drugs and targets which can be extracted from different data. Since drug-drug similarity and target-target similarity are important for many machine learning prediction models, we introduce how to calculate similarities based on data or features. Different machine learningbased drug-target interaction prediction methods can be proposed by using different features or information. Thus, we summarize, analyze and compare different machine learning-based prediction methods. CONCLUSION This study provides the guide to the development of computational methods for the drug-target interaction prediction.
Collapse
Affiliation(s)
- Wen Zhang
- School of Computer Science, Wuhan University, Wuhan 430072, China
| | - Weiran Lin
- School of Computer Science, Wuhan University, Wuhan 430072, China
| | - Ding Zhang
- School of Computer Science, Wuhan University, Wuhan 430072, China
| | - Siman Wang
- School of Computer Science, Wuhan University, Wuhan 430072, China
| | - Jingwen Shi
- School of Mathematics and Statistics, Wuhan University, Wuhan 430072, China
| | - Yanqing Niu
- School of Mathematics and Statistics, South-Central University for Nationalities, Wuhan 430074, China
| |
Collapse
|
32
|
Ding Y, Tang J, Guo F. Identification of drug–target interactions via fuzzy bipartite local model. Neural Comput Appl 2019. [DOI: 10.1007/s00521-019-04569-z] [Citation(s) in RCA: 64] [Impact Index Per Article: 12.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
|
33
|
Wang Y, Nie C, Zang T, Wang Y. Predicting circRNA-Disease Associations Based on circRNA Expression Similarity and Functional Similarity. Front Genet 2019; 10:832. [PMID: 31572444 PMCID: PMC6751509 DOI: 10.3389/fgene.2019.00832] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2019] [Accepted: 08/13/2019] [Indexed: 12/19/2022] Open
Abstract
Circular RNAs (circRNAs) are a novel class of endogenous noncoding RNAs that have well-conserved sequences. Emerging evidence has shown that circRNAs can be novel biomarkers or therapeutic targets for many diseases and play an important role in the development of various pathological conditions. Therefore, identifying potential disease-related circRNAs is helpful in improving the efficiency of finding therapeutic targets for diseases. Here, we propose a computational model (PreCDA) to predict potential circRNA-disease associations. First, we calculated the circRNA expression similarity based on circRNA expression profiles. The circRNA functional similarity is calculated based on cosine similarity, and the disease similarity is used as the dimension of each circRNA vector. The associations between circRNAs and diseases are defined based on the circRNA functional similarity and expression similarity. We constructed a disease-related circRNA association network and used a graph-based recommendation algorithm (PersonalRank) to sort candidate disease-related circRNAs. As a result, PreCDA has an average area under the receiver operating characteristic curve value of 78.15% in predicting candidate disease-related circRNAs. In addition, we discuss the factors that affect the performance of this method and find some unknown circRNAs related to diseases, with several common diseases used as case studies. These results show that PreCDA has good performance in predicting potential circRNA-disease associations and is helpful for the diagnosis and treatment of human diseases.
Collapse
Affiliation(s)
| | | | - Tianyi Zang
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China
| | - Yadong Wang
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China
| |
Collapse
|
34
|
Predicting human disease-associated circRNAs based on locality-constrained linear coding. Genomics 2019; 112:1335-1342. [PMID: 31394170 DOI: 10.1016/j.ygeno.2019.08.001] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2019] [Revised: 08/01/2019] [Accepted: 08/02/2019] [Indexed: 12/12/2022]
Abstract
Circular RNAs (circRNAs) are a new kind of endogenous non-coding RNAs, which have been discovered continuously. More and more studies have shown that circRNAs are related to the occurrence and development of human diseases. Identification of circRNAs associated with diseases can contribute to understand the pathogenesis, diagnosis and treatment of diseases. However, experimental methods of circRNA prediction remain expensive and time-consuming. Therefore, it is urgent to propose novel computational methods for the prediction of circRNA-disease associations. In this study, we develop a computational method called LLCDC that integrates the known circRNA-disease associations, circRNA semantic similarity network, disease semantic similarity network, reconstructed circRNA similarity network, and reconstructed disease similarity network to predict circRNAs related to human diseases. Specifically, the reconstructed similarity networks are obtained by using Locality-Constrained Linear Coding (LLC) on the known association matrix, cosine similarities of circRNAs and diseases. Then, the label propagation method is applied to the similarity networks, and four relevant score matrices are respectively obtained. Finally, we use 5-fold cross validation (5-fold CV) to evaluate the performance of LLCDC, and the AUC value of the method is 0.9177, indicating that our method performs better than the other three methods. In addition, case studies on gastric cancer, breast cancer and papillary thyroid carcinoma further verify the reliability of our method in predicting disease-associated circRNAs.
Collapse
|
35
|
Xia LY, Yang ZY, Zhang H, Liang Y. Improved Prediction of Drug-Target Interactions Using Self-Paced Learning with Collaborative Matrix Factorization. J Chem Inf Model 2019; 59:3340-3351. [PMID: 31260620 DOI: 10.1021/acs.jcim.9b00408] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
Identifying drug-target interactions (DTIs) plays an important role in the field of drug discovery, drug side-effects, and drug repositioning. However, in vivo or biochemical experimental methods for identifying new DTIs are extremely expensive and time-consuming. Recently, in silico or various computational methods have been developed for DTI prediction, such as ligand-based approaches and docking approaches, but these traditional computational methods have several limitations. This work utilizes the chemogenomic-based approaches for efficiently identifying potential DTI candidates, namely, self-paced learning with collaborative matrix factorization based on weighted low-rank approximation (SPLCMF) for DTI prediction, which integrates multiple networks related to drugs and targets into regularized least-squares and focuses on learning a low-dimensional vector representation of features. The SPLCMF framework can select samples from easy to complex into training by using soft weighting, which is inclined to more faithfully reflect the latent importance of samples in training. Experimental results on synthetic data and five benchmark data sets show that our proposed SPLCMF outperforms other existing state-of-the-art approaches. These results indicate that our proposed SPLCMF can provide a useful tool to predict unknown DTIs, which may provide new insights into drug discovery, drug side-effect prediction, and repositioning existing drug.
Collapse
Affiliation(s)
- Liang-Yong Xia
- Faculty of Information Technology , Macau University of Science and Technology , Macau , China 999078
| | - Zi-Yi Yang
- Faculty of Information Technology , Macau University of Science and Technology , Macau , China 999078
| | - Hui Zhang
- Faculty of Information Technology , Macau University of Science and Technology , Macau , China 999078
| | - Yong Liang
- Faculty of Information Technology , Macau University of Science and Technology , Macau , China 999078.,State Key Laboratory of Quality Research in Chinese Medicines , Macau University of Science and Technology , Macau , China 999078
| |
Collapse
|
36
|
Su R, Wu H, Xu B, Liu X, Wei L. Developing a Multi-Dose Computational Model for Drug-Induced Hepatotoxicity Prediction Based on Toxicogenomics Data. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2019; 16:1231-1239. [PMID: 30040651 DOI: 10.1109/tcbb.2018.2858756] [Citation(s) in RCA: 85] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Drug-induced hepatotoxicity may cause acute and chronic liver disease, leading to great concern for patient safety. It is also one of the main reasons for drug withdrawal from the market. Toxicogenomics data has been widely used in hepatotoxicity prediction. In our study, we proposed a multi-dose computational model to predict the drug-induced hepatotoxicity based on gene expression and toxicity data. The dose/concentration information after drug treatment is fully utilized in our study based on the dose-response curve, thus a more informative representative of the dose-response relationship is considered. We also proposed a new feature selection method, named MEMO, which is also one important aspect of our multi-dose model in our study, to deal with the high-dimensional toxicogenomics data. We validated the proposed model using the TG-GATEs, which is a large database recording toxicogenomics data from multiple views. The experimental results show that the drug-induced hepatotoxicity can be predicted with high accuracy and efficiency using the proposed predictive model.
Collapse
|
37
|
Pan Z, Zhang H, Liang C, Li G, Xiao Q, Ding P, Luo J. Self-Weighted Multi-Kernel Multi-Label Learning for Potential miRNA-Disease Association Prediction. MOLECULAR THERAPY-NUCLEIC ACIDS 2019; 17:414-423. [PMID: 31319245 PMCID: PMC6637211 DOI: 10.1016/j.omtn.2019.06.014] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/28/2019] [Revised: 05/22/2019] [Accepted: 06/12/2019] [Indexed: 11/23/2022]
Abstract
Researchers have realized that microRNAs (miRNAs) play significant roles in the pathogenesis of various diseases. Although many computational models have been proposed to predict the associations between miRNAs and diseases, prediction performance could still be improved. In this paper, we propose a novel self-weighted, multi-kernel, multi-label learning (SwMKML) method to predict disease-related miRNAs. SwMKML adaptively learns two optimal kernel matrices for both miRNAs and diseases from multiple kernels constructed from known miRNA-disease associations. Moreover, the miRNA-disease associations predicted from both spaces are updated simultaneously based on a multi-label framework. Compared with four state-of-the-art computational models, SwMKML achieved best results of 95.5%, 93.1%, and 84.1% in global leave-one-out cross-validation, 5-fold cross-validation, and overall prediction accuracy, respectively. A case study conducted on head and neck neoplasms further identified two potential prognostic biomarkers, hsa-mir-125b-1 and hsa-mir-125b-2, for the disease. SwMKML is freely available at Github, and we anticipate that it may become an effective tool for potential miRNA-disease association prediction.
Collapse
Affiliation(s)
- Zhenxia Pan
- School of Information Science and Engineering, Shandong Normal University, Jinan 250358, China
| | - Huaxiang Zhang
- School of Information Science and Engineering, Shandong Normal University, Jinan 250358, China.
| | - Cheng Liang
- School of Information Science and Engineering, Shandong Normal University, Jinan 250358, China.
| | - Guanghui Li
- School of Information Engineering, East China Jiaotong University, Nanchang 330013, China
| | - Qiu Xiao
- College of Information Science and Engineering, Hunan Normal University, Changsha 410006, China
| | - Pingjian Ding
- College of Computer Science and Electronic Engineering, Hunan University, Changsha 410082, China
| | - Jiawei Luo
- College of Computer Science and Electronic Engineering, Hunan University, Changsha 410082, China
| |
Collapse
|
38
|
Zhao Q, Yu H, Ji M, Zhao Y, Chen X. Computational Model Development of Drug-Target Interaction Prediction: A Review. Curr Protein Pept Sci 2019; 20:492-494. [DOI: 10.2174/1389203720666190123164310] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2018] [Revised: 01/16/2019] [Accepted: 01/18/2019] [Indexed: 12/14/2022]
Abstract
In the medical field, drug-target interactions are very important for the diagnosis and treatment of diseases, they also can help researchers predict the link between biomolecules in the biological field, such as drug-protein and protein-target correlations. Therefore, the drug-target research is a very popular study in both the biological and medical fields. However, due to the limitations of manual experiments in the laboratory, computational prediction methods for drug-target relationships are increasingly favored by researchers. In this review, we summarize several computational prediction models of the drug-target connections during the past two years, and briefly introduce their advantages and shortcomings. Finally, several further interesting research directions of drug-target interactions are listed.
Collapse
Affiliation(s)
- Qi Zhao
- College of Computer Science, Shenyang Aerospace University, Shenyang, 110136, China
| | - Haifan Yu
- School of Mathematics, Liaoning University, Shenyang, 110036, China
| | - Mingxuan Ji
- School of Mathematics, Liaoning University, Shenyang, 110036, China
| | - Yan Zhao
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, 221116, China
| | - Xing Chen
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, 221116, China
| |
Collapse
|
39
|
Identification of drug-side effect association via multiple information integration with centered kernel alignment. Neurocomputing 2019. [DOI: 10.1016/j.neucom.2018.10.028] [Citation(s) in RCA: 148] [Impact Index Per Article: 29.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
|
40
|
Tang G, Shi J, Wu W, Yue X, Zhang W. Sequence-based bacterial small RNAs prediction using ensemble learning strategies. BMC Bioinformatics 2018; 19:503. [PMID: 30577759 PMCID: PMC6302447 DOI: 10.1186/s12859-018-2535-1] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open
Abstract
Background Bacterial small non-coding RNAs (sRNAs) have emerged as important elements in diverse physiological processes, including growth, development, cell proliferation, differentiation, metabolic reactions and carbon metabolism, and attract great attention. Accurate prediction of sRNAs is important and challenging, and helps to explore functions and mechanism of sRNAs. Results In this paper, we utilize a variety of sRNA sequence-derived features to develop ensemble learning methods for the sRNA prediction. First, we compile a balanced dataset and four imbalanced datasets. Then, we investigate various sRNA sequence-derived features, such as spectrum profile, mismatch profile, reverse compliment k-mer and pseudo nucleotide composition. Finally, we consider two ensemble learning strategies to integrate all features for building ensemble learning models for the sRNA prediction. One is the weighted average ensemble method (WAEM), which uses the linear weighted sum of outputs from the individual feature-based predictors to predict sRNAs. The other is the neural network ensemble method (NNEM), which trains a deep neural network by combining diverse features. In the computational experiments, we evaluate our methods on these five datasets by using 5-fold cross validation. WAEM and NNEM can produce better results than existing state-of-the-art sRNA prediction methods. Conclusions WAEM and NNEM have great potential for the sRNA prediction, and are helpful for understanding the biological mechanism of bacteria.
Collapse
Affiliation(s)
- Guifeng Tang
- School of Computer Science, Wuhan University, Wuhan, 430072, China
| | - Jingwen Shi
- School of Mathematics and Statistics, Wuhan University, Wuhan, 430072, China
| | - Wenjian Wu
- Electronic Information School, Wuhan University, Wuhan, 430072, China
| | - Xiang Yue
- Department of Computer Science and Engineering, The Ohio State University, Columbus, OH, 43210, USA
| | - Wen Zhang
- College of Informatics, Huazhong Agricultural University, Wuhan, 430070, China.
| |
Collapse
|
41
|
Wang M, Tang C, Chen J. Drug-Target Interaction Prediction via Dual Laplacian Graph Regularized Matrix Completion. BIOMED RESEARCH INTERNATIONAL 2018; 2018:1425608. [PMID: 30627536 PMCID: PMC6304580 DOI: 10.1155/2018/1425608] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/25/2018] [Revised: 09/03/2018] [Accepted: 10/24/2018] [Indexed: 01/16/2023]
Abstract
Drug-target interactions play an important role for biomedical drug discovery and development. However, it is expensive and time-consuming to accomplish this task by experimental determination. Therefore, developing computational techniques for drug-target interaction prediction is urgent and has practical significance. In this work, we propose an effective computational model of dual Laplacian graph regularized matrix completion, referred to as DLGRMC briefly, to infer the unknown drug-target interactions. Specifically, DLGRMC transforms the task of drug-target interaction prediction into a matrix completion problem, in which the potential interactions between drugs and targets can be obtained based on the prediction scores after the matrix completion procedure. In DLGRMC, the drug pairwise chemical structure similarities and the target pairwise genomic sequence similarities are fully exploited to serve the matrix completion by using a dual Laplacian graph regularization term; i.e., drugs with similar chemical structure are more likely to have interactions with similar targets and targets with similar genomic sequence similarity are more likely to have interactions with similar drugs. In addition, during the matrix completion process, an indicator matrix with binary values which indicates the indices of the observed drug-target interactions is deployed to preserve the experimental confirmed interactions. Furthermore, we develop an alternative iterative strategy to solve the constrained matrix completion problem based on Augmented Lagrange Multiplier algorithm. We evaluate DLGRMC on five benchmark datasets and the results show that DLGRMC outperforms several state-of-the-art approaches in terms of 10-fold cross validation based AUPR values and PR curves. In addition, case studies also demonstrate that DLGRMC can successfully predict most of the experimental validated drug-target interactions.
Collapse
Affiliation(s)
- Minhui Wang
- Department of Pharmacy, People's Hospital of Lian'shui County, Huai'an 223300, China
| | - Chang Tang
- School of Computer Science, China University of Geosciences, Wuhan 430074, China
| | - Jiajia Chen
- Department of Pharmacy, The Affiliated Huai'an Hospital of Xuzhou Medical University, Huai'an 223002, China
| |
Collapse
|
42
|
Manifold regularized matrix factorization for drug-drug interaction prediction. J Biomed Inform 2018; 88:90-97. [DOI: 10.1016/j.jbi.2018.11.005] [Citation(s) in RCA: 48] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2017] [Revised: 11/03/2018] [Accepted: 11/11/2018] [Indexed: 12/20/2022]
|
43
|
Qu Y, Zhang H, Lyu C, Liang C. LLCMDA: A Novel Method for Predicting miRNA Gene and Disease Relationship Based on Locality-Constrained Linear Coding. Front Genet 2018; 9:576. [PMID: 30555511 PMCID: PMC6282048 DOI: 10.3389/fgene.2018.00576] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2018] [Accepted: 11/08/2018] [Indexed: 01/03/2023] Open
Abstract
MiRNAs are small non-coding regulatory RNAs which are associated with multiple diseases. Increasing evidence has shown that miRNAs play important roles in various biological and physiological processes. Therefore, the identification of potential miRNA-disease associations could provide new clues to understanding the mechanism of pathogenesis. Although many traditional methods have been successfully applied to discover part of the associations, they are in general time-consuming and expensive. Consequently, computational-based methods are urgently needed to predict the potential miRNA-disease associations in a more efficient and resources-saving way. In this paper, we propose a novel method to predict miRNA-disease associations based on Locality-constrained Linear Coding (LLC). Specifically, we first reconstruct similarity networks for both miRNAs and diseases using LLC and then apply label propagation on the similarity networks to get relevant scores. To comprehensively verify the performance of the proposed method, we compare our method with several state-of-the-art methods under different evaluation metrics. Moreover, two types of case studies conducted on two common diseases further demonstrate the validity and utility of our method. Extensive experimental results indicate that our method can effectively predict potential associations between miRNAs and diseases.
Collapse
Affiliation(s)
- Yu Qu
- School of Information Science and Engineering, Shandong Normal University, Jinan, China
| | - Huaxiang Zhang
- School of Information Science and Engineering, Shandong Normal University, Jinan, China
| | - Chen Lyu
- School of Information Science and Engineering, Shandong Normal University, Jinan, China
| | - Cheng Liang
- School of Information Science and Engineering, Shandong Normal University, Jinan, China
| |
Collapse
|
44
|
PWCDA: Path Weighted Method for Predicting circRNA-Disease Associations. Int J Mol Sci 2018; 19:ijms19113410. [PMID: 30384427 PMCID: PMC6274797 DOI: 10.3390/ijms19113410] [Citation(s) in RCA: 59] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2018] [Revised: 10/25/2018] [Accepted: 10/26/2018] [Indexed: 12/22/2022] Open
Abstract
CircRNAs have particular biological structure and have proven to play important roles in diseases. It is time-consuming and costly to identify circRNA-disease associations by biological experiments. Therefore, it is appealing to develop computational methods for predicting circRNA-disease associations. In this study, we propose a new computational path weighted method for predicting circRNA-disease associations. Firstly, we calculate the functional similarity scores of diseases based on disease-related gene annotations and the semantic similarity scores of circRNAs based on circRNA-related gene ontology, respectively. To address missing similarity scores of diseases and circRNAs, we calculate the Gaussian Interaction Profile (GIP) kernel similarity scores for diseases and circRNAs, respectively, based on the circRNA-disease associations downloaded from circR2Disease database (http://bioinfo.snnu.edu.cn/CircR2Disease/). Then, we integrate disease functional similarity scores and circRNA semantic similarity scores with their related GIP kernel similarity scores to construct a heterogeneous network made up of three sub-networks: disease similarity network, circRNA similarity network and circRNA-disease association network. Finally, we compute an association score for each circRNA-disease pair based on paths connecting them in the heterogeneous network to determine whether this circRNA-disease pair is associated. We adopt leave one out cross validation (LOOCV) and five-fold cross validations to evaluate the performance of our proposed method. In addition, three common diseases, Breast Cancer, Gastric Cancer and Colorectal Cancer, are used for case studies. Experimental results illustrate the reliability and usefulness of our computational method in terms of different validation measures, which indicates PWCDA can effectively predict potential circRNA-disease associations.
Collapse
|
45
|
Yu SP, Liang C, Xiao Q, Li GH, Ding PJ, Luo JW. GLNMDA: a novel method for miRNA-disease association prediction based on global linear neighborhoods. RNA Biol 2018; 15:1215-1227. [PMID: 30244645 PMCID: PMC6284594 DOI: 10.1080/15476286.2018.1521210] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2018] [Revised: 08/22/2018] [Accepted: 08/24/2018] [Indexed: 01/11/2023] Open
Abstract
Recently, increasing studies have shown that miRNAs are involved in the development and progression of various complex diseases. Consequently, predicting potential miRNA-disease associations makes an important contribution to understanding the pathogenesis of diseases, developing new drugs as well as designing individualized diagnostic and therapeutic approaches for different human diseases. Nonetheless, the inherent noise and incompleteness in the existing biological datasets have limited the prediction accuracy of current computational models. To solve this issue, in this paper, we propose a novel method for miRNA-disease association prediction based on global linear neighborhoods (GLNMDA). Specifically, our method obtains a new miRNA/disease similarity matrix by linearly reconstructing each miRNA/disease according to the known experimentally verified miRNA-disease associations. We then adopt label propagation to infer the potential associations between miRNAs and diseases. As a result, GLNMDA achieved reliable performance in the frameworks of both local and global LOOCV (AUCs of 0.867 and 0.929, respectively) and 5-fold cross validation (average AUC of 0.926). Case studies on five common human diseases further confirmed the utility of our method in discovering latent miRNA-disease pairs. Taken together, GLNMDA could serve as a reliable computational tool for miRNA-disease association prediction.
Collapse
Affiliation(s)
- Sheng-Peng Yu
- School of Information Science and Engineering, Shandong Normal University, Jinan, China
| | - Cheng Liang
- School of Information Science and Engineering, Shandong Normal University, Jinan, China
| | - Qiu Xiao
- College of Information Science and Engineering, Hunan Normal University, Changsha, China
| | - Guang-Hui Li
- School of Information Engineering, East China Jiaotong University, Nanchang, China
| | - Ping-Jian Ding
- College of Information Science and Engineering, Hunan University, Changsha, China
| | - Jia-Wei Luo
- College of Information Science and Engineering, Hunan University, Changsha, China
| |
Collapse
|
46
|
Zhang W, Yue X, Huang F, Liu R, Chen Y, Ruan C. Predicting drug-disease associations and their therapeutic function based on the drug-disease association bipartite network. Methods 2018; 145:51-59. [DOI: 10.1016/j.ymeth.2018.06.001] [Citation(s) in RCA: 58] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2018] [Revised: 05/15/2018] [Accepted: 06/01/2018] [Indexed: 02/01/2023] Open
|
47
|
Deepika SS, Geetha TV. A meta-learning framework using representation learning to predict drug-drug interaction. J Biomed Inform 2018; 84:136-147. [DOI: 10.1016/j.jbi.2018.06.015] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2018] [Revised: 06/22/2018] [Accepted: 06/25/2018] [Indexed: 01/24/2023]
Affiliation(s)
- S S Deepika
- Department of Computer Science, Anna University, Chennai, Tamil Nadu, India.
| | - T V Geetha
- Department of Computer Science, Anna University, Chennai, Tamil Nadu, India
| |
Collapse
|
48
|
Qu Y, Zhang H, Liang C, Ding P, Luo J. SNMDA: A novel method for predicting microRNA-disease associations based on sparse neighbourhood. J Cell Mol Med 2018; 22:5109-5120. [PMID: 30030889 PMCID: PMC6156399 DOI: 10.1111/jcmm.13799] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2018] [Revised: 05/25/2018] [Accepted: 06/21/2018] [Indexed: 01/05/2023] Open
Abstract
miRNAs are a class of small noncoding RNAs that are associated with a variety of complex biological processes. Increasing studies have shown that miRNAs have close relationships with many human diseases. The prediction of the associations between miRNAs and diseases has thus become a hot topic. Although traditional experimental methods are reliable, they could only identify a limited number of associations as they are time‐consuming and expensive. Consequently, great efforts have been made to effectively predict reliable disease‐related miRNAs based on computational methods. In this study, we present a novel approach to predict the potential microRNA‐disease associations based on sparse neighbourhood. Specifically, our method takes advantage of the sparsity of the miRNA‐disease association network and integrates the sparse information into the current similarity matrices for both miRNAs and diseases. To demonstrate the utility of our method, we applied global LOOCV, local LOOCV and five‐fold cross‐validation to evaluate our method, respectively. The corresponding AUCs are 0.936, 0.882 and 0.934. Three types of case studies on five common diseases further confirm the performance of our method in predicting unknown miRNA‐disease associations. Overall, results show that SNMDA can predict the potential associations between miRNAs and diseases effectively.
Collapse
Affiliation(s)
- Yu Qu
- School of Information Science and Engineering, Shandong Normal University, Jinan, China
| | - Huaxiang Zhang
- School of Information Science and Engineering, Shandong Normal University, Jinan, China
| | - Cheng Liang
- School of Information Science and Engineering, Shandong Normal University, Jinan, China
| | - Pingjian Ding
- School of Information Science and Engineering, Hunan University, Changsha, China
| | - Jiawei Luo
- School of Information Science and Engineering, Hunan University, Changsha, China
| |
Collapse
|
49
|
Niu M, Li Y, Wang C, Han K. RFAmyloid: A Web Server for Predicting Amyloid Proteins. Int J Mol Sci 2018; 19:ijms19072071. [PMID: 30013015 PMCID: PMC6073578 DOI: 10.3390/ijms19072071] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2018] [Revised: 07/10/2018] [Accepted: 07/12/2018] [Indexed: 12/22/2022] Open
Abstract
Amyloid is an insoluble fibrous protein and its mis-aggregation can lead to some diseases, such as Alzheimer’s disease and Creutzfeldt–Jakob’s disease. Therefore, the identification of amyloid is essential for the discovery and understanding of disease. We established a novel predictor called RFAmy based on random forest to identify amyloid, and it employed SVMProt 188-D feature extraction method based on protein composition and physicochemical properties and pse-in-one feature extraction method based on amino acid composition, autocorrelation pseudo acid composition, profile-based features and predicted structures features. In the ten-fold cross-validation test, RFAmy’s overall accuracy was 89.19% and F-measure was 0.891. Results were obtained by comparison experiments with other feature, classifiers, and existing methods. This shows the effectiveness of RFAmy in predicting amyloid protein. The RFAmy proposed in this paper can be accessed through the URL http://server.malab.cn/RFAmyloid/.
Collapse
Affiliation(s)
- Mengting Niu
- School of Information and Computer Engineering, Northeast Forestry University, Harbin 150040, China.
| | - Yanjuan Li
- School of Information and Computer Engineering, Northeast Forestry University, Harbin 150040, China.
| | - Chunyu Wang
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150040, China.
| | - Ke Han
- School of Computer and Information Engineering, Harbin University of Commerce, Harbin 150040, China.
| |
Collapse
|
50
|
Wei L, Chen H, Su R. M6APred-EL: A Sequence-Based Predictor for Identifying N6-methyladenosine Sites Using Ensemble Learning. MOLECULAR THERAPY-NUCLEIC ACIDS 2018; 12:635-644. [PMID: 30081234 PMCID: PMC6082921 DOI: 10.1016/j.omtn.2018.07.004] [Citation(s) in RCA: 136] [Impact Index Per Article: 22.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/31/2018] [Revised: 07/03/2018] [Accepted: 07/03/2018] [Indexed: 12/28/2022]
Abstract
N6-methyladenosine (m6A) modification is the most abundant RNA methylation modification and involves various biological processes, such as RNA splicing and degradation. Recent studies have demonstrated the feasibility of identifying m6A peaks using high-throughput sequencing techniques. However, such techniques cannot accurately identify specific methylated sites, which is important for a better understanding of m6A functions. In this study, we develop a novel machine learning-based predictor called M6APred-EL for the identification of m6A sites. To predict m6A sites accurately within genomic sequences, we trained an ensemble of three support vector machine classifiers that explore the position-specific information and physical chemical information from position-specific k-mer nucleotide propensity, physical-chemical properties, and ring-function-hydrogen-chemical properties. We examined and compared the performance of our predictor with other state-of-the-art methods of benchmarking datasets. Comparative results showed that the proposed M6APred-EL performed more accurately for m6A site identification. Moreover, a user-friendly web server that implements the proposed M6APred-EL is well established and is currently available at http://server.malab.cn/M6APred-EL/. It is expected to be a practical and effective tool for the investigation of m6A functional mechanisms.
Collapse
Affiliation(s)
- Leyi Wei
- School of Computer Science and Technology, Tianjin University, Tianjin, China; State Key Laboratory of Medicinal Chemical Biology, Nankai University, Tianjin, China
| | - Huangrong Chen
- School of Computer Science and Technology, Tianjin University, Tianjin, China
| | - Ran Su
- School of Computer Software, Tianjin University, Tianjin, China; State Key Laboratory of Medicinal Chemical Biology, Nankai University, Tianjin, China.
| |
Collapse
|