1
|
He S, Yun L, Yi H. Fusing graph transformer with multi-aggregate GCN for enhanced drug-disease associations prediction. BMC Bioinformatics 2024; 25:79. [PMID: 38378479 PMCID: PMC10877759 DOI: 10.1186/s12859-024-05705-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2023] [Accepted: 02/14/2024] [Indexed: 02/22/2024] Open
Abstract
BACKGROUND Identification of potential drug-disease associations is important for both the discovery of new indications for drugs and for the reduction of unknown adverse drug reactions. Exploring the potential links between drugs and diseases is crucial for advancing biomedical research and improving healthcare. While advanced computational techniques play a vital role in revealing the connections between drugs and diseases, current research still faces challenges in the process of mining potential relationships between drugs and diseases using heterogeneous network data. RESULTS In this study, we propose a learning framework for fusing Graph Transformer Networks and multi-aggregate graph convolutional network to learn efficient heterogenous information graph representations for drug-disease association prediction, termed WMAGT. This method extensively harnesses the capabilities of a robust graph transformer, effectively modeling the local and global interactions of nodes by integrating a graph convolutional network and a graph transformer with self-attention mechanisms in its encoder. We first integrate drug-drug, drug-disease, and disease-disease networks to construct heterogeneous information graph. Multi-aggregate graph convolutional network and graph transformer are then used in conjunction with neural collaborative filtering module to integrate information from different domains into highly effective feature representation. CONCLUSIONS Rigorous cross-validation, ablation studies examined the robustness and effectiveness of the proposed method. Experimental results demonstrate that WMAGT outperforms other state-of-the-art methods in accurate drug-disease association prediction, which is beneficial for drug repositioning and drug safety research.
Collapse
Affiliation(s)
- Shihui He
- School of Information Science and Technology, Yunnan Normal University, Kunming, 650500, China
- Engineering Research Center of Computer Vision and Intelligent Control Technology, Department of Education, Kunming, 650500, China
| | - Lijun Yun
- School of Information Science and Technology, Yunnan Normal University, Kunming, 650500, China.
- Engineering Research Center of Computer Vision and Intelligent Control Technology, Department of Education, Kunming, 650500, China.
| | - Haicheng Yi
- School of Computer Science, Northwestern Polytechnical University, Xi'an, 710129, China.
| |
Collapse
|
2
|
Ding Y, Zhou H, Zou Q, Yuan L. Identification of drug-side effect association via correntropy-loss based matrix factorization with neural tangent kernel. Methods 2023; 219:73-81. [PMID: 37783242 DOI: 10.1016/j.ymeth.2023.09.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2023] [Revised: 09/18/2023] [Accepted: 09/20/2023] [Indexed: 10/04/2023] Open
Abstract
Adverse drug reactions include side effects, allergic reactions, and secondary infections. Severe adverse reactions can cause cancer, deformity, or mutation. The monitoring of drug side effects is an important support for post marketing safety supervision of drugs, and an important basis for revising drug instructions. Its purpose is to timely detect and control drug safety risks. Traditional methods are time-consuming. To accelerate the discovery of side effects, we propose a machine learning based method, called correntropy-loss based matrix factorization with neural tangent kernel (CLMF-NTK), to solve the prediction of drug side effects. Our method and other computational methods are tested on three benchmark datasets, and the results show that our method achieves the best predictive performance.
Collapse
Affiliation(s)
- Yijie Ding
- Key Laboratory of Computational Science and Application of Hainan Province, Hainan Normal University, Haikou 571158, China; Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou 324000, China; School of Electronic and Information Engineering, Suzhou University of Science and Technology, Suzhou 215009, China
| | - Hongmei Zhou
- Beidahuang Industry Group General Hospital, Harbin 150001, China
| | - Quan Zou
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou 324000, China.
| | - Lei Yuan
- Department of Hepatobiliary Surgery, Quzhou People's Hospital, 100# Minjiang Main Road, Quzhou 324000, China.
| |
Collapse
|
3
|
Muniyappan S, Rayan AXA, Varrieth GT. EGeRepDR: An enhanced genetic-based representation learning for drug repurposing using multiple biomedical sources. J Biomed Inform 2023; 147:104528. [PMID: 37858852 DOI: 10.1016/j.jbi.2023.104528] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2023] [Revised: 09/11/2023] [Accepted: 10/16/2023] [Indexed: 10/21/2023]
Abstract
MOTIVATION Drug repurposing (DR) is an imminent approach for identifying novel therapeutic indications for the available drugs and discovering novel drugs for previously untreatable diseases. Nowadays, DR has major attention in the pharmaceutical industry due to the high cost and time of launching new drugs to the market through traditional drug development. DR task majorly depends on genetic information since the drugs revert the modified Gene Expression (GE) of diseases to normal. Many of the existing studies have not considered the genetic importance of predicting the potential candidates. METHOD We proposed a novel multimodal framework that utilizes genetic aspects of drugs and diseases such as genes, pathways, gene signatures, or expression to enhance the performance of DR using various data sources. Firstly, the heterogeneous biological network (HBN) is constructed with three types of nodes namely drug, disease, and gene, and 4 types of edges similarities (drug, gene, and disease), drug-gene, gene-disease, and drug-disease. Next, a modified graph auto-encoder (GAE*) model is applied to learn the representation of drug and disease nodes using the topological structure and edge information. Secondly, the HBN is enhanced with the information extracted from biomedical literature and ontology using a novel semi-supervised pattern embedding-based bootstrapping model and novel DR perspective representation learning respectively to improve the prediction performance. Finally, our proposed system uses a neural network model to generate the probability score of drug-disease pairs. RESULTS We demonstrate the efficiency of the proposed model on various datasets and achieved outstanding performance in 5-fold cross-validation (AUC = 0.99, AUPR = 0.98). Further, we validated the top-ranked potential candidates using pathway analysis and proved that the known and predicted candidates share common genes in the pathways.
Collapse
Affiliation(s)
- Saranya Muniyappan
- Computer Science and Engineering, CEG Campus, Anna University, Chennai, Tamil Nadu, India.
| | | | | |
Collapse
|
4
|
Li X, Liao M, Wang B, Zan X, Huo Y, Liu Y, Bao Z, Xu P, Liu W. A drug repurposing method based on inhibition effect on gene regulatory network. Comput Struct Biotechnol J 2023; 21:4446-4455. [PMID: 37731599 PMCID: PMC10507583 DOI: 10.1016/j.csbj.2023.09.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2023] [Revised: 09/05/2023] [Accepted: 09/07/2023] [Indexed: 09/22/2023] Open
Abstract
Numerous computational drug repurposing methods have emerged as efficient alternatives to costly and time-consuming traditional drug discovery approaches. Some of these methods are based on the assumption that the candidate drug should have a reversal effect on disease-associated genes. However, such methods are not applicable in the case that there is limited overlap between disease-related genes and drug-perturbed genes. In this study, we proposed a novel Drug Repurposing method based on the Inhibition Effect on gene regulatory network (DRIE) to identify potential drugs for cancer treatment. DRIE integrated gene expression profile and gene regulatory network to calculate inhibition score by using the shortest path in the disease-specific network. The results on eleven datasets indicated the superior performance of DRIE when compared to other state-of-the-art methods. Case studies showed that our method effectively discovered novel drug-disease associations. Our findings demonstrated that the top-ranked drug candidates had been already validated by CTD database. Additionally, it clearly identified potential agents for three cancers (colorectal, breast, and lung cancer), which was beneficial when annotating drug-disease relationships in the CTD. This study proposed a novel framework for drug repurposing, which would be helpful for drug discovery and development.
Collapse
Affiliation(s)
- Xianbin Li
- Institute of Computational Science and Technology, Guangzhou University, Guangzhou, China
- School of Computer Science of Information Technology, Qiannan Normal University for Nationalities, Duyun, China
| | - Minzhen Liao
- Institute of Computational Science and Technology, Guangzhou University, Guangzhou, China
| | - Bing Wang
- School of Medicine, Southeast University, Nanjing, China
| | - Xiangzhen Zan
- Institute of Computational Science and Technology, Guangzhou University, Guangzhou, China
| | - Yanhao Huo
- Institute of Computational Science and Technology, Guangzhou University, Guangzhou, China
| | - Yue Liu
- Institute of Computational Science and Technology, Guangzhou University, Guangzhou, China
| | - Zhenshen Bao
- Institute of Computational Science and Technology, Guangzhou University, Guangzhou, China
- School of Computer Science of Information Technology, Qiannan Normal University for Nationalities, Duyun, China
| | - Peng Xu
- Institute of Computational Science and Technology, Guangzhou University, Guangzhou, China
- School of Computer Science of Information Technology, Qiannan Normal University for Nationalities, Duyun, China
| | - Wenbin Liu
- Institute of Computational Science and Technology, Guangzhou University, Guangzhou, China
| |
Collapse
|
5
|
Singh S, Kumar R, Payra S, Singh SK. Artificial Intelligence and Machine Learning in Pharmacological Research: Bridging the Gap Between Data and Drug Discovery. Cureus 2023; 15:e44359. [PMID: 37779744 PMCID: PMC10539991 DOI: 10.7759/cureus.44359] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/31/2023] [Indexed: 10/03/2023] Open
Abstract
Artificial intelligence (AI) has transformed pharmacological research through machine learning, deep learning, and natural language processing. These advancements have greatly influenced drug discovery, development, and precision medicine. AI algorithms analyze vast biomedical data identifying potential drug targets, predicting efficacy, and optimizing lead compounds. AI has diverse applications in pharmacological research, including target identification, drug repurposing, virtual screening, de novo drug design, toxicity prediction, and personalized medicine. AI improves patient selection, trial design, and real-time data analysis in clinical trials, leading to enhanced safety and efficacy outcomes. Post-marketing surveillance utilizes AI-based systems to monitor adverse events, detect drug interactions, and support pharmacovigilance efforts. Machine learning models extract patterns from complex datasets, enabling accurate predictions and informed decision-making, thus accelerating drug discovery. Deep learning, specifically convolutional neural networks (CNN), excels in image analysis, aiding biomarker identification and optimizing drug formulation. Natural language processing facilitates the mining and analysis of scientific literature, unlocking valuable insights and information. However, the adoption of AI in pharmacological research raises ethical considerations. Ensuring data privacy and security, addressing algorithm bias and transparency, obtaining informed consent, and maintaining human oversight in decision-making are crucial ethical concerns. The responsible deployment of AI necessitates robust frameworks and regulations. The future of AI in pharmacological research is promising, with integration with emerging technologies like genomics, proteomics, and metabolomics offering the potential for personalized medicine and targeted therapies. Collaboration among academia, industry, and regulatory bodies is essential for the ethical implementation of AI in drug discovery and development. Continuous research and development in AI techniques and comprehensive training programs will empower scientists and healthcare professionals to fully exploit AI's potential, leading to improved patient outcomes and innovative pharmacological interventions.
Collapse
Affiliation(s)
- Shruti Singh
- Department of Pharmacology, All India Institute of Medical Sciences, Patna, IND
| | - Rajesh Kumar
- Department of Pharmacology, All India Institute of Medical Sciences, Patna, IND
| | - Shuvasree Payra
- Department of Pharmacology, All India Institute of Medical Sciences, Patna, IND
| | - Sunil K Singh
- Department of Pharmacology, All India Institute of Medical Sciences, Patna, IND
| |
Collapse
|
6
|
Liu BM, Gao YL, Zhang DJ, Zhou F, Wang J, Zheng CH, Liu JX. A new framework for drug-disease association prediction combing light-gated message passing neural network and gated fusion mechanism. Brief Bioinform 2022; 23:6775584. [PMID: 36305457 DOI: 10.1093/bib/bbac457] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2022] [Revised: 09/07/2022] [Accepted: 09/23/2022] [Indexed: 12/14/2022] Open
Abstract
With the development of research on the complex aetiology of many diseases, computational drug repositioning methodology has proven to be a shortcut to costly and inefficient traditional methods. Therefore, developing more promising computational methods is indispensable for finding new candidate diseases to treat with existing drugs. In this paper, a model integrating a new variant of message passing neural network and a novel-gated fusion mechanism called GLGMPNN is proposed for drug-disease association prediction. First, a light-gated message passing neural network (LGMPNN), including message passing, aggregation and updating, is proposed to separately extract multiple pieces of information from the similarity networks and the association network. Then, a gated fusion mechanism consisting of a forget gate and an output gate is applied to integrate the multiple pieces of information to extent. The forget gate calculated by the multiple embeddings is built to integrate the association information into the similarity information. Furthermore, the final node representations are controlled by the output gate, which fuses the topology information of the networks and the initial similarity information. Finally, a bilinear decoder is adopted to reconstruct an adjacency matrix for drug-disease associations. Evaluated by 10-fold cross-validations, GLGMPNN achieves excellent performance compared with the current models. The following studies show that our model can effectively discover novel drug-disease associations.
Collapse
Affiliation(s)
- Bao-Min Liu
- School of Computer Science, Qufu Normal University, Rizhao, 276826, Shandong, China
| | - Ying-Lian Gao
- Qufu Normal University Library, Qufu Normal University, Rizhao, 276826, Shandong, China
| | - Dai-Jun Zhang
- School of Computer Science, Qufu Normal University, Rizhao, 276826, Shandong, China
| | - Feng Zhou
- School of Computer Science, Qufu Normal University, Rizhao, 276826, Shandong, China
| | - Juan Wang
- School of Computer Science, Qufu Normal University, Rizhao, 276826, Shandong, China
| | - Chun-Hou Zheng
- School of Computer Science, Qufu Normal University, Rizhao, 276826, Shandong, China
| | - Jin-Xing Liu
- School of Computer Science, Qufu Normal University, Rizhao, 276826, Shandong, China
| |
Collapse
|
7
|
Chen L, Lin D, Xu H, Li J, Lin L. WLLP: A weighted reconstruction-based linear label propagation algorithm for predicting potential therapeutic agents for COVID-19. Front Microbiol 2022; 13:1040252. [PMID: 36466666 PMCID: PMC9713947 DOI: 10.3389/fmicb.2022.1040252] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2022] [Accepted: 10/06/2022] [Indexed: 11/18/2022] Open
Abstract
The global coronavirus disease 2019 (COVID-19) pandemic caused by the severe acute respiratory syndrome coronavirus-2 (SARS-CoV) has led to a huge health and economic crises. However, the research required to develop new drugs and vaccines is very expensive in terms of labor, money, and time. Owing to recent advances in data science, drug-repositioning technologies have become one of the most promising strategies available for developing effective treatment options. Using the previously reported human drug virus database (HDVD), we proposed a model to predict possible drug regimens based on a weighted reconstruction-based linear label propagation algorithm (WLLP). For the drug–virus association matrix, we used the weighted K-nearest known neighbors method for preprocessing and label propagation of the network based on the linear neighborhood similarity of drugs and viruses to obtain the final prediction results. In the framework of 10 times 10-fold cross-validated area under the receiver operating characteristic (ROC) curve (AUC), WLLP exhibited excellent performance with an AUC of 0.8828 ± 0.0037 and an area under the precision-recall curve of 0.5277 ± 0.0053, outperforming the other four models used for comparison. We also predicted effective drug regimens against SARS-CoV-2, and this case study showed that WLLP can be used to suggest potential drugs for the treatment of COVID-19.
Collapse
Affiliation(s)
- Langcheng Chen
- Center of Campus Network and Modern Educational Technology, Guangdong University of Technology, Guangzhou, China
| | - Dongying Lin
- School of Computer Science, Guangdong University of Technology, Guangzhou, China
| | - Haojie Xu
- School of Computer Science, Guangdong University of Technology, Guangzhou, China
| | - Jianming Li
- School of Computer Science, Guangdong University of Technology, Guangzhou, China
| | - Lieqing Lin
- Center of Campus Network and Modern Educational Technology, Guangdong University of Technology, Guangzhou, China
- *Correspondence: Lieqing Lin
| |
Collapse
|
8
|
Song Y, Cui H, Zhang T, Yang T, Li X, Xuan P. Prediction of Drug-Related Diseases Through Integrating Pairwise Attributes and Neighbor Topological Structures. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:2963-2974. [PMID: 34133286 DOI: 10.1109/tcbb.2021.3089692] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Identifying new disease indications for the approved drugs can help reduce the cost and time of drug development. Most of the recent methods focus on exploiting the various information related to drugs and diseases for predicting the candidate drug-disease associations. However, the previous methods failed to deeply integrate the neighborhood topological structure and the node attributes of an interested drug-disease node pair. We propose a new prediction method, ANPred, to learn and integrate pairwise attribute information and neighbor topology information from the similarities and associations related to drugs and diseases. First, a bi-layer heterogeneous network with intra-layer and inter-layer connections is established to combine the drug similarities, the disease similarities, and the drug-disease associations. Second, the embedding of a pair of drug and disease is constructed based on integrating multiple biological premises about drugs and diseases. The learning framework based on multi-layer convolutional neural networks is designed to learn the attribute representation of the pair of drug and disease nodes from its embedding. The sequences composed of neighbor nodes are formed based on random walk on the heterogeneous network. A framework based on fully-connected autoencoder and skip-gram module is constructed to learn the neighbor topological representations of nodes. The cross-validation results indicate the performance of ANPred is superior to several state-of-the-art methods. The case studies on 5 drugs further confirm the ability of ANPred in discovering the potential drug-disease association candidates.
Collapse
|
9
|
Wang H, Huang F, Xiong Z, Zhang W. A heterogeneous network-based method with attentive meta-path extraction for predicting drug-target interactions. Brief Bioinform 2022; 23:6596318. [PMID: 35641162 DOI: 10.1093/bib/bbac184] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2022] [Revised: 04/09/2022] [Accepted: 04/23/2022] [Indexed: 11/13/2022] Open
Abstract
Predicting drug-target interactions (DTIs) is crucial at many phases of drug discovery and repositioning. Many computational methods based on heterogeneous networks (HNs) have proved their potential to predict DTIs by capturing extensive biological knowledge and semantic information from meta-paths. However, existing methods manually customize meta-paths, which is overly dependent on some specific expertise. Such strategy heavily limits the scalability and flexibility of these models, and even affects their predictive performance. To alleviate this limitation, we propose a novel HN-based method with attentive meta-path extraction for DTI prediction, named HampDTI, which is capable of automatically extracting useful meta-paths through a learnable attention mechanism instead of pre-definition based on domain knowledge. Specifically, by scoring multi-hop connections across various relations in the HN with each relation assigned an attention weight, HampDTI constructs a new trainable graph structure, called meta-path graph. Such meta-path graph implicitly measures the importance of every possible meta-path between drugs and targets. To enable HampDTI to extract more diverse meta-paths, we adopt a multi-channel mechanism to generate multiple meta-path graphs. Then, a graph neural network is deployed on the generated meta-path graphs to yield the multi-channel embeddings of drugs and targets. Finally, HampDTI fuses all embeddings from different channels for predicting DTIs. The meta-path graphs are optimized along with the model training such that HampDTI can adaptively extract valuable meta-paths for DTI prediction. The experiments on benchmark datasets not only show the superiority of HampDTI in DTI prediction over several baseline methods, but also, more importantly, demonstrate the effectiveness of the model discovering important meta-paths.
Collapse
Affiliation(s)
- Hongzhun Wang
- College of Informatics, Huazhong Agricultural University, Wuhan, 430070, Wuhan, China
| | - Feng Huang
- College of Informatics, Huazhong Agricultural University, Wuhan, 430070, Wuhan, China
| | - Zhankun Xiong
- College of Informatics, Huazhong Agricultural University, Wuhan, 430070, Wuhan, China
| | - Wen Zhang
- College of Informatics, Huazhong Agricultural University, Wuhan, 430070, Wuhan, China
| |
Collapse
|
10
|
Wang L, Tan Y, Yang X, Kuang L, Ping P. Review on predicting pairwise relationships between human microbes, drugs and diseases: from biological data to computational models. Brief Bioinform 2022; 23:6553604. [PMID: 35325024 DOI: 10.1093/bib/bbac080] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2021] [Revised: 02/14/2022] [Accepted: 02/15/2022] [Indexed: 12/11/2022] Open
Abstract
In recent years, with the rapid development of techniques in bioinformatics and life science, a considerable quantity of biomedical data has been accumulated, based on which researchers have developed various computational approaches to discover potential associations between human microbes, drugs and diseases. This paper provides a comprehensive overview of recent advances in prediction of potential correlations between microbes, drugs and diseases from biological data to computational models. Firstly, we introduced the widely used datasets relevant to the identification of potential relationships between microbes, drugs and diseases in detail. And then, we divided a series of a lot of representative computing models into five major categories including network, matrix factorization, matrix completion, regularization and artificial neural network for in-depth discussion and comparison. Finally, we analysed possible challenges and opportunities in this research area, and at the same time we outlined some suggestions for further improvement of predictive performances as well.
Collapse
Affiliation(s)
- Lei Wang
- College of Computer Engineering & Applied Mathematics, Changsha University, Changsha, 410022, Hunan, China.,Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, Xiangtan, 411105, Hunan, China
| | - Yaqin Tan
- College of Computer Engineering & Applied Mathematics, Changsha University, Changsha, 410022, Hunan, China.,Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, Xiangtan, 411105, Hunan, China
| | - Xiaoyu Yang
- College of Computer Engineering & Applied Mathematics, Changsha University, Changsha, 410022, Hunan, China.,Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, Xiangtan, 411105, Hunan, China
| | - Linai Kuang
- Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, Xiangtan, 411105, Hunan, China
| | - Pengyao Ping
- College of Computer Engineering & Applied Mathematics, Changsha University, Changsha, 410022, Hunan, China
| |
Collapse
|
11
|
Ding Y, Lei X, Liao B, Wu FX. MLRDFM: a multi-view Laplacian regularized DeepFM model for predicting miRNA-disease associations. Brief Bioinform 2022; 23:6552270. [PMID: 35323901 DOI: 10.1093/bib/bbac079] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2021] [Revised: 02/07/2022] [Accepted: 02/15/2022] [Indexed: 01/20/2023] Open
Abstract
MOTIVATION MicroRNAs (miRNAs), as critical regulators, are involved in various fundamental and vital biological processes, and their abnormalities are closely related to human diseases. Predicting disease-related miRNAs is beneficial to uncovering new biomarkers for the prevention, detection, prognosis, diagnosis and treatment of complex diseases. RESULTS In this study, we propose a multi-view Laplacian regularized deep factorization machine (DeepFM) model, MLRDFM, to predict novel miRNA-disease associations while improving the standard DeepFM. Specifically, MLRDFM improves DeepFM from two aspects: first, MLRDFM takes the relationships among items into consideration by regularizing their embedding features via their similarity-based Laplacians. In this study, miRNA Laplacian regularization integrates four types of miRNA similarity, while disease Laplacian regularization integrates two types of disease similarity. Second, to judiciously train our model, Laplacian eigenmaps are utilized to initialize the weights in the dense embedding layer. The experimental results on the latest HMDD v3.2 dataset show that MLRDFM improves the performance and reduces the overfitting phenomenon of DeepFM. Besides, MLRDFM is greatly superior to the state-of-the-art models in miRNA-disease association prediction in terms of different evaluation metrics with the 5-fold cross-validation. Furthermore, case studies further demonstrate the effectiveness of MLRDFM.
Collapse
Affiliation(s)
- Yulian Ding
- Division of Biomedical Engineering, University of Saskatchewan, 57 Campus Drive, S7N 5A9, Saskatchewan, Canada
| | - Xiujuan Lei
- School of Computer Science, Shaanxi Normal University, 620 West Chang'an Avenue, 710119, Shaanxi, China
| | - Bo Liao
- School of Mathematics and Statistics, Hainan Normal University, 99 Longkun South Road, 571158, Hainan, China
| | - Fang-Xiang Wu
- Division of Biomedical Engineering, University of Saskatchewan, 57 Campus Drive, S7N 5A9, Saskatchewan, Canada.,Department of Mechanical Engineering and Department of Computer Science, University of Saskatchewan, 57 Campus Drive, S7N5A9, Saskatchewan, Canada
| |
Collapse
|
12
|
Selvaraj N, Swaroop AK, Nidamanuri BSS, Kumar R R, Natarajan J, Selvaraj J. Network-based drug repurposing: A critical review. Curr Drug Res Rev 2022; 14:116-131. [PMID: 35156575 DOI: 10.2174/2589977514666220214120403] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2021] [Revised: 11/17/2021] [Accepted: 11/30/2021] [Indexed: 11/22/2022]
Abstract
New drug development for a disease is a tedious time taking, complex and expensive process. Even if it is done, still the chances for success of newly developed drugs are very low. Modern reports state that repurposing the pre-existing drugs will have more efficient functioning than newly developed drugs. This repurposing process will save time, reduce expenses and provide more success rate. The only limitation for this repurposing is getting a desired pharmacological and characteristic parameter of various drugs from vast data available about a huge number of drugs, their effects, and target mechanisms. This drawback can be avoided by introducing computational methods of analysis. This includes various network analysis types that use various biological processes and relationships with various drugs to make data interpretation a simple process. Some of the data sets now available in standard and simplified forms include gene expression, drug-target interactions, protein networks, electronic health records, clinical trial results, and drug adverse event reports. Integrating various data sets and interpretation methods gives way for a more efficient and easy way to repurpose an exact drug for desired target and effect. In this review, we are going to discuss briefly various computational biological network analysis methods like gene regulatory networks, metabolic networks, protein-protein interaction networks, drug-target interaction networks, drug-disease association networks, drug-drug interaction networks, drug-side effects networks, integrated network-based methods, semantic link networks, and isoform-isoform networks. Along with these, we have also briefly presented limitations, predicting methods, data sets used of various biological networks used of the drug for drug repurposing.
Collapse
Affiliation(s)
- Nagaraj Selvaraj
- Department of Pharmaceutical Chemistry, JSS College of Pharmacy, JSS Academy of Higher Education &Research Ooty, Nilgiris, Tamilnadu, India
| | - Akey Krishna Swaroop
- Department of Pharmaceutical Chemistry, JSS College of Pharmacy, JSS Academy of Higher Education &Research Ooty, Nilgiris, Tamilnadu, India
| | - Bala Sai Soujith Nidamanuri
- Department of Pharmaceutics, JSS College of Pharmacy, JSS Academy of Higher Education &Research Ooty, Nilgiris, Tamilnadu, India
| | - Rajesh Kumar R
- Department of Pharmaceutical Biotechnology, JSS College of Pharmacy, JSS Academy of Higher Education &Research Ooty, Nilgiris, Tamilnadu, India
| | - Jawahar Natarajan
- Department of Pharmaceutics, JSS College of Pharmacy, JSS Academy of Higher Education &Research Ooty, Nilgiris, Tamilnadu, India
| | - Jubie Selvaraj
- Department of Pharmaceutical Chemistry, JSS College of Pharmacy, JSS Academy of Higher Education &Research Ooty, Nilgiris, Tamilnadu, India
| |
Collapse
|
13
|
Targets preliminary screening for the fresh natural drug molecule based on Cosine-correlation and similarity-comparison of local network. J Transl Med 2022; 20:67. [PMID: 35115019 PMCID: PMC8812203 DOI: 10.1186/s12967-022-03279-w] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2021] [Accepted: 01/24/2022] [Indexed: 11/30/2022] Open
Abstract
Background Chinese herbal medicine is made up of hundreds of natural drug molecules and has played a major role in traditional Chinese medicine (TCM) for several thousand years. Therefore, it is of great significance to study the target of natural drug molecules for exploring the mechanism of treating diseases with TCM. However, it is very difficult to determine the targets of a fresh natural drug molecule due to the complexity of the interaction between drug molecules and targets. Compared with traditional biological experiments, the computational method has the advantages of less time and low cost for targets screening, but it remains many great challenges, especially for the molecules without social ties. Methods This study proposed a novel method based on the Cosine-correlation and Similarity-comparison of Local Network (CSLN) to perform the preliminary screening of targets for the fresh natural drug molecules and assign weights to them through a trained parameter. Results The performance of CSLN is superior to the popular drug-target-interaction (DTI) prediction model GRGMF on the gold standard data in the condition that is drug molecules are the objects for training and testing. Moreover, CSLN showed excellent ability in checking the targets screening performance for a fresh-natural-drug-molecule (scenario simulation) on the TCMSP (13 positive samples in top20), meanwhile, Western-Blot also further verified the accuracy of CSLN. Conclusions In summary, the results suggest that CSLN can be used as an alternative strategy for screening targets of fresh natural drug molecules.
Collapse
|
14
|
Gao CQ, Zhou YK, Xin XH, Min H, Du PF. DDA-SKF: Predicting Drug-Disease Associations Using Similarity Kernel Fusion. Front Pharmacol 2022; 12:784171. [PMID: 35095495 PMCID: PMC8792612 DOI: 10.3389/fphar.2021.784171] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2021] [Accepted: 12/20/2021] [Indexed: 12/13/2022] Open
Abstract
Drug repositioning provides a promising and efficient strategy to discover potential associations between drugs and diseases. Many systematic computational drug-repositioning methods have been introduced, which are based on various similarities of drugs and diseases. In this work, we proposed a new computational model, DDA-SKF (drug-disease associations prediction using similarity kernels fusion), which can predict novel drug indications by utilizing similarity kernel fusion (SKF) and Laplacian regularized least squares (LapRLS) algorithms. DDA-SKF integrated multiple similarities of drugs and diseases. The prediction performances of DDA-SKF are better, or at least comparable, to all state-of-the-art methods. The DDA-SKF can work without sufficient similarity information between drug indications. This allows us to predict new purpose for orphan drugs. The source code and benchmarking datasets are deposited in a GitHub repository (https://github.com/GCQ2119216031/DDA-SKF).
Collapse
Affiliation(s)
- Chu-Qiao Gao
- College of Intelligence and Computing, Tianjin University, Tianjin, China
| | - Yuan-Ke Zhou
- College of Intelligence and Computing, Tianjin University, Tianjin, China
| | - Xiao-Hong Xin
- College of Intelligence and Computing, Tianjin University, Tianjin, China
| | - Hui Min
- College of Intelligence and Computing, Tianjin University, Tianjin, China
| | - Pu-Feng Du
- College of Intelligence and Computing, Tianjin University, Tianjin, China
| |
Collapse
|
15
|
Wang W, Wang Y, Zhang Y, Liu D, Zhang H, Wang X. PPDTS: Predicting potential drug-target interactions based on network similarity. IET Syst Biol 2021; 16:18-27. [PMID: 34783172 PMCID: PMC8849239 DOI: 10.1049/syb2.12037] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2021] [Revised: 10/06/2021] [Accepted: 11/04/2021] [Indexed: 11/19/2022] Open
Abstract
Identification of drug–target interactions (DTIs) has great practical importance in the drug discovery process for known diseases. However, only a small proportion of DTIs in these databases has been verified experimentally, and the computational methods for predicting the interactions remain challenging. As a result, some effective computational models have become increasingly popular for predicting DTIs. In this work, the authors predict potential DTIs from the local structure of drug–target associations' network, which is different from the traditional global network similarity methods based on structure and ligand. A novel method called PPDTS is proposed to predict DTIs. First, according to the DTIs’ network local structure, the known DTIs are converted into a binary network. Second, the Resource Allocation algorithm is used to obtain a drug–drug similarity network and a target–target similarity network. Third, a Collaborative Filtering algorithm is used with the known drug–target topology information to obtain similarity scores. Fourth, the linear combination of drug–target similarity model and the target–drug similarity model are innovatively proposed to obtain the final prediction results. Finally, the experimental performance of PPDTS has proved to be higher than that of the previously mentioned four popular network‐based similarity methods, which is validated in different experimental datasets. Some of the predicted results can be supported in UniProt and DrugBank databases.
Collapse
Affiliation(s)
- Wei Wang
- College of Computer and Information Engineering, Henan Normal University, Xinxiang, China.,Key Laboratory of Artificial Intelligence and Personalized Learning in Education of Henan Province, Henan Normal University, Xinxiang, China.,Big Data Engineering Laboratory for Teaching Resources and Assessment of Education Quality of Henan Province, Henan Normal University, Xinxiang, China
| | - Yongqing Wang
- College of Computer and Information Engineering, Henan Normal University, Xinxiang, China
| | - Yu Zhang
- College of Computer and Information Engineering, Henan Normal University, Xinxiang, China
| | - Dong Liu
- College of Computer and Information Engineering, Henan Normal University, Xinxiang, China.,Key Laboratory of Artificial Intelligence and Personalized Learning in Education of Henan Province, Henan Normal University, Xinxiang, China.,Big Data Engineering Laboratory for Teaching Resources and Assessment of Education Quality of Henan Province, Henan Normal University, Xinxiang, China
| | - Hongjun Zhang
- Computer Science and Technology, Anyang University, Anyang, China
| | - Xianfang Wang
- Computer Science and Technology, Henan Institute of Technology, Xinxiang, China
| |
Collapse
|
16
|
Wang CC, Han CD, Zhao Q, Chen X. Circular RNAs and complex diseases: from experimental results to computational models. Brief Bioinform 2021; 22:bbab286. [PMID: 34329377 PMCID: PMC8575014 DOI: 10.1093/bib/bbab286] [Citation(s) in RCA: 99] [Impact Index Per Article: 33.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2021] [Revised: 06/23/2021] [Accepted: 07/03/2021] [Indexed: 12/13/2022] Open
Abstract
Circular RNAs (circRNAs) are a class of single-stranded, covalently closed RNA molecules with a variety of biological functions. Studies have shown that circRNAs are involved in a variety of biological processes and play an important role in the development of various complex diseases, so the identification of circRNA-disease associations would contribute to the diagnosis and treatment of diseases. In this review, we summarize the discovery, classifications and functions of circRNAs and introduce four important diseases associated with circRNAs. Then, we list some significant and publicly accessible databases containing comprehensive annotation resources of circRNAs and experimentally validated circRNA-disease associations. Next, we introduce some state-of-the-art computational models for predicting novel circRNA-disease associations and divide them into two categories, namely network algorithm-based and machine learning-based models. Subsequently, several evaluation methods of prediction performance of these computational models are summarized. Finally, we analyze the advantages and disadvantages of different types of computational models and provide some suggestions to promote the development of circRNA-disease association identification from the perspective of the construction of new computational models and the accumulation of circRNA-related data.
Collapse
Affiliation(s)
- Chun-Chun Wang
- School of Information and Control Engineering, China University of Mining and Technology
| | - Chen-Di Han
- School of Information and Control Engineering, China University of Mining and Technology
| | - Qi Zhao
- School of Computer Science and Software Engineering, University of Science and Technology Liaoning
| | - Xing Chen
- China University of Mining and Technology
| |
Collapse
|
17
|
Yi HC, You ZH, Guo ZH, Huang DS, Chan KCC. Learning Representation of Molecules in Association Network for Predicting Intermolecular Associations. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021; 18:2546-2554. [PMID: 32070992 DOI: 10.1109/tcbb.2020.2973091] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
A key aim of post-genomic biomedical research is to systematically understand molecules and their interactions in human cells. Multiple biomolecules coordinate to sustain life activities, and interactions between various biomolecules are interconnected. However, existing studies usually only focusing on associations between two or very limited types of molecules. In this study, we propose a network representation learning based computational framework MAN-SDNE to predict any intermolecular associations. More specifically, we constructed a large-scale molecular association network of multiple biomolecules in human by integrating associations among long non-coding RNA, microRNA, protein, drug, and disease, containing 6,528 molecular nodes, 9 kind of,105,546 associations. And then, the feature of each node is represented by its network proximity and attribute features. Furthermore, these features are used to train Random Forest classifier to predict intermolecular associations. MAN-SDNE achieves a remarkable performance with an AUC of 0.9552 and an AUPR of 0.9338 under five-fold cross-validation. To indicate the ability to predict specific types of interactions, a case study for predicting lncRNA-protein interactions using MAN-SDNE is also executed. Experimental results demonstrate this work offers a systematic insight for understanding the synergistic associations between molecules and complex diseases and provides a network-based computational tool to systematically explore intermolecular interactions.
Collapse
|
18
|
Gao J, Zhang X, Tian L, Liu Y, Wang J, Li Z, Hu X. MTGNN: Multi-Task Graph Neural Network based few-shot learning for disease similarity measurement. Methods 2021; 198:88-95. [PMID: 34700014 DOI: 10.1016/j.ymeth.2021.10.005] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2021] [Revised: 10/16/2021] [Accepted: 10/18/2021] [Indexed: 11/24/2022] Open
Abstract
Similar diseases are usually caused by molecular origins or similar phenotypes. Confirming the relationship between diseases can help researchers gain a deep insight of the pathogenic mechanisms of emerging complex diseases, and improve the corresponding diagnoses and treatment. Therefore, similar diseases are considerably important in biology and pathology. However, the insufficient number of labelled similar disease pairs cannot support the optimal training of the models. In this paper, we propose a Multi-Task Graph Neural Network (MTGNN) framework to measure disease similarity by few-shot learning. To tackle the problem of insufficient number of labelled similar disease pairs, we design the multi-task optimization strategy to train the graph neural network for disease similarity task (lack of labelled training data) by introducing link prediction task (sufficient labelled training data). The similarity between diseases can then be obtained by measuring the distance between disease embeddings in high-dimensional space learning from the double tasks. The experiment results evaluate the performance of MTGNN and illustrate its advantages over previous methods on few labeled training dataset.
Collapse
Affiliation(s)
- Jianliang Gao
- School of Computer Science and Engineering, Central South University, Changsha 410083, China
| | - Xiangchi Zhang
- School of Computer Science and Engineering, Central South University, Changsha 410083, China
| | - Ling Tian
- School of Computer Science and Engineering, Central South University, Changsha 410083, China
| | - Yuxin Liu
- School of Computer Science and Engineering, Central South University, Changsha 410083, China
| | - Jianxin Wang
- School of Computer Science and Engineering, Central South University, Changsha 410083, China
| | - Zhao Li
- Alibaba Group, Hangzhou 310000, China.
| | - Xiaohua Hu
- College of Computing & Informatics, Drexel University, Philadelphia, PA 19104, USA
| |
Collapse
|
19
|
The Complex Structure of the Pharmacological Drug-Disease Network. ENTROPY 2021; 23:e23091139. [PMID: 34573762 PMCID: PMC8466955 DOI: 10.3390/e23091139] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/22/2021] [Revised: 08/23/2021] [Accepted: 08/24/2021] [Indexed: 12/29/2022]
Abstract
The complexity of drug–disease interactions is a process that has been explained in terms of the need for new drugs and the increasing cost of drug development, among other factors. Over the last years, diverse approaches have been explored to understand drug–disease relationships. Here, we construct a bipartite graph in terms of active ingredients and diseases based on thoroughly classified data from a recognized pharmacological website. We find that the connectivities between drugs (outgoing links) and diseases (incoming links) follow approximately a stretched-exponential function with different fitting parameters; for drugs, it is between exponential and power law functions, while for diseases, the behavior is purely exponential. The network projections, onto either drugs or diseases, reveal that the co-ocurrence of drugs (diseases) in common target diseases (drugs) lead to the appearance of connected components, which varies as the threshold number of common target diseases (drugs) is increased. The corresponding projections built from randomized versions of the original bipartite networks are considered to evaluate the differences. The heterogeneity of association at group level between active ingredients and diseases is evaluated in terms of the Shannon entropy and algorithmic complexity, revealing that higher levels of diversity are present for diseases compared to drugs. Finally, the robustness of the original bipartite network is evaluated in terms of most-connected nodes removal (direct attack) and random removal (random failures).
Collapse
|
20
|
Li W, Wang S, Xu J. An Ensemble Matrix Completion Model for Predicting Potential Drugs Against SARS-CoV-2. Front Microbiol 2021; 12:694534. [PMID: 34367094 PMCID: PMC8334363 DOI: 10.3389/fmicb.2021.694534] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2021] [Accepted: 06/22/2021] [Indexed: 11/13/2022] Open
Abstract
Because of the catastrophic outbreak of global coronavirus disease 2019 (COVID-19) and its strong infectivity and possible persistence, computational repurposing of existing approved drugs will be a promising strategy that facilitates rapid clinical treatment decisions and provides reasonable justification for subsequent clinical trials and regulatory reviews. Since the effects of a small number of conditionally marketed vaccines need further clinical observation, there is still an urgent need to quickly and effectively repurpose potentially available drugs before the next disease peak. In this work, we have manually collected a set of experimentally confirmed virus-drug associations through the publicly published database and literature, consisting of 175 drugs and 95 viruses, as well as 933 virus-drug associations. Then, because the samples are extremely sparse and unbalanced, negative samples cannot be easily obtained. We have developed an ensemble model, EMC-Voting, based on matrix completion and weighted soft voting, a semi-supervised machine learning model for computational drug repurposing. Finally, we have evaluated the prediction performance of EMC-Voting by fivefold crossing-validation and compared it with other baseline classifiers and prediction models. The case study for the virus SARS-COV-2 included in the dataset demonstrates that our model achieves the outperforming AUPR value of 0.934 in virus-drug association's prediction.
Collapse
Affiliation(s)
| | - Shulin Wang
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China
| | | |
Collapse
|
21
|
Li A, Deng Y, Tan Y, Chen M. A novel miRNA-disease association prediction model using dual random walk with restart and space projection federated method. PLoS One 2021; 16:e0252971. [PMID: 34138933 PMCID: PMC8211179 DOI: 10.1371/journal.pone.0252971] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2021] [Accepted: 05/26/2021] [Indexed: 12/27/2022] Open
Abstract
A large number of studies have shown that the variation and disorder of miRNAs are important causes of diseases. The recognition of disease-related miRNAs has become an important topic in the field of biological research. However, the identification of disease-related miRNAs by biological experiments is expensive and time consuming. Thus, computational prediction models that predict disease-related miRNAs must be developed. A novel network projection-based dual random walk with restart (NPRWR) was used to predict potential disease-related miRNAs. The NPRWR model aims to estimate and accurately predict miRNA-disease associations by using dual random walk with restart and network projection technology, respectively. The leave-one-out cross validation (LOOCV) was adopted to evaluate the prediction performance of NPRWR. The results show that the area under the receiver operating characteristic curve(AUC) of NPRWR was 0.9029, which is superior to that of other advanced miRNA-disease associated prediction methods. In addition, lung and kidney neoplasms were selected to present a case study. Among the first 50 miRNAs predicted, 50 and 49 miRNAs have been proven by in databases or relevant literature. Moreover, NPRWR can be used to predict isolated diseases and new miRNAs. LOOCV and the case study achieved good prediction results. Thus, NPRWR will become an effective and accurate disease-miRNA association prediction model.
Collapse
Affiliation(s)
- Ang Li
- Hunan Institute of Technology, School of Computer Science and Technology, Hengyang, China
| | - Yingwei Deng
- Hunan Institute of Technology, School of Computer Science and Technology, Hengyang, China
- Hainan Key Laboratory for Computational Science and Application, Haikou, China
| | - Yan Tan
- Hunan Institute of Technology, School of Computer Science and Technology, Hengyang, China
| | - Min Chen
- Hunan Institute of Technology, School of Computer Science and Technology, Hengyang, China
| |
Collapse
|
22
|
Ding Y, Lei X, Liao B, Wu FX. Predicting miRNA-Disease Associations Based on Multi-View Variational Graph Auto-Encoder with Matrix Factorization. IEEE J Biomed Health Inform 2021; 26:446-457. [PMID: 34111017 DOI: 10.1109/jbhi.2021.3088342] [Citation(s) in RCA: 27] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
MicroRNAs (miRNAs) have been proved to play critical roles in diverse biological processes, including the human disease development process. Exploring the potential associations between miRNAs and diseases can help us better understand complex disease mechanisms. Given that traditional biological experiments are expensive and time-consuming, computational models can serve as efficient means to uncover potential miRNA-disease associations. This study presents a new computational model based on variational graph auto-encoder with matrix factorization (VGAMF) for miRNA-disease association prediction. More specifically, VGAMF first integrates four different types of information about miRNAs into an miRNA comprehensive similarity network and two types of information about diseases into a disease comprehensive similarity network, respectively. Then, VGAMF gets the non-linear representations of miRNAs and diseases, respectively, from those two comprehensive similarity networks with variational graph auto-encoders. Simultaneously, a non-negative matrix factorization is conducted on the miRNA-disease association matrix to get the linear representations of miRNAs and diseases. Finally, a fully connected neural network combines linear and non-linear representations of miRNAs and diseases to get the final predicted association score for all miRNA-disease pairs. In the 10-fold cross-validation experiments, VGAMF achieves an average AUC of 0.9280 on HMDD v2.0 and 0.9470 on HMDD v3.2, which outperforms other competing methods. Besides, the case studies on colon cancer and esophageal cancer further demonstrate the effectiveness of VGAMF in predicting novel miRNA-disease associations.
Collapse
|
23
|
Kim H, Kim E, Lee I, Bae B, Park M, Nam H. Artificial Intelligence in Drug Discovery: A Comprehensive Review of Data-driven and Machine Learning Approaches. BIOTECHNOL BIOPROC E 2021; 25:895-930. [PMID: 33437151 PMCID: PMC7790479 DOI: 10.1007/s12257-020-0049-y] [Citation(s) in RCA: 28] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2020] [Revised: 05/27/2020] [Accepted: 06/03/2020] [Indexed: 02/07/2023]
Abstract
As expenditure on drug development increases exponentially, the overall drug discovery process requires a sustainable revolution. Since artificial intelligence (AI) is leading the fourth industrial revolution, AI can be considered as a viable solution for unstable drug research and development. Generally, AI is applied to fields with sufficient data such as computer vision and natural language processing, but there are many efforts to revolutionize the existing drug discovery process by applying AI. This review provides a comprehensive, organized summary of the recent research trends in AI-guided drug discovery process including target identification, hit identification, ADMET prediction, lead optimization, and drug repositioning. The main data sources in each field are also summarized in this review. In addition, an in-depth analysis of the remaining challenges and limitations will be provided, and proposals for promising future directions in each of the aforementioned areas.
Collapse
Affiliation(s)
- Hyunho Kim
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST), Gwangju, 61005 Korea
| | - Eunyoung Kim
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST), Gwangju, 61005 Korea
| | - Ingoo Lee
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST), Gwangju, 61005 Korea
| | - Bongsung Bae
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST), Gwangju, 61005 Korea
| | - Minsu Park
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST), Gwangju, 61005 Korea
| | - Hojung Nam
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST), Gwangju, 61005 Korea
| |
Collapse
|
24
|
Ding Y, Tang J, Guo F. The Computational Models of Drug-target Interaction Prediction. Protein Pept Lett 2020; 27:348-358. [PMID: 30968771 DOI: 10.2174/0929866526666190410124110] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2019] [Revised: 02/22/2019] [Accepted: 04/02/2019] [Indexed: 12/19/2022]
Abstract
The identification of Drug-Target Interactions (DTIs) is an important process in drug discovery and medical research. However, the tradition experimental methods for DTIs identification are still time consuming, extremely expensive and challenging. In the past ten years, various computational methods have been developed to identify potential DTIs. In this paper, the identification methods of DTIs are summarized. What's more, several state-of-the-art computational methods are mainly introduced, containing network-based method and machine learning-based method. In particular, for machine learning-based methods, including the supervised and semisupervised models, have essential differences in the approach of negative samples. Although these effective computational models in identification of DTIs have achieved significant improvements, network-based and machine learning-based methods have their disadvantages, respectively. These computational methods are evaluated on four benchmark data sets via values of Area Under the Precision Recall curve (AUPR).
Collapse
Affiliation(s)
- Yijie Ding
- School of Electronic and Information Engineering, Suzhou University of Science and Technology, Suzhou, China
| | - Jijun Tang
- Department of Computer Science and Engineering, University of South Carolina, Columbia, SC, United States.,School of Computer Science and Technology, College of Intelligence and Computing, Tianjin University, Tianjin, China
| | - Fei Guo
- School of Computer Science and Technology, College of Intelligence and Computing, Tianjin University, Tianjin, China
| |
Collapse
|
25
|
Huang L, Luo H, Li S, Wu FX, Wang J. Drug-drug similarity measure and its applications. Brief Bioinform 2020; 22:5956929. [PMID: 33152756 DOI: 10.1093/bib/bbaa265] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2020] [Revised: 09/13/2020] [Accepted: 09/14/2020] [Indexed: 02/01/2023] Open
Abstract
Drug similarities play an important role in modern biology and medicine, as they help scientists gain deep insights into drugs' therapeutic mechanisms and conduct wet labs that may significantly improve the efficiency of drug research and development. Nowadays, a number of drug-related databases have been constructed, with which many methods have been developed for computing similarities between drugs for studying associations between drugs, human diseases, proteins (drug targets) and more. In this review, firstly, we briefly introduce the publicly available drug-related databases. Secondly, based on different drug features, interaction relationships and multimodal data, we summarize similarity calculation methods in details. Then, we discuss the applications of drug similarities in various biological and medical areas. Finally, we evaluate drug similarity calculation methods with common evaluation metrics to illustrate the important roles of drug similarity measures on different applications.
Collapse
Affiliation(s)
- Lan Huang
- Hunan Provincial Key Lab of Bioinformatics, School of Computer Science and Engineering at Central South University, Hunan, China
| | - Huimin Luo
- School of Computer and Information Engineering at Henan University, Kaifeng, China
| | - Suning Li
- Hunan Provincial Key Lab of Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha, Hunan, China
| | - Fang-Xiang Wu
- College of Engineering and Department of Computer Sciences, University of Saskatchewan, Saskatoon, Canada
| | - Jianxin Wang
- Hunan Provincial Key Lab of Bioinformatics, School of Computer Science and Engineering at Central South University, Hunan, China
| |
Collapse
|
26
|
Xie Q, Yang KM, Heo GE, Song M. Literature based discovery of alternative TCM medicine for adverse reactions to depression drugs. BMC Bioinformatics 2020; 21:405. [PMID: 33106157 PMCID: PMC7586667 DOI: 10.1186/s12859-020-03735-8] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2020] [Accepted: 09/03/2020] [Indexed: 11/10/2022] Open
Abstract
Background In recent years, Traditional Chinese Medicine (TCM) and alternative medicine have been widely used along with western drugs as a complementary form of treatment. In this study, we first use the scientific literature to identify western drugs with obvious side effects. Then, we find TCM alternatives for these western drugs to ameliorate their side effects. Results We used depression as a case study. To evaluate our method, we showed the relation between herb-ingredients-target-disease for representative alternative herbs of western drugs. Further, a protein-protein interaction network of western drugs and alternative herbs was produced, and we performed enrichment analysis of the targets of the active ingredients of the herbs and examined the enrichment of Gene Ontology terms for Biological Process, Cellular Component, and Molecular Function and KEGG Pathway levels, to show how these targets affect different levels of gene expression. Conclusion Our proposed method is able to select herbs that are highly relevant to the target indication (depression) and are able to treat the side effects caused by the target drug. The compounds from our selected alternative herbal medicines can therefore be complementary to the western drugs and ameliorate their side effects, which may help in the development of new drugs.
Collapse
Affiliation(s)
- Qing Xie
- Department of Library and Information Science, Yonsei University, 50 Yonsei-ro Seodaemun-gu, Seoul, 03722, Republic of Korea
| | - Kyoung Min Yang
- Department of Library and Information Science, Yonsei University, 50 Yonsei-ro Seodaemun-gu, Seoul, 03722, Republic of Korea
| | - Go Eun Heo
- Department of Library and Information Science, Yonsei University, 50 Yonsei-ro Seodaemun-gu, Seoul, 03722, Republic of Korea
| | - Min Song
- Department of Library and Information Science, Yonsei University, 50 Yonsei-ro Seodaemun-gu, Seoul, 03722, Republic of Korea.
| |
Collapse
|
27
|
Jafari M, Wang Y, Amiryousefi A, Tang J. Unsupervised Learning and Multipartite Network Models: A Promising Approach for Understanding Traditional Medicine. Front Pharmacol 2020; 11:1319. [PMID: 32982738 PMCID: PMC7479204 DOI: 10.3389/fphar.2020.01319] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2020] [Accepted: 08/07/2020] [Indexed: 12/11/2022] Open
Abstract
The ultimate goal of precision medicine is to determine right treatment for right patients based on precise diagnosis. To achieve this goal, correct stratification of patients using molecular features and clinical phenotypes is crucial. During the long history of medical science, our understanding on disease classification has been improved greatly by chemistry and molecular biology. Nowadays, we gain access to large scale patient-derived data by high-throughput technologies, generating a greater need for data science including unsupervised learning and network modeling. Unsupervised learning methods such as clustering could be a better solution to stratify patients when there is a lack of predefined classifiers. In network modularity analysis, clustering methods can be also applied to elucidate the complex structure of biological and disease networks at the systems level. In this review, we went over the main points of clustering analysis and network modeling, particularly in the context of Traditional Chinese medicine (TCM). We showed that this approach can provide novel insights on the rationale of classification for TCM herbs. In a case study, using a modularity analysis of multipartite networks, we illustrated that the TCM classifications are associated with the chemical properties of the herb ingredients. We concluded that multipartite network modeling may become a suitable data integration tool for understanding the mechanisms of actions of traditional medicine.
Collapse
Affiliation(s)
- Mohieddin Jafari
- Research Program in Systems Oncology, Faculty of Medicine, University of Helsinki, Helsinki, Finland
| | - Yinyin Wang
- Research Program in Systems Oncology, Faculty of Medicine, University of Helsinki, Helsinki, Finland
| | - Ali Amiryousefi
- Research Program in Systems Oncology, Faculty of Medicine, University of Helsinki, Helsinki, Finland
| | - Jing Tang
- Research Program in Systems Oncology, Faculty of Medicine, University of Helsinki, Helsinki, Finland
| |
Collapse
|
28
|
Xiang J, Zhang NR, Zhang JS, Lv XY, Li M. PrGeFNE: Predicting disease-related genes by fast network embedding. Methods 2020; 192:3-12. [PMID: 32610158 DOI: 10.1016/j.ymeth.2020.06.015] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2020] [Revised: 06/13/2020] [Accepted: 06/22/2020] [Indexed: 12/14/2022] Open
Abstract
Identifying disease-related genes is of importance for understanding of molecule mechanisms of diseases, as well as diagnosis and treatment of diseases. Many computational methods have been proposed to predict disease-related genes, but how to make full use of multi-source biological data to enhance the ability of disease-gene prediction is still challenging. In this paper, we proposed a novel method for predicting disease-related genes by using fast network embedding (PrGeFNE), which can integrate multiple types of associations related to diseases and genes. Specifically, we first constructed a heterogeneous network by using phenotype-disease, disease-gene, protein-protein and gene-GO associations; and low-dimensional representation of nodes is extracted from the network by using a fast network embedding algorithm. Then, a dual-layer heterogeneous network was reconstructed by using the low-dimensional representation, and a network propagation was applied to the dual-layer heterogeneous network to predict disease-related genes. Through cross-validation and newly added-association validation, we displayed the important roles of different types of association data in enhancing the ability of disease-gene prediction, and confirmed the excellent performance of PrGeFNE by comparing to state-of-the-art algorithms. Furthermore, we developed a web tool that can facilitate researchers to search for candidate genes of different diseases predicted by PrGeFNE, along with the enrichment analysis of GO and pathway on candidate gene set. This may be useful for investigation of diseases' molecular mechanisms as well as their experimental validations. The web tool is available at http://bioinformatics.csu.edu.cn/prgefne/.
Collapse
Affiliation(s)
- Ju Xiang
- School of Computer Science and Engineering, Central South University, Changsha 410083, China; Neuroscience Research Center & Department of Basic Medical Sciences, Changsha Medical University, Changsha, 410219 Hunan, China
| | - Ning-Rui Zhang
- College of Information Science and Engineering, Xinjiang University, Urumqi 830046, China
| | - Jia-Shuai Zhang
- School of Computer Science and Engineering, Central South University, Changsha 410083, China
| | - Xiao-Yi Lv
- School of Software, Xinjiang University, Urumqi 830046, China
| | - Min Li
- School of Computer Science and Engineering, Central South University, Changsha 410083, China.
| |
Collapse
|
29
|
Yi HC, You ZH, Huang DS, Guo ZH, Chan KCC, Li Y. Learning Representations to Predict Intermolecular Interactions on Large-Scale Heterogeneous Molecular Association Network. iScience 2020; 23:101261. [PMID: 32580123 PMCID: PMC7317230 DOI: 10.1016/j.isci.2020.101261] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2019] [Revised: 04/29/2020] [Accepted: 06/08/2020] [Indexed: 02/07/2023] Open
Abstract
Molecular components that are functionally interdependent in human cells constitute molecular association networks. Disease can be caused by disturbance of multiple molecular interactions. New biomolecular regulatory mechanisms can be revealed by discovering new biomolecular interactions. To this end, a heterogeneous molecular association network is formed by systematically integrating comprehensive associations between miRNAs, lncRNAs, circRNAs, mRNAs, proteins, drugs, microbes, and complex diseases. We propose a machine learning method for predicting intermolecular interactions, named MMI-Pred. More specifically, a network embedding model is developed to fully exploit the network behavior of biomolecules, and attribute features are also calculated. Then, these discriminative features are combined to train a random forest classifier to predict intermolecular interactions. MMI-Pred achieves an outstanding performance of 93.50% accuracy in hybrid associations prediction under 5-fold cross-validation. This work provides systematic landscape and machine learning method to model and infer complex associations between various biological components.
Collapse
Affiliation(s)
- Hai-Cheng Yi
- Xinjiang Laboratory of Minority Speech and Language Information Processing, Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Zhu-Hong You
- Xinjiang Laboratory of Minority Speech and Language Information Processing, Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China; University of Chinese Academy of Sciences, Beijing 100049, China.
| | - De-Shuang Huang
- Institute of Machine Learning and Systems Biology, School of Electronics and Information Engineering, Tongji University, Shanghai 201804, China
| | - Zhen-Hao Guo
- Xinjiang Laboratory of Minority Speech and Language Information Processing, Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China
| | - Keith C C Chan
- Department of Computing, Hong Kong Polytechnic University, Hong Kong SAR 999077, China
| | - Yangming Li
- College of Engineering Technology, Rochester Institute of Technology, Rochester, NY 14623, USA
| |
Collapse
|
30
|
Liu J, Zuo Z, Wu G. Link Prediction Only With Interaction Data and its Application on Drug Repositioning. IEEE Trans Nanobioscience 2020; 19:547-555. [PMID: 32340956 DOI: 10.1109/tnb.2020.2990291] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
To assist drug development, many computational methods have been proposed to identify potential drug-disease treatment associations before wet experiments. Based on the assumption that similar drugs may treat similar diseases, most methods calculate the similarities of drugs and diseases by using various chemical or biological features. However, since these features may be unknown or hard to collect, such methods will not work in the face of incomplete data. Besides, due to the lack of validated negative samples in the drug-disease associations data, most methods have no choice but to simply select some unlabeled samples as negative ones, which may introduce noises and decrease the reliability of prediction. Herein, we propose a new method (TS-SVD) which only uses those known drug-protein, disease-protein and drug-disease interactions to predict the potential drug-disease associations. In a constructed drug-protein-disease heterogeneous network, assuming that drugs/diseases relating to some common proteins or diseases/drugs may be similar, we get the common neighbors count matrix of drugs/diseases, then convert it to a topological similarity matrix. After that, we get low dimensional embedding representations of drug-disease pairs by using topological features and singular value decomposition. Finally, a Random Forest classifier is trained to do the prediction. To train a more reasonable model, we select out some reliable negative samples based on the k -step neighbors relationships between drugs and diseases. Compared with some state-of-the-art methods, we use less information but achieve better or comparable performance. Meanwhile, our strategy for selecting reliable negative samples can improve the performances of these methods. Case studies have further shown the practicality of our method in discovering novel drug-disease associations.
Collapse
|
31
|
Wang W, Lv H, Zhao Y, Liu D, Wang Y, Zhang Y. DLS: A Link Prediction Method Based on Network Local Structure for Predicting Drug-Protein Interactions. Front Bioeng Biotechnol 2020; 8:330. [PMID: 32391341 PMCID: PMC7193019 DOI: 10.3389/fbioe.2020.00330] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2019] [Accepted: 03/25/2020] [Indexed: 12/22/2022] Open
Abstract
The studies on drug-protein interactions (DPIs) had significant for drug repositioning, drug discovery, and clinical medicine. The biochemical experimentation (in vitro) requires a long time and high cost to be confirmed because it is difficult to estimate. Therefore, a feasible solution is to predict DPIs efficiently with computers. We propose a link prediction method based on drug-protein interaction (DPI) local structural similarity (DLS) for predicting the DPIs. The DLS method combines link prediction and binary network structure to predict DPIs. The ten-fold cross-validation method was applied in the experiment. After comparing the predictive capability of DLS with the improved similarity-based network prediction method, the results of DLS on the test set are significantly better. Moreover, several candidate proteins were predicted for three approved drugs, namely captopril, desferrioxamine and losartan, and these predictions are further validated by the literature. In addition, the combination of the Common Neighborhood (CN) method and the DLS method provides a new idea for the integrated application of the link prediction method.
Collapse
Affiliation(s)
- Wei Wang
- Department of Computer Science and Technology, College of Computer and Information Engineering, Henan Normal University, Xinxiang, China.,Big Data Engineering Laboratory for Teaching Resources and Assessment of Education Quality, Xinxiang, China
| | - Hehe Lv
- Department of Computer Science and Technology, College of Computer and Information Engineering, Henan Normal University, Xinxiang, China
| | - Yuan Zhao
- Department of Computer Science and Technology, College of Computer and Information Engineering, Henan Normal University, Xinxiang, China
| | - Dong Liu
- Department of Computer Science and Technology, College of Computer and Information Engineering, Henan Normal University, Xinxiang, China
| | - Yongqing Wang
- Department of Computer Science and Technology, College of Computer and Information Engineering, Henan Normal University, Xinxiang, China.,Big Data Engineering Laboratory for Teaching Resources and Assessment of Education Quality, Xinxiang, China
| | - Yu Zhang
- Department of Computer Science and Technology, College of Computer and Information Engineering, Henan Normal University, Xinxiang, China
| |
Collapse
|
32
|
Huang Q, Zhang J, Wei L, Guo F, Zou Q. 6mA-RicePred: A Method for Identifying DNA N 6-Methyladenine Sites in the Rice Genome Based on Feature Fusion. FRONTIERS IN PLANT SCIENCE 2020; 11:4. [PMID: 32076430 PMCID: PMC7006724 DOI: 10.3389/fpls.2020.00004] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/04/2019] [Accepted: 01/06/2020] [Indexed: 06/01/2023]
Abstract
MOTIVATION The biological function of N 6-methyladenine DNA (6mA) in plants is largely unknown. Rice is one of the most important crops worldwide and is a model species for molecular and genetic studies. There are few methods for 6mA site recognition in the rice genome, and an effective computational method is needed. RESULTS In this paper, we propose a new computational method called 6mA-Pred to identify 6mA sites in the rice genome. 6mA-Pred employs a feature fusion method to combine advantageous features from other methods and thus obtain a new feature to identify 6mA sites. This method achieved an accuracy of 87.27% in the identification of 6mA sites with 10-fold cross-validation and achieved an accuracy of 85.6% in independent test sets.
Collapse
Affiliation(s)
- Qianfei Huang
- College of Intelligence and Computing, Tianjin University, Tianjin, China
| | - Jun Zhang
- Rehabilitation Department, Heilongjiang Province Land Reclamation Headquarters General Hospital, Harbin, China
| | - Leyi Wei
- College of Intelligence and Computing, Tianjin University, Tianjin, China
| | - Fei Guo
- College of Intelligence and Computing, Tianjin University, Tianjin, China
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China
| |
Collapse
|
33
|
Yu L, Xu F, Gao L. Predict New Therapeutic Drugs for Hepatocellular Carcinoma Based on Gene Mutation and Expression. Front Bioeng Biotechnol 2020; 8:8. [PMID: 32047745 PMCID: PMC6997129 DOI: 10.3389/fbioe.2020.00008] [Citation(s) in RCA: 41] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2019] [Accepted: 01/07/2020] [Indexed: 02/01/2023] Open
Abstract
Hepatocellular carcinoma (HCC) is the fourth most common primary liver tumor and is an important medical problem worldwide. However, the use of current therapies for HCC is no possible to be cured, and despite numerous attempts and clinical trials, there are not so many approved targeted treatments for HCC. So, it is necessary to identify additional treatment strategies to prevent the growth of HCC tumors. We are looking for a systematic drug repositioning bioinformatics method to identify new drug candidates for the treatment of HCC, which considers not only aberrant genomic information, but also the changes of transcriptional landscapes. First, we screen the collection of HCC feature genes, i.e., kernel genes, which frequently mutated in most samples of HCC based on human mutation data. Then, the gene expression data of HCC in TCGA are combined to classify the kernel genes of HCC. Finally, the therapeutic score (TS) of each drug is calculated based on the kolmogorov-smirnov statistical method. Using this strategy, we identify five drugs that associated with HCC, including three drugs that could treat HCC and two drugs that might have side-effect on HCC. In addition, we also make Connectivity Map (CMap) profiles similarity analysis and KEGG enrichment analysis on drug targets. All these findings suggest that our approach is effective for accurate predicting novel therapeutic options for HCC and easily to be extended to other tumors.
Collapse
Affiliation(s)
- Liang Yu
- School of Computer Science and Technology, Xidian University, Xi'an, China
| | - Fengdan Xu
- School of Computer Science and Technology, Xidian University, Xi'an, China
| | - Lin Gao
- School of Computer Science and Technology, Xidian University, Xi'an, China
| |
Collapse
|
34
|
Li Z, Huang Q, Chen X, Wang Y, Li J, Xie Y, Dai Z, Zou X. Identification of Drug-Disease Associations Using Information of Molecular Structures and Clinical Symptoms via Deep Convolutional Neural Network. Front Chem 2020; 7:924. [PMID: 31998700 PMCID: PMC6966717 DOI: 10.3389/fchem.2019.00924] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2019] [Accepted: 12/18/2019] [Indexed: 02/02/2023] Open
Abstract
Identifying drug-disease associations is helpful for not only predicting new drug indications and recognizing lead compounds, but also preventing, diagnosing, treating diseases. Traditional experimental methods are time consuming, laborious and expensive. Therefore, it is urgent to develop computational method for predicting potential drug-disease associations on a large scale. Herein, a novel method was proposed to identify drug-disease associations based on the deep learning technique. Molecular structure and clinical symptom information were used to characterize drugs and diseases. Then, a novel two-dimensional matrix was constructed and mapped to a gray-scale image for representing drug-disease association. Finally, deep convolution neural network was introduced to build model for identifying potential drug-disease associations. The performance of current method was evaluated based on the training set and test set, and accuracies of 89.90 and 86.51% were obtained. Prediction ability for recognizing new drug indications, lead compounds and true drug-disease associations was also investigated and verified by performing various experiments. Additionally, 3,620,516 potential drug-disease associations were identified and some of them were further validated through docking modeling. It is anticipated that the proposed method may be a powerful large scale virtual screening tool for drug research and development. The source code of MATLAB is freely available on request from the authors.
Collapse
Affiliation(s)
- Zhanchao Li
- School of Chemistry and Chemical Engineering, Guangdong Pharmaceutical University, Guangzhou, China.,School of Chemistry, Sun Yat-Sen University, Guangzhou, China
| | - Qixing Huang
- School of Chemistry and Chemical Engineering, Guangdong Pharmaceutical University, Guangzhou, China
| | - Xingyu Chen
- School of Chemistry and Chemical Engineering, Guangdong Pharmaceutical University, Guangzhou, China
| | - Yang Wang
- Key Laboratory of Digital Quality Evaluation of Chinese Materia Medica of State Administration of Traditional Chinese Medicine, Guangzhou, China
| | - Jinlong Li
- School of Chemistry and Chemical Engineering, Guangdong Pharmaceutical University, Guangzhou, China
| | - Yun Xie
- School of Chemistry and Chemical Engineering, Guangdong Pharmaceutical University, Guangzhou, China
| | - Zong Dai
- Key Laboratory of Digital Quality Evaluation of Chinese Materia Medica of State Administration of Traditional Chinese Medicine, Guangzhou, China
| | - Xiaoyong Zou
- Key Laboratory of Digital Quality Evaluation of Chinese Materia Medica of State Administration of Traditional Chinese Medicine, Guangzhou, China
| |
Collapse
|
35
|
Fan Y, Cui J, Zhu Q. Heterogeneous graph inference based on similarity network fusion for predicting lncRNA–miRNA interaction. RSC Adv 2020; 10:11634-11642. [PMID: 35496629 PMCID: PMC9050493 DOI: 10.1039/c9ra11043g] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2019] [Accepted: 03/14/2020] [Indexed: 12/28/2022] Open
Abstract
LncRNA and miRNA are two non-coding RNA types that are popular in current research. LncRNA interacts with miRNA to regulate gene transcription, further affecting human health and disease. Accurate identification of lncRNA–miRNA interactions contributes to the in-depth study of the biological functions and mechanisms of non-coding RNA. However, relying on biological experiments to obtain interaction information is time-consuming and expensive. Considering the rapid accumulation of gene information and the few computational methods, it is urgent to supplement the effective computational models to predict lncRNA–miRNA interactions. In this work, we propose a heterogeneous graph inference method based on similarity network fusion (SNFHGILMI) to predict potential lncRNA–miRNA interactions. First, we calculated multiple similarity data, including lncRNA sequence similarity, miRNA sequence similarity, lncRNA Gaussian nuclear similarity, and miRNA Gaussian nuclear similarity. Second, the similarity network fusion method was employed to integrate the data and get the similarity network of lncRNA and miRNA. Then, we constructed a bipartite network by combining the known interaction network and similarity network of lncRNA and miRNA. Finally, the heterogeneous graph inference method was introduced to construct a prediction model. On the real dataset, the model SNFHGILMI achieved AUC of 0.9501 and 0.9426 ± 0.0035 based on LOOCV and 5-fold cross validation, respectively. Furthermore, case studies also demonstrate that SNFHGILMI is a high-performance prediction method that can accurately predict new lncRNA–miRNA interactions. The Matlab code and readme file of SNFHGILMI can be downloaded from https://github.com/cj-DaSE/SNFHGILMI. LncRNA and miRNA are two non-coding RNA types that are popular in current research.![]()
Collapse
Affiliation(s)
- Yongxian Fan
- School of Computer and Information Security
- Guilin University of Electronic Technology
- Guilin 541004
- China
| | - Juan Cui
- School of Computer and Information Security
- Guilin University of Electronic Technology
- Guilin 541004
- China
| | - QingQi Zhu
- School of Computer and Information Security
- Guilin University of Electronic Technology
- Guilin 541004
- China
| |
Collapse
|
36
|
Réda C, Kaufmann E, Delahaye-Duriez A. Machine learning applications in drug development. Comput Struct Biotechnol J 2019; 18:241-252. [PMID: 33489002 PMCID: PMC7790737 DOI: 10.1016/j.csbj.2019.12.006] [Citation(s) in RCA: 66] [Impact Index Per Article: 13.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2019] [Revised: 12/10/2019] [Accepted: 12/10/2019] [Indexed: 02/07/2023] Open
Abstract
Due to the huge amount of biological and medical data available today, along with well-established machine learning algorithms, the design of largely automated drug development pipelines can now be envisioned. These pipelines may guide, or speed up, drug discovery; provide a better understanding of diseases and associated biological phenomena; help planning preclinical wet-lab experiments, and even future clinical trials. This automation of the drug development process might be key to the current issue of low productivity rate that pharmaceutical companies currently face. In this survey, we will particularly focus on two classes of methods: sequential learning and recommender systems, which are active biomedical fields of research.
Collapse
Affiliation(s)
- Clémence Réda
- NeuroDiderot, UMR 1141, Inserm, Université de Paris, Sorbonne Paris Cité, Hôpital Robert Debré, 48, boulevard Sérurier, Paris 75019, France
- Université Paris Diderot, Université de Paris, Sorbonne Paris Cité, 5, rue Thomas Mann, Paris 75013, France
| | - Emilie Kaufmann
- Univ. Lille, CNRS, Centrale Lille, Inria, UMR 9189 - CRIStAL - Centre de Recherche en Informatique Signal et Automatique de Lille, F-59000 Lille, France
| | - Andrée Delahaye-Duriez
- NeuroDiderot, UMR 1141, Inserm, Université de Paris, Sorbonne Paris Cité, Hôpital Robert Debré, 48, boulevard Sérurier, Paris 75019, France
- Université Paris 13, Sorbonne Paris Cité, UFR de santé, médecine et biologie humaine, Bobigny 93000, France
- Service histologie-embryologie-cytogénétique-biologie de la reproduction-CECOS, Hôpital Jean Verdier, AP-HP, Bondy 93140, France
| |
Collapse
|
37
|
Zhang W, Tang G, Zhou S, Niu Y. LncRNA-miRNA interaction prediction through sequence-derived linear neighborhood propagation method with information combination. BMC Genomics 2019; 20:946. [PMID: 31856716 PMCID: PMC6923828 DOI: 10.1186/s12864-019-6284-y] [Citation(s) in RCA: 26] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
BACKGROUND Researchers discover lncRNAs can act as decoys or sponges to regulate the behavior of miRNAs. Identification of lncRNA-miRNA interactions helps to understand the functions of lncRNAs, especially their roles in complicated diseases. Computational methods can save time and reduce cost in identifying lncRNA-miRNA interactions, but there have been only a few computational methods. RESULTS In this paper, we propose a sequence-derived linear neighborhood propagation method (SLNPM) to predict lncRNA-miRNA interactions. First, we calculate the integrated lncRNA-lncRNA similarity and the integrated miRNA-miRNA similarity by combining known lncRNA-miRNA interactions, lncRNA sequences and miRNA sequences. We consider two similarity calculation strategies respectively, namely similarity-based information combination (SC) and interaction profile-based information combination (PC). Second, the integrated lncRNA similarity-based graph and the integrated miRNA similarity-based graph are respectively constructed, and the label propagation processes are implemented on two graphs to score lncRNA-miRNA pairs. Finally, the weighted averages of their outputs are adopted as final predictions. Therefore, we construct two editions of SLNPM: sequence-derived linear neighborhood propagation method based on similarity information combination (SLNPM-SC) and sequence-derived linear neighborhood propagation method based on interaction profile information combination (SLNPM-PC). The experimental results show that SLNPM-SC and SLNPM-PC predict lncRNA-miRNA interactions with higher accuracy compared with other state-of-the-art methods. The case studies demonstrate that SLNPM-SC and SLNPM-PC help to find novel lncRNA-miRNA interactions for given lncRNAs or miRNAs. CONCLUSION The study reveals that known interactions bring the most important information for lncRNA-miRNA interaction prediction, and sequences of lncRNAs (miRNAs) also provide useful information. In conclusion, SLNPM-SC and SLNPM-PC are promising for lncRNA-miRNA interaction prediction.
Collapse
Affiliation(s)
- Wen Zhang
- College of informatics, Huazhong Agricultural University, Wuhan, 430070 China
| | - Guifeng Tang
- School of Computer Science, Wuhan University, Wuhan, 430072 China
| | - Shuang Zhou
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong, China
| | - Yanqing Niu
- School of Mathematics and Statistics, South-Central University for Nationalities, Wuhan, 430074 China
| |
Collapse
|
38
|
Liu R, Zhang P. Towards early detection of adverse drug reactions: combining pre-clinical drug structures and post-market safety reports. BMC Med Inform Decis Mak 2019; 19:279. [PMID: 31849321 PMCID: PMC6918608 DOI: 10.1186/s12911-019-0999-1] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2019] [Accepted: 12/04/2019] [Indexed: 01/10/2023] Open
Abstract
Background Adverse drug reaction (ADR) is a major burden for patients and healthcare industry. Early and accurate detection of potential ADRs can help to improve drug safety and reduce financial costs. Post-market spontaneous reports of ADRs remain a cornerstone of pharmacovigilance and a series of drug safety signal detection methods play an important role in providing drug safety insights. However, existing methods require sufficient case reports to generate signals, limiting their usages for newly approved drugs with few (or even no) reports. Methods In this study, we propose a label propagation framework to enhance drug safety signals by combining drug chemical structures with FDA Adverse Event Reporting System (FAERS). First, we compute original drug safety signals via common signal detection algorithms. Then, we construct a drug similarity network based on chemical structures. Finally, we generate enhanced drug safety signals by propagating original signals on the drug similarity network. Our proposed framework enriches post-market safety reports with pre-clinical drug similarity network, effectively alleviating issues of insufficient cases for newly approved drugs. Results We apply the label propagation framework to four popular signal detection algorithms (PRR, ROR, MGPS, BCPNN) and find that our proposed framework generates more accurate drug safety signals than the corresponding baselines. In addition, our framework identifies potential ADRs for newly approved drugs, thus paving the way for early detection of ADRs. Conclusions The proposed label propagation framework combines pre-clinical drug structures with post-market safety reports, generates enhanced drug safety signals, and can potentially help to accurately detect ADRs ahead of time. Availability The source code for this paper is available at: https://github.com/ruoqi-liu/LP-SDA.
Collapse
Affiliation(s)
- Ruoqi Liu
- Department of Computer Science and Engineering, The Ohio State University, 2015 Neil Ave, Columbus, 43210, Ohio, USA
| | - Ping Zhang
- Department of Computer Science and Engineering, The Ohio State University, 2015 Neil Ave, Columbus, 43210, Ohio, USA. .,Department of Biomedical Informatics, The Ohio State University, 1800 Cannon Drive, Columbus, 43210, Ohio, USA.
| |
Collapse
|
39
|
Chen X, Shi W, Deng L. Prediction of Disease Comorbidity Using HeteSim Scores based on Multiple Heterogeneous Networks. Curr Gene Ther 2019; 19:232-241. [DOI: 10.2174/1566523219666190917155959] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2019] [Revised: 06/14/2019] [Accepted: 06/16/2019] [Indexed: 12/25/2022]
Abstract
Background:
Accumulating experimental studies have indicated that disease comorbidity
causes additional pain to patients and leads to the failure of standard treatments compared to patients
who have a single disease. Therefore, accurate prediction of potential comorbidity is essential to design
more efficient treatment strategies. However, only a few disease comorbidities have been discovered
in the clinic.
Objective:
In this work, we propose PCHS, an effective computational method for predicting disease
comorbidity.
Materials and Methods:
We utilized the HeteSim measure to calculate the relatedness score for different
disease pairs in the global heterogeneous network, which integrates six networks based on biological
information, including disease-disease associations, drug-drug interactions, protein-protein interactions
and associations among them. We built the prediction model using the Support Vector Machine
(SVM) based on the HeteSim scores.
Results and Conclusion:
The results showed that PCHS performed significantly better than previous
state-of-the-art approaches and achieved an AUC score of 0.90 in 10-fold cross-validation. Furthermore,
some of our predictions have been verified in literatures, indicating the effectiveness of our method.
Collapse
Affiliation(s)
- Xuegong Chen
- School of Computer Science and Engineering, Central South University, Changsha, 410075, China
| | - Wanwan Shi
- School of Computer Science and Engineering, Central South University, Changsha, 410075, China
| | - Lei Deng
- School of Computer Science and Engineering, Central South University, Changsha, 410075, China
| |
Collapse
|
40
|
Li S, Xie M, Liu X. A Novel Approach Based on Bipartite Network Recommendation and KATZ Model to Predict Potential Micro-Disease Associations. Front Genet 2019; 10:1147. [PMID: 31803235 PMCID: PMC6873782 DOI: 10.3389/fgene.2019.01147] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2019] [Accepted: 10/21/2019] [Indexed: 12/24/2022] Open
Abstract
Accumulating evidence indicates that the microbes colonizing human bodies have crucial effects on human health and the discovery of disease-related microbes will promote the discovery of biomarkers and drugs for the prevention, diagnosis, treatment, and prognosis of diseases. However clinical experiments of disease-microbe associations are time-consuming, laborious and expensive, and there are few methods for predicting potential microbe-disease association. Therefore, developing effective computational models utilizing the accumulated public data of clinically validated microbe-disease associations to identify novel disease-microbe associations is of practical importance. We propose a novel method based on the KATZ model and Bipartite Network Recommendation Algorithm (KATZBNRA) to discover potential associations between microbes and diseases. We calculate the Gaussian interaction profile kernel similarity of diseases and microbes based on validated disease-microbe associations. Then, we construct a bipartite graph and execute a bipartite network recommendation algorithm. Finally, we integrate the disease similarity, microbe similarity and bipartite network recommendation score to obtain the final score, which is used to infer whether there are some novel disease-microbe interactions. To evaluate the predictive power of KATZBNRA, we tested it with the walk length 2 using global leave-one-out cross validation (LOOV), two-fold and five-fold cross validations, with AUCs of 0.9098, 0.8463 and 0.8969, respectively. The test results also show that KATZBNRA is more accurate than two recent similar methods KATZHMDA and BNPMDA.
Collapse
Affiliation(s)
- Shiru Li
- College of Information Science and Engineering, Hunan Normal University, Changsha, China
| | - Minzhu Xie
- College of Information Science and Engineering, Hunan Normal University, Changsha, China
| | - Xinqiu Liu
- Hunan Vocational College of Engineering, Changsha, China
| |
Collapse
|
41
|
Yi HC, You ZH, Guo ZH. Construction and Analysis of Molecular Association Network by Combining Behavior Representation and Node Attributes. Front Genet 2019; 10:1106. [PMID: 31788002 PMCID: PMC6854842 DOI: 10.3389/fgene.2019.01106] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2019] [Accepted: 10/15/2019] [Indexed: 11/13/2022] Open
Abstract
A key aim of post-genomic biomedical research is to systematically understand and model complex biomolecular activities based on a systematic perspective. Biomolecular interactions are widespread and interrelated, multiple biomolecules coordinate to sustain life activities, any disturbance of these complex connections can lead to abnormal of life activities or complex diseases. However, many existing researches usually only focus on individual intermolecular interactions. In this work, we revealed, constructed, and analyzed a large-scale molecular association network of multiple biomolecules in human by integrating associations among lncRNAs, miRNAs, proteins, drugs, and diseases, in which various associations are interconnected and any type of associations can be predicted. We propose Molecular Association Network (MAN)–High-Order Proximity preserved Embedding (HOPE), a novel network representation learning based method to fully exploit latent feature of biomolecules to accurately predict associations between molecules. More specifically, network representation learning algorithm HOPE was applied to learn behavior feature of nodes in the association network. Attribute features of nodes were also adopted. Then, a machine learning model CatBoost was trained to predict potential association between any nodes. The performance of our method was evaluated under five-fold cross validation. A case study to predict miRNA-disease associations was also conducted to verify the prediction capability. MAN-HOPE achieves high accuracy of 93.3% and area under the receiver operating characteristic curve of 0.9793. The experimental results demonstrate the novelty of our systematic understanding of the intermolecular associations, and enable systematic exploration of the landscape of molecular interactions that shape specialized cellular functions.
Collapse
Affiliation(s)
- Hai-Cheng Yi
- Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi, China.,University of Chinese Academy of Sciences, Beijing, China
| | - Zhu-Hong You
- Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi, China
| | - Zhen-Hao Guo
- Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi, China
| |
Collapse
|
42
|
Gong Y, Niu Y, Zhang W, Li X. A network embedding-based multiple information integration method for the MiRNA-disease association prediction. BMC Bioinformatics 2019; 20:468. [PMID: 31510919 PMCID: PMC6740005 DOI: 10.1186/s12859-019-3063-3] [Citation(s) in RCA: 40] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2019] [Accepted: 08/29/2019] [Indexed: 12/26/2022] Open
Abstract
BACKGROUND MiRNAs play significant roles in many fundamental and important biological processes, and predicting potential miRNA-disease associations makes contributions to understanding the molecular mechanism of human diseases. Existing state-of-the-art methods make use of miRNA-target associations, miRNA-family associations, miRNA functional similarity, disease semantic similarity and known miRNA-disease associations, but the known miRNA-disease associations are not well exploited. RESULTS In this paper, a network embedding-based multiple information integration method (NEMII) is proposed for the miRNA-disease association prediction. First, known miRNA-disease associations are formulated as a bipartite network, and the network embedding method Structural Deep Network Embedding (SDNE) is adopted to learn embeddings of nodes in the bipartite network. Second, the embedding representations of miRNAs and diseases are combined with biological features about miRNAs and diseases (miRNA-family associations and disease semantic similarities) to represent miRNA-disease pairs. Third, the prediction models are constructed based on the miRNA-disease pairs by using the random forest. In computational experiments, NEMII achieves high-accuracy performances and outperforms other state-of-the-art methods: GRNMF, NTSMDA and PBMDA. The usefulness of NEMII is further validated by case studies. The studies demonstrate the great potential of network embedding method for the miRNA-disease association prediction, and SDNE outperforms other popular network embedding methods: DeepWalk, High-Order Proximity preserved Embedding (HOPE) and Laplacian Eigenmaps (LE). CONCLUSION We propose a new method, named NEMII, for predicting miRNA-disease associations, which has great potential to benefit the field of miRNA-disease association prediction.
Collapse
Affiliation(s)
- Yuchong Gong
- School of Computer Science, Wuhan University, Wuhan, 430072 China
| | - Yanqing Niu
- School of Mathematics and Statistics, South-Central University for Nationalities, Wuhan, 430074 China
| | - Wen Zhang
- College of Informatics, Huazhong Agricultural University, Wuhan, 430070 China
| | - Xiaohong Li
- School of Computer Science, Wuhan University, Wuhan, 430072 China
| |
Collapse
|
43
|
Identification of amyloidogenic peptides via optimized integrated features space based on physicochemical properties and PSSM. Anal Biochem 2019; 583:113362. [PMID: 31310738 DOI: 10.1016/j.ab.2019.113362] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2019] [Revised: 07/09/2019] [Accepted: 07/12/2019] [Indexed: 01/08/2023]
Abstract
At present, the identification of amyloid becomes more and more essential and meaningful. Because its mis-aggregation may cause some diseases such as Alzheimer's and Parkinson's diseases. This paper focus on the classification of amyloidogenic peptides and a novel feature representation called PhyAve_PSSMDwt is proposed. It includes two parts. One is based on physicochemical properties involving hydrophilicity, hydrophobicity, aggregation tendency, packing density and H-bonding which extracts 15-dimensional features in total. And the other is 60-dimensional features through recursive feature elimination from PSSM by discrete wavelet transform. In this period, sliding window is introduced to reconstruct PSSM so that the evolutionary information of short sequences can still be extracted. At last, the support vector machine is adopted as a classifier. The experimental result on Pep424 dataset shows that PSSM's information makes a great contribution on performance. And compared with other existing methods, our results after cross-validation increase by 3.1%, 3.3%, 0.136 and 0.007 in accuracy, specificity, Matthew's correlation coefficient and AUC value, respectively. It indicates that our method is effective and competitive.
Collapse
|
44
|
Su R, Wu H, Xu B, Liu X, Wei L. Developing a Multi-Dose Computational Model for Drug-Induced Hepatotoxicity Prediction Based on Toxicogenomics Data. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2019; 16:1231-1239. [PMID: 30040651 DOI: 10.1109/tcbb.2018.2858756] [Citation(s) in RCA: 85] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Drug-induced hepatotoxicity may cause acute and chronic liver disease, leading to great concern for patient safety. It is also one of the main reasons for drug withdrawal from the market. Toxicogenomics data has been widely used in hepatotoxicity prediction. In our study, we proposed a multi-dose computational model to predict the drug-induced hepatotoxicity based on gene expression and toxicity data. The dose/concentration information after drug treatment is fully utilized in our study based on the dose-response curve, thus a more informative representative of the dose-response relationship is considered. We also proposed a new feature selection method, named MEMO, which is also one important aspect of our multi-dose model in our study, to deal with the high-dimensional toxicogenomics data. We validated the proposed model using the TG-GATEs, which is a large database recording toxicogenomics data from multiple views. The experimental results show that the drug-induced hepatotoxicity can be predicted with high accuracy and efficiency using the proposed predictive model.
Collapse
|
45
|
Xiao Q, Dai J, Luo J, Fujita H. Multi-view manifold regularized learning-based method for prioritizing candidate disease miRNAs. Knowl Based Syst 2019. [DOI: 10.1016/j.knosys.2019.03.023] [Citation(s) in RCA: 65] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
|
46
|
Gao YC, Zhou XH, Zhang W. An Ensemble Strategy to Predict Prognosis in Ovarian Cancer Based on Gene Modules. Front Genet 2019; 10:366. [PMID: 31068972 PMCID: PMC6491874 DOI: 10.3389/fgene.2019.00366] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2019] [Accepted: 04/05/2019] [Indexed: 12/15/2022] Open
Abstract
Due to the high heterogeneity and complexity of cancer, it is still a challenge to predict the prognosis of cancer patients. In this work, we used a clustering algorithm to divide patients into different subtypes in order to reduce the heterogeneity of the cancer patients in each subtype. Based on the hypothesis that the gene co-expression network may reveal relationships among genes, some communities in the network could influence the prognosis of cancer patients and all the prognosis-related communities could fully reveal the prognosis of cancer patients. To predict the prognosis for cancer patients in each subtype, we adopted an ensemble classifier based on the gene co-expression network of the corresponding subtype. Using the gene expression data of ovarian cancer patients in TCGA (The Cancer Genome Atlas), three subtypes were identified. Survival analysis showed that patients in different subtypes had different survival risks. Three ensemble classifiers were constructed for each subtype. Leave-one-out and independent validation showed that our method outperformed control and literature methods. Furthermore, the function annotation of the communities in each subtype showed that some communities were cancer-related. Finally, we found that the current drug targets can partially support our method.
Collapse
Affiliation(s)
| | - Xiong-Hui Zhou
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan, China
| | - Wen Zhang
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan, China
| |
Collapse
|
47
|
Wu G, Liu J, Yue X. Prediction of drug-disease associations based on ensemble meta paths and singular value decomposition. BMC Bioinformatics 2019; 20:134. [PMID: 30925858 PMCID: PMC6439991 DOI: 10.1186/s12859-019-2644-5] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Background In the field of drug repositioning, it is assumed that similar drugs may treat similar diseases, therefore many existing computational methods need to compute the similarities of drugs and diseases. However, the calculation of similarity depends on the adopted measure and the available features, which may lead that the similarity scores vary dramatically from one to another, and it will not work when facing the incomplete data. Besides, supervised learning based methods usually need both positive and negative samples to train the prediction models, whereas in drug-disease pairs data there are only some verified interactions (positive samples) and a lot of unlabeled pairs. To train the models, many methods simply treat the unlabeled samples as negative ones, which may introduce artificial noises. Herein, we propose a method to predict drug-disease associations without the need of similarity information, and select more likely negative samples. Results In the proposed EMP-SVD (Ensemble Meta Paths and Singular Value Decomposition), we introduce five meta paths corresponding to different kinds of interaction data, and for each meta path we generate a commuting matrix. Every matrix is factorized into two low rank matrices by SVD which are used for the latent features of drugs and diseases respectively. The features are combined to represent drug-disease pairs. We build a base classifier via Random Forest for each meta path and five base classifiers are combined as the final ensemble classifier. In order to train out a more reliable prediction model, we select more likely negative ones from unlabeled samples under the assumption that non-associated drug and disease pair have no common interacted proteins. The experiments have shown that the proposed EMP-SVD method outperforms several state-of-the-art approaches. Case studies by literature investigation have found that the proposed EMP-SVD can mine out many drug-disease associations, which implies the practicality of EMP-SVD. Conclusions The proposed EMP-SVD can integrate the interaction data among drugs, proteins and diseases, and predict the drug-disease associations without the need of similarity information. At the same time, the strategy of selecting more reliable negative samples will benefit the prediction.
Collapse
Affiliation(s)
- Guangsheng Wu
- School of Computer Science, Wuhan University, Wuhan, 430072, People's Republic of China
| | - Juan Liu
- School of Computer Science, Wuhan University, Wuhan, 430072, People's Republic of China. .,Suzhou Institute of Wuhan University, Suzhou, 215123, People's Republic of China.
| | - Xiang Yue
- School of Computer Science, Wuhan University, Wuhan, 430072, People's Republic of China.,Department of Computer Science and Engineering, The Ohio State University, Ohio, 43210, USA
| |
Collapse
|
48
|
Yu LH, Huang QW, Zhou XH. Identification of Cancer Hallmarks Based on the Gene Co-expression Networks of Seven Cancers. Front Genet 2019; 10:99. [PMID: 30838028 PMCID: PMC6389798 DOI: 10.3389/fgene.2019.00099] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2018] [Accepted: 01/29/2019] [Indexed: 12/20/2022] Open
Abstract
Identifying the hallmarks of cancer is essential for cancer research, and the genes involved in cancer hallmarks are likely to be cancer drivers. However, there is no appropriate method in the current literature for identifying genetic cancer hallmarks, especially considering the interrelationships among the genes. Here, we hypothesized that "dense clusters" (or "communities") in the gene co-expression networks of cancer patients may represent functional units regarding cancer formation and progression, and the communities present in the co-expression networks of multiple types of cancer may be cancer hallmarks. Consequently, we mined the conserved communities in the gene co-expression networks of seven cancers in order to identify candidate hallmarks. Functional annotation of the communities showed that they were mainly related to immune response, the cell cycle and the biological processes that maintain basic cellular functions. Survival analysis using the genes involved in the conserved communities verified that two of these hallmarks could predict the survival risks of cancer patients in multiple types of cancer. Furthermore, the genes involved in these hallmarks, one of which was related to the cell cycle, could be useful in screening for cancer drugs.
Collapse
Affiliation(s)
- Ling-Hao Yu
- College of Science, Huazhong Agricultural University, Wuhan, China
| | - Qin-Wei Huang
- College of Science, Huazhong Agricultural University, Wuhan, China
| | - Xiong-Hui Zhou
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan, China
| |
Collapse
|
49
|
Zhu R, Li G, Liu JX, Dai LY, Guo Y. ACCBN: ant-Colony-clustering-based bipartite network method for predicting long non-coding RNA-protein interactions. BMC Bioinformatics 2019; 20:16. [PMID: 30626319 PMCID: PMC6327428 DOI: 10.1186/s12859-018-2586-3] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2018] [Accepted: 12/17/2018] [Indexed: 12/24/2022] Open
Abstract
BACKGROUND Long non-coding RNA (lncRNA) studies play an important role in the development, invasion, and metastasis of the tumor. The analysis and screening of the differential expression of lncRNAs in cancer and corresponding paracancerous tissues provides new clues for finding new cancer diagnostic indicators and improving the treatment. Predicting lncRNA-protein interactions is very important in the analysis of lncRNAs. This article proposes an Ant-Colony-Clustering-Based Bipartite Network (ACCBN) method and predicts lncRNA-protein interactions. The ACCBN method combines ant colony clustering and bipartite network inference to predict lncRNA-protein interactions. RESULTS A five-fold cross-validation method was used in the experimental test. The results show that the values of the evaluation indicators of ACCBN on the test set are significantly better after comparing the predictive ability of ACCBN with RWR, ProCF, LPIHN, and LPBNI method. CONCLUSIONS With the continuous development of biology, besides the research on the cellular process, the research on the interaction function between proteins becomes a new key topic of biology. The studies on protein-protein interactions had important implications for bioinformatics, clinical medicine, and pharmacology. However, there are many kinds of proteins, and their functions of interactions are complicated. Moreover, the experimental methods require time to be confirmed because it is difficult to estimate. Therefore, a viable solution is to predict protein-protein interactions efficiently with computers. The ACCBN method has a good effect on the prediction of protein-protein interactions in terms of sensitivity, precision, accuracy, and F1-score.
Collapse
Affiliation(s)
- Rong Zhu
- School of Information Science and Engineering, Central South University, Changsha, 410083, China. .,School of Information Science and Engineering, Qufu Normal University, Rizhao, 276826, China.
| | - Guangshun Li
- School of Information Science and Engineering, Qufu Normal University, Rizhao, 276826, China
| | - Jin-Xing Liu
- School of Information Science and Engineering, Qufu Normal University, Rizhao, 276826, China
| | - Ling-Yun Dai
- School of Information Science and Engineering, Qufu Normal University, Rizhao, 276826, China
| | - Ying Guo
- School of Information Science and Engineering, Central South University, Changsha, 410083, China.
| |
Collapse
|
50
|
Tang G, Shi J, Wu W, Yue X, Zhang W. Sequence-based bacterial small RNAs prediction using ensemble learning strategies. BMC Bioinformatics 2018; 19:503. [PMID: 30577759 PMCID: PMC6302447 DOI: 10.1186/s12859-018-2535-1] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open
Abstract
Background Bacterial small non-coding RNAs (sRNAs) have emerged as important elements in diverse physiological processes, including growth, development, cell proliferation, differentiation, metabolic reactions and carbon metabolism, and attract great attention. Accurate prediction of sRNAs is important and challenging, and helps to explore functions and mechanism of sRNAs. Results In this paper, we utilize a variety of sRNA sequence-derived features to develop ensemble learning methods for the sRNA prediction. First, we compile a balanced dataset and four imbalanced datasets. Then, we investigate various sRNA sequence-derived features, such as spectrum profile, mismatch profile, reverse compliment k-mer and pseudo nucleotide composition. Finally, we consider two ensemble learning strategies to integrate all features for building ensemble learning models for the sRNA prediction. One is the weighted average ensemble method (WAEM), which uses the linear weighted sum of outputs from the individual feature-based predictors to predict sRNAs. The other is the neural network ensemble method (NNEM), which trains a deep neural network by combining diverse features. In the computational experiments, we evaluate our methods on these five datasets by using 5-fold cross validation. WAEM and NNEM can produce better results than existing state-of-the-art sRNA prediction methods. Conclusions WAEM and NNEM have great potential for the sRNA prediction, and are helpful for understanding the biological mechanism of bacteria.
Collapse
Affiliation(s)
- Guifeng Tang
- School of Computer Science, Wuhan University, Wuhan, 430072, China
| | - Jingwen Shi
- School of Mathematics and Statistics, Wuhan University, Wuhan, 430072, China
| | - Wenjian Wu
- Electronic Information School, Wuhan University, Wuhan, 430072, China
| | - Xiang Yue
- Department of Computer Science and Engineering, The Ohio State University, Columbus, OH, 43210, USA
| | - Wen Zhang
- College of Informatics, Huazhong Agricultural University, Wuhan, 430070, China.
| |
Collapse
|