1
|
Zhou B, Ran B, Chen L. A GraphSAGE-based model with fingerprints only to predict drug-drug interactions. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2024; 21:2922-2942. [PMID: 38454713 DOI: 10.3934/mbe.2024130] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/09/2024]
Abstract
Drugs are an effective way to treat various diseases. Some diseases are so complicated that the effect of a single drug for such diseases is limited, which has led to the emergence of combination drug therapy. The use multiple drugs to treat these diseases can improve the drug efficacy, but it can also bring adverse effects. Thus, it is essential to determine drug-drug interactions (DDIs). Recently, deep learning algorithms have become popular to design DDI prediction models. However, most deep learning-based models need several types of drug properties, inducing the application problems for drugs without these properties. In this study, a new deep learning-based model was designed to predict DDIs. For wide applications, drugs were first represented by commonly used properties, referred to as fingerprint features. Then, these features were perfectly fused with the drug interaction network by a type of graph convolutional network method, GraphSAGE, yielding high-level drug features. The inner product was adopted to score the strength of drug pairs. The model was evaluated by 10-fold cross-validation, resulting in an AUROC of 0.9704 and AUPR of 0.9727. Such performance was better than the previous model which directly used drug fingerprint features and was competitive compared with some other previous models that used more drug properties. Furthermore, the ablation tests indicated the importance of the main parts of the model, and we analyzed the strengths and limitations of a model for drugs with different degrees in the network. This model identified some novel DDIs that may bring expected benefits, such as the combination of PEA and cannabinol that may produce better effects. DDIs that may cause unexpected side effects have also been discovered, such as the combined use of WIN 55,212-2 and cannabinol. These DDIs can provide novel insights for treating complex diseases or avoiding adverse drug events.
Collapse
Affiliation(s)
- Bo Zhou
- Institute of Wound Prevention and Treatment, Shanghai University of Medicine and Health Sciences, Shanghai 201318, China
- School of Basic Medical Sciences, Shanghai University of Medicine and Health Sciences, Shanghai 201318, China
| | - Bing Ran
- College of Information Engineering, Shanghai Maritime University, Shanghai 201306, China
| | - Lei Chen
- College of Information Engineering, Shanghai Maritime University, Shanghai 201306, China
| |
Collapse
|
2
|
Zhang Y, Chu Y, Lin S, Xiong Y, Wei DQ. ReHoGCNES-MDA: prediction of miRNA-disease associations using homogenous graph convolutional networks based on regular graph with random edge sampler. Brief Bioinform 2024; 25:bbae103. [PMID: 38517693 PMCID: PMC10959163 DOI: 10.1093/bib/bbae103] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2023] [Revised: 02/04/2024] [Accepted: 02/23/2024] [Indexed: 03/24/2024] Open
Abstract
Numerous investigations increasingly indicate the significance of microRNA (miRNA) in human diseases. Hence, unearthing associations between miRNA and diseases can contribute to precise diagnosis and efficacious remediation of medical conditions. The detection of miRNA-disease linkages via computational techniques utilizing biological information has emerged as a cost-effective and highly efficient approach. Here, we introduced a computational framework named ReHoGCNES, designed for prospective miRNA-disease association prediction (ReHoGCNES-MDA). This method constructs homogenous graph convolutional network with regular graph structure (ReHoGCN) encompassing disease similarity network, miRNA similarity network and known MDA network and then was tested on four experimental tasks. A random edge sampler strategy was utilized to expedite processes and diminish training complexity. Experimental results demonstrate that the proposed ReHoGCNES-MDA method outperforms both homogenous graph convolutional network and heterogeneous graph convolutional network with non-regular graph structure in all four tasks, which implicitly reveals steadily degree distribution of a graph does play an important role in enhancement of model performance. Besides, ReHoGCNES-MDA is superior to several machine learning algorithms and state-of-the-art methods on the MDA prediction. Furthermore, three case studies were conducted to further demonstrate the predictive ability of ReHoGCNES. Consequently, 93.3% (breast neoplasms), 90% (prostate neoplasms) and 93.3% (prostate neoplasms) of the top 30 forecasted miRNAs were validated by public databases. Hence, ReHoGCNES-MDA might serve as a dependable and beneficial model for predicting possible MDAs.
Collapse
Affiliation(s)
- Yufang Zhang
- School of Mathematical Sciences and SJTU-Yale Joint Center for Biostatistics and Data Science, Shanghai Jiao Tong University, Shanghai 200240, China
- Peng Cheng Laboratory, Shenzhen, Guangdong 518055, China
- Zhongjing Research and Industrialization Institute of Chinese Medicine, Zhongguancun Scientific Park, Meixi, Nanyang, Henan, 473006, China
| | - Yanyi Chu
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, 94305, USA
| | - Shenggeng Lin
- State Key Laboratory of Microbial Metabolism, School of Life Sciences and Biotechnology, and Joint Laboratory of International Cooperation in Metabolic and Developmental Sciences, Ministry of Education, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Yi Xiong
- State Key Laboratory of Microbial Metabolism, School of Life Sciences and Biotechnology, and Joint Laboratory of International Cooperation in Metabolic and Developmental Sciences, Ministry of Education, Shanghai Jiao Tong University, Shanghai 200240, China
- Shanghai Artificial Intelligence Laboratory, Shanghai, 200232, China
| | - Dong-Qing Wei
- Peng Cheng Laboratory, Shenzhen, Guangdong 518055, China
- Zhongjing Research and Industrialization Institute of Chinese Medicine, Zhongguancun Scientific Park, Meixi, Nanyang, Henan, 473006, China
- State Key Laboratory of Microbial Metabolism, School of Life Sciences and Biotechnology, and Joint Laboratory of International Cooperation in Metabolic and Developmental Sciences, Ministry of Education, Shanghai Jiao Tong University, Shanghai 200240, China
| |
Collapse
|
3
|
Li G, Zeng F, Luo J, Liang C, Xiao Q. MNCLCDA: predicting circRNA-drug sensitivity associations by using mixed neighbourhood information and contrastive learning. BMC Med Inform Decis Mak 2023; 23:291. [PMID: 38110886 PMCID: PMC10729363 DOI: 10.1186/s12911-023-02384-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2023] [Accepted: 12/01/2023] [Indexed: 12/20/2023] Open
Abstract
BACKGROUND circRNAs play an important role in drug resistance and cancer development. Recently, many studies have shown that the expressions of circRNAs in human cells can affect the sensitivity of cells to therapeutic drugs, thus significantly influencing the therapeutic effects of these drugs. Traditional biomedical experiments required to verify this sensitivity relationship are not only time-consuming but also expensive. Hence, the development of an efficient computational approach that can accurately predict the novel associations between drug sensitivities and circRNAs is a crucial and pressing need. METHODS In this research, we present a novel computational framework called MNCLCDA, which aims to predict the potential associations between drug sensitivities and circRNAs to assist with medical research. First, MNCLCDA quantifies the similarity between the given drug and circRNA using drug structure information, circRNA gene sequence information, and GIP kernel information. Due to the existence of noise in similarity information, we employ a preprocessing approach based on random walk with restart for similarity networks to efficiently capture the useful features of circRNAs and drugs. Second, we use a mixed neighbourhood graph convolutional network to obtain the neighbourhood information of nodes. Then, a graph-based contrastive learning method is used to enhance the robustness of the model, and finally, a double Laplace-regularized least-squares method is used to predict potential circRNA-drug associations through the kernel matrices in the circRNA and drug spaces. RESULTS Numerous experimental results show that MNCLCDA outperforms six other advanced methods. In addition, the excellent performance of our proposed model in case studies illustrates that MNCLCDA also has the ability to predict the associations between drug sensitivity and circRNA in practical situations. CONCLUSIONS After a large number of experiments, it is illustrated that MNCLCDA is an efficient tool for predicting the potential associations between drug sensitivities and circRNAs, thereby can provide some guidance for clinical trials.
Collapse
Affiliation(s)
- Guanghui Li
- School of Information Engineering, East China Jiaotong University, Nanchang, China.
| | - Feifan Zeng
- School of Information Engineering, East China Jiaotong University, Nanchang, China
| | - Jiawei Luo
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China.
| | - Cheng Liang
- School of Information Science and Engineering, Shandong Normal University, Jinan, China
| | - Qiu Xiao
- College of Information Science and Engineering, Hunan Normal University, Changsha, China
| |
Collapse
|
4
|
Recent Deep Learning Methodology Development for RNA–RNA Interaction Prediction. Symmetry (Basel) 2022. [DOI: 10.3390/sym14071302] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023] Open
Abstract
Genetic regulation of organisms involves complicated RNA–RNA interactions (RRIs) among messenger RNA (mRNA), microRNA (miRNA), and long non-coding RNA (lncRNA). Detecting RRIs is beneficial for discovering biological mechanisms as well as designing new drugs. In recent years, with more and more experimentally verified RNA–RNA interactions being deposited into databases, statistical machine learning, especially recent deep-learning-based automatic algorithms, have been widely applied to RRI prediction with remarkable success. This paper first gives a brief introduction to the traditional machine learning methods applied on RRI prediction and benchmark databases for training the models, and then provides a recent methodology overview of deep learning models in the prediction of microRNA (miRNA)–mRNA interactions and long non-coding RNA (lncRNA)–miRNA interactions.
Collapse
|
5
|
Yan C, Duan G, Li N, Zhang L, Wu FX, Wang J. PDMDA: predicting deep-level miRNA-disease associations with graph neural networks and sequence features. Bioinformatics 2022; 38:2226-2234. [PMID: 35150255 DOI: 10.1093/bioinformatics/btac077] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2021] [Revised: 01/18/2022] [Accepted: 02/05/2022] [Indexed: 02/03/2023] Open
Abstract
MOTIVATION Many studies have shown that microRNAs (miRNAs) play a key role in human diseases. Meanwhile, traditional experimental methods for miRNA-disease association identification are extremely costly, time-consuming and challenging. Therefore, many computational methods have been developed to predict potential associations between miRNAs and diseases. However, those methods mainly predict the existence of miRNA-disease associations, and they cannot predict the deep-level miRNA-disease association types. RESULTS In this study, we propose a new end-to-end deep learning method (called PDMDA) to predict deep-level miRNA-disease associations with graph neural networks (GNNs) and miRNA sequence features. Based on the sequence and structural features of miRNAs, PDMDA extracts the miRNA feature representations by a fully connected network (FCN). The disease feature representations are extracted from the disease-gene network and gene-gene interaction network by GNN model. Finally, a multilayer with three fully connected layers and a softmax layer is designed to predict the final miRNA-disease association scores based on the concatenated feature representations of miRNAs and diseases. Note that PDMDA does not take the miRNA-disease association matrix as input to compute the Gaussian interaction profile similarity. We conduct three experiments based on six association type samples (including circulations, epigenetics, target, genetics, known association of which their types are unknown and unknown association samples). We conduct fivefold cross-validation validation to assess the prediction performance of PDMDA. The area under the receiver operating characteristic curve scores is used as metric. The experiment results show that PDMDA can accurately predict the deep-level miRNA-disease associations. AVAILABILITY AND IMPLEMENTATION Data and source codes are available at https://github.com/27167199/PDMDA.
Collapse
Affiliation(s)
- Cheng Yan
- School of Information Science and Engineering, Hunan University of Chinese Medicine, Changsha 410208, China.,School of Computer Science and Engineering, Hunan Provincial Key Lab on Bioinformatics, Central South University, Changsha 410083, China
| | - Guihua Duan
- School of Computer Science and Engineering, Hunan Provincial Key Lab on Bioinformatics, Central South University, Changsha 410083, China
| | - Na Li
- School of Computer Science and Engineering, Hunan Provincial Key Lab on Bioinformatics, Central South University, Changsha 410083, China
| | - Lishen Zhang
- School of Computer Science and Engineering, Hunan Provincial Key Lab on Bioinformatics, Central South University, Changsha 410083, China
| | - Fang-Xiang Wu
- Division of Biomedical Engineering and Department of Mechanical Engineering, University of Saskatchewan, Saskatoon SK S7N5A9, Canada
| | - Jianxin Wang
- School of Computer Science and Engineering, Hunan Provincial Key Lab on Bioinformatics, Central South University, Changsha 410083, China
| |
Collapse
|
6
|
Zhang G, Li M, Deng H, Xu X, Liu X, Zhang W. SGNNMD: signed graph neural network for predicting deregulation types of miRNA-disease associations. Brief Bioinform 2021; 23:6455665. [PMID: 34875683 DOI: 10.1093/bib/bbab464] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2021] [Revised: 10/08/2021] [Accepted: 10/11/2021] [Indexed: 11/13/2022] Open
Abstract
MiRNAs are a class of small non-coding RNA molecules that play an important role in many biological processes, and determining miRNA-disease associations can benefit drug development and clinical diagnosis. Although great efforts have been made to develop miRNA-disease association prediction methods, few attention has been paid to in-depth classification of miRNA-disease associations, e.g. up/down-regulation of miRNAs in diseases. In this paper, we regard known miRNA-disease associations as a signed bipartite network, which has miRNA nodes, disease nodes and two types of edges representing up/down-regulation of miRNAs in diseases, and propose a signed graph neural network method (SGNNMD) for predicting deregulation types of miRNA-disease associations. SGNNMD extracts subgraphs around miRNA-disease pairs from the signed bipartite network and learns structural features of subgraphs via a labeling algorithm and a neural network, and then combines them with biological features (i.e. miRNA-miRNA functional similarity and disease-disease semantic similarity) to build the prediction model. In the computational experiments, SGNNMD achieves highly competitive performance when compared with several baselines, including the signed graph link prediction methods, multi-relation prediction methods and one existing deregulation type prediction method. Moreover, SGNNMD has good inductive capability and can generalize to miRNAs/diseases unseen during the training.
Collapse
Affiliation(s)
- Guangzhan Zhang
- College of Informatics, Huazhong Agricultural University, Wuhan 430070, China
| | - Menglu Li
- College of Informatics, Huazhong Agricultural University, Wuhan 430070, China
| | - Huan Deng
- College of Informatics, Huazhong Agricultural University, Wuhan 430070, China
| | - Xinran Xu
- College of Informatics, Huazhong Agricultural University, Wuhan 430070, China
| | - Xuan Liu
- College of Informatics, Huazhong Agricultural University, Wuhan 430070, China
| | - Wen Zhang
- College of Informatics, Huazhong Agricultural University, Wuhan 430070, China
| |
Collapse
|
7
|
Chen D, Fulmer C, Gordon IO, Syed S, Stidham RW, Vande Casteele N, Qin Y, Falloon K, Cohen BL, Wyllie R, Rieder F. Application of Artificial Intelligence to Clinical Practice in Inflammatory Bowel Disease - What the Clinician Needs to Know. J Crohns Colitis 2021; 16:460-471. [PMID: 34558619 PMCID: PMC8919817 DOI: 10.1093/ecco-jcc/jjab169] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Artificial intelligence [AI] techniques are quickly spreading across medicine as an analytical method to tackle challenging clinical questions. What were previously thought of as highly complex data sources, such as images or free text, are now becoming manageable. Novel analytical methods merge the latest developments in information technology infrastructure with advances in computer science. Once primarily associated with Silicon Valley, AI techniques are now making their way into medicine, including in the field of inflammatory bowel diseases [IBD]. Understanding potential applications and limitations of these techniques can be difficult, in particular for busy clinicians. In this article, we explain the basic terminologies and provide a particular focus on the foundations behind state-of-the-art AI methodologies in both imaging and text. We explore the growing applications of AI in medicine, with a specific focus on IBD to inform the practising gastroenterologist and IBD specialist. Finally, we outline possible future uses of these technologies in daily clinical practice.
Collapse
Affiliation(s)
- David Chen
- Medical Operations, Cleveland Clinic Foundation, Cleveland, OH, USA,Department of Gastroenterology, Hepatology and Nutrition, Digestive Diseases and Surgery Institute, Cleveland Clinic Foundation, Cleveland, OH, USA
| | - Clifton Fulmer
- Department of Pathology, Robert J. Tomsich Pathology and Laboratory Medicine Institute, Cleveland Clinic Foundation, Cleveland, OH, USA
| | - Ilyssa O Gordon
- Department of Pathology, Robert J. Tomsich Pathology and Laboratory Medicine Institute, Cleveland Clinic Foundation, Cleveland, OH, USA
| | - Sana Syed
- Division of Gastroenterology, Hepatology and Nutrition, Department of Pediatrics, School of Medicine, University of Virginia, Charlottesville, VA, USA,School of Data Science, University of Virginia, Charlottesville, VA, USA
| | - Ryan W Stidham
- Division of Gastroenterology and Hepatology, Department of Internal Medicine, Michigan Medicine, Ann Arbor, MI, USA,Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | | | - Yi Qin
- Department of Gastroenterology, Hepatology and Nutrition, Digestive Diseases and Surgery Institute, Cleveland Clinic Foundation, Cleveland, OH, USA
| | - Katherine Falloon
- Department of Gastroenterology, Hepatology and Nutrition, Digestive Diseases and Surgery Institute, Cleveland Clinic Foundation, Cleveland, OH, USA
| | - Benjamin L Cohen
- Department of Gastroenterology, Hepatology and Nutrition, Digestive Diseases and Surgery Institute, Cleveland Clinic Foundation, Cleveland, OH, USA
| | - Robert Wyllie
- Medical Operations, Cleveland Clinic Foundation, Cleveland, OH, USA,Department of Pediatric Gastroenterology, Hepatology, and Nutrition, Cleveland Clinic Foundation, Cleveland, OH, USA
| | - Florian Rieder
- Corresponding author: Florian Rieder, MD, Department of Inflammation and Immunity, and Department of Gastroenterology, Hepatology, & Nutrition, Cleveland Clinic Foundation, 9500 Euclid Ave., Cleveland, OH 44195, USA. Tel: (216) 445-5631; Fax: (216) 636-0104; E-mail:
| |
Collapse
|
8
|
Ji C, Wang Y, Ni J, Zheng C, Su Y. Predicting miRNA-Disease Associations Based on Heterogeneous Graph Attention Networks. Front Genet 2021; 12:727744. [PMID: 34512733 PMCID: PMC8424198 DOI: 10.3389/fgene.2021.727744] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2021] [Accepted: 08/02/2021] [Indexed: 11/23/2022] Open
Abstract
In recent years, more and more evidence has shown that microRNAs (miRNAs) play an important role in the regulation of post-transcriptional gene expression, and are closely related to human diseases. Many studies have also revealed that miRNAs can be served as promising biomarkers for the potential diagnosis and treatment of human diseases. The interactions between miRNA and human disease have rarely been demonstrated, and the underlying mechanism of miRNA is not clear. Therefore, computational approaches has attracted the attention of researchers, which can not only save time and money, but also improve the efficiency and accuracy of biological experiments. In this work, we proposed a Heterogeneous Graph Attention Networks (GAT) based method for miRNA-disease associations prediction, named HGATMDA. We constructed a heterogeneous graph for miRNAs and diseases, introduced weighted DeepWalk and GAT methods to extract features of miRNAs and diseases from the graph. Moreover, a fully-connected neural networks is used to predict correlation scores between miRNA-disease pairs. Experimental results under five-fold cross validation (five-fold CV) showed that HGATMDA achieved better prediction performance than other state-of-the-art methods. In addition, we performed three case studies on breast neoplasms, lung neoplasms and kidney neoplasms. The results showed that for the three diseases mentioned above, 50 out of top 50 candidates were confirmed by the validation datasets. Therefore, HGATMDA is suitable as an effective tool to identity potential diseases-related miRNAs.
Collapse
Affiliation(s)
- Cunmei Ji
- School of Cyber Science and Engineering, Qufu Normal University, Qufu, China
| | - Yutian Wang
- School of Cyber Science and Engineering, Qufu Normal University, Qufu, China
| | - Jiancheng Ni
- School of Cyber Science and Engineering, Qufu Normal University, Qufu, China
| | - Chunhou Zheng
- School of Artificial Intelligence, Anhui University, Hefei, China
| | - Yansen Su
- School of Artificial Intelligence, Anhui University, Hefei, China
| |
Collapse
|
9
|
Fan Y, Chen M, Pan X. GCRFLDA: scoring lncRNA-disease associations using graph convolution matrix completion with conditional random field. Brief Bioinform 2021; 23:6363052. [PMID: 34486019 DOI: 10.1093/bib/bbab361] [Citation(s) in RCA: 35] [Impact Index Per Article: 11.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2021] [Revised: 07/19/2021] [Accepted: 08/16/2021] [Indexed: 12/12/2022] Open
Abstract
Long noncoding RNAs (lncRNAs) play important roles in various biological regulatory processes, and are closely related to the occurrence and development of diseases. Identifying lncRNA-disease associations is valuable for revealing the molecular mechanism of diseases and exploring treatment strategies. Thus, it is necessary to computationally predict lncRNA-disease associations as a complementary method for biological experiments. In this study, we proposed a novel prediction method GCRFLDA based on the graph convolutional matrix completion. GCRFLDA first constructed a graph using the available lncRNA-disease association information. Then, it constructed an encoder consisting of conditional random field and attention mechanism to learn efficient embeddings of nodes, and a decoder layer to score lncRNA-disease associations. In GCRFLDA, the Gaussian interaction profile kernels similarity and cosine similarity were fused as side information of lncRNA and disease nodes. Experimental results on four benchmark datasets show that GCRFLDA is superior to other existing methods. Moreover, we conducted case studies on four diseases and observed that 70 of 80 predicted associated lncRNAs were confirmed by the literature.
Collapse
Affiliation(s)
- Yongxian Fan
- School of Computer Science and Information Security, Guilin University of Electronic Technology
| | - Meijun Chen
- Guilin University of Electronic Technology, Guilin 541004, China
| | - Xiaoyong Pan
- Department of Automation of Shanghai Jiao Tong University
| |
Collapse
|
10
|
Zhang XM, Liang L, Liu L, Tang MJ. Graph Neural Networks and Their Current Applications in Bioinformatics. Front Genet 2021; 12:690049. [PMID: 34394185 PMCID: PMC8360394 DOI: 10.3389/fgene.2021.690049] [Citation(s) in RCA: 41] [Impact Index Per Article: 13.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2021] [Accepted: 05/28/2021] [Indexed: 12/22/2022] Open
Abstract
Graph neural networks (GNNs), as a branch of deep learning in non-Euclidean space, perform particularly well in various tasks that process graph structure data. With the rapid accumulation of biological network data, GNNs have also become an important tool in bioinformatics. In this research, a systematic survey of GNNs and their advances in bioinformatics is presented from multiple perspectives. We first introduce some commonly used GNN models and their basic principles. Then, three representative tasks are proposed based on the three levels of structural information that can be learned by GNNs: node classification, link prediction, and graph generation. Meanwhile, according to the specific applications for various omics data, we categorize and discuss the related studies in three aspects: disease prediction, drug discovery, and biomedical imaging. Based on the analysis, we provide an outlook on the shortcomings of current studies and point out their developing prospect. Although GNNs have achieved excellent results in many biological tasks at present, they still face challenges in terms of low-quality data processing, methodology, and interpretability and have a long road ahead. We believe that GNNs are potentially an excellent method that solves various biological problems in bioinformatics research.
Collapse
Affiliation(s)
- Xiao-Meng Zhang
- School of Information, Yunnan Normal University, Kunming, China
| | - Li Liang
- School of Information, Yunnan Normal University, Kunming, China
| | - Lin Liu
- School of Information, Yunnan Normal University, Kunming, China
- Key Laboratory of Educational Informatization for Nationalities Ministry of Education, Yunnan Normal University, Kunming, China
| | - Ming-Jing Tang
- Key Laboratory of Educational Informatization for Nationalities Ministry of Education, Yunnan Normal University, Kunming, China
- School of Life Sciences, Yunnan Normal University, Kunming, China
| |
Collapse
|
11
|
Dai Q, Chu Y, Li Z, Zhao Y, Mao X, Wang Y, Xiong Y, Wei DQ. MDA-CF: Predicting MiRNA-Disease associations based on a cascade forest model by fusing multi-source information. Comput Biol Med 2021; 136:104706. [PMID: 34371319 DOI: 10.1016/j.compbiomed.2021.104706] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2021] [Revised: 07/26/2021] [Accepted: 07/26/2021] [Indexed: 01/17/2023]
Abstract
MicroRNAs (miRNAs) are significant regulators in various biological processes. They may become promising biomarkers or therapeutic targets, which provide a new perspective in diagnosis and treatment of multiple diseases. Since the experimental methods are always costly and resource-consuming, prediction of disease-related miRNAs using computational methods is in great need. In this study, we developed MDA-CF to identify underlying miRNA-disease associations based on a cascade forest model. In this method, multi-source information was integrated to represent miRNAs and diseases comprehensively, and the autoencoder was utilized for dimension reduction to obtain the optimal feature space. The cascade forest model was then employed for miRNA-disease association prediction. As a result, the average AUC of MDA-CF was 0.9464 on HMDD v3.2 in five-fold cross-validation. Compared with previous computational methods, MDA-CF performed better on HMDD v2.0 with an average AUC of 0.9258. Moreover, MDA-CF was implemented to investigate colon neoplasm, breast neoplasm, and gastric neoplasm, and 100%, 86%, 88% of the top 50 potential miRNAs were validated by authoritative databases. In conclusion, MDA-CF appears to be a reliable method to uncover disease-associated miRNAs. The source code of MDA-CF is available at https://github.com/a1622108/MDA-CF.
Collapse
Affiliation(s)
- Qiuying Dai
- State Key Laboratory of Microbial Metabolism, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Yanyi Chu
- State Key Laboratory of Microbial Metabolism, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Zhiqi Li
- State Key Laboratory of Microbial Metabolism, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Yusong Zhao
- State Key Laboratory of Microbial Metabolism, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Xueying Mao
- State Key Laboratory of Microbial Metabolism, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Yanjing Wang
- State Key Laboratory of Microbial Metabolism, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Yi Xiong
- State Key Laboratory of Microbial Metabolism, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, 200240, China.
| | - Dong-Qing Wei
- State Key Laboratory of Microbial Metabolism, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, 200240, China; Peng Cheng Laboratory, Vanke Cloud City Phase I Building 8, Xili Street, Nanshan District, Shenzhen, Guangdong, 518055, China.
| |
Collapse
|
12
|
Xia Y, Xia CQ, Pan X, Shen HB. GraphBind: protein structural context embedded rules learned by hierarchical graph neural networks for recognizing nucleic-acid-binding residues. Nucleic Acids Res 2021; 49:e51. [PMID: 33577689 PMCID: PMC8136796 DOI: 10.1093/nar/gkab044] [Citation(s) in RCA: 45] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2020] [Accepted: 02/09/2021] [Indexed: 11/24/2022] Open
Abstract
Knowledge of the interactions between proteins and nucleic acids is the basis of understanding various biological activities and designing new drugs. How to accurately identify the nucleic-acid-binding residues remains a challenging task. In this paper, we propose an accurate predictor, GraphBind, for identifying nucleic-acid-binding residues on proteins based on an end-to-end graph neural network. Considering that binding sites often behave in highly conservative patterns on local tertiary structures, we first construct graphs based on the structural contexts of target residues and their spatial neighborhood. Then, hierarchical graph neural networks (HGNNs) are used to embed the latent local patterns of structural and bio-physicochemical characteristics for binding residue recognition. We comprehensively evaluate GraphBind on DNA/RNA benchmark datasets. The results demonstrate the superior performance of GraphBind than state-of-the-art methods. Moreover, GraphBind is extended to other ligand-binding residue prediction to verify its generalization capability. Web server of GraphBind is freely available at http://www.csbio.sjtu.edu.cn/bioinf/GraphBind/.
Collapse
Affiliation(s)
- Ying Xia
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai 200240, China
| | - Chun-Qiu Xia
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai 200240, China
| | - Xiaoyong Pan
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai 200240, China
| | - Hong-Bin Shen
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai 200240, China.,School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China
| |
Collapse
|
13
|
Chu Y, Wang X, Dai Q, Wang Y, Wang Q, Peng S, Wei X, Qiu J, Salahub DR, Xiong Y, Wei DQ. MDA-GCNFTG: identifying miRNA-disease associations based on graph convolutional networks via graph sampling through the feature and topology graph. Brief Bioinform 2021; 22:6261915. [PMID: 34009265 DOI: 10.1093/bib/bbab165] [Citation(s) in RCA: 34] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2021] [Revised: 04/02/2021] [Accepted: 04/08/2021] [Indexed: 11/13/2022] Open
Abstract
Accurate identification of the miRNA-disease associations (MDAs) helps to understand the etiology and mechanisms of various diseases. However, the experimental methods are costly and time-consuming. Thus, it is urgent to develop computational methods towards the prediction of MDAs. Based on the graph theory, the MDA prediction is regarded as a node classification task in the present study. To solve this task, we propose a novel method MDA-GCNFTG, which predicts MDAs based on Graph Convolutional Networks (GCNs) via graph sampling through the Feature and Topology Graph to improve the training efficiency and accuracy. This method models both the potential connections of feature space and the structural relationships of MDA data. The nodes of the graphs are represented by the disease semantic similarity, miRNA functional similarity and Gaussian interaction profile kernel similarity. Moreover, we considered six tasks simultaneously on the MDA prediction problem at the first time, which ensure that under both balanced and unbalanced sample distribution, MDA-GCNFTG can predict not only new MDAs but also new diseases without known related miRNAs and new miRNAs without known related diseases. The results of 5-fold cross-validation show that the MDA-GCNFTG method has achieved satisfactory performance on all six tasks and is significantly superior to the classic machine learning methods and the state-of-the-art MDA prediction methods. Moreover, the effectiveness of GCNs via the graph sampling strategy and the feature and topology graph in MDA-GCNFTG has also been demonstrated. More importantly, case studies for two diseases and three miRNAs are conducted and achieved satisfactory performance.
Collapse
Affiliation(s)
- Yanyi Chu
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, China
| | - Xuhong Wang
- School of Electronic, Information and Electrical Engineering (SEIEE), Shanghai Jiao Tong University, China
| | - Qiuying Dai
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, China
| | - Yanjing Wang
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, China
| | - Qiankun Wang
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, China
| | - Shaoliang Peng
- College of Computer Science and Electronic Engineering, Hunan University, China
| | | | | | - Dennis Russell Salahub
- Department of Chemistry, University of Calgary, Fellow Royal Society of Canada and Fellow of the American Association for the Advancement of Science, China
| | - Yi Xiong
- State Key Laboratory of Microbial Metabolism, Shanghai-Islamabad-Belgrade Joint Innovation Center on Antibacterial Resistances, Joint International Research Laboratory of Metabolic & Developmental Sciences and School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200030, P.R. China
| | - Dong-Qing Wei
- State Key Laboratory of Microbial Metabolism, Shanghai-Islamabad-Belgrade Joint Innovation Center on Antibacterial Resistances, Joint International Research Laboratory of Metabolic & Developmental Sciences and School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200030, P.R. China
| |
Collapse
|
14
|
Ding Y, Lei X, Liao B, Wu FX. Machine learning approaches for predicting biomolecule-disease associations. Brief Funct Genomics 2021; 20:273-287. [PMID: 33554238 DOI: 10.1093/bfgp/elab002] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Biomolecules, such as microRNAs, circRNAs, lncRNAs and genes, are functionally interdependent in human cells, and all play critical roles in diverse fundamental and vital biological processes. The dysregulations of such biomolecules can cause diseases. Identifying the associations between biomolecules and diseases can uncover the mechanisms of complex diseases, which is conducive to their diagnosis, treatment, prognosis and prevention. Due to the time consumption and cost of biologically experimental methods, many computational association prediction methods have been proposed in the past few years. In this study, we provide a comprehensive review of machine learning-based approaches for predicting disease-biomolecule associations with multi-view data sources. Firstly, we introduce some databases and general strategies for integrating multi-view data sources in the prediction models. Then we discuss several feature representation methods for machine learning-based prediction models. Thirdly, we comprehensively review machine learning-based prediction approaches in three categories: basic machine learning methods, matrix completion-based methods and deep learning-based methods, while discussing their advantages and disadvantages. Finally, we provide some perspectives for further improving biomolecule-disease prediction methods.
Collapse
Affiliation(s)
- Yulian Ding
- Division of Biomedical Engineering at the University of Saskatchewan
| | - Xiujuan Lei
- School of Computer Science at Shaanxi Normal University
| | - Bo Liao
- School of Mathematics and Statistics at Hainan Normal University, Haikou, China
| | - Fang-Xiang Wu
- College of Engineering and the Department of Computer Science at University of Saskatchewan
| |
Collapse
|
15
|
Zhu L, Yang X, Zhu R, Yu L. Identifying Discriminative Biological Function Features and Rules for Cancer-Related Long Non-coding RNAs. Front Genet 2021; 11:598773. [PMID: 33391350 PMCID: PMC7772407 DOI: 10.3389/fgene.2020.598773] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2020] [Accepted: 11/23/2020] [Indexed: 01/17/2023] Open
Abstract
Cancer has been a major public health problem worldwide for many centuries. Cancer is a complex disease associated with accumulative genetic mutations, epigenetic aberrations, chromosomal instability, and expression alteration. Increasing lines of evidence suggest that many non-coding transcripts, which are termed as non-coding RNAs, have important regulatory roles in cancer. In particular, long non-coding RNAs (lncRNAs) play crucial roles in tumorigenesis. Cancer-related lncRNAs serve as oncogenic factors or tumor suppressors. Although many lncRNAs are identified as potential regulators in tumorigenesis by using traditional experimental methods, they are time consuming and expensive considering the tremendous amount of lncRNAs needed. Thus, effective and fast approaches to recognize tumor-related lncRNAs should be developed. The proposed approach should help us understand not only the mechanisms of lncRNAs that participate in tumorigenesis but also their satisfactory performance in distinguishing cancer-related lncRNAs. In this study, we utilized a decision tree (DT), a type of rule learning algorithm, to investigate cancer-related lncRNAs with functional annotation contents [gene ontology (GO) terms and KEGG pathways] of their co-expressed genes. Cancer-related and other lncRNAs encoded by the key enrichment features of GO and KEGG filtered by feature selection methods were used to build an informative DT, which further induced several decision rules. The rules provided not only a new tool for identifying cancer-related lncRNAs but also connected the lncRNAs and cancers with the combinations of GO terms. Results provided new directions for understanding cancer-related lncRNAs.
Collapse
Affiliation(s)
- Liucun Zhu
- School of Life Sciences, Shanghai University, Shanghai, China
| | - Xin Yang
- School of Life Sciences, Shanghai University, Shanghai, China
| | - Rui Zhu
- School of Life Sciences, Shanghai University, Shanghai, China
| | - Lei Yu
- Department of Medical Oncology, Shanghai Concord Medical Cancer Center, Shanghai, China
| |
Collapse
|
16
|
Fan Y, Chen M, Zhu Q, Wang W. Inferring Disease-Associated Microbes Based on Multi-Data Integration and Network Consistency Projection. Front Bioeng Biotechnol 2020; 8:831. [PMID: 32850711 PMCID: PMC7418576 DOI: 10.3389/fbioe.2020.00831] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2020] [Accepted: 06/29/2020] [Indexed: 12/18/2022] Open
Abstract
Plenty of microbes in our human body play a vital role in the process of cell physiology. In recent years, there is accumulating evidence indicating that microbes are closely related to many complex human diseases. In-depth investigation of disease-associated microbes can contribute to understanding the pathogenesis of diseases and thus provide novel strategies for the treatment, diagnosis, and prevention of diseases. To date, many computational models have been proposed for predicting microbe-disease associations using available similarity networks. However, these similarity networks are not effectively fused. In this study, we proposed a novel computational model based on multi-data integration and network consistency projection for Human Microbe-Disease Associations Prediction (HMDA-Pred), which fuses multiple similarity networks by a linear network fusion method. HMDA-Pred yielded AUC values of 0.9589 and 0.9361 ± 0.0037 in the experiments of leave-one-out cross validation (LOOCV) and 5-fold cross validation (5-fold CV), respectively. Furthermore, in case studies, 10, 8, and 10 out of the top 10 predicted microbes of asthma, colon cancer, and inflammatory bowel disease were confirmed by the literatures, respectively.
Collapse
Affiliation(s)
- Yongxian Fan
- School of Computer Science and Information Security, Guilin University of Electronic Technology, Guilin, China
| | | | | | | |
Collapse
|
17
|
Liu F, Dong H, Mei Z, Huang T. Investigation of miRNA and mRNA Co-expression Network in Ependymoma. Front Bioeng Biotechnol 2020; 8:177. [PMID: 32266223 PMCID: PMC7096354 DOI: 10.3389/fbioe.2020.00177] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2020] [Accepted: 02/20/2020] [Indexed: 12/18/2022] Open
Abstract
Ependymoma (EPN) is a rare primary tumor of the central nervous system (CNS) that affects both children and adults. Despite the definition and classification of distinct molecular subgroups, there remains a group of EPNs with a balanced genome, which makes it difficult to predict a prognosis of patients with EPN. The role of miRNA-mRNA network on EPN is still poorly understood. We assessed the involvement of miRNA-mRNA pairs in EPN by applying a weighted co-expression network analysis (WGCNA) approach. Using whole genome expression profile analysis followed by functional enrichment, we detected hub genes involved in active proliferation and DNA replication of nerve cells. Key genes including CYP11B1, KRT33B, RUNX1T1, SIK1, MAP3K4, MLANA, and SFRP5 identified in co-expression networks were regulated by miR-15a and miR-24-1. These seven miRNA-mRNA pairs were considered to influence not only pathways in cancer and tumor suppression process, but also MAPK, NF-kappaB, and WNT signaling pathways which were associated with tumorigenesis and development. This study provides a novel insight into potential diagnostic biomarkers of EPN and may have value in choosing therapeutic targets with clinical utility.
Collapse
Affiliation(s)
- Feili Liu
- Department of Neurosurgery, Huashan Hospital, Fudan University, Shanghai, China
| | - Hang Dong
- Shanghai Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, China
| | - Zi Mei
- Shanghai Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, China
| | - Tao Huang
- Shanghai Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, China
| |
Collapse
|
18
|
Xiao Q, Zhang N, Luo J, Dai J, Tang X. Adaptive multi-source multi-view latent feature learning for inferring potential disease-associated miRNAs. Brief Bioinform 2020; 22:2043-2057. [PMID: 32186712 DOI: 10.1093/bib/bbaa028] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2019] [Revised: 02/16/2020] [Accepted: 01/14/2020] [Indexed: 12/13/2022] Open
Abstract
Accumulating evidence has shown that microRNAs (miRNAs) play crucial roles in different biological processes, and their mutations and dysregulations have been proved to contribute to tumorigenesis. In silico identification of disease-associated miRNAs is a cost-effective strategy to discover those most promising biomarkers for disease diagnosis and treatment. The increasing available omics data sources provide unprecedented opportunities to decipher the underlying relationships between miRNAs and diseases by computational models. However, most existing methods are biased towards a single representation of miRNAs or diseases and are also not capable of discovering unobserved associations for new miRNAs or diseases without association information. In this study, we present a novel computational method with adaptive multi-source multi-view latent feature learning (M2LFL) to infer potential disease-associated miRNAs. First, we adopt multiple data sources to obtain similarity profiles and capture different latent features according to the geometric characteristic of miRNA and disease spaces. Then, the multi-modal latent features are projected to a common subspace to discover unobserved miRNA-disease associations in both miRNA and disease views, and an adaptive joint graph regularization term is developed to preserve the intrinsic manifold structures of multiple similarity profiles. Meanwhile, the Lp,q-norms are imposed into the projection matrices to ensure the sparsity and improve interpretability. The experimental results confirm the superior performance of our proposed method in screening reliable candidate disease miRNAs, which suggests that M2LFL could be an efficient tool to discover diagnostic biomarkers for guiding laborious clinical trials.
Collapse
|