1
|
Luo Y, Li S, Peng L, Ding P, Liang W. Predicting associations between drugs and G protein-coupled receptors using a multi-graph convolutional network. Comput Biol Chem 2024; 110:108060. [PMID: 38579550 DOI: 10.1016/j.compbiolchem.2024.108060] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2023] [Revised: 03/12/2024] [Accepted: 03/22/2024] [Indexed: 04/07/2024]
Abstract
Developing new drugs is an expensive, time-consuming process that frequently involves safety concerns. By discovering novel uses for previously verified drugs, drug repurposing helps to bypass the time-consuming and costly process of drug development. As the largest family of proteins targeted by verified drugs, G protein-coupled receptors (GPCR) are vital to efficiently repurpose drugs by inferring their associations with drugs. Drug repurposing may be sped up by computational models that predict the strength of novel drug-GPCR pairs interaction. To this end, a number of models have been put forth. In existing methods, however, drug structure, drug-drug interactions, GPCR sequence, and subfamily information couldn't simultaneously be taken into account to detect novel drugs-GPCR relationships. In this study, based on a multi-graph convolutional network, an end-to-end deep model was developed to efficiently and precisely discover latent drug-GPCR relationships by combining data from multi-sources. We demonstrated that our model, based on multi-graph convolutional networks, outperformed rival deep learning techniques as well as non-deep learning models in terms of inferring drug-GPCR relationships. Our results indicated that integrating data from multi-sources can lead to further advancement.
Collapse
Affiliation(s)
- Yuxun Luo
- School of Computer Science and Engineering, Hunan University of Science and Technology, Xiangtan, Hunan 411201, China; Hunan Key Laboratory for Service Computing and Novel Software Technology, Hunan University of Science and Technology, Xiangtan, Hunan 411201, China
| | - Shasha Li
- Department of Electrical and Electronic Engineering, The University of Hong Kong, Hong Kong 999077, China
| | - Li Peng
- School of Computer Science and Engineering, Hunan University of Science and Technology, Xiangtan, Hunan 411201, China; Hunan Key Laboratory for Service Computing and Novel Software Technology, Hunan University of Science and Technology, Xiangtan, Hunan 411201, China.
| | - Pingjian Ding
- School of Computer Science, University of South China, Hengyang, Hunan 421001, China
| | - Wei Liang
- School of Computer Science and Engineering, Hunan University of Science and Technology, Xiangtan, Hunan 411201, China; Hunan Key Laboratory for Service Computing and Novel Software Technology, Hunan University of Science and Technology, Xiangtan, Hunan 411201, China.
| |
Collapse
|
2
|
Gao Z, Ding P, Xu R. IUPHAR review - Data-driven computational drug repurposing approaches for opioid use disorder. Pharmacol Res 2024; 199:106960. [PMID: 37832859 DOI: 10.1016/j.phrs.2023.106960] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/19/2023] [Revised: 10/08/2023] [Accepted: 10/10/2023] [Indexed: 10/15/2023]
Abstract
Opioid Use Disorder (OUD) is a chronic and relapsing condition characterized by the misuse of opioid drugs, causing significant morbidity and mortality in the United States. Existing medications for OUD are limited, and there is an immediate need to discover treatments with enhanced safety and efficacy. Drug repurposing aims to find new indications for existing medications, offering a time-saving and cost-efficient alternative strategy to traditional drug discovery. Computational approaches have been developed to further facilitate the drug repurposing process. In this paper, we reviewed state-of-the-art data-driven computational drug repurposing approaches for OUD and discussed their advantages and potential challenges. We also highlighted promising repurposed candidate drugs for OUD that were identified by computational drug repurposing techniques and reviewed studies supporting their potential mechanisms of action in treating OUD.
Collapse
Affiliation(s)
- Zhenxiang Gao
- Center for Artificial Intelligence in Drug Discovery, School of Medicine, Case Western Reserve University, Cleveland, OH, USA
| | - Pingjian Ding
- Center for Artificial Intelligence in Drug Discovery, School of Medicine, Case Western Reserve University, Cleveland, OH, USA
| | - Rong Xu
- Center for Artificial Intelligence in Drug Discovery, School of Medicine, Case Western Reserve University, Cleveland, OH, USA.
| |
Collapse
|
3
|
Li J, Wang D, Yang Z, Liu M. HEGANLDA: A Computational Model for Predicting Potential Lncrna-Disease Associations Based On Multiple Heterogeneous Networks. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:388-398. [PMID: 34932483 DOI: 10.1109/tcbb.2021.3136886] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Long non-coding RNAs (lncRNAs) play vital regulatory roles in many human complex diseases, however, the number of validated lncRNA-disease associations is notable rare so far. How to predict potential lncRNA-disease associations precisely through computational methods remains challenging. In this study, we proposed a novel method, LDVCHN (LncRNA-Disease Vector Calculation Heterogeneous Networks), and also developed the corresponding model, HEGANLDA (Heterogeneous Embedding Generative Adversarial Networks LncRNA-Disease Association), for predicting potential lncRNA-disease associations. In HEGANLDA, the graph embedding algorithm (HeGAN) was introduced for mapping all nodes in the lncRNA-miRNA-disease heterogeneous network into the low-dimensional vectors which severed as the inputs of LDVCHN. HEGANLDA effectively adopted the XGBoost (eXtreme Gradient Boosting) classifier, which was trained by the low-dimensional vectors, to predict potential lncRNA-disease associations. The 10-fold cross-validation method was utilized to evaluate the performance of our model, our model finally achieved an area under the ROC curve of 0.983. According to the experiment results, HEGANLDA outperformed any one of five current state-of-the-art methods. To further evaluate the effectiveness of HEGANLDA in predicting potential lncRNA-disease associations, both case studies and robustness tests were performed and the results confirmed its effectiveness and robustness. The source code and data of HEGANLDA are available at https://github.com/HEGANLDA/HEGANLDA.
Collapse
|
4
|
Zhong Y, Shen C, Wu H, Xu T, Luo L. Improving the Prediction of Potential Kinase Inhibitors with Feature Learning on Multisource Knowledge. Interdiscip Sci 2022; 14:775-785. [PMID: 35536538 DOI: 10.1007/s12539-022-00523-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2021] [Revised: 04/14/2022] [Accepted: 04/15/2022] [Indexed: 06/14/2023]
Abstract
PURPOSE The identification of potential kinase inhibitors plays a key role in drug discovery for treating human diseases. Currently, most existing computational methods only extract limited features such as sequence information from kinases and inhibitors. To further enhance the identification of kinase inhibitors, more features need to be leveraged. Hence, it is appealing to develop effective methods to aggregate feature information from multisource knowledge for predicting potential kinase inhibitors. In this paper, we propose a novel computational framework called FLMTS to improve the performance of kinase inhibitor prediction by aggregating multisource knowledge. METHOD FLMTS uses a random walk with restart (RWR) to combine multiscale information in a heterogeneous network. We used the combined information as features of compounds and kinases and input them into random forest (RF) to predict unknown compound-kinase interactions. RESULTS Experimental results reveal that FLMTS obtains significant improvement over existing state-of-the-art methods. Case studies demonstrated the reliability of FLMTS, and pathway enrichment analysis demonstrated that FLMTS could also accurately predict signaling pathways in disease treatment. CONCLUSION In conclusion, our computational framework of FLMTS for improving the prediction of potential kinase inhibitors successfully aggregates feature information from multisource knowledge, yielding better prediction performance than existing state-of-the-art methods.
Collapse
Affiliation(s)
- Yichen Zhong
- School of Computer Science, University of South China, Hengyang, 421001, China
- Hunan Provincial Base for Scientific and Technological Innovation Cooperation, Hengyang, 421001, China
| | - Cong Shen
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, 410083, China
| | - Huanhuan Wu
- School of Computer Science, University of South China, Hengyang, 421001, China
| | - Tao Xu
- School of Computer Science, University of South China, Hengyang, 421001, China
| | - Lingyun Luo
- School of Computer Science, University of South China, Hengyang, 421001, China.
- Hunan Provincial Base for Scientific and Technological Innovation Cooperation, Hengyang, 421001, China.
| |
Collapse
|
5
|
Königs C, Friedrichs M, Dietrich T. The heterogeneous pharmacological medical biochemical network PharMeBINet. Sci Data 2022; 9:393. [PMID: 35821017 PMCID: PMC9276653 DOI: 10.1038/s41597-022-01510-3] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2022] [Accepted: 06/22/2022] [Indexed: 12/04/2022] Open
Abstract
Heterogeneous biomedical pharmacological databases are important for multiple fields in bioinformatics. Hetionet is a freely available database combining diverse entities and relationships from 29 public resources. Therefore, it is used as the basis for this project. 19 additional pharmacological medical and biological databases such as CTD, DrugBank, and ClinVar are parsed and integrated into Neo4j. Afterwards, the information is merged into the Hetionet structure. Different mapping methods are used such as external identification systems or name mapping. The resulting open-source Neo4j database PharMeBINet has 2,869,407 different nodes with 66 labels and 15,883,653 relationships with 208 edge types. It is a heterogeneous database containing interconnected information on ADRs, diseases, drugs, genes, gene variations, proteins, and more. Relationships between these entities represent drug-drug interactions or drug-causes-ADR relations, to name a few. It has much potential for developing further data analyses including machine learning applications. A web application for accessing the database is free to use for everyone and available at https://pharmebi.net. Additionally, the database is deposited on Zenodo at 10.5281/zenodo.6578218. Measurement(s) | data integration objective | Technology Type(s) | database creation objective |
Collapse
Affiliation(s)
- Cassandra Königs
- Bielefeld University, Bioinformatics/Medical Informatics Department, Bielefeld, 33615, Germany.
| | - Marcel Friedrichs
- Bielefeld University, Bioinformatics/Medical Informatics Department, Bielefeld, 33615, Germany
| | - Theresa Dietrich
- Bielefeld University, Bioinformatics/Medical Informatics Department, Bielefeld, 33615, Germany
| |
Collapse
|
6
|
Tan H, Qiu S, Wang J, Yu G, Guo W, Guo M. Weighted deep factorizing heterogeneous molecular network for genome-phenome association prediction. Methods 2022; 205:18-28. [PMID: 35690250 DOI: 10.1016/j.ymeth.2022.05.008] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2022] [Revised: 05/14/2022] [Accepted: 05/26/2022] [Indexed: 11/18/2022] Open
Abstract
Genome-phenome association (GPA) prediction can promote the understanding of biological mechanisms about complex pathology of phenotypes (i.e., traits and diseases). Traditional heterogeneous network-based GPA approaches overwhelmingly need to project heterogeneous data toward homogeneous network for data fusion and prediction, such projections result in the loss of heterogeneous network structure information. Matrix factorization based data fusion can avoid such projection by integrating multi-type data in a coherent way, but they typically perform linear factorization and cannot mine the nonlinear relationships between molecules, which compromise the accuracy of GPA analysis. Furthermore, most of them can not selectively synergy network topology and node attribution information in a principle way. In this paper, we propose a weighted deep matrix factorization based solution (WDGPA) to predict GPAs by selectively and differentially fusing heterogeneous molecular network and diverse attributes of nodes. WDGPA firstly assigns weights to inter/intra-relational data matrices and attribute data matrices, and performs deep matrix factorization on these matrices of heterogeneous network in a cooperative manner to obtain the nonlinear representations of different nodes. In addition, it performs low-rank representation learning on the attribute data with the shared nonlinear representations. In this way, both the network topology and node attributes are jointly mined to explore the representations of molecules and complex interplays between molecules and phenotypes. WDGPA then uses the representational vectors of gene and phenotype nodes to predict GPAs. Experimental results on maize and human datasets confirm that WDGPA outperforms competitive methods by a large margin under different evaluation protocols.
Collapse
Affiliation(s)
- Haojiang Tan
- School of Software, Shandong University, Jinan, China; Joint SDU-NTU Centre For AI Research (C-FAIR), Shandong University, Jinan, China.
| | - Sichao Qiu
- School of Software, Shandong University, Jinan, China; Joint SDU-NTU Centre For AI Research (C-FAIR), Shandong University, Jinan, China.
| | - Jun Wang
- Joint SDU-NTU Centre For AI Research (C-FAIR), Shandong University, Jinan, China.
| | - Guoxian Yu
- Joint SDU-NTU Centre For AI Research (C-FAIR), Shandong University, Jinan, China.
| | - Wei Guo
- Joint SDU-NTU Centre For AI Research (C-FAIR), Shandong University, Jinan, China.
| | - Maozu Guo
- College of Electrical and Information Engineering, Beijing University of Civil Engineering and Architecture, Beijing, China.
| |
Collapse
|
7
|
Liu P, Luo J, Chen X. miRCom: Tensor Completion Integrating Multi-View Information to Deduce the Potential Disease-Related miRNA-miRNA Pairs. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:1747-1759. [PMID: 33180730 DOI: 10.1109/tcbb.2020.3037331] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
MicroRNAs (miRNAs) are consistently capable of regulating gene expression synergistically in a combination mode and play a key role in various biological processes associated with the initiation and development of human diseases, which indicate that comprehending the synergistic molecular mechanism of miRNAs may facilitate understanding the pathogenesis of diseases or even overcome it. However, most existing computational methods had an incomprehensive acknowledge of the miRNA synergistic effect on the pathogenesis of complex diseases, or were hard to be extended to a large-scale prediction task of miRNA synergistic combinations for different diseases. In this article, we propose a novel tensor completion framework integrating multi-view miRNAs and diseases information, called miRCom, for the discovery of potential disease-associated miRNA-miRNA pairs. We first construct an incomplete three-order association tensor and several types of similarity matrices based on existing biological knowledge. Then, we formulate an objective function via performing the factorizations of coupled tensor and matrices simultaneously. Finally, we build an optimization schema by adopting the ADMM algorithm. After that, we obtain the prediction of miRNA-miRNA pairs for different diseases from the full tensor. The contrastive experimental results with other approaches verified that miRCom effectively identify the potential disease-related miRNA-miRNA pairs. Moreover, case study results further illustrated that miRNA-miRNA pairs have more biologically significance and prognostic value than single miRNAs.
Collapse
|
8
|
Abstract
Knowledge graphs (KGs) have rapidly emerged as an important area in AI over the last ten years. Building on a storied tradition of graphs in the AI community, a KG may be simply defined as a directed, labeled, multi-relational graph with some form of semantics. In part, this has been fueled by increased publication of structured datasets on the Web, and well-publicized successes of large-scale projects such as the Google Knowledge Graph and the Amazon Product Graph. However, another factor that is less discussed, but which has been equally instrumental in the success of KGs, is the cross-disciplinary nature of academic KG research. Arguably, because of the diversity of this research, a synthesis of how different KG research strands all tie together could serve a useful role in enabling more ‘moonshot’ research and large-scale collaborations. This review of the KG research landscape attempts to provide such a synthesis by first showing what the major strands of research are, and how those strands map to different communities, such as Natural Language Processing, Databases and Semantic Web. A unified framework is suggested in which to view the distinct, but overlapping, foci of KG research within these communities.
Collapse
|
9
|
Xiang J, Meng X, Zhao Y, Wu FX, Li M. HyMM: hybrid method for disease-gene prediction by integrating multiscale module structure. Brief Bioinform 2022; 23:6547263. [PMID: 35275996 DOI: 10.1093/bib/bbac072] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2021] [Revised: 01/18/2022] [Accepted: 02/13/2022] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION Identifying disease-related genes is an important issue in computational biology. Module structure widely exists in biomolecule networks, and complex diseases are usually thought to be caused by perturbations of local neighborhoods in the networks, which can provide useful insights for the study of disease-related genes. However, the mining and effective utilization of the module structure is still challenging in such issues as a disease gene prediction. RESULTS We propose a hybrid disease-gene prediction method integrating multiscale module structure (HyMM), which can utilize multiscale information from local to global structure to more effectively predict disease-related genes. HyMM extracts module partitions from local to global scales by multiscale modularity optimization with exponential sampling, and estimates the disease relatedness of genes in partitions by the abundance of disease-related genes within modules. Then, a probabilistic model for integration of gene rankings is designed in order to integrate multiple predictions derived from multiscale module partitions and network propagation, and a parameter estimation strategy based on functional information is proposed to further enhance HyMM's predictive power. By a series of experiments, we reveal the importance of module partitions at different scales, and verify the stable and good performance of HyMM compared with eight other state-of-the-arts and its further performance improvement derived from the parameter estimation. CONCLUSIONS The results confirm that HyMM is an effective framework for integrating multiscale module structure to enhance the ability to predict disease-related genes, which may provide useful insights for the study of the multiscale module structure and its application in such issues as a disease-gene prediction.
Collapse
Affiliation(s)
- Ju Xiang
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha 410083, China; Department of Basic Medical Sciences & Academician Workstation, Changsha Medical University, Changsha, Hunan 410219, China
| | - Xiangmao Meng
- School of Computer Science and Engineering, Central South University, Changsha 410083, China
| | - Yichao Zhao
- School of Computer Science and Engineering, Central South University, Changsha, Hunan 410083, China
| | - Fang-Xiang Wu
- Division of Biomedical Engineering and Department of Mechanical Engineering, University of Saskatchewan, Saskatoon, SK, S7N 5A9, Canada
| | - Min Li
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha 410083, China
| |
Collapse
|
10
|
Xiang J, Zhang J, Zhao Y, Wu FX, Li M. Biomedical data, computational methods and tools for evaluating disease-disease associations. Brief Bioinform 2022; 23:6522999. [PMID: 35136949 DOI: 10.1093/bib/bbac006] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2021] [Revised: 01/04/2022] [Accepted: 01/05/2022] [Indexed: 12/12/2022] Open
Abstract
In recent decades, exploring potential relationships between diseases has been an active research field. With the rapid accumulation of disease-related biomedical data, a lot of computational methods and tools/platforms have been developed to reveal intrinsic relationship between diseases, which can provide useful insights to the study of complex diseases, e.g. understanding molecular mechanisms of diseases and discovering new treatment of diseases. Human complex diseases involve both external phenotypic abnormalities and complex internal molecular mechanisms in organisms. Computational methods with different types of biomedical data from phenotype to genotype can evaluate disease-disease associations at different levels, providing a comprehensive perspective for understanding diseases. In this review, available biomedical data and databases for evaluating disease-disease associations are first summarized. Then, existing computational methods for disease-disease associations are reviewed and classified into five groups in terms of the usages of biomedical data, including disease semantic-based, phenotype-based, function-based, representation learning-based and text mining-based methods. Further, we summarize software tools/platforms for computation and analysis of disease-disease associations. Finally, we give a discussion and summary on the research of disease-disease associations. This review provides a systematic overview for current disease association research, which could promote the development and applications of computational methods and tools/platforms for disease-disease associations.
Collapse
Affiliation(s)
- Ju Xiang
- School of Computer Science and Engineering, Central South University, China
| | - Jiashuai Zhang
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha, Hunan 410083, China
| | - Yichao Zhao
- School of Computer Science and Engineering, Central South University, China
| | - Fang-Xiang Wu
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha, Hunan 410083, China
| | - Min Li
- Division of Biomedical Engineering and Department of Mechanical Engineering at University of Saskatchewan, Saskatoon, Canada
| |
Collapse
|
11
|
Deligiorgi MV, Trafalis DT. The Intriguing Thyroid Hormones-Lung Cancer Association as Exemplification of the Thyroid Hormones-Cancer Association: Three Decades of Evolving Research. Int J Mol Sci 2021; 23:436. [PMID: 35008863 PMCID: PMC8745569 DOI: 10.3390/ijms23010436] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2021] [Revised: 12/22/2021] [Accepted: 12/28/2021] [Indexed: 12/21/2022] Open
Abstract
Exemplifying the long-pursued thyroid hormones (TH)-cancer association, the TH-lung cancer association is a compelling, yet elusive, issue. The present narrative review provides background knowledge on the molecular aspects of TH actions, with focus on the contribution of TH to hallmarks of cancer. Then, it provides a comprehensive overview of data pertinent to the TH-lung cancer association garnered over the last three decades and identifies obstacles that need to be overcome to enable harnessing this association in the clinical setting. TH contribute to all hallmarks of cancer through integration of diverse actions, currently classified according to molecular background. Despite the increasingly recognized implication of TH in lung cancer, three pending queries need to be resolved to empower a tailored approach: (1) How to stratify patients with TH-sensitive lung tumors? (2) How is determined whether TH promote or inhibit lung cancer progression? (3) How to mimic the antitumor and/or abrogate the tumor-promoting TH actions in lung cancer? To address these queries, research should prioritize the elucidation of the crosstalk between TH signaling and oncogenic signaling implicated in lung cancer initiation and progression, and the development of efficient, safe, and feasible strategies leveraging this crosstalk in therapeutics.
Collapse
Affiliation(s)
- Maria V. Deligiorgi
- Department of Pharmacology—Clinical Pharmacology Unit, Faculty of Medicine, National and Kapodistrian University of Athens, Building 16, 1st Floor, 75 Mikras Asias Str, 11527 Athens, Greece;
| | | |
Collapse
|
12
|
Yi HC, You ZH, Huang DS, Kwoh CK. Graph representation learning in bioinformatics: trends, methods and applications. Brief Bioinform 2021; 23:6361044. [PMID: 34471921 DOI: 10.1093/bib/bbab340] [Citation(s) in RCA: 47] [Impact Index Per Article: 15.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2021] [Revised: 07/18/2021] [Accepted: 08/02/2021] [Indexed: 12/12/2022] Open
Abstract
Graph is a natural data structure for describing complex systems, which contains a set of objects and relationships. Ubiquitous real-life biomedical problems can be modeled as graph analytics tasks. Machine learning, especially deep learning, succeeds in vast bioinformatics scenarios with data represented in Euclidean domain. However, rich relational information between biological elements is retained in the non-Euclidean biomedical graphs, which is not learning friendly to classic machine learning methods. Graph representation learning aims to embed graph into a low-dimensional space while preserving graph topology and node properties. It bridges biomedical graphs and modern machine learning methods and has recently raised widespread interest in both machine learning and bioinformatics communities. In this work, we summarize the advances of graph representation learning and its representative applications in bioinformatics. To provide a comprehensive and structured analysis and perspective, we first categorize and analyze both graph embedding methods (homogeneous graph embedding, heterogeneous graph embedding, attribute graph embedding) and graph neural networks. Furthermore, we summarize their representative applications from molecular level to genomics, pharmaceutical and healthcare systems level. Moreover, we provide open resource platforms and libraries for implementing these graph representation learning methods and discuss the challenges and opportunities of graph representation learning in bioinformatics. This work provides a comprehensive survey of emerging graph representation learning algorithms and their applications in bioinformatics. It is anticipated that it could bring valuable insights for researchers to contribute their knowledge to graph representation learning and future-oriented bioinformatics studies.
Collapse
Affiliation(s)
- Hai-Cheng Yi
- Chinese Academy of Sciences, Xinjiang Technical Institute of Physics and Chemistry, Urumqi 830011, China.,University of Chinese Academy of Sciences, Beijing 100049, China
| | - Zhu-Hong You
- School of Computer Science, Northwestern Polytechnical University, Xi'an 710129, China
| | - De-Shuang Huang
- Institute of Machine Learning and Systems Biology, School of Electronics and Information Engineering, Tongji University, Shanghai 201804, China
| | - Chee Keong Kwoh
- School of Computer Science and Engineering, Nanyang Technological University, 50 Nanyang Avenue, Singapore
| |
Collapse
|
13
|
Chen X, Luo L, Shen C, Ding P, Luo J. An In Silico Method for Predicting Drug Synergy Based on Multitask Learning. Interdiscip Sci 2021; 13:299-311. [PMID: 33611781 DOI: 10.1007/s12539-021-00422-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2020] [Revised: 01/29/2021] [Accepted: 02/07/2021] [Indexed: 12/20/2022]
Abstract
To make better use of all kinds of knowledge to predict drug synergy, it is crucial to successfully establish a drug synergy prediction model and leverage the reconstruction of sparse known drug targets. Therefore, we present an in silico method that predicts the synergy scores of drug pairs based on multitask learning (DSML) that could fuse drug targets, protein-protein interactions, anatomical therapeutic chemical codes, a priori knowledge of drug combinations. To simultaneously reconstruct drug-target protein interactions and synergistic drug combinations, DSML benefits indirectly from the associations with relation through proteins. In cross-validation experiments, DSML improved the ability to predict drug synergy. Moreover, the reconstruction of drug-target interactions and the incorporation of multisource knowledge significantly improved drug combination predictions by a large margin. The potential drug combinations predicted by DSML demonstrate its ability to predict drug synergy.
Collapse
Affiliation(s)
- Xin Chen
- School of Computer Science, University of South China, Hengyang, 421001, Hunan, China
| | - Lingyun Luo
- School of Computer Science, University of South China, Hengyang, 421001, Hunan, China.,Hunan Medical Big Data International Sci.&Tech. Innovation Cooperation Base, Hengyang, 421000, Hunan, China
| | - Cong Shen
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, 410082, Hunan, China
| | - Pingjian Ding
- School of Computer Science, University of South China, Hengyang, 421001, Hunan, China. .,Hunan Medical Big Data International Sci.&Tech. Innovation Cooperation Base, Hengyang, 421000, Hunan, China.
| | - Jiawei Luo
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, 410082, Hunan, China
| |
Collapse
|
14
|
Wang F, Han S, Yang J, Yan W, Hu G. Knowledge-Guided "Community Network" Analysis Reveals the Functional Modules and Candidate Targets in Non-Small-Cell Lung Cancer. Cells 2021; 10:cells10020402. [PMID: 33669233 PMCID: PMC7919838 DOI: 10.3390/cells10020402] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2021] [Revised: 02/06/2021] [Accepted: 02/15/2021] [Indexed: 12/24/2022] Open
Abstract
Non-small-cell lung cancer (NSCLC) represents a heterogeneous group of malignancies that are the leading cause of cancer-related death worldwide. Although many NSCLC-related genes and pathways have been identified, there remains an urgent need to mechanistically understand how these genes and pathways drive NSCLC. Here, we propose a knowledge-guided and network-based integration method, called the node and edge Prioritization-based Community Analysis, to identify functional modules and their candidate targets in NSCLC. The protein–protein interaction network was prioritized by performing a random walk with restart algorithm based on NSCLC seed genes and the integrating edge weights, and then a “community network” was constructed by combining Girvan–Newman and Label Propagation algorithms. This systems biology analysis revealed that the CCNB1-mediated network in the largest community provides a modular biomarker, the second community serves as a drug regulatory module, and the two are connected by some contextual signaling motifs. Moreover, integrating structural information into the signaling network suggested novel protein–protein interactions with therapeutic significance, such as interactions between GNG11 and CXCR2, CXCL3, and PPBP. This study provides new mechanistic insights into the landscape of cellular functions in the context of modular networks and will help in developing therapeutic targets for NSCLC.
Collapse
Affiliation(s)
- Fan Wang
- Center for Systems Biology, Department of Bioinformatics, School of Biology and Basic Medical Sciences, Soochow University, Suzhou 215123, China; (F.W.); (S.H.); (J.Y.)
| | - Shuqing Han
- Center for Systems Biology, Department of Bioinformatics, School of Biology and Basic Medical Sciences, Soochow University, Suzhou 215123, China; (F.W.); (S.H.); (J.Y.)
| | - Ji Yang
- Center for Systems Biology, Department of Bioinformatics, School of Biology and Basic Medical Sciences, Soochow University, Suzhou 215123, China; (F.W.); (S.H.); (J.Y.)
| | - Wenying Yan
- Center for Systems Biology, Department of Bioinformatics, School of Biology and Basic Medical Sciences, Soochow University, Suzhou 215123, China; (F.W.); (S.H.); (J.Y.)
- Correspondence: (W.Y.); (G.H.)
| | - Guang Hu
- Center for Systems Biology, Department of Bioinformatics, School of Biology and Basic Medical Sciences, Soochow University, Suzhou 215123, China; (F.W.); (S.H.); (J.Y.)
- State Key Laboratory of Radiation Medicine and Protection, Soochow University, Suzhou 215123, China
- Correspondence: (W.Y.); (G.H.)
| |
Collapse
|
15
|
Shen C, Luo J, Lai Z, Ding P. Multiview Joint Learning-Based Method for Identifying Small-Molecule-Associated MiRNAs by Integrating Pharmacological, Genomics, and Network Knowledge. J Chem Inf Model 2020; 60:4085-4097. [DOI: 10.1021/acs.jcim.0c00244] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Affiliation(s)
- Cong Shen
- College of Computer Science and Electronic Engineering, Hunan University, Changsha 410083, China
| | - Jiawei Luo
- College of Computer Science and Electronic Engineering, Hunan University, Changsha 410083, China
| | - Zihan Lai
- College of Computer Science and Electronic Engineering, Hunan University, Changsha 410083, China
| | - Pingjian Ding
- School of Computer Science, University of South China, Hengyang 421001, China
| |
Collapse
|