1
|
Fan Y, Zhang C, Hu X, Huang Z, Xue J, Deng L. SGCLDGA: unveiling drug-gene associations through simple graph contrastive learning. Brief Bioinform 2024; 25:bbae231. [PMID: 38754409 PMCID: PMC11097980 DOI: 10.1093/bib/bbae231] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2024] [Revised: 04/15/2024] [Accepted: 04/30/2024] [Indexed: 05/18/2024] Open
Abstract
Drug repurposing offers a viable strategy for discovering new drugs and therapeutic targets through the analysis of drug-gene interactions. However, traditional experimental methods are plagued by their costliness and inefficiency. Despite graph convolutional network (GCN)-based models' state-of-the-art performance in prediction, their reliance on supervised learning makes them vulnerable to data sparsity, a common challenge in drug discovery, further complicating model development. In this study, we propose SGCLDGA, a novel computational model leveraging graph neural networks and contrastive learning to predict unknown drug-gene associations. SGCLDGA employs GCNs to extract vector representations of drugs and genes from the original bipartite graph. Subsequently, singular value decomposition (SVD) is employed to enhance the graph and generate multiple views. The model performs contrastive learning across these views, optimizing vector representations through a contrastive loss function to better distinguish positive and negative samples. The final step involves utilizing inner product calculations to determine association scores between drugs and genes. Experimental results on the DGIdb4.0 dataset demonstrate SGCLDGA's superior performance compared with six state-of-the-art methods. Ablation studies and case analyses validate the significance of contrastive learning and SVD, highlighting SGCLDGA's potential in discovering new drug-gene associations. The code and dataset for SGCLDGA are freely available at https://github.com/one-melon/SGCLDGA.
Collapse
Affiliation(s)
- Yanhao Fan
- School of Computer Science and Engineering, Central South University, 410075, Changsha, China
| | - Che Zhang
- School of software, Xinjiang University, 830046, Urumqi, China
| | - Xiaowen Hu
- School of Computer Science and Engineering, Central South University, 410075, Changsha, China
| | - Zhijian Huang
- School of Computer Science and Engineering, Central South University, 410075, Changsha, China
| | - Jiameng Xue
- School of Computer Science and Engineering, Central South University, 410075, Changsha, China
| | - Lei Deng
- School of Computer Science and Engineering, Central South University, 410075, Changsha, China
| |
Collapse
|
2
|
Mokhtaridoost M, Maass PG, Gönen M. Identifying Tissue- and Cohort-Specific RNA Regulatory Modules in Cancer Cells Using Multitask Learning. Cancers (Basel) 2022; 14:cancers14194939. [PMID: 36230862 PMCID: PMC9563725 DOI: 10.3390/cancers14194939] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2022] [Revised: 09/30/2022] [Accepted: 10/06/2022] [Indexed: 11/24/2022] Open
Abstract
Simple Summary Understanding the underlying biological mechanisms of primary tumors is crucial for predicting how tumors respond to therapies and exploring accurate treatment strategies. miRNA–mRNA interactions have a major effect on many biological processes that are important in the formation and progression of cancer. In this study, we introduced a computational pipeline to extract tissue- and cohort-specific miRNA–mRNA regulatory modules of multiple cancer types from the same origin using miRNA and mRNA expression profiles of primary tumors. Our model identified regulatory modules of underlying cancer types (i.e., cohort-specific) and shared regulatory modules between cohorts (i.e., tissue-specific). Abstract MicroRNA (miRNA) alterations significantly impact the formation and progression of human cancers. miRNAs interact with messenger RNAs (mRNAs) to facilitate degradation or translational repression. Thus, identifying miRNA–mRNA regulatory modules in cohorts of primary tumor tissues are fundamental for understanding the biology of tumor heterogeneity and precise diagnosis and treatment. We established a multitask learning sparse regularized factor regression (MSRFR) method to determine key tissue- and cohort-specific miRNA–mRNA regulatory modules from expression profiles of tumors. MSRFR simultaneously models the sparse relationship between miRNAs and mRNAs and extracts tissue- and cohort-specific miRNA–mRNA regulatory modules separately. We tested the model’s ability to determine cohort-specific regulatory modules of multiple cancer cohorts from the same tissue and their underlying tissue-specific regulatory modules by extracting similarities between cancer cohorts (i.e., blood, kidney, and lung). We also detected tissue-specific and cohort-specific signatures in the corresponding regulatory modules by comparing our findings from various other tissues. We show that MSRFR effectively determines cancer-related miRNAs in cohort-specific regulatory modules, distinguishes tissue- and cohort-specific regulatory modules from each other, and extracts tissue-specific information from different cohorts of disease-related tissue. Our findings indicate that the MSRFR model can support current efforts in precision medicine to define tumor-specific miRNA–mRNA signatures.
Collapse
Affiliation(s)
- Milad Mokhtaridoost
- Genetics & Genome Biology Program, The Hospital for Sick Children, Toronto, ON M5G 1X8, Canada
- Graduate School of Sciences and Engineering, Koç University, İstanbul 34450, Turkey
| | - Philipp G. Maass
- Genetics & Genome Biology Program, The Hospital for Sick Children, Toronto, ON M5G 1X8, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, ON M5S 1A8, Canada
| | - Mehmet Gönen
- Department of Industrial Engineering, College of Engineering, Koç University, İstanbul 34450, Turkey
- School of Medicine, Koç University, İstanbul 34450, Turkey
- Correspondence: ; Tel.: +90-212-338-1813
| |
Collapse
|
3
|
MHDMF: Prediction of miRNA-disease associations based on Deep Matrix Factorization with Multi-source Graph Convolutional Network. Comput Biol Med 2022; 149:106069. [PMID: 36115300 DOI: 10.1016/j.compbiomed.2022.106069] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2022] [Revised: 07/31/2022] [Accepted: 08/27/2022] [Indexed: 11/24/2022]
Abstract
A growing number of works have proved that microRNAs (miRNAs) are a crucial biomarker in diverse bioprocesses affecting various diseases. As a good complement to high-cost wet experiment-based methods, numerous computational prediction methods have sprung up. However, there are still challenges that exist in making effective use of high false-negative associations and multi-source information for finding the potential associations. In this work, we develop an end-to-end computational framework, called MHDMF, which integrates the multi-source information on a heterogeneous network to discover latent disease-miRNA associations. Since high false-negative exist in the miRNA-disease associations, MHDMF utilizes the multi-source Graph Convolutional Network (GCN) to correct the false-negative association by reformulating the miRNA-disease association score matrix. The score matrix reformulation is based on different similarity profiles and known associations between miRNAs, genes, and diseases. Then, MHDMF employs Deep Matrix Factorization (DMF) to predict the miRNA-disease associations based on reformulated miRNA-disease association score matrix. The experimental results show that the proposed framework outperforms highly related comparison methods by a large margin on tasks of miRNA-disease association prediction. Furthermore, case studies suggest that MHDMF could be a convenient and efficient tool and may supply a new way to think about miRNA-disease association prediction.
Collapse
|
4
|
Hamamoto R, Takasawa K, Machino H, Kobayashi K, Takahashi S, Bolatkan A, Shinkai N, Sakai A, Aoyama R, Yamada M, Asada K, Komatsu M, Okamoto K, Kameoka H, Kaneko S. Application of non-negative matrix factorization in oncology: one approach for establishing precision medicine. Brief Bioinform 2022; 23:6628783. [PMID: 35788277 PMCID: PMC9294421 DOI: 10.1093/bib/bbac246] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2022] [Revised: 05/06/2022] [Accepted: 05/25/2022] [Indexed: 12/19/2022] Open
Abstract
The increase in the expectations of artificial intelligence (AI) technology has led to machine learning technology being actively used in the medical field. Non-negative matrix factorization (NMF) is a machine learning technique used for image analysis, speech recognition, and language processing; recently, it is being applied to medical research. Precision medicine, wherein important information is extracted from large-scale medical data to provide optimal medical care for every individual, is considered important in medical policies globally, and the application of machine learning techniques to this end is being handled in several ways. NMF is also introduced differently because of the characteristics of its algorithms. In this review, the importance of NMF in the field of medicine, with a focus on the field of oncology, is described by explaining the mathematical science of NMF and the characteristics of the algorithm, providing examples of how NMF can be used to establish precision medicine, and presenting the challenges of NMF. Finally, the direction regarding the effective use of NMF in the field of oncology is also discussed.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | - Rina Aoyama
- Showa University Graduate School of Medicine School of Medicine
| | | | - Ken Asada
- RIKEN Center for Advanced Intelligence Project
| | | | | | | | | |
Collapse
|
5
|
Xiao Q, Dai J, Luo J. A survey of circular RNAs in complex diseases: databases, tools and computational methods. Brief Bioinform 2021; 23:6407737. [PMID: 34676391 DOI: 10.1093/bib/bbab444] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2021] [Revised: 09/21/2021] [Accepted: 09/28/2021] [Indexed: 01/22/2023] Open
Abstract
Circular RNAs (circRNAs) are a category of novelty discovered competing endogenous non-coding RNAs that have been proved to implicate many human complex diseases. A large number of circRNAs have been confirmed to be involved in cancer progression and are expected to become promising biomarkers for tumor diagnosis and targeted therapy. Deciphering the underlying relationships between circRNAs and diseases may provide new insights for us to understand the pathogenesis of complex diseases and further characterize the biological functions of circRNAs. As traditional experimental methods are usually time-consuming and laborious, computational models have made significant progress in systematically exploring potential circRNA-disease associations, which not only creates new opportunities for investigating pathogenic mechanisms at the level of circRNAs, but also helps to significantly improve the efficiency of clinical trials. In this review, we first summarize the functions and characteristics of circRNAs and introduce some representative circRNAs related to tumorigenesis. Then, we mainly investigate the available databases and tools dedicated to circRNA and disease studies. Next, we present a comprehensive review of computational methods for predicting circRNA-disease associations and classify them into five categories, including network propagating-based, path-based, matrix factorization-based, deep learning-based and other machine learning methods. Finally, we further discuss the challenges and future researches in this field.
Collapse
Affiliation(s)
- Qiu Xiao
- Hunan Normal University and Hunan Xiangjiang Artificial Intelligence Academy, Changsha, China
| | - Jianhua Dai
- Hunan Normal University and Hunan Xiangjiang Artificial Intelligence Academy, Changsha, China
| | - Jiawei Luo
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China
| |
Collapse
|
6
|
Zhang J, Liu L, Xu T, Zhang W, Li J, Rao N, Le TD. Time to infer miRNA sponge modules. WILEY INTERDISCIPLINARY REVIEWS-RNA 2021; 13:e1686. [PMID: 34342388 DOI: 10.1002/wrna.1686] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/06/2021] [Revised: 07/14/2021] [Accepted: 07/14/2021] [Indexed: 01/01/2023]
Abstract
Inferring competing endogenous RNA (ceRNA) or microRNA (miRNA) sponge modules is a challenging and meaningful task for revealing ceRNA regulation mechanism at the module level. Modules in this context refer to groups of miRNA sponges which have mutual competitions and act as functional units for achieving biological processes. The recent development of computational methods based on heterogeneous data provides a novel way to discern the competitive effects of miRNA sponges on human complex diseases. This article aims to provide a comprehensive perspective of miRNA sponge module discovery methods. We first review the publicly available databases of cancer-related miRNA sponges, as the miRNA sponges involved in human cancers contribute to the discovery of cancer-associated modules. Then we review the existing computational methods for inferring miRNA sponge modules. Furthermore, we conduct an assessment on the performance of the module discovery methods with the pan-cancer dataset, and the comparison study indicates that it is useful to infer biologically meaningful miRNA sponge modules by directly mapping heterogeneous data to the competitive modules. Finally, we discuss the future directions and associated challenges in developing in silico methods to infer miRNA sponge modules. This article is categorized under: RNA Interactions with Proteins and Other Molecules > Small Molecule-RNA Interactions Regulatory RNAs/RNAi/Riboswitches > Regulatory RNAs.
Collapse
Affiliation(s)
- Junpeng Zhang
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, Sichuan, China.,School of Engineering, Dali University, Dali, Yunnan, China
| | - Lin Liu
- UniSA STEM, University of South Australia, Mawson Lakes, South Australia, Australia
| | - Taosheng Xu
- Institute of Intelligent Machines, Hefei Institutes of Physical Science, Chinese Academy of Sciences, Hefei, Anhui, China
| | - Wu Zhang
- School of Agriculture and Biological Sciences, Dali University, Dali, Yunnan, China
| | - Jiuyong Li
- UniSA STEM, University of South Australia, Mawson Lakes, South Australia, Australia
| | - Nini Rao
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, Sichuan, China
| | - Thuc Duy Le
- UniSA STEM, University of South Australia, Mawson Lakes, South Australia, Australia
| |
Collapse
|
7
|
Xiao Q, Fu Y, Yang Y, Dai J, Luo J. NSL2CD: identifying potential circRNA-disease associations based on network embedding and subspace learning. Brief Bioinform 2021; 22:6265177. [PMID: 33954582 DOI: 10.1093/bib/bbab177] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2021] [Revised: 03/29/2021] [Accepted: 04/14/2021] [Indexed: 12/28/2022] Open
Abstract
Many studies have evidenced that circular RNAs (circRNAs) are important regulators in various pathological processes and play vital roles in many human diseases, which could serve as promising biomarkers for disease diagnosis, treatment and prognosis. However, the functions of most of circRNAs remain to be unraveled, and it is time-consuming and costly to uncover those relationships between circRNAs and diseases by conventional experimental methods. Thus, identifying candidate circRNAs for human diseases offers new opportunities to understand the functional properties of circRNAs and the pathogenesis of diseases. In this study, we propose a novel network embedding-based adaptive subspace learning method (NSL2CD) for predicting potential circRNA-disease associations and discovering those disease-related circRNA candidates. The proposed method first calculates disease similarities and circRNA similarities by fully utilizing different data sources and learns low-dimensional node representations with network embedding methods. Then, we adopt an adaptive subspace learning model to discover potential associations between circRNAs and diseases. Meanwhile, an integrated weighted graph regularization term is imposed to preserve local geometric structures of data spaces, and L1,2-norm constraint is also incorporated into the model to realize the smoothness and sparsity of projection matrices. The experiment results show that NSL2CD achieves comparable performance under different evaluation metrics, and case studies further confirm its ability to discover potential candidate circRNAs for human diseases.
Collapse
Affiliation(s)
- Qiu Xiao
- Hunan Normal University and Hunan Xiangjiang Artificial Intelligence Academy, China
| | - Yu Fu
- Hunan Normal University, China
| | - Yide Yang
- School of Medicine, Hunan Normal University, China
| | - Jianhua Dai
- Hunan Normal University and Hunan Xiangjiang Artificial Intelligence Academy, China
| | | |
Collapse
|
8
|
Zhang J, Liu L, Xu T, Zhang W, Zhao C, Li S, Li J, Rao N, Le TD. miRSM: an R package to infer and analyse miRNA sponge modules in heterogeneous data. RNA Biol 2021; 18:2308-2320. [PMID: 33822666 DOI: 10.1080/15476286.2021.1905341] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
In molecular biology, microRNA (miRNA) sponges are RNA transcripts which compete with other RNA transcripts for binding with miRNAs. Research has shown that miRNA sponges have a fundamental impact on tissue development and disease progression. Generally, to achieve a specific biological function, miRNA sponges tend to form modules or communities in a biological system. Until now, however, there is still a lack of tools to aid researchers to infer and analyse miRNA sponge modules from heterogeneous data. To fill this gap, we develop an R/Bioconductor package, miRSM, for facilitating the procedure of inferring and analysing miRNA sponge modules. miRSM provides a collection of 50 co-expression analysis methods to identify gene co-expression modules (which are candidate miRNA sponge modules), four module discovery methods to infer miRNA sponge modules and seven modular analysis methods for investigating miRNA sponge modules. miRSM will enable researchers to quickly apply new datasets to infer and analyse miRNA sponge modules, and will consequently accelerate the research on miRNA sponges.
Collapse
Affiliation(s)
- Junpeng Zhang
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, Sichuan, China.,School of Engineering, Dali University, Dali, Yunnan, China
| | - Lin Liu
- UniSA STEM, University of South Australia, Mawson Lakes, SA, Australia
| | - Taosheng Xu
- Institute of Intelligent Machines, Hefei Institutes of Physical Science, Chinese Academy of Sciences, Hefei, Anhui, China
| | - Wu Zhang
- School of Agriculture and Biological Sciences, Dali University, Dali, Yunnan, China
| | - Chunwen Zhao
- School of Engineering, Dali University, Dali, Yunnan, China
| | - Sijing Li
- School of Engineering, Dali University, Dali, Yunnan, China
| | - Jiuyong Li
- UniSA STEM, University of South Australia, Mawson Lakes, SA, Australia
| | - Nini Rao
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, Sichuan, China
| | - Thuc Duy Le
- UniSA STEM, University of South Australia, Mawson Lakes, SA, Australia
| |
Collapse
|
9
|
Mokhtaridoost M, Gönen M. An efficient framework to identify key miRNA-mRNA regulatory modules in cancer. Bioinformatics 2020; 36:i592-i600. [PMID: 33381822 DOI: 10.1093/bioinformatics/btaa798] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION Micro-RNAs (miRNAs) are known as the important components of RNA silencing and post-transcriptional gene regulation, and they interact with messenger RNAs (mRNAs) either by degradation or by translational repression. miRNA alterations have a significant impact on the formation and progression of human cancers. Accordingly, it is important to establish computational methods with high predictive performance to identify cancer-specific miRNA-mRNA regulatory modules. RESULTS We presented a two-step framework to model miRNA-mRNA relationships and identify cancer-specific modules between miRNAs and mRNAs from their matched expression profiles of more than 9000 primary tumors. We first estimated the regulatory matrix between miRNA and mRNA expression profiles by solving multiple linear programming problems. We then formulated a unified regularized factor regression (RFR) model that simultaneously estimates the effective number of modules (i.e. latent factors) and extracts modules by decomposing regulatory matrix into two low-rank matrices. Our RFR model groups correlated miRNAs together and correlated mRNAs together, and also controls sparsity levels of both matrices. These attributes lead to interpretable results with high predictive performance. We applied our method on a very comprehensive data collection by including 32 TCGA cancer types. To find the biological relevance of our approach, we performed functional gene set enrichment and survival analyses. A large portion of the identified modules are significantly enriched in Hallmark, PID and KEGG pathways/gene sets. To validate the identified modules, we also performed literature validation as well as validation using experimentally supported miRTarBase database. AVAILABILITY AND IMPLEMENTATION Our implementation of proposed two-step RFR algorithm in R is available at https://github.com/MiladMokhtaridoost/2sRFR together with the scripts that replicate the reported experiments. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
| | - Mehmet Gönen
- Department of Industrial Engineering, College of Engineering, İstanbul 34450, Turkey.,School of Medicine, Koç University, İstanbul 34450, Turkey.,Department of Biomedical Engineering, School of Medicine, Oregon Health & Science University, Portland, OR 97239, USA
| |
Collapse
|
10
|
Xiao Q, Zhong J, Tang X, Luo J. iCDA-CMG: identifying circRNA-disease associations by federating multi-similarity fusion and collective matrix completion. Mol Genet Genomics 2020; 296:223-233. [PMID: 33159254 DOI: 10.1007/s00438-020-01741-2] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2020] [Accepted: 10/23/2020] [Indexed: 01/22/2023]
Abstract
Circular RNAs (circRNAs) are a special class of non-coding RNAs with covalently closed-loop structures. Studies prove that circRNAs perform critical roles in various biological processes, and the aberrant expression of circRNAs is closely related to tumorigenesis. Therefore, identifying potential circRNA-disease associations is beneficial to understand the pathogenesis of complex diseases at the circRNA level and helps biomedical researchers and practitioners to discover diagnostic biomarkers accurately. However, it is tremendously laborious and time-consuming to discover disease-related circRNAs with conventional biological experiments. In this study, we develop an integrative framework, called iCDA-CMG, to predict potential associations between circRNAs and diseases. By incorporating multi-source prior knowledge, including known circRNA-disease associations, disease similarities and circRNA similarities, we adopt a collective matrix completion-based graph learning model to prioritize the most promising disease-related circRNAs for guiding laborious clinical trials. The results show that iCDA-CMG outperforms other state-of-the-art models in terms of cross-validation and independent prediction. Moreover, the case studies for several representative cancers suggest the effectiveness of iCDA-CMG in screening circRNA candidates for human diseases, which will contribute to elucidating the pathogenesis mechanisms and unveiling new opportunities for disease diagnosis and targeted therapy.
Collapse
Affiliation(s)
- Qiu Xiao
- Hunan Provincial Key Laboratory of Intelligent Computing and Language Information Processing, College of Information Science and Engineering, Hunan Normal University, Changsha, 410081, China.,Hunan Xiangjiang Artificial Intelligence Academy, Changsha, 410000, China
| | - Jiancheng Zhong
- Hunan Provincial Key Laboratory of Intelligent Computing and Language Information Processing, College of Information Science and Engineering, Hunan Normal University, Changsha, 410081, China.
| | - Xiwei Tang
- School of Information Science and Engineering, Hunan First Normal University, Changsha, 410205, China
| | - Jiawei Luo
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, 410082, China.
| |
Collapse
|
11
|
Xiao Q, Yu H, Zhong J, Liang C, Li G, Ding P, Luo J. An in-silico method with graph-based multi-label learning for large-scale prediction of circRNA-disease associations. Genomics 2020; 112:3407-3415. [DOI: 10.1016/j.ygeno.2020.06.017] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2020] [Revised: 06/08/2020] [Accepted: 06/09/2020] [Indexed: 01/03/2023]
|
12
|
Huang J, Chen J, Zhang B, Zhu L, Cai H. Evaluation of gene-drug common module identification methods using pharmacogenomics data. Brief Bioinform 2020; 22:5860683. [PMID: 32591780 DOI: 10.1093/bib/bbaa087] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2020] [Revised: 04/06/2020] [Accepted: 04/23/2020] [Indexed: 01/21/2023] Open
Abstract
Accurately identifying the interactions between genomic factors and the response of cancer drugs plays important roles in drug discovery, drug repositioning and cancer treatment. A number of studies revealed that interactions between genes and drugs were 'many-genes-to-many drugs' interactions, i.e. common modules, opposed to 'one-gene-to-one-drug' interactions. Such modules fully explain the interactions between complex biological regulatory mechanisms and cancer drugs. However, strategies for effectively and robustly identifying the underlying common modules among pharmacogenomics data remain to be improved. In this paper, we aim to provide a detailed evaluation of three categories of state-of-the-art common module identification techniques from a machine learning perspective, including non-negative matrix factorization (NMF), partial least squares (PLS) and network analyses. We first evaluate the performance of six methods, namely SNMNMF, NetNMF, SNPLS, O2PLS, NSBM and HOGMMNC, using two series of simulated data sets with different noise levels and outlier ratios. Then, we conduct experiments using a real world data set of 2091 genes and 101 drugs in 392 cancer cell lines and compare the real experimental results from the aspect of biological process term enrichment, gene-drug and drug-drug interactions. Finally, we present interesting findings from our evaluation study and discuss the advantages and drawbacks of each method. Supplementary information: Supplementary file is available at Briefings in Bioinformatics online.
Collapse
Affiliation(s)
- Jie Huang
- South China University of Technology, School of Computer Science and Engineering, Guangzhou, 510006, China
| | - Jiazhou Chen
- South China University of Technology, School of Computer Science and Engineering, Guangzhou, 510006, China
| | - Bin Zhang
- South China University of Technology, School of Computer Science and Engineering, Guangzhou, 510006, China
| | - Lei Zhu
- South China University of Technology, School of Computer Science and Engineering, Guangzhou, 510006, China
| | - Hongmin Cai
- South China University of Technology, School of Computer Science and Engineering, Guangzhou, 510006, China
| |
Collapse
|
13
|
LMSM: A modular approach for identifying lncRNA related miRNA sponge modules in breast cancer. PLoS Comput Biol 2020; 16:e1007851. [PMID: 32324747 PMCID: PMC7200020 DOI: 10.1371/journal.pcbi.1007851] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2019] [Revised: 05/05/2020] [Accepted: 04/06/2020] [Indexed: 12/12/2022] Open
Abstract
Until now, existing methods for identifying lncRNA related miRNA sponge modules mainly rely on lncRNA related miRNA sponge interaction networks, which may not provide a full picture of miRNA sponging activities in biological conditions. Hence there is a strong need of new computational methods to identify lncRNA related miRNA sponge modules. In this work, we propose a framework, LMSM, to identify LncRNA related MiRNA Sponge Modules from heterogeneous data. To understand the miRNA sponging activities in biological conditions, LMSM uses gene expression data to evaluate the influence of the shared miRNAs on the clustered sponge lncRNAs and mRNAs. We have applied LMSM to the human breast cancer (BRCA) dataset from The Cancer Genome Atlas (TCGA). As a result, we have found that the majority of LMSM modules are significantly implicated in BRCA and most of them are BRCA subtype-specific. Most of the mediating miRNAs act as crosslinks across different LMSM modules, and all of LMSM modules are statistically significant. Multi-label classification analysis shows that the performance of LMSM modules is significantly higher than baseline’s performance, indicating the biological meanings of LMSM modules in classifying BRCA subtypes. The consistent results suggest that LMSM is robust in identifying lncRNA related miRNA sponge modules. Moreover, LMSM can be used to predict miRNA targets. Finally, LMSM outperforms a graph clustering-based strategy in identifying BRCA-related modules. Altogether, our study shows that LMSM is a promising method to investigate modular regulatory mechanism of sponge lncRNAs from heterogeneous data. Previous studies have revealed that long non-coding RNAs (lncRNAs), as microRNA (miRNA) sponges or competing endogenous RNAs (ceRNAs), can regulate the expression levels of messenger RNAs (mRNAs) by decreasing the amount of miRNAs interacting with mRNAs. In this work, we hypothesize that the “tug-of-war” between RNA transcripts for attracting miRNAs is across groups or modules. Based on the hypothesis, we propose a framework called LMSM, to identify LncRNA related MiRNA Sponge Modules. Based on the two miRNA sponge modular competition principles, significant sharing of miRNAs and high canonical correlation between the sponge lncRNAs and mRNAs, LMSM is also capable of predicting miRNA targets. LMSM not only extends the ceRNA hypothesis, but also provides a novel way to investigate the biological functions and modular mechanism of lncRNAs in breast cancer.
Collapse
|
14
|
Xiao Q, Zhang N, Luo J, Dai J, Tang X. Adaptive multi-source multi-view latent feature learning for inferring potential disease-associated miRNAs. Brief Bioinform 2020; 22:2043-2057. [PMID: 32186712 DOI: 10.1093/bib/bbaa028] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2019] [Revised: 02/16/2020] [Accepted: 01/14/2020] [Indexed: 12/13/2022] Open
Abstract
Accumulating evidence has shown that microRNAs (miRNAs) play crucial roles in different biological processes, and their mutations and dysregulations have been proved to contribute to tumorigenesis. In silico identification of disease-associated miRNAs is a cost-effective strategy to discover those most promising biomarkers for disease diagnosis and treatment. The increasing available omics data sources provide unprecedented opportunities to decipher the underlying relationships between miRNAs and diseases by computational models. However, most existing methods are biased towards a single representation of miRNAs or diseases and are also not capable of discovering unobserved associations for new miRNAs or diseases without association information. In this study, we present a novel computational method with adaptive multi-source multi-view latent feature learning (M2LFL) to infer potential disease-associated miRNAs. First, we adopt multiple data sources to obtain similarity profiles and capture different latent features according to the geometric characteristic of miRNA and disease spaces. Then, the multi-modal latent features are projected to a common subspace to discover unobserved miRNA-disease associations in both miRNA and disease views, and an adaptive joint graph regularization term is developed to preserve the intrinsic manifold structures of multiple similarity profiles. Meanwhile, the Lp,q-norms are imposed into the projection matrices to ensure the sparsity and improve interpretability. The experimental results confirm the superior performance of our proposed method in screening reliable candidate disease miRNAs, which suggests that M2LFL could be an efficient tool to discover diagnostic biomarkers for guiding laborious clinical trials.
Collapse
|
15
|
|
16
|
Xiao Q, Dai J, Luo J, Fujita H. Multi-view manifold regularized learning-based method for prioritizing candidate disease miRNAs. Knowl Based Syst 2019. [DOI: 10.1016/j.knosys.2019.03.023] [Citation(s) in RCA: 65] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
|