1
|
Pham VVH, Jue TR, Bell JL, Luciani F, Michniewicz F, Cirillo G, Vahdat L, Mayoh C, Vittorio O. A novel network-based method identifies a cuproplasia-related pan-cancer gene signature to predict patient outcome. Hum Genet 2024; 143:1145-1162. [PMID: 38642129 PMCID: PMC11485146 DOI: 10.1007/s00439-024-02673-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2023] [Accepted: 03/26/2024] [Indexed: 04/22/2024]
Abstract
Copper is a vital micronutrient involved in many biological processes and is an essential component of tumour cell growth and migration. Copper influences tumour growth through a process called cuproplasia, defined as abnormal copper-dependent cell-growth and proliferation. Copper-chelation therapy targeting this process has demonstrated efficacy in several clinical trials against cancer. While the molecular pathways associated with cuproplasia are partially known, genetic heterogeneity across different cancer types has limited the understanding of how cuproplasia impacts patient survival. Utilising RNA-sequencing data from The Cancer Genome Atlas (TCGA) and the Genotype-Tissue Expression (GTEx) datasets, we generated gene regulatory networks to identify the critical cuproplasia-related genes across 23 different cancer types. From this, we identified a novel 8-gene cuproplasia-related gene signature associated with pan-cancer survival, and a 6-gene prognostic risk score model in low grade glioma. These findings highlight the use of gene regulatory networks to identify cuproplasia-related gene signatures that could be used to generate risk score models. This can potentially identify patients who could benefit from copper-chelation therapy and identifies novel targeted therapeutic strategies.
Collapse
Affiliation(s)
- Vu Viet Hoang Pham
- Children's Cancer Institute, Lowy Cancer Research Centre, UNSW, Kensington, NSW, Australia
- School of Biomedical Sciences, UNSW Sydney, Kensington, NSW, Australia
| | - Toni Rose Jue
- Children's Cancer Institute, Lowy Cancer Research Centre, UNSW, Kensington, NSW, Australia
- School of Biomedical Sciences, UNSW Sydney, Kensington, NSW, Australia
| | - Jessica Lilian Bell
- Children's Cancer Institute, Lowy Cancer Research Centre, UNSW, Kensington, NSW, Australia
- School of Biomedical Sciences, UNSW Sydney, Kensington, NSW, Australia
| | - Fabio Luciani
- School of Biomedical Sciences, UNSW Sydney, Kensington, NSW, Australia
| | - Filip Michniewicz
- School of Biomedical Sciences, UNSW Sydney, Kensington, NSW, Australia
| | - Giuseppe Cirillo
- Department of Pharmacy, Health and Nutritional Sciences, University of Calabria, Rende, Italy
| | - Linda Vahdat
- Dartmouth-Hitchcock Medical Center: Lebanon, New Hampshire, US
| | - Chelsea Mayoh
- Children's Cancer Institute, Lowy Cancer Research Centre, UNSW, Kensington, NSW, Australia
- School of Clinical Medicine, UNSW Medicine & Health, UNSW Sydney, Kensington, NSW, Australia
| | - Orazio Vittorio
- Children's Cancer Institute, Lowy Cancer Research Centre, UNSW, Kensington, NSW, Australia.
- School of Biomedical Sciences, UNSW Sydney, Kensington, NSW, Australia.
| |
Collapse
|
2
|
Wang B, He J, Meng Q. Detection of minimal extended driver nodes in energetic costs reduction. CHAOS (WOODBURY, N.Y.) 2024; 34:083122. [PMID: 39146454 DOI: 10.1063/5.0214746] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/19/2024] [Accepted: 07/29/2024] [Indexed: 08/17/2024]
Abstract
Structures of complex networks are fundamental to system dynamics, where node state and connectivity patterns determine the cost of a control system, a key aspect in unraveling complexity. However, minimizing the energy required to control a system with the fewest input nodes remains an open problem. This study investigates the relationship between the structure of closed-connected function modules and control energy. We discovered that small structural adjustments, such as adding a few extended driver nodes, can significantly reduce control energy. Thus, we propose MInimal extended driver nodes in Energetic costs Reduction (MIER). Next, we transform the detection of MIER into a multi-objective optimization problem and choose an NSGA-II algorithm to solve it. Compared with the baseline methods, NSGA-II can approximate the optimal solution to the greatest extent. Through experiments using synthetic and real data, we validate that MIER can exponentially decrease control energy. Furthermore, random perturbation tests confirm the stability of MIER. Subsequently, we applied MIER to three representative scenarios: regulation of differential expression genes affected by cancer mutations in the human protein-protein interaction network, trade relations among developed countries in the world trade network, and regulation of body-wall muscle cells by motor neurons in Caenorhabditis elegans nervous network. The results reveal that the involvement of MIER significantly reduces control energy required for these original modules from a topological perspective. Additionally, MIER nodes enhance functionality, supplement key nodes, and uncover potential mechanisms. Overall, our work provides practical computational tools for understanding and presenting control strategies in biological, social, and neural systems.
Collapse
Affiliation(s)
- Bingbo Wang
- School of Computer Science and Technology, Xidian University, Xi'an 710071, China
| | - Jiaojiao He
- School of Computer Science and Technology, Xidian University, Xi'an 710071, China
| | - Qingdou Meng
- School of Computer Science and Technology, Xidian University, Xi'an 710071, China
| |
Collapse
|
3
|
Ma X, Li Z, Du Z, Xu Y, Chen Y, Zhuo L, Fu X, Liu R. Advancing cancer driver gene detection via Schur complement graph augmentation and independent subspace feature extraction. Comput Biol Med 2024; 174:108484. [PMID: 38643595 DOI: 10.1016/j.compbiomed.2024.108484] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2024] [Revised: 03/18/2024] [Accepted: 04/15/2024] [Indexed: 04/23/2024]
Abstract
Accurately identifying cancer driver genes (CDGs) is crucial for guiding cancer treatment and has recently received great attention from researchers. However, the high complexity and heterogeneity of cancer gene regulatory networks limit the precition accuracy of existing deep learning models. To address this, we introduce a model called SCIS-CDG that utilizes Schur complement graph augmentation and independent subspace feature extraction techniques to effectively predict potential CDGs. Firstly, a random Schur complement strategy is adopted to generate two augmented views of gene network within a graph contrastive learning framework. Rapid randomization of the random Schur complement strategy enhances the model's generalization and its ability to handle complex networks effectively. Upholding the Schur complement principle in expectations promotes the preservation of the original gene network's vital structure in the augmented views. Subsequently, we employ feature extraction technology using multiple independent subspaces, each trained with independent weights to reduce inter-subspace dependence and improve the model's expressiveness. Concurrently, we introduced a feature expansion component based on the structure of the gene network to address issues arising from the limited dimensionality of node features. Moreover, it can alleviate the challenges posed by the heterogeneity of cancer gene networks to some extent. Finally, we integrate a learnable attention weight mechanism into the graph neural network (GNN) encoder, utilizing feature expansion technology to optimize the significance of various feature levels in the prediction task. Following extensive experimental validation, the SCIS-CDG model has exhibited high efficiency in identifying known CDGs and uncovering potential unknown CDGs in external datasets. Particularly when compared to previous conventional GNN models, its performance has seen significant improved. The code and data are publicly available at: https://github.com/mxqmxqmxq/SCIS-CDG.
Collapse
Affiliation(s)
- Xinqian Ma
- School of Data Science and Artificial Intelligence, Wenzhou University of Technology, 325027, Wenzhou, China
| | - Zhen Li
- School of Computer Science of Information Technology, Qiannan Normal University for Nationalities, Duyun, Guizhou 558000, China; Institute of Computational Science and Technology, Guangzhou University, 510000, Guangzhou, China
| | - Zhenya Du
- Guangzhou Xinhua University, 510520, Guangzhou, China
| | - Yan Xu
- School of Data Science and Artificial Intelligence, Wenzhou University of Technology, 325027, Wenzhou, China
| | - Yifan Chen
- College of Computer and Information Engineering, Central South University of Forestry and Technology, Changsha, Hunan, 410004, China
| | - Linlin Zhuo
- School of Data Science and Artificial Intelligence, Wenzhou University of Technology, 325027, Wenzhou, China.
| | - Xiangzheng Fu
- College of Computer Science and Electronic Engineering, Hunan University, 410012, Changsha, China
| | - Ruijun Liu
- School of Software, Beihang University, Beijing, China.
| |
Collapse
|
4
|
Huang Y, Chen F, Sun H, Zhong C. Exploring gene-patient association to identify personalized cancer driver genes by linear neighborhood propagation. BMC Bioinformatics 2024; 25:34. [PMID: 38254011 PMCID: PMC10804660 DOI: 10.1186/s12859-024-05662-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2023] [Accepted: 01/18/2024] [Indexed: 01/24/2024] Open
Abstract
BACKGROUND Driver genes play a vital role in the development of cancer. Identifying driver genes is critical for diagnosing and understanding cancer. However, challenges remain in identifying personalized driver genes due to tumor heterogeneity of cancer. Although many computational methods have been developed to solve this problem, few efforts have been undertaken to explore gene-patient associations to identify personalized driver genes. RESULTS Here we propose a method called LPDriver to identify personalized cancer driver genes by employing linear neighborhood propagation model on individual genetic data. LPDriver builds personalized gene network based on the genetic data of individual patients, extracts the gene-patient associations from the bipartite graph of the personalized gene network and utilizes a linear neighborhood propagation model to mine gene-patient associations to detect personalized driver genes. The experimental results demonstrate that as compared to the existing methods, our method shows competitive performance and can predict cancer driver genes in a more accurate way. Furthermore, these results also show that besides revealing novel driver genes that have been reported to be related with cancer, LPDriver is also able to identify personalized cancer driver genes for individual patients by their network characteristics even if the mutation data of genes are hidden. CONCLUSIONS LPDriver can provide an effective approach to predict personalized cancer driver genes, which could promote the diagnosis and treatment of cancer. The source code and data are freely available at https://github.com/hyr0771/LPDriver .
Collapse
Affiliation(s)
- Yiran Huang
- School of Computer, Electronics and Information, Guangxi University, Nanning, 530004, China
- Key Laboratory of Parallel, Distributed and Intelligent Computing in Guangxi Universities and Colleges, Guangxi University, Nanning, 530004, China
- Guangxi Key Laboratory of Multimedia Communications and Network Technology, Guangxi University, Nanning, 530004, China
| | - Fuhao Chen
- School of Computer, Electronics and Information, Guangxi University, Nanning, 530004, China
| | - Hongtao Sun
- School of Computer, Electronics and Information, Guangxi University, Nanning, 530004, China
| | - Cheng Zhong
- School of Computer, Electronics and Information, Guangxi University, Nanning, 530004, China.
- Key Laboratory of Parallel, Distributed and Intelligent Computing in Guangxi Universities and Colleges, Guangxi University, Nanning, 530004, China.
- Guangxi Key Laboratory of Multimedia Communications and Network Technology, Guangxi University, Nanning, 530004, China.
| |
Collapse
|
5
|
Nourbakhsh M, Degn K, Saksager A, Tiberti M, Papaleo E. Prediction of cancer driver genes and mutations: the potential of integrative computational frameworks. Brief Bioinform 2024; 25:bbad519. [PMID: 38261338 PMCID: PMC10805075 DOI: 10.1093/bib/bbad519] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2023] [Revised: 11/27/2023] [Accepted: 12/11/2023] [Indexed: 01/24/2024] Open
Abstract
The vast amount of available sequencing data allows the scientific community to explore different genetic alterations that may drive cancer or favor cancer progression. Software developers have proposed a myriad of predictive tools, allowing researchers and clinicians to compare and prioritize driver genes and mutations and their relative pathogenicity. However, there is little consensus on the computational approach or a golden standard for comparison. Hence, benchmarking the different tools depends highly on the input data, indicating that overfitting is still a massive problem. One of the solutions is to limit the scope and usage of specific tools. However, such limitations force researchers to walk on a tightrope between creating and using high-quality tools for a specific purpose and describing the complex alterations driving cancer. While the knowledge of cancer development increases daily, many bioinformatic pipelines rely on single nucleotide variants or alterations in a vacuum without accounting for cellular compartments, mutational burden or disease progression. Even within bioinformatics and computational cancer biology, the research fields work in silos, risking overlooking potential synergies or breakthroughs. Here, we provide an overview of databases and datasets for building or testing predictive cancer driver tools. Furthermore, we introduce predictive tools for driver genes, driver mutations, and the impact of these based on structural analysis. Additionally, we suggest and recommend directions in the field to avoid silo-research, moving towards integrative frameworks.
Collapse
Affiliation(s)
- Mona Nourbakhsh
- Cancer Systems Biology, Section for Bioinformatics, Department of Health Technology, Technical University of Denmark, 2800 Lyngby, Denmark
| | - Kristine Degn
- Cancer Systems Biology, Section for Bioinformatics, Department of Health Technology, Technical University of Denmark, 2800 Lyngby, Denmark
| | - Astrid Saksager
- Cancer Systems Biology, Section for Bioinformatics, Department of Health Technology, Technical University of Denmark, 2800 Lyngby, Denmark
| | - Matteo Tiberti
- Cancer Structural Biology, Danish Cancer Institute, 2100 Copenhagen, Denmark
| | - Elena Papaleo
- Cancer Systems Biology, Section for Bioinformatics, Department of Health Technology, Technical University of Denmark, 2800 Lyngby, Denmark
- Cancer Structural Biology, Danish Cancer Institute, 2100 Copenhagen, Denmark
| |
Collapse
|
6
|
Meng P, Wang G, Guo H, Jiang T. Identifying cancer driver genes using a two-stage random walk with restart on a gene interaction network. Comput Biol Med 2023; 158:106810. [PMID: 37011433 DOI: 10.1016/j.compbiomed.2023.106810] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2022] [Revised: 03/08/2023] [Accepted: 03/20/2023] [Indexed: 04/03/2023]
Abstract
Cancer development and progression are significantly influenced by cancer driver genes. Understanding cancer driver genes and their mechanisms of action is essential for developing effective cancer treatments. As a result, identifying driver genes is important for drug development, cancer diagnosis, and treatment. Here, we present an algorithm to discover driver genes based on the two-stage random walk with restart (RWR), and the modified method for calculating the transition probability matrix in random walk algorithm. First, we performed the first stage of RWR on the whole gene interaction network, in which we employ a new method for calculating the transition probability matrix and extracted the subnetwork based on nodes that had a high correlation with the seed nodes. The subnetwork was then applied to the second stage of RWR and the nodes were re-ranked in the subnetwork. Our approach outperformed existing methods in identifying driver genes. The outcome of the effect of three gene interaction networks, two rounds of random walk, and the seed nodes' sensitivity were all compared at the same time. In addition, we identified several potential driver genes, some of which are involved in driving cancer development. Overall, our method is efficient in various cancer types, significantly outperforms existing methods, and can identify possible driver genes.
Collapse
|
7
|
Li F, Li H, Shang J, Liu JX, Dai L, Liu X, Li Y. A network-based method for identifying cancer driver genes based on node control centrality. Exp Biol Med (Maywood) 2022; 248:232-241. [PMID: 36573462 PMCID: PMC10107394 DOI: 10.1177/15353702221139201] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Open
Abstract
Cancer is one of the major contributors to human mortality and has a serious influence on human survival and health. In biomedical research, the identification of cancer driver genes (cancer drivers for short) is an important task; cancer drivers can promote the progression and generation of cancer. To identify cancer drivers, many methods have been developed. These computational models only identify coding cancer drivers; however, non-coding drivers likewise play significant roles in the progression of cancer. Hence, we propose a Network-based Method for identifying cancer Driver Genes based on node Control Centrality (NMDGCC), which can identify coding and non-coding cancer driver genes. The process of NMDGCC for identifying driver genes mainly includes the following two steps. In the first step, we construct a gene interaction network by using mRNAs and miRNAs expression data in the cancer state. In the second step, the control centrality of the node is used to identify cancer drivers in the constructed network. We use the breast cancer dataset from The Cancer Genome Atlas (TCGA) to verify the effectiveness of NMDGCC. Compared with the existing methods of cancer driver genes identification, NMDGCC has a better performance. NMDGCC also identifies 295 miRNAs as non-coding cancer drivers, of which 158 are related to tumorigenesis of BRCA. We also apply NMDGCC to identify driver genes related to the different breast cancer subtypes. The result shows that NMDGCC detects many cancer drivers of specific cancer subtypes.
Collapse
Affiliation(s)
- Feng Li
- School of Computer Science, Qufu Normal University, Rizhao 276826, China
| | - Han Li
- School of Computer Science, Qufu Normal University, Rizhao 276826, China
| | - Junliang Shang
- School of Computer Science, Qufu Normal University, Rizhao 276826, China
| | - Jin-Xing Liu
- School of Computer Science, Qufu Normal University, Rizhao 276826, China
| | - Lingyun Dai
- School of Computer Science, Qufu Normal University, Rizhao 276826, China
| | - Xikui Liu
- Department of Electrical Engineering and Information Technology, Shandong University of Science and Technology, Jinan 250031, China
| | - Yan Li
- Department of Electrical Engineering and Information Technology, Shandong University of Science and Technology, Jinan 250031, China
| |
Collapse
|
8
|
Cifuentes-Bernal AM, Pham VVH, Li X, Liu L, Li J, Duy Le T. Dynamic cancer drivers: a causal approach for cancer driver discovery based on bio-pathological trajectories. Brief Funct Genomics 2022; 21:455-465. [PMID: 36124841 PMCID: PMC10467634 DOI: 10.1093/bfgp/elac030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2022] [Revised: 08/08/2022] [Accepted: 08/23/2022] [Indexed: 12/14/2022] Open
Abstract
The traditional way for discovering genes which drive cancer (namely cancer drivers) neglects the dynamic information of cancer development, even though it is well known that cancer progresses dynamically. To enhance cancer driver discovery, we expand cancer driver concept to dynamic cancer driver as a gene driving one or more bio-pathological transitions during cancer progression. Our method refers to the fact that cancer should not be considered as a single process but a compendium of altered biological processes causing the disease to develop over time. Reciprocally, different drivers of cancer can potentially be discovered by analysing different bio-pathological pathways. We propose a novel approach for causal inference of genes driving one or more core processes during cancer development (i.e. dynamic cancer driver). We use the concept of pseudotime for inferring the latent progression of samples along a biological transition during cancer and identifying a critical event when such a process is significantly deviated from normal to carcinogenic. We infer driver genes by assessing the causal effect they have on the process after such a critical event. We have applied our method to single-cell and bulk sequencing datasets of breast cancer. The evaluation results show that our method outperforms well-recognized cancer driver inference methods. These results suggest that including information of the underlying dynamics of cancer improves the inference process (in comparison with using static data), and allows us to discover different sets of driver genes from different processes in cancer. R scripts and datasets can be found at https://github.com/AndresMCB/DynamicCancerDriver.
Collapse
Affiliation(s)
- Andres M Cifuentes-Bernal
- UniSA STEM Unit, University of South Australia,
Mawson Lakes Blvd, 5095, South Australia , Australia
| | - Vu V H Pham
- UniSA STEM Unit, University of South Australia,
Mawson Lakes Blvd, 5095, South Australia , Australia
| | - Xiaomei Li
- UniSA STEM Unit, University of South Australia,
Mawson Lakes Blvd, 5095, South Australia , Australia
| | - Lin Liu
- UniSA STEM Unit, University of South Australia,
Mawson Lakes Blvd, 5095, South Australia , Australia
| | - Jiuyong Li
- UniSA STEM Unit, University of South Australia,
Mawson Lakes Blvd, 5095, South Australia , Australia
| | - Thuc Duy Le
- UniSA STEM Unit, University of South Australia,
Mawson Lakes Blvd, 5095, South Australia , Australia
| |
Collapse
|
9
|
Li X, Liu L, Whitehead C, Li J, Thierry B, Le TD, Winter M. OUP accepted manuscript. Brief Funct Genomics 2022; 21:296-309. [PMID: 35484822 PMCID: PMC9328024 DOI: 10.1093/bfgp/elac006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2021] [Revised: 03/11/2022] [Accepted: 03/18/2022] [Indexed: 11/24/2022] Open
Abstract
Preeclampsia is a pregnancy-specific disease that can have serious effects on the health of both mothers and their offspring. Predicting which women will develop preeclampsia in early pregnancy with high accuracy will allow for improved management. The clinical symptoms of preeclampsia are well recognized, however, the precise molecular mechanisms leading to the disorder are poorly understood. This is compounded by the heterogeneous nature of preeclampsia onset, timing and severity. Indeed a multitude of poorly defined causes including genetic components implicates etiologic factors, such as immune maladaptation, placental ischemia and increased oxidative stress. Large datasets generated by microarray and next-generation sequencing have enabled the comprehensive study of preeclampsia at the molecular level. However, computational approaches to simultaneously analyze the preeclampsia transcriptomic and network data and identify clinically relevant information are currently limited. In this paper, we proposed a control theory method to identify potential preeclampsia-associated genes based on both transcriptomic and network data. First, we built a preeclampsia gene regulatory network and analyzed its controllability. We then defined two types of critical preeclampsia-associated genes that play important roles in the constructed preeclampsia-specific network. Benchmarking against differential expression, betweenness centrality and hub analysis we demonstrated that the proposed method may offer novel insights compared with other standard approaches. Next, we investigated subtype specific genes for early and late onset preeclampsia. This control theory approach could contribute to a further understanding of the molecular mechanisms contributing to preeclampsia.
Collapse
Affiliation(s)
- Xiaomei Li
- UniSA STEM, University of South Australia, Mawson Lakes, 5095, SA, Australia
| | - Lin Liu
- UniSA STEM, University of South Australia, Mawson Lakes, 5095, SA, Australia
| | - Clare Whitehead
- Pregnancy Research Centre, Dept of Obstetrics & Gynaecology, University of Melbourne, Royal Women’s Hospital, Melbourne, 3052, VIC, Australia
| | - Jiuyong Li
- UniSA STEM, University of South Australia, Mawson Lakes, 5095, SA, Australia
| | - Benjamin Thierry
- Future Industries Institute, University of South Australia, Mawson Lakes, 5095, SA, Australia
| | - Thuc D Le
- Corresponding authors: Thuc D. Le, UniSA STEM, University of South Australia, Mawson Lakes, 5095, SA, Australia. E-mail: ; M. Winter, Future Industries Institute, University of South Australia, Mawson Lakes, 5095, SA, Australia. E-mail:
| | - Marnie Winter
- Corresponding authors: Thuc D. Le, UniSA STEM, University of South Australia, Mawson Lakes, 5095, SA, Australia. E-mail: ; M. Winter, Future Industries Institute, University of South Australia, Mawson Lakes, 5095, SA, Australia. E-mail:
| |
Collapse
|
10
|
Kosvyra A, Ntzioni E, Chouvarda I. Network analysis with biological data of cancer patients: A scoping review. J Biomed Inform 2021; 120:103873. [PMID: 34298154 DOI: 10.1016/j.jbi.2021.103873] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2020] [Revised: 06/30/2021] [Accepted: 07/18/2021] [Indexed: 12/25/2022]
Abstract
BACKGROUND & OBJECTIVE Network Analysis (NA) is a mathematical method that allows exploring relations between units and representing them as a graph. Although NA was initially related to social sciences, the past two decades was introduced in Bioinformatics. The recent growth of the networks' use in biological data analysis reveals the need to further investigate this area. In this work, we attempt to identify the use of NA with biological data, and specifically: (a) what types of data are used and whether they are integrated or not, (b) what is the purpose of this analysis, predictive or descriptive, and (c) the outcome of such analyses, specifically in cancer diseases. METHODS & MATERIALS The literature review was conducted on two databases, PubMed & IEEE, and was restricted to journal articles of the last decade (January 2010 - December 2019). At a first level, all articles were screened by title and abstract, and at a second level the screening was conducted by reading the full text article, following the predefined inclusion & exclusion criteria leading to 131 articles of interest. A table was created with the information of interest and was used for the classification of the articles. The articles were initially classified to analysis studies and studies that propose a new algorithm or methodology. Each one of these categories was further screened by the following clustering criteria: (a) data used, (b) study purpose, (c) study outcome. Specifically for the studies proposing a new algorithm, the novelty presented in each one was detected. RESULTS & Conclusions: In the past five years researchers are focusing on creating new algorithms and methodologies to enhance this field. The articles' classification revealed that only 25% of the analyses are integrating multi-omics data, although 50% of the new algorithms developed follow this integrative direction. Moreover, only 20% of the analyses and 10% of the newly developed methodologies have a predictive purpose. Regarding the result of the works reviewed, 75% of the studies focus on identifying, prognostic or not, gene signatures. Concluding, this review revealed the need for deploying predictive and multi-omics integrative algorithms and methodologies that can be used to enhance cancer diagnosis, prognosis and treatment.
Collapse
Affiliation(s)
- A Kosvyra
- Laboratory of Computing, Medical Informatics and Biomedical Imaging Technologies, School of Medicine, Aristotle University of Thessaloniki, Thessaloniki, Greece.
| | - E Ntzioni
- Laboratory of Computing, Medical Informatics and Biomedical Imaging Technologies, School of Medicine, Aristotle University of Thessaloniki, Thessaloniki, Greece
| | - I Chouvarda
- Laboratory of Computing, Medical Informatics and Biomedical Imaging Technologies, School of Medicine, Aristotle University of Thessaloniki, Thessaloniki, Greece
| |
Collapse
|
11
|
Zhang T, Zhang SW, Li Y. Identifying Driver Genes for Individual Patients through Inductive Matrix Completion. Bioinformatics 2021; 37:4477-4484. [PMID: 34175939 DOI: 10.1093/bioinformatics/btab477] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2020] [Revised: 04/30/2021] [Accepted: 06/25/2021] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION The driver genes play a key role in the evolutionary process of cancer. Effectively identifying these driver genes is crucial to cancer diagnosis and treatment. However, due to the high heterogeneity of cancers, it remains challenging to identify the driver genes for individual patients. Although some computational methods have been proposed to tackle this problem, they seldom consider the fact that the genes functionally similar to the well-established driver genes may likely play similar roles in cancer process, which potentially promotes the driver gene identification. Thus, here we developed a novel approach of IMCDriver to promote the driver gene identification both for cohorts and individual patients. RESULTS IMCDriver first considers the well-established driver genes as prior information, and adopts the using multi-omics data (e.g., somatic mutation, gene expression and protein-protein interaction) to compute the similarity between patients/genes. Then, IMCDriver prioritizes the personalized mutated genes according to their functional similarity to the well-established driver genes via Inductive Matrix Completion. Finally, IMCDriver identifies the highly rank-ordered genes as the personalized driver genes. The results on five cancer datasets from TCGA show that our IMCDriver outperforms other existing state-of-the-art methods both in the cohort and patient-specific driver gene identification. IMCDriver also reveals some novel driver genes that potentially drive cancer development. In addition, even for the driver genes rarely mutated among a population, IMCDriver can still identify them and prioritize them with high priorities. AVAILABILITY Code available at https://github.com/NWPU-903PR/IMCDriver. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Tong Zhang
- Key Laboratory of Information Fusion Technology of Ministry of Education, School of Automation, Northwestern Polytechnical University, China Xi'an.,School of Electrical and Mechanical Engineering, Pingdingshan University, Pingdingshan, China
| | - Shao-Wu Zhang
- Key Laboratory of Information Fusion Technology of Ministry of Education, School of Automation, Northwestern Polytechnical University, China Xi'an
| | - Yan Li
- Key Laboratory of Information Fusion Technology of Ministry of Education, School of Automation, Northwestern Polytechnical University, China Xi'an
| |
Collapse
|
12
|
Pham VVH, Liu L, Bracken CP, Nguyen T, Goodall GJ, Li J, Le TD. pDriver : A novel method for unravelling personalised coding and miRNA cancer drivers. Bioinformatics 2021; 37:3285-3292. [PMID: 33904576 DOI: 10.1093/bioinformatics/btab262] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2020] [Revised: 03/19/2021] [Accepted: 04/22/2021] [Indexed: 02/07/2023] Open
Abstract
MOTIVATION Unravelling cancer driver genes is important in cancer research. Although computational methods have been developed to identify cancer drivers, most of them detect cancer drivers at population level. However, two patients who have the same cancer type and receive the same treatment may have different outcomes because each patient has a different genome and their disease might be driven by different driver genes. Therefore new methods are being developed for discovering cancer drivers at individual level, but existing personalised methods only focus on coding drivers while microRNAs (miRNAs) have been shown to drive cancer progression as well. Thus, novel methods are required to discover both coding and miRNA cancer drivers at individual level. RESULTS We propose the novel method, pDriver, to discover personalised cancer drivers. pDriver includes two stages: (1) Constructing gene networks for each cancer patient and (2) Discovering cancer drivers for each patient based on the constructed gene networks. To demonstrate the effectiveness of pDriver, we have applied it to five TCGA cancer datasets and compared it with the state-of-the-art methods. The result indicates that pDriver is more effective than other methods. Furthermore, pDriver can also detect miRNA cancer drivers and most of them have been confirmed to be associated with cancer by literature. We further analyse the predicted personalised drivers for breast cancer patients and the result shows that they are significantly enriched in many GO processes and KEGG pathways involved in breast cancer. AVAILABILITY AND IMPLEMENTATION pDriver is available at https://github.com/pvvhoang/pDriver. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Vu V H Pham
- UniSA STEM, University of South Australia, Mawson Lakes, SA 5095, Australia
| | - Lin Liu
- UniSA STEM, University of South Australia, Mawson Lakes, SA 5095, Australia
| | - Cameron P Bracken
- Centre for Cancer Biology, an alliance of SA Pathology and University of South Australia, Adelaide, SA 5000, Australia.,Department of Medicine, The University of Adelaide, Adelaide, SA 5005, Australia
| | - Thin Nguyen
- Applied Artificial Intelligence Institute, Deakin University, Australia
| | - Gregory J Goodall
- Centre for Cancer Biology, an alliance of SA Pathology and University of South Australia, Adelaide, SA 5000, Australia.,Department of Medicine, The University of Adelaide, Adelaide, SA 5005, Australia
| | - Jiuyong Li
- UniSA STEM, University of South Australia, Mawson Lakes, SA 5095, Australia
| | - Thuc D Le
- UniSA STEM, University of South Australia, Mawson Lakes, SA 5095, Australia
| |
Collapse
|
13
|
Guo WF, Zhang SW, Feng YH, Liang J, Zeng T, Chen L. Network controllability-based algorithm to target personalized driver genes for discovering combinatorial drugs of individual patients. Nucleic Acids Res 2021; 49:e37. [PMID: 33434272 PMCID: PMC8053130 DOI: 10.1093/nar/gkaa1272] [Citation(s) in RCA: 26] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2020] [Revised: 12/02/2020] [Accepted: 12/22/2020] [Indexed: 12/27/2022] Open
Abstract
Multiple driver genes in individual patient samples may cause resistance to individual drugs in precision medicine. However, current computational methods have not studied how to fill the gap between personalized driver gene identification and combinatorial drug discovery for individual patients. Here, we developed a novel structural network controllability-based personalized driver genes and combinatorial drug identification algorithm (CPGD), aiming to identify combinatorial drugs for an individual patient by targeting personalized driver genes from network controllability perspective. On two benchmark disease datasets (i.e. breast cancer and lung cancer datasets), performance of CPGD is superior to that of other state-of-the-art driver gene-focus methods in terms of discovery rate among prior-known clinical efficacious combinatorial drugs. Especially on breast cancer dataset, CPGD evaluated synergistic effect of pairwise drug combinations by measuring synergistic effect of their corresponding personalized driver gene modules, which are affected by a given targeting personalized driver gene set of drugs. The results showed that CPGD performs better than existing synergistic combinatorial strategies in identifying clinical efficacious paired combinatorial drugs. Furthermore, CPGD enhanced cancer subtyping by computationally providing personalized side effect signatures for individual patients. In addition, CPGD identified 90 drug combinations candidates from SARS-COV2 dataset as potential drug repurposing candidates for recently spreading COVID-19.
Collapse
Affiliation(s)
- Wei-Feng Guo
- Key Laboratory of Information Fusion Technology of Ministry of Education, School of Automation, Northwestern Polytechnical University, Xian 710072, China.,School of Electrical Engineering, Zhengzhou University, Zhengzhou 450001, China
| | - Shao-Wu Zhang
- Key Laboratory of Information Fusion Technology of Ministry of Education, School of Automation, Northwestern Polytechnical University, Xian 710072, China
| | - Yue-Hua Feng
- Key Laboratory of Information Fusion Technology of Ministry of Education, School of Automation, Northwestern Polytechnical University, Xian 710072, China
| | - Jing Liang
- School of Electrical Engineering, Zhengzhou University, Zhengzhou 450001, China
| | - Tao Zeng
- CAS Key Laboratory of Computational Biology, Bio-Med Big Data Center, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China.,Shanghai Institute of Biochemistry and Cell Biology, Center for Excellence in Molecular Cell Science, Chinese Academy Science, Shanghai 200031, China
| | - Luonan Chen
- Shanghai Institute of Biochemistry and Cell Biology, Center for Excellence in Molecular Cell Science, Chinese Academy Science, Shanghai 200031, China.,School of Life Science and Technology, ShanghaiTech University, Shanghai 201210, China.,Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming 650223, China.,Key Laboratory of Systems Biology, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Hangzhou 310024, China
| |
Collapse
|
14
|
Zhang Y, Zhu L, Wang X. NEM-Tar: A Probabilistic Graphical Model for Cancer Regulatory Network Inference and Prioritization of Potential Therapeutic Targets From Multi-Omics Data. Front Genet 2021; 12:608042. [PMID: 33968127 PMCID: PMC8100334 DOI: 10.3389/fgene.2021.608042] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2020] [Accepted: 03/22/2021] [Indexed: 11/13/2022] Open
Abstract
Targeted therapy has been widely adopted as an effective treatment strategy to battle against cancer. However, cancers are not single disease entities, but comprising multiple molecularly distinct subtypes, and the heterogeneity nature prevents precise selection of patients for optimized therapy. Dissecting cancer subtype-specific signaling pathways is crucial to pinpointing dysregulated genes for the prioritization of novel therapeutic targets. Nested effects models (NEMs) are a group of graphical models that encode subset relations between observed downstream effects under perturbations to upstream signaling genes, providing a prototype for mapping the inner workings of the cell. In this study, we developed NEM-Tar, which extends the original NEMs to predict drug targets by incorporating causal information of (epi)genetic aberrations for signaling pathway inference. An information theory-based score, weighted information gain (WIG), was proposed to assess the impact of signaling genes on a specific downstream biological process of interest. Subsequently, we conducted simulation studies to compare three inference methods and found that the greedy hill-climbing algorithm demonstrated the highest accuracy and robustness to noise. Furthermore, two case studies were conducted using multi-omics data for colorectal cancer (CRC) and gastric cancer (GC) in the TCGA database. Using NEM-Tar, we inferred signaling networks driving the poor-prognosis subtypes of CRC and GC, respectively. Our model prioritized not only potential individual drug targets such as HER2, for which FDA-approved inhibitors are available but also the combinations of multiple targets potentially useful for the design of combination therapies.
Collapse
Affiliation(s)
- Yuchen Zhang
- Department of Biomedical Sciences, City University of Hong Kong, Hong Kong, China
| | - Lina Zhu
- Department of Biomedical Sciences, City University of Hong Kong, Hong Kong, China
| | - Xin Wang
- Department of Biomedical Sciences, City University of Hong Kong, Hong Kong, China.,Key Laboratory of Biochip Technology, Biotech and Health Centre, Shenzhen Research Institute, City University of Hong Kong, Shenzhen, China
| |
Collapse
|
15
|
Pham VVH, Liu L, Bracken C, Goodall G, Li J, Le TD. Computational methods for cancer driver discovery: A survey. Am J Cancer Res 2021; 11:5553-5568. [PMID: 33859763 PMCID: PMC8039954 DOI: 10.7150/thno.52670] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2020] [Accepted: 01/20/2021] [Indexed: 12/21/2022] Open
Abstract
Identifying the genes responsible for driving cancer is of critical importance for directing treatment. Accordingly, multiple computational tools have been developed to facilitate this task. Due to the different methods employed by these tools, different data considered by the tools, and the rapidly evolving nature of the field, the selection of an appropriate tool for cancer driver discovery is not straightforward. This survey seeks to provide a comprehensive review of the different computational methods for discovering cancer drivers. We categorise the methods into three groups; methods for single driver identification, methods for driver module identification, and methods for identifying personalised cancer drivers. In addition to providing a “one-stop” reference of these methods, by evaluating and comparing their performance, we also provide readers the information about the different capabilities of the methods in identifying biologically significant cancer drivers. The biologically relevant information identified by these tools can be seen through the enrichment of discovered cancer drivers in GO biological processes and KEGG pathways and through our identification of a small cancer-driver cohort that is capable of stratifying patient survival.
Collapse
|
16
|
Pham VVH, Liu L, Bracken CP, Goodall GJ, Li J, Le TD. DriverGroup: a novel method for identifying driver gene groups. Bioinformatics 2021; 36:i583-i591. [PMID: 33381812 DOI: 10.1093/bioinformatics/btaa797] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023] Open
Abstract
MOTIVATION Identifying cancer driver genes is a key task in cancer informatics. Most existing methods are focused on individual cancer drivers which regulate biological processes leading to cancer. However, the effect of a single gene may not be sufficient to drive cancer progression. Here, we hypothesize that there are driver gene groups that work in concert to regulate cancer, and we develop a novel computational method to detect those driver gene groups. RESULTS We develop a novel method named DriverGroup to detect driver gene groups by using gene expression and gene interaction data. The proposed method has three stages: (i) constructing the gene network, (ii) discovering critical nodes of the constructed network and (iii) identifying driver gene groups based on the discovered critical nodes. Before evaluating the performance of DriverGroup in detecting cancer driver groups, we firstly assess its performance in detecting the influence of gene groups, a key step of DriverGroup. The application of DriverGroup to DREAM4 data demonstrates that it is more effective than other methods in detecting the regulation of gene groups. We then apply DriverGroup to the BRCA dataset to identify driver groups for breast cancer. The identified driver groups are promising as several group members are confirmed to be related to cancer in literature. We further use the predicted driver groups in survival analysis and the results show that the survival curves of patient subpopulations classified using the predicted driver groups are significantly differentiated, indicating the usefulness of DriverGroup. AVAILABILITY AND IMPLEMENTATION DriverGroup is available at https://github.com/pvvhoang/DriverGroup. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Vu V H Pham
- UniSA STEM, University of South Australia, Mawson Lakes, SA, 5095, Australia
| | - Lin Liu
- UniSA STEM, University of South Australia, Mawson Lakes, SA, 5095, Australia
| | - Cameron P Bracken
- Centre for Cancer Biology, an alliance of SA Pathology and University of South Australia, Adelaide, SA, 5000, Australia.,Department of Medicine, The University of Adelaide, Adelaide, SA 5005, Australia
| | - Gregory J Goodall
- Centre for Cancer Biology, an alliance of SA Pathology and University of South Australia, Adelaide, SA, 5000, Australia.,Department of Medicine, The University of Adelaide, Adelaide, SA 5005, Australia
| | - Jiuyong Li
- UniSA STEM, University of South Australia, Mawson Lakes, SA, 5095, Australia
| | - Thuc D Le
- UniSA STEM, University of South Australia, Mawson Lakes, SA, 5095, Australia
| |
Collapse
|
17
|
Chaudhary MS, Pham VVH, Le TD. NIBNA: A network-based node importance approach for identifying breast cancer drivers. Bioinformatics 2021; 37:2521-2528. [PMID: 33677485 DOI: 10.1093/bioinformatics/btab145] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2020] [Revised: 01/21/2021] [Accepted: 02/28/2021] [Indexed: 12/11/2022] Open
Abstract
MOTIVATION Identifying meaningful cancer driver genes in a cohort of tumors is a challenging task in cancer genomics. Although existing studies have identified known cancer drivers, most of them focus on detecting coding drivers with mutations. It is acknowledged that non-coding drivers can regulate driver mutations to promote cancer growth. In this work, we propose a novel node importance based network analysis (NIBNA) framework to detect coding and non-coding cancer drivers. We hypothesize that cancer drivers are crucial to the formation of community structures in cancer network, and removing them from the network greatly perturbs the network structure thereby critically affecting the functioning of the network. NIBNA detects cancer drivers using a three-step process; first, a condition-specific network is built by incorporating gene expression data and gene networks, second, the community structures in the network are estimated and third, a centrality-based metric is applied to compute node importance. RESULTS We apply NIBNA to the BRCA dataset and it outperforms existing state-of-art methods in detecting coding cancer drivers. NIBNA also predicts 265 miRNA drivers and majority of these drivers have been validated in literature. Further we apply NIBNA to detect cancer subtype-specific drivers and several predicted drivers have been validated to be associated with cancer subtypes. Lastly, we evaluate NIBNA's performance in detecting epithelial-mesenchymal transition (EMT) drivers, and we confirmed 8 coding and 13 miRNA drivers in the list of known genes. AVAILABILITY The source code can be accessed at: https://github.com/mandarsc/NIBNA. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
| | - Vu V H Pham
- UniSA STEM, University of South Australia, Mawson Lakes, SA 5095, Australia
| | - Thuc D Le
- UniSA STEM, University of South Australia, Mawson Lakes, SA 5095, Australia
| |
Collapse
|