1
|
Wang Y, Zhou B, Ru J, Meng X, Wang Y, Liu W. Advances in computational methods for identifying cancer driver genes. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2023; 20:21643-21669. [PMID: 38124614 DOI: 10.3934/mbe.2023958] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/23/2023]
Abstract
Cancer driver genes (CDGs) are crucial in cancer prevention, diagnosis and treatment. This study employed computational methods for identifying CDGs, categorizing them into four groups. The major frameworks for each of these four categories were summarized. Additionally, we systematically gathered data from public databases and biological networks, and we elaborated on computational methods for identifying CDGs using the aforementioned databases. Further, we summarized the algorithms, mainly involving statistics and machine learning, used for identifying CDGs. Notably, the performances of nine typical identification methods for eight types of cancer were compared to analyze the applicability areas of these methods. Finally, we discussed the challenges and prospects associated with methods for identifying CDGs. The present study revealed that the network-based algorithms and machine learning-based methods demonstrated superior performance.
Collapse
Affiliation(s)
- Ying Wang
- School of Computer Science and Engineering, Changshu Institute of Technology, Changshu 215500, China
| | - Bohao Zhou
- School of Computer Science and Engineering, Changshu Institute of Technology, Changshu 215500, China
| | - Jidong Ru
- School of Textile Garment and Design, Changshu Institute of Technology, Changshu 215500, China
| | - Xianglian Meng
- School of Computer Information and Engineering, Changzhou Institute of Technology, Changzhou 213032, China
| | - Yundong Wang
- School of Computer Science and Engineering, Changshu Institute of Technology, Changshu 215500, China
| | - Wenjie Liu
- School of Computer Information and Engineering, Changzhou Institute of Technology, Changzhou 213032, China
| |
Collapse
|
2
|
Liu J, Ma F, Zhu Y, Zhang N, Kong L, Mi J, Cong H, Gao R, Wang M, Zhang Y. MaxCLK: discovery of cancer driver genes via maximal clique and information entropy of modules. Bioinformatics 2023; 39:btad737. [PMID: 38065693 PMCID: PMC10739565 DOI: 10.1093/bioinformatics/btad737] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2023] [Revised: 11/10/2023] [Accepted: 12/07/2023] [Indexed: 12/23/2023] Open
Abstract
MOTIVATION Cancer is caused by the accumulation of somatic mutations in multiple pathways, in which driver mutations are typically of the properties of high coverage and high exclusivity in patients. Identifying cancer driver genes has a pivotal role in understanding the mechanisms of oncogenesis and treatment. RESULTS Here, we introduced MaxCLK, an algorithm for identifying cancer driver genes, which was developed by an integrated analysis of somatic mutation data and protein-protein interaction (PPI) networks and further improved by an information entropy index. Tested on pancancer and single cancers, MaxCLK outperformed other existing methods with higher accuracy. About pancancer, we predicted 154 driver genes and 787 driver modules. The analysis of co-occurrence and exclusivity between modules and pathways reveals the correlation of their combinations. Overall, our study has deepened the understanding of driver mechanism in PPI topology and found novel driver genes. AVAILABILITY AND IMPLEMENTATION The source codes for MaxCLK are freely available at https://github.com/ShandongUniversityMasterMa/MaxCLK-main.
Collapse
Affiliation(s)
- Jian Liu
- School of Mathematics and Statistics, Shandong University at Weihai, Weihai, Shandong 264209, China
| | - Fubin Ma
- School of Mathematics and Statistics, Shandong University at Weihai, Weihai, Shandong 264209, China
| | - Yongdi Zhu
- School of Mathematics and Statistics, Shandong University at Weihai, Weihai, Shandong 264209, China
| | - Naiqian Zhang
- School of Mathematics and Statistics, Shandong University at Weihai, Weihai, Shandong 264209, China
| | - Lingming Kong
- Marine College, Shandong University at Weihai, Weihai, Shandong 264209, China
| | - Jia Mi
- Precision Medicine Research Center, School of Pharmacy, Binzhou Medical University, Yantai, Shandong 264003, China
| | - Haiyan Cong
- Department of Central Lab, Weihai Municipal Hospital, Weihai, Shandong 264209, China
| | - Rui Gao
- School of Control Science and Engineering, Shandong University, Jinan, Shandong 250100, China
| | - Mingyi Wang
- Department of Central Lab, Weihai Municipal Hospital, Weihai, Shandong 264209, China
- Department of Central Lab, Weihai Municipal Hospital, Cheeloo College of Medicine, Shandong University, Weihai, Shandong 264200, China
| | - Yusen Zhang
- School of Mathematics and Statistics, Shandong University at Weihai, Weihai, Shandong 264209, China
| |
Collapse
|
3
|
Wang C, Shi J, Cai J, Zhang Y, Zheng X, Zhang N. DriverRWH: discovering cancer driver genes by random walk on a gene mutation hypergraph. BMC Bioinformatics 2022; 23:277. [PMID: 35831792 PMCID: PMC9281118 DOI: 10.1186/s12859-022-04788-7] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2021] [Accepted: 06/08/2022] [Indexed: 12/24/2022] Open
Abstract
Background Recent advances in next-generation sequencing technologies have helped investigators generate massive amounts of cancer genomic data. A critical challenge in cancer genomics is identification of a few cancer driver genes whose mutations cause tumor growth. However, the majority of existing computational approaches underuse the co-occurrence mutation information of the individuals, which are deemed to be important in tumorigenesis and tumor progression, resulting in high rate of false positive. Results To make full use of co-mutation information, we present a random walk algorithm referred to as DriverRWH on a weighted gene mutation hypergraph model, using somatic mutation data and molecular interaction network data to prioritize candidate driver genes. Applied to tumor samples of different cancer types from The Cancer Genome Atlas, DriverRWH shows significantly better performance than state-of-art prioritization methods in terms of the area under the curve scores and the cumulative number of known driver genes recovered in top-ranked candidate genes. Besides, DriverRWH discovers several potential drivers, which are enriched in cancer-related pathways. DriverRWH recovers approximately 50% known driver genes in the top 30 ranked candidate genes for more than half of the cancer types. In addition, DriverRWH is also highly robust to perturbations in the mutation data and gene functional network data. Conclusion DriverRWH is effective among various cancer types in prioritizes cancer driver genes and provides considerable improvement over other tools with a better balance of precision and sensitivity. It can be a useful tool for detecting potential driver genes and facilitate targeted cancer therapies. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-022-04788-7.
Collapse
Affiliation(s)
- Chenye Wang
- School of Mathematics and Statistics, Shandong University, Weihai, 264209, China
| | - Junhan Shi
- School of Mathematics and Statistics, Shandong University, Weihai, 264209, China
| | - Jiansheng Cai
- Department of Mathematics, Weifang University, Weifang, 261061, Shandong, China
| | - Yusen Zhang
- School of Mathematics and Statistics, Shandong University, Weihai, 264209, China
| | - Xiaoqi Zheng
- Department of Mathematics, Shanghai Normal University, Shanghai, 200234, China
| | - Naiqian Zhang
- School of Mathematics and Statistics, Shandong University, Weihai, 264209, China.
| |
Collapse
|
4
|
Van Daele D, Weytjens B, De Raedt L, Marchal K. OMEN: Network-based Driver Gene Identification using Mutual Exclusivity. Bioinformatics 2022; 38:3245-3251. [PMID: 35552634 DOI: 10.1093/bioinformatics/btac312] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2021] [Revised: 04/28/2022] [Accepted: 05/09/2022] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION Network-based driver identification methods that can exploit mutual exclusivity typically fail to detect rare drivers because of their statistical rigor. Propagation-based methods in contrast allow recovering rare driver genes, but the interplay between network topology and high-scoring nodes often results in spurious predictions. The specificity of driver gene detection can be improved by taking into account both gene-specific and gene-set properties. Combining these requires a formalism that can adjust gene-set properties depending on the exact network context within which a gene is analyzed. RESULTS We developed OMEN: a logic programming framework based on random walk semantics. OMEN presents a number of novel concepts. In particular, its design is unique in that it presents an effective approach to combine both gene-specific driver properties and gene-set properties, and includes a novel method to avoid restrictive, a priori filtering of genes by exploiting the gene-set property of mutual exclusivity, expressed in terms of the functional impact scores of mutations, rather than in terms of simple binary mutation calls. Applying OMEN to a benchmark data set derived from TCGA illustrates how OMEN is able to robustly identify driver genes and modules of driver genes as proxies of driver pathways. AVAILABILITY The source code is freely available for download at www.github.com/DriesVanDaele/OMEN The data set is archived at https://doi.org/10.5281/zenodo.6419097 and the code at https://doi.org/10.5281/zenodo.6419764. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
| | - Bram Weytjens
- Department of Plant Biotechnology and Bioinformatics, Department of Information Technology, IDLab, Gent, 9000, Belgium IMEC
| | - Luc De Raedt
- Department of Computer Science, KU Leuven, 3001, Belgium
| | - Kathleen Marchal
- Department of Plant Biotechnology and Bioinformatics, Department of Information Technology, IDLab, Gent, 9000, Belgium IMEC
| |
Collapse
|
5
|
Gao B, Zhao Y, Gao Y, Li G, Wu L. Identification of Common Driver Gene Modules and Associations between Cancers through Integrated Network Analysis. GLOBAL CHALLENGES (HOBOKEN, NJ) 2021; 5:2100006. [PMID: 34504716 PMCID: PMC8414517 DOI: 10.1002/gch2.202100006] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/20/2021] [Revised: 04/26/2021] [Indexed: 05/12/2023]
Abstract
High-throughput biological data has created an unprecedented opportunity for illuminating the mechanisms of tumor emergence and evolution. An important and challenging problem in deciphering cancers is to investigate the commonalities of driver genes and pathways and the associations between cancers. Aiming at this problem, a tool ComCovEx is developed to identify common cancer driver gene modules between two cancers by searching for the candidates in local signaling networks using an exclusivity-coverage iteration strategy and outputting those with significant coverage and exclusivity for both cancers. The associations of the cancer pairs are further evaluated by Fisher's exact test. Being applied to 11 TCGA cancer datasets, ComCovEx identifies 13 significantly associated cancer pairs with plenty of biologically significant common gene modules. The novel results of cancer relationship and common gene modules reveal the relevant pathological basis of different cancer types and provide new clues to diagnosis and drug treatment in associated cancers.
Collapse
Affiliation(s)
- Bo Gao
- IAMMADISNCMISAcademy of Mathematics and Systems ScienceChinese Academy of SciencesBeijing100190China
- School of MathematicsShandong UniversityJinan250100China
- School of Mathematical SciencesUniversity of Chinese Academy of SciencesBeijing100049China
- School of Public HealthCapital Medical UniversityBeijing100069China
- Beijing Municipal Key Laboratory of Clinical EpidemiologyBeijing100069China
| | - Yue Zhao
- IAMMADISNCMISAcademy of Mathematics and Systems ScienceChinese Academy of SciencesBeijing100190China
- School of Mathematical SciencesUniversity of Chinese Academy of SciencesBeijing100049China
| | - Yonghang Gao
- IAMMADISNCMISAcademy of Mathematics and Systems ScienceChinese Academy of SciencesBeijing100190China
- School of Mathematical SciencesUniversity of Chinese Academy of SciencesBeijing100049China
| | - Guojun Li
- School of MathematicsShandong UniversityJinan250100China
- Research Center for Mathematics and Interdisciplinary SciencesShandong UniversityQingdao266237China
| | - Ling‐Yun Wu
- IAMMADISNCMISAcademy of Mathematics and Systems ScienceChinese Academy of SciencesBeijing100190China
- School of Mathematical SciencesUniversity of Chinese Academy of SciencesBeijing100049China
| |
Collapse
|
6
|
Klein MI, Cannataro VL, Townsend JP, Newman S, Stern DF, Zhao H. Identifying modules of cooperating cancer drivers. Mol Syst Biol 2021; 17:e9810. [PMID: 33769711 PMCID: PMC7995435 DOI: 10.15252/msb.20209810] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2020] [Revised: 01/20/2021] [Accepted: 01/26/2021] [Indexed: 12/22/2022] Open
Abstract
Identifying cooperating modules of driver alterations can provide insights into cancer etiology and advance the development of effective personalized treatments. We present Cancer Rule Set Optimization (CRSO) for inferring the combinations of alterations that cooperate to drive tumor formation in individual patients. Application to 19 TCGA cancer types revealed a mean of 11 core driver combinations per cancer, comprising 2-6 alterations per combination and accounting for a mean of 70% of samples per cancer type. CRSO is distinct from methods based on statistical co-occurrence, which we demonstrate is a suboptimal criterion for investigating driver cooperation. CRSO identified well-studied driver combinations that were not detected by other approaches and nominated novel combinations that correlate with clinical outcomes in multiple cancer types. Novel synergies were identified in NRAS-mutant melanomas that may be therapeutically relevant. Core driver combinations involving NFE2L2 mutations were identified in four cancer types, supporting the therapeutic potential of NRF2 pathway inhibition. CRSO is available at https://github.com/mikekleinsgit/CRSO/.
Collapse
Affiliation(s)
- Michael I Klein
- Program in Computational Biology and BioinformaticsYale UniversityNew HavenCTUSA
- Bioinformatics R&DSema4StamfordCTUSA
| | - Vincent L Cannataro
- Department of BiologyEmmanuel CollegeBostonMAUSA
- Department of BiostatisticsYale School of Public HealthNew HavenCTUSA
| | - Jeffrey P Townsend
- Program in Computational Biology and BioinformaticsYale UniversityNew HavenCTUSA
- Department of BiostatisticsYale School of Public HealthNew HavenCTUSA
- Yale Cancer CenterYale UniversityNew HavenCTUSA
| | | | - David F Stern
- Yale Cancer CenterYale UniversityNew HavenCTUSA
- Department of PathologyYale School of MedicineNew HavenCTUSA
| | - Hongyu Zhao
- Program in Computational Biology and BioinformaticsYale UniversityNew HavenCTUSA
- Department of BiostatisticsYale School of Public HealthNew HavenCTUSA
- Yale Cancer CenterYale UniversityNew HavenCTUSA
| |
Collapse
|
7
|
Gu H, Xu X, Qin P, Wang J. FI-Net: Identification of Cancer Driver Genes by Using Functional Impact Prediction Neural Network. Front Genet 2020; 11:564839. [PMID: 33244318 PMCID: PMC7683798 DOI: 10.3389/fgene.2020.564839] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2020] [Accepted: 09/30/2020] [Indexed: 12/24/2022] Open
Abstract
Identification of driver genes, whose mutations cause the development of tumors, is crucial for the improvement of cancer research and precision medicine. To overcome the problem that the traditional frequency-based methods cannot detect lowly recurrently mutated driver genes, researchers have focused on the functional impact of gene mutations and proposed the function-based methods. However, most of the function-based methods estimate the distribution of the null model through the non-parametric method, which is sensitive to sample size. Besides, such methods could probably lead to underselection or overselection results. In this study, we proposed a method to identify driver genes by using functional impact prediction neural network (FI-net). An artificial neural network as a parametric model was constructed to estimate the functional impact scores for genes, in which multi-omics features were used as the multivariate inputs. Then the estimation of the background distribution and the identification of driver genes were conducted in each cluster obtained by the hierarchical clustering algorithm. We applied FI-net and other 22 state-of-the-art methods to 31 datasets from The Cancer Genome Atlas project. According to the comprehensive evaluation criterion, FI-net was powerful among various datasets and outperformed the other methods in terms of the overlap fraction with Cancer Gene Census and Network of Cancer Genes database, and the consensus in predictions among methods. Furthermore, the results illustrated that FI-net can identify known and potential novel driver genes.
Collapse
Affiliation(s)
- Hong Gu
- Faculty of Electronic Information and Electrical Engineering, Dalian University of Technology, Dalian, China
| | - Xiaolu Xu
- Faculty of Electronic Information and Electrical Engineering, Dalian University of Technology, Dalian, China
| | - Pan Qin
- Faculty of Electronic Information and Electrical Engineering, Dalian University of Technology, Dalian, China
| | - Jia Wang
- Department of Breast Surgery, Institute of Breast Disease, Second Hospital of Dalian Medical University, Dalian, China
| |
Collapse
|
8
|
Gao B, Zhao Y, Li Y, Liu J, Wang L, Li G, Su Z. Prediction of Driver Modules via Balancing Exclusive Coverages of Mutations in Cancer Samples. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2019; 6:1801384. [PMID: 30828525 PMCID: PMC6382311 DOI: 10.1002/advs.201801384] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/19/2018] [Revised: 10/04/2018] [Indexed: 05/07/2023]
Abstract
Mutual exclusivity of cancer driving mutations is a frequently observed phenomenon in the mutational landscape of cancer. The long tail of rare mutations complicates the discovery of mutually exclusive driver modules. The existing methods usually suffer from the problem that only few genes in some identified modules cover most of the cancer samples. To overcome this hurdle, an efficient method UniCovEx is presented via identifying mutually exclusive driver modules of balanced exclusive coverages. UniCovEx first searches for candidate driver modules with a strong topological relationship in signaling networks using a greedy strategy. It then evaluates the candidate modules by considering their coverage, exclusivity, and balance of coverage, using a novel metric termed exclusive entropy of modules, which measures how balanced the modules are. Finally, UniCovEx predicts sample-specific driver modules by solving a minimum set cover problem using a greedy strategy. When tested on 12 The Cancer Genome Atlas datasets of different cancer types, UniCovEx shows a significant superiority over the previous methods. The software is available at: https://sourceforge.net/projects/cancer-pathway/files/.
Collapse
Affiliation(s)
- Bo Gao
- School of MathematicsShandong UniversityJinan250100China
- State Key Laboratory of Microbial TechnologyShandong UniversityJinan250100China
| | - Yue Zhao
- IAMMADISNCMISAcademy of Mathematics and Systems ScienceChinese Academy of SciencesBeijing100190China
- School of Mathematical SciencesUniversity of Chinese Academy of SciencesBeijing100049China
| | - Yang Li
- School of MathematicsShandong UniversityJinan250100China
- State Key Laboratory of Microbial TechnologyShandong UniversityJinan250100China
| | - Juntao Liu
- School of MathematicsShandong UniversityJinan250100China
| | - Lushan Wang
- State Key Laboratory of Microbial TechnologyShandong UniversityJinan250100China
| | - Guojun Li
- School of MathematicsShandong UniversityJinan250100China
- State Key Laboratory of Microbial TechnologyShandong UniversityJinan250100China
| | - Zhengchang Su
- Department of Bioinformatics and GenomicsCollege of Computing and InformaticsThe University of North Carolina at Charlotte9201 University City BlvdCharlotteNC28223USA
| |
Collapse
|
9
|
Zhou XH, Chu XY, Xue G, Xiong JH, Zhang HY. Identifying cancer prognostic modules by module network analysis. BMC Bioinformatics 2019; 20:85. [PMID: 30777030 PMCID: PMC6380061 DOI: 10.1186/s12859-019-2674-z] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2017] [Accepted: 02/08/2019] [Indexed: 02/08/2023] Open
Abstract
Background The identification of prognostic genes that can distinguish the prognostic risks of cancer patients remains a significant challenge. Previous works have proven that functional gene sets were more reliable for this task than the gene signature. However, few works have considered the cross-talk among functional gene sets, which may result in neglecting important prognostic gene sets for cancer. Results Here, we proposed a new method that considers both the interactions among modules and the prognostic correlation of the modules to identify prognostic modules in cancers. First, dense sub-networks in the gene co-expression network of cancer patients were detected. Second, cross-talk between every two modules was identified by a permutation test, thus generating the module network. Third, the prognostic correlation of each module was evaluated by the resampling method. Then, the GeneRank algorithm, which takes the module network and the prognostic correlations of all the modules as input, was applied to prioritize the prognostic modules. Finally, the selected modules were validated by survival analysis in various data sets. Our method was applied in three kinds of cancers, and the results show that our method succeeded in identifying prognostic modules in all the three cancers. In addition, our method outperformed state-of-the-art methods. Furthermore, the selected modules were significantly enriched with known cancer-related genes and drug targets of cancer, which may indicate that the genes involved in the modules may be drug targets for therapy. Conclusions We proposed a useful method to identify key modules in cancer prognosis and our prognostic genes may be good candidates for drug targets. Electronic supplementary material The online version of this article (10.1186/s12859-019-2674-z) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Xiong-Hui Zhou
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan, 430070, People's Republic of China
| | - Xin-Yi Chu
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan, 430070, People's Republic of China
| | - Gang Xue
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan, 430070, People's Republic of China
| | - Jiang-Hui Xiong
- State Key Laboratory of Space Medicine Fundamentals and Application, China Astronaut Research and Training Center, Beijing, People's Republic of China.,Lab of Epigenetics and Health Tracking Technology, Space Institute of Southern China, Shenzhen, People's Republic of China
| | - Hong-Yu Zhang
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan, 430070, People's Republic of China.
| |
Collapse
|
10
|
Hou Y, Gao B, Li G, Su Z. MaxMIF: A New Method for Identifying Cancer Driver Genes through Effective Data Integration. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2018; 5:1800640. [PMID: 30250803 PMCID: PMC6145398 DOI: 10.1002/advs.201800640] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/26/2018] [Revised: 06/14/2018] [Indexed: 05/05/2023]
Abstract
Identification of a few cancer driver mutation genes from a much larger number of passenger mutation genes in cancer samples remains a highly challenging task. Here, a novel method for distinguishing the driver genes from the passenger genes by effective integration of somatic mutation data and molecular interaction data using a maximal mutational impact function (MaxMIF) is presented. When evaluated on six somatic mutation datasets of Pan-Cancer and 19 datasets of different cancer types from TCGA, MaxMIF almost always significantly outperforms all the existing state-of-the-art methods in terms of predictive accuracy, sensitivity, and specificity. It recovers about 30% more known cancer genes in 500 top-ranked candidate genes than the best among the other tools evaluated. MaxMIF is also highly robust to data perturbation. Intriguingly, MaxMIF is able to identify potential cancer driver genes, with strong experimental data support. Therefore, MaxMIF can be very useful for identifying or prioritizing cancer driver genes in the increasing number of available cancer genomic data.
Collapse
Affiliation(s)
- Yingnan Hou
- School of MathematicsShandong UniversityJinan250100P. R. China
- State Key Laboratory of Microbial TechnologyShandong UniversityJinan250100P. R. China
| | - Bo Gao
- School of MathematicsShandong UniversityJinan250100P. R. China
- State Key Laboratory of Microbial TechnologyShandong UniversityJinan250100P. R. China
| | - Guojun Li
- School of MathematicsShandong UniversityJinan250100P. R. China
- State Key Laboratory of Microbial TechnologyShandong UniversityJinan250100P. R. China
- Department of Bioinformatics and GenomicsThe University of North Carolina at Charlotte9201, University City BlvdCharlotteNC28223USA
| | - Zhengchang Su
- Department of Bioinformatics and GenomicsThe University of North Carolina at Charlotte9201, University City BlvdCharlotteNC28223USA
| |
Collapse
|