1
|
Zhou C, Weng J, Liu S, Zhou Q, Hu Z, Yin Y, Lv P, Sun J, Li H, Yi Y, Shen Y, Ye Q, Shi Y, Dong Q, Liu C, Zhu X, Ren N. Whole-exome sequencing reveals the metastatic potential of hepatocellular carcinoma from the perspective of tumor and circulating tumor DNA. Hepatol Int 2023; 17:1461-1476. [PMID: 37217808 DOI: 10.1007/s12072-023-10540-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/29/2023] [Accepted: 04/15/2023] [Indexed: 05/24/2023]
Abstract
BACKGROUND Relapse of hepatocellular carcinoma (HCC) due to vascular invasion is common, but the genomic mechanisms remain unclear, and molecular determinants of high-risk relapse cases are lacking. We aimed to reveal the evolutionary trajectory of microvascular invasion (MVI) and develop a predictive signature for relapse in HCC. METHODS Whole-exome sequencing was performed on tumor and peritumor tissues, portal vein tumor thrombus (PVTT), and circulating tumor DNA (ctDNA) to compare the genomic profiles between 5 HCC patients with MVI and 5 patients without MVI. We conducted an integrated analysis of exome and transcriptome to develop and validate a prognostic signature in two public cohorts and one cohort from Zhongshan Hospital, Fudan University. RESULTS Shared genomic landscapes and identical clonal origins among tumor, PVTT, and ctDNA were observed in MVI ( +) HCC, suggesting that genomic changes favoring metastasis occur at the primary tumor stage and are inherited in metastatic lesions and ctDNA. There was no clonal relatedness between the primary tumor and ctDNA in MVI ( - ) HCC. HCC had dynamic mutation alterations during MVI and exhibited genetic heterogeneity between primary and metastatic tumors, which can be comprehensively reflected by ctDNA. A relapse-related gene signature named RGSHCC was developed based on the significantly mutated genes associated with MVI and shown to be a robust classifier of HCC relapse. CONCLUSIONS We characterized the genomic alterations during HCC vascular invasion and revealed a previously undescribed evolution pattern of ctDNA in HCC. A novel multiomics-based signature was developed to identify high-risk relapse populations.
Collapse
Affiliation(s)
- Chenhao Zhou
- Department of Liver Surgery and Transplantation, Key Laboratory of Carcinogenesis and Cancer Invasion, Ministry of Education, Liver Cancer Institute, Zhongshan Hospital, Fudan University, 180 Fenglin Road, Shanghai, 200032, People's Republic of China
- Department of Molecular and Cellular Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX, 77030, USA
- Key Laboratory of Whole-Period Monitoring and Precise Intervention of Digestive Cancer of Shanghai Municipal Health Commission, Shanghai, 201199, People's Republic of China
| | - Jialei Weng
- Department of Liver Surgery and Transplantation, Key Laboratory of Carcinogenesis and Cancer Invasion, Ministry of Education, Liver Cancer Institute, Zhongshan Hospital, Fudan University, 180 Fenglin Road, Shanghai, 200032, People's Republic of China
- Key Laboratory of Whole-Period Monitoring and Precise Intervention of Digestive Cancer of Shanghai Municipal Health Commission, Shanghai, 201199, People's Republic of China
| | - Shaoqing Liu
- Department of Liver Surgery and Transplantation, Key Laboratory of Carcinogenesis and Cancer Invasion, Ministry of Education, Liver Cancer Institute, Zhongshan Hospital, Fudan University, 180 Fenglin Road, Shanghai, 200032, People's Republic of China
- Key Laboratory of Whole-Period Monitoring and Precise Intervention of Digestive Cancer of Shanghai Municipal Health Commission, Shanghai, 201199, People's Republic of China
| | - Qiang Zhou
- Department of Liver Surgery and Transplantation, Key Laboratory of Carcinogenesis and Cancer Invasion, Ministry of Education, Liver Cancer Institute, Zhongshan Hospital, Fudan University, 180 Fenglin Road, Shanghai, 200032, People's Republic of China
- Key Laboratory of Whole-Period Monitoring and Precise Intervention of Digestive Cancer of Shanghai Municipal Health Commission, Shanghai, 201199, People's Republic of China
| | - Zhiqiu Hu
- Key Laboratory of Whole-Period Monitoring and Precise Intervention of Digestive Cancer of Shanghai Municipal Health Commission, Shanghai, 201199, People's Republic of China
- Institute of Fudan-Minhang Academic Health System, Minhang Hospital, Fudan University, Shanghai, 201199, People's Republic of China
| | - Yirui Yin
- Department of Liver Surgery, Xiamen Branch, Zhongshan Hospital, Fudan University, Xiamen, 361015, People's Republic of China
| | - Peng Lv
- Department of Liver Surgery, Xiamen Branch, Zhongshan Hospital, Fudan University, Xiamen, 361015, People's Republic of China
| | - Jialei Sun
- Department of Liver Surgery and Transplantation, Key Laboratory of Carcinogenesis and Cancer Invasion, Ministry of Education, Liver Cancer Institute, Zhongshan Hospital, Fudan University, 180 Fenglin Road, Shanghai, 200032, People's Republic of China
| | - Hui Li
- Department of Liver Surgery and Transplantation, Key Laboratory of Carcinogenesis and Cancer Invasion, Ministry of Education, Liver Cancer Institute, Zhongshan Hospital, Fudan University, 180 Fenglin Road, Shanghai, 200032, People's Republic of China
| | - Yong Yi
- Department of Liver Surgery and Transplantation, Key Laboratory of Carcinogenesis and Cancer Invasion, Ministry of Education, Liver Cancer Institute, Zhongshan Hospital, Fudan University, 180 Fenglin Road, Shanghai, 200032, People's Republic of China
| | - Yinghao Shen
- Department of Liver Surgery and Transplantation, Key Laboratory of Carcinogenesis and Cancer Invasion, Ministry of Education, Liver Cancer Institute, Zhongshan Hospital, Fudan University, 180 Fenglin Road, Shanghai, 200032, People's Republic of China
| | - Qinghai Ye
- Department of Liver Surgery and Transplantation, Key Laboratory of Carcinogenesis and Cancer Invasion, Ministry of Education, Liver Cancer Institute, Zhongshan Hospital, Fudan University, 180 Fenglin Road, Shanghai, 200032, People's Republic of China
| | - Yi Shi
- Biomedical Research Centre, Zhongshan Hospital, Fudan University, Shanghai, 200032, People's Republic of China
| | - Qiongzhu Dong
- Key Laboratory of Whole-Period Monitoring and Precise Intervention of Digestive Cancer of Shanghai Municipal Health Commission, Shanghai, 201199, People's Republic of China
- Institute of Fudan-Minhang Academic Health System, Minhang Hospital, Fudan University, Shanghai, 201199, People's Republic of China
| | - Chunxiao Liu
- Department of Molecular and Cellular Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX, 77030, USA
| | - Xiaoqiang Zhu
- State Key Laboratory for Oncogenes and Related Genes, Division of Gastroenterology and Hepatology, Key Laboratory of Gastroenterology and Hepatology, School of Medicine, Ministry of Health, Shanghai Institute of Digestive Disease, Renji Hospital, Shanghai Jiao Tong University, Shanghai, 200001, People's Republic of China.
- School of Biomedical Sciences, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, 999077, People's Republic of China.
| | - Ning Ren
- Department of Liver Surgery and Transplantation, Key Laboratory of Carcinogenesis and Cancer Invasion, Ministry of Education, Liver Cancer Institute, Zhongshan Hospital, Fudan University, 180 Fenglin Road, Shanghai, 200032, People's Republic of China.
- Key Laboratory of Whole-Period Monitoring and Precise Intervention of Digestive Cancer of Shanghai Municipal Health Commission, Shanghai, 201199, People's Republic of China.
- Institute of Fudan-Minhang Academic Health System, Minhang Hospital, Fudan University, Shanghai, 201199, People's Republic of China.
| |
Collapse
|
2
|
Zhang W, Xiang X, Zhao B, Huang J, Yang L, Zeng Y. Identifying Cancer Driver Pathways Based on the Mouth Brooding Fish Algorithm. ENTROPY (BASEL, SWITZERLAND) 2023; 25:841. [PMID: 37372185 DOI: 10.3390/e25060841] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/07/2022] [Revised: 05/05/2023] [Accepted: 05/23/2023] [Indexed: 06/29/2023]
Abstract
Identifying the driver genes of cancer progression is of great significance in improving our understanding of the causes of cancer and promoting the development of personalized treatment. In this paper, we identify the driver genes at the pathway level via an existing intelligent optimization algorithm, named the Mouth Brooding Fish (MBF) algorithm. Many methods based on the maximum weight submatrix model to identify driver pathways attach equal importance to coverage and exclusivity and assign them equal weight, but those methods ignore the impact of mutational heterogeneity. Here, we use principal component analysis (PCA) to incorporate covariate data to reduce the complexity of the algorithm and construct a maximum weight submatrix model considering different weights of coverage and exclusivity. Using this strategy, the unfavorable effect of mutational heterogeneity is overcome to some extent. Data involving lung adenocarcinoma and glioblastoma multiforme were tested with this method and the results compared with the MDPFinder, Dendrix, and Mutex methods. When the driver pathway size was 10, the recognition accuracy of the MBF method reached 80% in both datasets, and the weight values of the submatrix were 1.7 and 1.89, respectively, which are better than those of the compared methods. At the same time, in the signal pathway enrichment analysis, the important role of the driver genes identified by our MBF method in the cancer signaling pathway is revealed, and the validity of these driver genes is demonstrated from the perspective of their biological effects.
Collapse
Affiliation(s)
- Wei Zhang
- College of Computer Science and Engineering, Changsha University, Changsha 410022, China
- Hunan Province Key Laboratory of Industrial Internet Technology and Security, Changsha University, Changsha 410022, China
| | - Xiaowen Xiang
- College of Computer Science and Engineering, Changsha University, Changsha 410022, China
| | - Bihai Zhao
- College of Computer Science and Engineering, Changsha University, Changsha 410022, China
- Hunan Province Key Laboratory of Industrial Internet Technology and Security, Changsha University, Changsha 410022, China
| | - Jianlin Huang
- College of Computer Science and Engineering, Changsha University, Changsha 410022, China
| | - Lan Yang
- College of Computer Science and Engineering, Changsha University, Changsha 410022, China
| | - Yifu Zeng
- College of Computer Science and Engineering, Changsha University, Changsha 410022, China
- Hunan Province Key Laboratory of Industrial Internet Technology and Security, Changsha University, Changsha 410022, China
| |
Collapse
|
3
|
Zhang J, Croft J, Le A. Familial CCM Genes Might Not Be Main Drivers for Pathogenesis of Sporadic CCMs-Genetic Similarity between Cancers and Vascular Malformations. J Pers Med 2023; 13:jpm13040673. [PMID: 37109059 PMCID: PMC10143507 DOI: 10.3390/jpm13040673] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2023] [Revised: 04/05/2023] [Accepted: 04/15/2023] [Indexed: 04/29/2023] Open
Abstract
Cerebral cavernous malformations (CCMs) are abnormally dilated intracranial capillaries that form cerebrovascular lesions with a high risk of hemorrhagic stroke. Recently, several somatic "activating" gain-of-function (GOF) point mutations in PIK3CA (phosphatidylinositol-4, 5-bisphosphate 3-kinase catalytic subunit p110α) were discovered as a dominant mutation in the lesions of sporadic forms of cerebral cavernous malformation (sCCM), raising the possibility that CCMs, like other types of vascular malformations, fall in the PIK3CA-related overgrowth spectrum (PROS). However, this possibility has been challenged with different interpretations. In this review, we will continue our efforts to expound the phenomenon of the coexistence of gain-of-function (GOF) point mutations in the PIK3CA gene and loss-of-function (LOF) mutations in CCM genes in the CCM lesions of sCCM and try to delineate the relationship between mutagenic events with CCM lesions in a temporospatial manner. Since GOF PIK3CA point mutations have been well studied in reproductive cancers, especially breast cancer as a driver oncogene, we will perform a comparative meta-analysis for GOF PIK3CA point mutations in an attempt to demonstrate the genetic similarities shared by both cancers and vascular anomalies.
Collapse
Affiliation(s)
- Jun Zhang
- Departments of Molecular & Translational Medicine (MTM), Texas Tech University Health Science Center El Paso (TTUHSCEP), El Paso, TX 79905, USA
| | - Jacob Croft
- Departments of Molecular & Translational Medicine (MTM), Texas Tech University Health Science Center El Paso (TTUHSCEP), El Paso, TX 79905, USA
| | - Alexander Le
- Departments of Molecular & Translational Medicine (MTM), Texas Tech University Health Science Center El Paso (TTUHSCEP), El Paso, TX 79905, USA
| |
Collapse
|
4
|
Chen Y, Zhang XF, Ou-Yang L. Inferring cancer common and specific gene networks via multi-layer joint graphical model. Comput Struct Biotechnol J 2023; 21:974-990. [PMID: 36733706 PMCID: PMC9873583 DOI: 10.1016/j.csbj.2023.01.017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2022] [Revised: 01/08/2023] [Accepted: 01/14/2023] [Indexed: 01/19/2023] Open
Abstract
Cancer is a complex disease caused primarily by genetic variants. Reconstructing gene networks within tumors is essential for understanding the functional regulatory mechanisms of carcinogenesis. Advances in high-throughput sequencing technologies have provided tremendous opportunities for inferring gene networks via computational approaches. However, due to the heterogeneity of the same cancer type and the similarities between different cancer types, it remains a challenge to systematically investigate the commonalities and specificities between gene networks of different cancer types, which is a crucial step towards precision cancer diagnosis and treatment. In this study, we propose a new sparse regularized multi-layer decomposition graphical model to jointly estimate the gene networks of multiple cancer types. Our model can handle various types of gene expression data and decomposes each cancer-type-specific network into three components, i.e., globally shared, partially shared and cancer-type-unique components. By identifying the globally and partially shared gene network components, our model can explore the heterogeneous similarities between different cancer types, and our identified cancer-type-unique components can help to reveal the regulatory mechanisms unique to each cancer type. Extensive experiments on synthetic data illustrate the effectiveness of our model in joint estimation of multiple gene networks. We also apply our model to two real data sets to infer the gene networks of multiple cancer subtypes or cell lines. By analyzing our estimated globally shared, partially shared, and cancer-type-unique components, we identified a number of important genes associated with common and specific regulatory mechanisms across different cancer types.
Collapse
Affiliation(s)
- Yuanxiao Chen
- Guangdong Key Laboratory of Intelligent Information Processing, Shenzhen Key Laboratory of Media Security, and Guangdong Laboratory of Artificial Intelligence and Digital Economy(SZ), Shenzhen University, Shenzhen, China
| | - Xiao-Fei Zhang
- School of Mathematics and Statistics & Hubei Key Laboratory of Mathematical Sciences, Central China Normal University, Wuhan, China
| | - Le Ou-Yang
- Guangdong Key Laboratory of Intelligent Information Processing, Shenzhen Key Laboratory of Media Security, and Guangdong Laboratory of Artificial Intelligence and Digital Economy(SZ), Shenzhen University, Shenzhen, China,Corresponding author.
| |
Collapse
|
5
|
Wang C, Zhang H, Ma H, Wang Y, Cai K, Guo T, Yang Y, Li Z, Zhu Y. Inference of pan-cancer related genes by orthologs matching based on enhanced LSTM model. Front Microbiol 2022; 13:963704. [PMID: 36267181 PMCID: PMC9577021 DOI: 10.3389/fmicb.2022.963704] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2022] [Accepted: 08/16/2022] [Indexed: 11/13/2022] Open
Abstract
Many disease-related genes have been found to be associated with cancer diagnosis, which is useful for understanding the pathophysiology of cancer, generating targeted drugs, and developing new diagnostic and treatment techniques. With the development of the pan-cancer project and the ongoing expansion of sequencing technology, many scientists are focusing on mining common genes from The Cancer Genome Atlas (TCGA) across various cancer types. In this study, we attempted to infer pan-cancer associated genes by examining the microbial model organism Saccharomyces Cerevisiae (Yeast) by homology matching, which was motivated by the benefits of reverse genetics. First, a background network of protein-protein interactions and a pathogenic gene set involving several cancer types in humans and yeast were created. The homology between the human gene and yeast gene was then discovered by homology matching, and its interaction sub-network was obtained. This was undertaken following the principle that the homologous genes of the common ancestor may have similarities in expression. Then, using bidirectional long short-term memory (BiLSTM) in combination with adaptive integration of heterogeneous information, we further explored the topological characteristics of the yeast protein interaction network and presented a node representation score to evaluate the node ability in graphs. Finally, homologous mapping for human genes matched the important genes identified by ensemble classifiers for yeast, which may be thought of as genes connected to all types of cancer. One way to assess the performance of the BiLSTM model is through experiments on the database. On the other hand, enrichment analysis, survival analysis, and other outcomes can be used to confirm the biological importance of the prediction results. You may access the whole experimental protocols and programs at https://github.com/zhuyuan-cug/AI-BiLSTM/tree/master.
Collapse
Affiliation(s)
- Chao Wang
- Department of Surgery, Hepatic Surgery Center, Institute of Hepato-Pancreato-Biliary Surgery, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
| | - Houwang Zhang
- Department of Electrical Engineering, City University of Hong Kong, Kowloon, Hong Kong SAR, China
| | - Haishu Ma
- School of Automation, China University of Geosciences, Wuhan, China
- Hubei Key Laboratory of Advanced Control and Intelligent Automation for Complex Systems, Wuhan, China
- Engineering Research Center of Intelligent Technology for Geo-Exploration, Wuhan, China
| | - Yawen Wang
- School of Mathematics and Physics, China University of Geosciences, Wuhan, China
| | - Ke Cai
- School of Automation, China University of Geosciences, Wuhan, China
- Hubei Key Laboratory of Advanced Control and Intelligent Automation for Complex Systems, Wuhan, China
- Engineering Research Center of Intelligent Technology for Geo-Exploration, Wuhan, China
| | - Tingrui Guo
- School of Automation, China University of Geosciences, Wuhan, China
- Hubei Key Laboratory of Advanced Control and Intelligent Automation for Complex Systems, Wuhan, China
- Engineering Research Center of Intelligent Technology for Geo-Exploration, Wuhan, China
| | - Yuanhang Yang
- School of Mathematics and Physics, China University of Geosciences, Wuhan, China
| | - Zhen Li
- School of Mathematics and Physics, China University of Geosciences, Wuhan, China
| | - Yuan Zhu
- School of Automation, China University of Geosciences, Wuhan, China
- Hubei Key Laboratory of Advanced Control and Intelligent Automation for Complex Systems, Wuhan, China
- Engineering Research Center of Intelligent Technology for Geo-Exploration, Wuhan, China
- Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence, Shanghai, China
- *Correspondence: Yuan Zhu
| |
Collapse
|
6
|
A nonlinear model and an algorithm for identifying cancer driver pathways. Appl Soft Comput 2022. [DOI: 10.1016/j.asoc.2022.109578] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
|
7
|
Wu J, Wu C, Li G. Identifying common driver modules by equilibrating coverage and mutual exclusivity across pan-cancer data. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2022.04.050] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
8
|
Zhu Y, Zhang H, Yang Y, Zhang C, Ou-Yang L, Bai L, Deng M, Yi M, Liu S, Wang C. Discovery of pan-cancer related genes via integrative network analysis. Brief Funct Genomics 2022; 21:325-338. [PMID: 35760070 DOI: 10.1093/bfgp/elac012] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2022] [Revised: 05/14/2022] [Accepted: 05/25/2022] [Indexed: 01/02/2023] Open
Abstract
Identification of cancer-related genes is helpful for understanding the pathogenesis of cancer, developing targeted drugs and creating new diagnostic and therapeutic methods. Considering the complexity of the biological laboratory methods, many network-based methods have been proposed to identify cancer-related genes at the global perspective with the increasing availability of high-throughput data. Some studies have focused on the tissue-specific cancer networks. However, cancers from different tissues may share common features, and those methods may ignore the differences and similarities across cancers during the establishment of modeling. In this work, in order to make full use of global information of the network, we first establish the pan-cancer network via differential network algorithm, which not only contains heterogeneous data across multiple cancer types but also contains heterogeneous data between tumor samples and normal samples. Second, the node representation vectors are learned by network embedding. In contrast to ranking analysis-based methods, with the help of integrative network analysis, we transform the cancer-related gene identification problem into a binary classification problem. The final results are obtained via ensemble classification. We further applied these methods to the most commonly used gene expression data involving six tissue-specific cancer types. As a result, an integrative pan-cancer network and several biologically meaningful results were obtained. As examples, nine genes were ultimately identified as potential pan-cancer-related genes. Most of these genes have been reported in published studies, thus showing our method's potential for application in identifying driver gene candidates for further biological experimental verification.
Collapse
Affiliation(s)
- Yuan Zhu
- School of Automation, China University of Geosciences, Lumo Road, 430074, Wuhan, China.,Hubei Key Laboratory of Advanced Control and Intelligent Automation for Complex Systems, Lumo Road, 430074, Wuhan, China.,Engineering Research Center of Intelligent Technology for Geo-Exploration, Lumo Road, 430074, Wuhan, China.,Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence(Fudan University), Ministry of Education, Handan Road, 200433, Shanghai, China
| | - Houwang Zhang
- Electrical Engineering, City University of HongKong, Kowloon, 999077, HongKong, China
| | - Yuanhang Yang
- School of Mathematics and Physics, China University of Geosciences, Lumo Road, 430074, Wuhan, China
| | - Chaoyang Zhang
- School of Computing Sciences and Computer Engineering, The University of Southern Mississippi, Hattiesburg, USA
| | - Le Ou-Yang
- Guangdong Key Laboratory of Intelligent Information Processing and Shenzhen Key Laboratory of Media Security, Shenzhen University, Nanhai Avenue, 518060, Shenzhen, China
| | - Litai Bai
- School of Automation, China University of Geosciences, Lumo Road, 430074, Wuhan, China.,Hubei Key Laboratory of Advanced Control and Intelligent Automation for Complex Systems, Lumo Road, 430074, Wuhan, China.,Engineering Research Center of Intelligent Technology for Geo-Exploration, Lumo Road, 430074, Wuhan, China
| | - Minghua Deng
- School of Mathematical Sciences, Peking University, No.5 Yiheyuan Road, 100871, Beijing, China
| | - Ming Yi
- School of Mathematics and Physics, China University of Geosciences, Lumo Road, 430074, Wuhan, China
| | - Song Liu
- School of Automation, China University of Geosciences, Lumo Road, 430074, Wuhan, China.,Hubei Key Laboratory of Advanced Control and Intelligent Automation for Complex Systems, Lumo Road, 430074, Wuhan, China.,Engineering Research Center of Intelligent Technology for Geo-Exploration, Lumo Road, 430074, Wuhan, China
| | - Chao Wang
- Hepatic Surgery Center, Institute of Hepato-Pancreato-Biliary Surgery, Department of Surgery, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Jiefang Avenue, 430030, Wuhan, China
| |
Collapse
|
9
|
Zhang Y, Chang X, Xia J, Huang Y, Sun S, Chen L, Liu X. Identifying network biomarkers of cancer by sample-specific differential network. BMC Bioinformatics 2022; 23:230. [PMID: 35705908 PMCID: PMC9202129 DOI: 10.1186/s12859-022-04772-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2021] [Accepted: 06/02/2022] [Indexed: 02/08/2023] Open
Abstract
Abundant datasets generated from various big science projects on diseases have presented great challenges and opportunities, which contributed to unfolding the complexity of diseases. The discovery of disease-associated molecular networks for each individual plays an important role in personalized therapy and precision treatment of cancer-based on the reference networks. However, there are no effective ways to distinguish the consistency of different reference networks. In this study, we developed a statistical method, i.e. a sample-specific differential network (SSDN), to construct and analyze such networks based on gene expression of a single sample against a reference dataset. We proved that the SSDN is structurally consistent even with different reference datasets if the reference dataset can follow certain conditions. The SSDN also can be used to identify patient-specific disease modules or network biomarkers as well as predict the potential driver genes of a tumor sample.
Collapse
Affiliation(s)
- Yu Zhang
- Key Laboratory of Systems Biology, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Hangzhou, 310024, China.,Key Laboratory of Systems Health Science of Zhejiang Province, Hangzhou, 310024, China.,School of Mathematics and Statistics, Shandong University, Weihai, 264209, Shandong, China
| | - Xiao Chang
- Institute of Statistics and Applied Mathematics, Anhui University of Finance & Economics, Bengbu, 233030, China.
| | - Jie Xia
- Center for Excellence in Molecular Cell Science, Shanghai Institute of Biochemistry and Cell Biology, Chinese Academy of Science, Shanghai, 200031, China
| | - Yanhong Huang
- School of Mathematics and Statistics, Shandong University, Weihai, 264209, Shandong, China
| | - Shaoyan Sun
- School of Mathematics and Statistics, Ludong University, Yantai, 264025, China
| | - Luonan Chen
- Key Laboratory of Systems Biology, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Hangzhou, 310024, China. .,Key Laboratory of Systems Health Science of Zhejiang Province, Hangzhou, 310024, China. .,Center for Excellence in Molecular Cell Science, Shanghai Institute of Biochemistry and Cell Biology, Chinese Academy of Science, Shanghai, 200031, China. .,West China Biomedical Big Data Center, Med-X center for informatics, West China Hospital, Sichuan University, Chengdu, 610041, China.
| | - Xiaoping Liu
- Key Laboratory of Systems Biology, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Hangzhou, 310024, China. .,Key Laboratory of Systems Health Science of Zhejiang Province, Hangzhou, 310024, China. .,School of Mathematics and Statistics, Shandong University, Weihai, 264209, Shandong, China.
| |
Collapse
|
10
|
Cheng X, Liu Y, Wang J, Chen Y, Robertson AG, Zhang X, Jones SJM, Taubert S. cSurvival: a web resource for biomarker interactions in cancer outcomes and in cell lines. Brief Bioinform 2022; 23:6562683. [PMID: 35368077 PMCID: PMC9116376 DOI: 10.1093/bib/bbac090] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2021] [Revised: 02/02/2022] [Accepted: 02/24/2022] [Indexed: 11/14/2022] Open
Abstract
Abstract
Survival analysis is a technique for identifying prognostic biomarkers and genetic vulnerabilities in cancer studies. Large-scale consortium-based projects have profiled >11 000 adult and >4000 pediatric tumor cases with clinical outcomes and multiomics approaches. This provides a resource for investigating molecular-level cancer etiologies using clinical correlations. Although cancers often arise from multiple genetic vulnerabilities and have deregulated gene sets (GSs), existing survival analysis protocols can report only on individual genes. Additionally, there is no systematic method to connect clinical outcomes with experimental (cell line) data. To address these gaps, we developed cSurvival (https://tau.cmmt.ubc.ca/cSurvival). cSurvival provides a user-adjustable analytical pipeline with a curated, integrated database and offers three main advances: (i) joint analysis with two genomic predictors to identify interacting biomarkers, including new algorithms to identify optimal cutoffs for two continuous predictors; (ii) survival analysis not only at the gene, but also the GS level; and (iii) integration of clinical and experimental cell line studies to generate synergistic biological insights. To demonstrate these advances, we report three case studies. We confirmed findings of autophagy-dependent survival in colorectal cancers and of synergistic negative effects between high expression of SLC7A11 and SLC2A1 on outcomes in several cancers. We further used cSurvival to identify high expression of the Nrf2-antioxidant response element pathway as a main indicator for lung cancer prognosis and for cellular resistance to oxidative stress-inducing drugs. Altogether, these analyses demonstrate cSurvival’s ability to support biomarker prognosis and interaction analysis via gene- and GS-level approaches and to integrate clinical and experimental biomedical studies.
Collapse
Affiliation(s)
- Xuanjin Cheng
- Centre for Molecular Medicine and Therapeutics, The University of British Columbia, Vancouver, British Columbia, Canada
- British Columbia Children’s Hospital Research Institute, Vancouver, British Columbia, Canada
- Department of Medical Genetics, The University of British Columbia, Vancouver, British Columbia, Canada
| | - Yongxing Liu
- Centre for Molecular Medicine and Therapeutics, The University of British Columbia, Vancouver, British Columbia, Canada
- British Columbia Children’s Hospital Research Institute, Vancouver, British Columbia, Canada
- Department of Medical Genetics, The University of British Columbia, Vancouver, British Columbia, Canada
| | - Jiahe Wang
- Centre for Molecular Medicine and Therapeutics, The University of British Columbia, Vancouver, British Columbia, Canada
- British Columbia Children’s Hospital Research Institute, Vancouver, British Columbia, Canada
- Department of Medical Genetics, The University of British Columbia, Vancouver, British Columbia, Canada
| | - Yujie Chen
- Centre for Molecular Medicine and Therapeutics, The University of British Columbia, Vancouver, British Columbia, Canada
- British Columbia Children’s Hospital Research Institute, Vancouver, British Columbia, Canada
- Department of Medical Genetics, The University of British Columbia, Vancouver, British Columbia, Canada
| | - Andrew Gordon Robertson
- Canada’s Michael Smith Genome Sciences Centre at BC Cancer Agency, Vancouver, British Columbia, Canada
| | - Xuekui Zhang
- Department of Mathematics and Statistics, University of Victoria, Victoria, British Columbia, Canada
| | - Steven J M Jones
- Department of Medical Genetics, The University of British Columbia, Vancouver, British Columbia, Canada
- Canada’s Michael Smith Genome Sciences Centre at BC Cancer Agency, Vancouver, British Columbia, Canada
| | - Stefan Taubert
- Centre for Molecular Medicine and Therapeutics, The University of British Columbia, Vancouver, British Columbia, Canada
- British Columbia Children’s Hospital Research Institute, Vancouver, British Columbia, Canada
- Department of Medical Genetics, The University of British Columbia, Vancouver, British Columbia, Canada
| |
Collapse
|
11
|
Ren X, Zhang L, Ma X, Li J, Lu Z. Integrated bioinformatics and experiments reveal the roles and driving forces for HSF1 in colorectal cancer. Bioengineered 2022; 13:2536-2552. [PMID: 35006040 PMCID: PMC8974194 DOI: 10.1080/21655979.2021.2018235] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023] Open
Abstract
Heat shock factor 1 (HSF1) has watershed significance in different tumors. However, the roles and driving forces for HSF1 in colorectal cancer (CRC) are poorly understood. Our study integrally analyzed the roles and driving forces for HSF1 in CRC by bioinformatics and experiments. The expression and prognostic characteristics of HSF1 were analyzed via UALCAN, GEPIA2, TISIDB, Prognoscan and HPA databases. Then, we analyzed the correlation between HSF1 expression and immune features via TIMER2 database. Subsequently, we explored the driving forces for HSF1 abnormal expression in CRC by bioinformatics and experiments. Our results showed that HSF1 was overexpressed and correlated with poor prognosis in CRC. And the expression of HSF1 was significantly correlated with multiple immune cell infiltration and was negatively correlated with immunomodulators such as programmed cell death 1 ligand 1(PD-L1). Along with many driver genes in particular TP53, super-enhancer, miRNA and DNA methylation were all responsible for HSF1 overexpression in CRC. Moreover, we demonstrated that β-catenin could promote the translation process of HSF1 mRNA by interacting with HuR, which could directly bind to the coding sequence (CDS) region of HSF1 mRNA. Collectively, HSF1 may be useful as a diagnostic and prognostic biomarker for CRC. HSF1 was closely correlated with immune features. Genetic and epigenetic alterations contributed to HSF1 overexpression in CRC. More importantly, we demonstrated that HSF1 may be regulated at the level of mRNA translation by β-catenin-induced HuR activity.
Collapse
Affiliation(s)
- Xiaomin Ren
- Department of Oncology, Affiliated Hospital of Weifang Medical University, Weifang, China.,Jinming Yu Academician Workstation of Oncology, Clinical Research Center, Affiliated Hospital of Weifang Medical University, Weifang, China
| | - Liyuan Zhang
- Department of Clinical Medicine, Medical College of Qingdao Binhai University, Qingdao, China
| | - Xiaolin Ma
- Department of Oncology, Affiliated Hospital of Weifang Medical University, Weifang, China
| | - Jiaqiu Li
- Department of Oncology, Affiliated Hospital of Weifang Medical University, Weifang, China.,Jinming Yu Academician Workstation of Oncology, Clinical Research Center, Affiliated Hospital of Weifang Medical University, Weifang, China
| | - Zhong Lu
- Department of Oncology, Affiliated Hospital of Weifang Medical University, Weifang, China.,Jinming Yu Academician Workstation of Oncology, Clinical Research Center, Affiliated Hospital of Weifang Medical University, Weifang, China
| |
Collapse
|
12
|
Yan X, Chen M, Xiao C, Fu J, Sun X, Hu Z, Zhou H. Effect of unfolded protein response on the immune infiltration and prognosis of transitional cell bladder cancer. Ann Med 2021; 53:1048-1058. [PMID: 34187252 PMCID: PMC8253203 DOI: 10.1080/07853890.2021.1918346] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/05/2021] [Accepted: 04/12/2021] [Indexed: 12/24/2022] Open
Abstract
Background: Bladder cancer (BC) is one of the most common human malignancies worldwide. Previous researches have shown that the unfolded protein response (UPR) pathway could contribute to the tumorigenesis of BC. However, the role of UPR in the immune infiltration, progression, and prognosis of BC is unclear.Methods: The GSVA and ssGSEA methods were used for assessing the UPR score and immune cells infiltration score in three BC public datasets, respectively. The relationship between the UPR pathway and clinicopathological characteristics was analyzed by the Kruskal-Wallis, Wilcox test, and log-rank test. The association of the UPR pathway with various tumor-infiltrating immune cells was evaluated with the correlation analysis. Univariate Cox regression analysis was performed to identify risk factors significantly associated with prognosis. The predictive models were built based on risk factors and visualized with nomograms. The performance of our models was evaluated with the calibration curve, Harrell's concordance index (c-index), and receiver operating characteristic (ROC) analysis.Results: We found that the UPR pathway and many UPR-related genes were significantly associated with the pathologic grade, tumor type, and invasive progression of transitional cell bladder cancer (TCBC), and a high UPR score predicted a poor prognosis in patients. The UPR score was positively correlated with the infiltration abundance of many tumor immune cells in TCBC. Besides, we constructed predictive models based on the UPR score, and good performance was observed, with c-indexes ranging from 0.74 to 0.87.Conclusions: Our study proved that the UPR pathway may have an important impact on the progression, prognosis, and tumor immune infiltration in TCBC, and the models we built may provide effective and reliable guides for prognosis assessment and treatment decision-making for TCBC patients.
Collapse
Affiliation(s)
- Xiaokai Yan
- Department of Oncology, the Second Affiliated Hospital of Zunyi Medical University, Zunyi, China
| | - Min Chen
- Department of Oncology, the Second Affiliated Hospital of Zunyi Medical University, Zunyi, China
| | - Chiying Xiao
- Department of Oncology, the Second Affiliated Hospital of Zunyi Medical University, Zunyi, China
| | - Jiandong Fu
- Department of Oncology, the Second Affiliated Hospital of Zunyi Medical University, Zunyi, China
| | - Xia Sun
- Department of Oncology, the Second Affiliated Hospital of Zunyi Medical University, Zunyi, China
| | - Zuohuai Hu
- Department of Oncology, the Second Affiliated Hospital of Zunyi Medical University, Zunyi, China
| | - Hang Zhou
- Department of Oncology, the Second Affiliated Hospital of Zunyi Medical University, Zunyi, China
| |
Collapse
|
13
|
Zhang W, Wang SL, Liu Y. Identification of Cancer Driver Modules Based on Graph Clustering from Multiomics Data. J Comput Biol 2021; 28:1007-1020. [PMID: 34529511 DOI: 10.1089/cmb.2021.0052] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023] Open
Abstract
A major challenge in cancer genomics is to identify cancer driver genes and modules. Most existing methods to identify cancer driver modules (iCDM) identify groups of genes whose somatic mutational patterns exhibit either mutual exclusivity or high coverage of patient samples, without considering other biological information from multiomics data sets. Here we integrate mutual exclusivity, coverage, and protein-protein interaction information to construct an edge-weighted network, and present a graph clustering approach based on symmetric non-negative matrix factorization to iCDM. iCDM was tested on pan-cancer data and the results were compared with those from several advanced computational methods. Our approach outperformed other methods in recovering known cancer driver modules, and the identified driver modules showed high accuracy in classifying normal and tumor samples.
Collapse
Affiliation(s)
- Wei Zhang
- College of Computer Engineering and Applied Mathematics, Changsha University, Changsha, China.,Hunan Province Key Laboratory of Industrial Internet Technology and Security, Changsha University, Changsha, China
| | - Shu-Lin Wang
- College of Computer Science and Electronics Engineering, Hunan University, Changsha, China
| | - Yue Liu
- College of Computer Science and Electronics Engineering, Hunan University, Changsha, China
| |
Collapse
|
14
|
Yang Z, Yu G, Guo M, Yu J, Zhang X, Wang J. CDPath: Cooperative Driver Pathways Discovery Using Integer Linear Programming and Markov Clustering. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021; 18:1384-1395. [PMID: 31581094 DOI: 10.1109/tcbb.2019.2945029] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Discovering driver pathways is an essential task to understand the pathogenesis of cancer and to design precise treatments for cancer patients. Increasing evidences have been indicating that multiple pathways often function cooperatively in carcinogenesis. In this study, we propose an approach called CDPath to discover cooperative driver pathways. CDPath first uses Integer Linear Programming to explore driver core modules from mutation profiles by enforcing co-occurrence and functional interaction relations between modules, and by maximizing the mutual exclusivity and coverage within modules. Next, to enforce cooperation of pathways and help the follow-up exact cooperative driver pathways discovery, it performs Markov clustering on pathway-pathway interaction network to cluster pathways. After that, it identifies pathways in different modules but in the same clusters as cooperative driver pathways. We apply CDPath on two TCGA datasets: breast cancer (BRCA) and endometrial cancer (UCEC). The results show that CDPath can identify known (i.e., TP53) and potential driver genes (i.e., SPTBN2). In addition, the identified cooperative driver pathways are related with the target cancer, and they are involved with carcinogenesis and several key biological processes. CDPath can uncover more potential biological associations between pathways (over 100 percent) and more cooperative driver pathways (over 200 percent) than competitive approaches. The demo codes of CDPath are available at http://mlda.swu.edu.cn/codes.php?name=CDPath.
Collapse
|
15
|
Ebadi AR, Soleimani A, Ghaderzadeh A. Providing an optimized model to detect driver genes from heterogeneous cancer samples using restriction in subspace learning. Sci Rep 2021; 11:9171. [PMID: 33911156 PMCID: PMC8080706 DOI: 10.1038/s41598-021-88548-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2020] [Accepted: 04/13/2021] [Indexed: 11/09/2022] Open
Abstract
Extracting the drivers from genes with mutation, and segregation of driver and passenger genes are known as the most controversial issues in cancer studies. According to the heterogeneity of cancer, it is not possible to identify indicators under a group of associated drivers, in order to identify a group of patients with diseases related to these subgroups. Therefore, the precise identification of the related driver genes using artificial intelligence techniques is still considered as a challenge for researchers. In this research, a new method has been developed using the subspace learning method, unsupervised learning, and with more constraints. Accordingly, it has been attempted to extract the driver genes with more precision and accurate results. The obtained results show that the proposed method is more to predict the driver genes and subgroups of driver genes which have the highest degree of overlap due to p-value with known driver genes in valid databases. Driver genes are the benchmark of MsigDB which have more overlap compared to them as selected driver genes. In this article, in addition to including the driver genes defined in previous work, introduce newer driver genes. The minister will define newer groups of driver genes compared to other methods the p-value of the proposed method was 9.21e-7 better than previous methods for 200 genes. Due to the overlap and newer driver genes and driver gene group and subgroups. The results show that the p value of the proposed method is about 2.7 times less than the driver sub method due to overlap, indicating that the proposed method can identify driver genes in cancerous tumors with greater accuracy and reliability.
Collapse
Affiliation(s)
- Ali Reza Ebadi
- Department Computer Engineering, Sanandaj Branch, Islamic Azad University, Sanandaj, Iran
| | - Ali Soleimani
- Department of Computer Engineering, College of Technical and Engineering, Malard Branch, Islamic Azad University, Tehran, Iran.
| | - Abdulbaghi Ghaderzadeh
- Department Computer Engineering, Sanandaj Branch, Islamic Azad University, Sanandaj, Iran
| |
Collapse
|
16
|
Xiang J, Zhang J, Zheng R, Li X, Li M. NIDM: network impulsive dynamics on multiplex biological network for disease-gene prediction. Brief Bioinform 2021; 22:6236070. [PMID: 33866352 DOI: 10.1093/bib/bbab080] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2021] [Revised: 02/11/2021] [Accepted: 02/21/2021] [Indexed: 12/12/2022] Open
Abstract
The prediction of genes related to diseases is important to the study of the diseases due to high cost and time consumption of biological experiments. Network propagation is a popular strategy for disease-gene prediction. However, existing methods focus on the stable solution of dynamics while ignoring the useful information hidden in the dynamical process, and it is still a challenge to make use of multiple types of physical/functional relationships between proteins/genes to effectively predict disease-related genes. Therefore, we proposed a framework of network impulsive dynamics on multiplex biological network (NIDM) to predict disease-related genes, along with four variants of NIDM models and four kinds of impulsive dynamical signatures (IDSs). NIDM is to identify disease-related genes by mining the dynamical responses of nodes to impulsive signals being exerted at specific nodes. By a series of experimental evaluations in various types of biological networks, we confirmed the advantage of multiplex network and the important roles of functional associations in disease-gene prediction, demonstrated superior performance of NIDM compared with four types of network-based algorithms and then gave the effective recommendations of NIDM models and IDS signatures. To facilitate the prioritization and analysis of (candidate) genes associated to specific diseases, we developed a user-friendly web server, which provides three kinds of filtering patterns for genes, network visualization, enrichment analysis and a wealth of external links (http://bioinformatics.csu.edu.cn/DGP/NID.jsp). NIDM is a protocol for disease-gene prediction integrating different types of biological networks, which may become a very useful computational tool for the study of disease-related genes.
Collapse
Affiliation(s)
- Ju Xiang
- School of Computer Science and Engineering, Central South University, Human, China
| | - Jiashuai Zhang
- School of Computer Science and Engineering, Central South University, Human, China
| | - Ruiqing Zheng
- School of Computer Science and Engineering, Central South University, China
| | - Xingyi Li
- School of Computer Science and Engineering, Central South University, Changsha, China
| | - Min Li
- School of Computer Science and Engineering, Central South University, Changsha, China
| |
Collapse
|
17
|
Fang H, Zhang Z, Zhou Y, Jin L, Yang Y. A greedy approach for mutual exclusivity analysis in cancer study. Biostatistics 2021; 23:910-925. [PMID: 33634822 DOI: 10.1093/biostatistics/kxab004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2020] [Revised: 11/26/2020] [Accepted: 01/13/2021] [Indexed: 11/14/2022] Open
Abstract
The main challenge in cancer genomics is to distinguish the driver genes from passenger or neutral genes. Cancer genomes exhibit extensive mutational heterogeneity that no two genomes contain exactly the same somatic mutations. Such mutual exclusivity (ME) of mutations has been observed in cancer data and is associated with functional pathways. Analysis of ME patterns may provide useful clues to driver genes or pathways and may suggest novel understandings of cancer progression. In this article, we consider a probabilistic, generative model of ME, and propose a powerful and greedy algorithm to select the mutual exclusivity gene sets. The greedy method includes a pre-selection procedure and a stepwise forward algorithm which can significantly reduce computation time. Power calculations suggest that the new method is efficient and powerful for one ME set or multiple ME sets with overlapping genes. We illustrate this approach by analysis of the whole-exome sequencing data of cancer types from TCGA.
Collapse
Affiliation(s)
- Hongyan Fang
- School of Mathematical Sciences, Anhui University, Hefei, Anhui, China
| | - Zeyu Zhang
- Department of Statistics and Finance, University of Science and Technology of China, Hefei, Anhui, China
| | - Yinsheng Zhou
- Department of Statistics and Finance, University of Science and Technology of China, Hefei, Anhui, China
| | - Lishuai Jin
- School of Mathematical Sciences, Anhui University, Hefei, Anhui, China
| | - Yaning Yang
- Department of Statistics and Finance, University of Science and Technology of China, Hefei, Anhui, China
| |
Collapse
|
18
|
Autoencoded DNA methylation data to predict breast cancer recurrence: Machine learning models and gene-weight significance. Artif Intell Med 2020; 110:101976. [PMID: 33250148 DOI: 10.1016/j.artmed.2020.101976] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2019] [Revised: 08/05/2020] [Accepted: 10/18/2020] [Indexed: 12/29/2022]
Abstract
Breast cancer is the most frequent cancer in women and the second most frequent overall after lung cancer. Although the 5-year survival rate of breast cancer is relatively high, recurrence is also common which often involves metastasis with its consequent threat for patients. DNA methylation-derived databases have become an interesting primary source for supervised knowledge extraction regarding breast cancer. Unfortunately, the study of DNA methylation involves the processing of hundreds of thousands of features for every patient. DNA methylation is featured by High Dimension Low Sample Size which has shown well-known issues regarding feature selection and generation. Autoencoders (AEs) appear as a specific technique for conducting nonlinear feature fusion. Our main objective in this work is to design a procedure to summarize DNA methylation by taking advantage of AEs. Our proposal is able to generate new features from the values of CpG sites of patients with and without recurrence. Then, a limited set of relevant genes to characterize breast cancer recurrence is proposed by the application of survival analysis and a pondered ranking of genes according to the distribution of their CpG sites. To test our proposal we have selected a dataset from The Cancer Genome Atlas data portal and an AE with a single-hidden layer. The literature and enrichment analysis (based on genomic context and functional annotation) conducted regarding the genes obtained with our experiment confirmed that all of these genes were related to breast cancer recurrence.
Collapse
|
19
|
Kim BH, Yu K, Lee PCW. Cancer classification of single-cell gene expression data by neural network. Bioinformatics 2020; 36:1360-1366. [PMID: 31603465 DOI: 10.1093/bioinformatics/btz772] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2019] [Revised: 08/13/2019] [Accepted: 10/08/2019] [Indexed: 01/16/2023] Open
Abstract
MOTIVATION Cancer classification based on gene expression profiles has provided insight on the causes of cancer and cancer treatment. Recently, machine learning-based approaches have been attempted in downstream cancer analysis to address the large differences in gene expression values, as determined by single-cell RNA sequencing (scRNA-seq). RESULTS We designed cancer classifiers that can identify 21 types of cancers and normal tissues based on bulk RNA-seq as well as scRNA-seq data. Training was performed with 7398 cancer samples and 640 normal samples from 21 tumors and normal tissues in TCGA based on the 300 most significant genes expressed in each cancer. Then, we compared neural network (NN), support vector machine (SVM), k-nearest neighbors (kNN) and random forest (RF) methods. The NN performed consistently better than other methods. We further applied our approach to scRNA-seq transformed by kNN smoothing and found that our model successfully classified cancer types and normal samples. AVAILABILITY AND IMPLEMENTATION Cancer classification by neural network. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Bong-Hyun Kim
- Department of Biomedical Sciences, University of Ulsan College of Medicine, ASAN Medical Center, Seoul 05505, Korea.,Advanced Bio Computing Center, Frederick National Laboratory for Cancer Research, Frederick, MD 21702, USA
| | - Kijin Yu
- Department of Biomedical Sciences, University of Ulsan College of Medicine, ASAN Medical Center, Seoul 05505, Korea
| | - Peter C W Lee
- Department of Biomedical Sciences, University of Ulsan College of Medicine, ASAN Medical Center, Seoul 05505, Korea
| |
Collapse
|
20
|
Xi J, Yuan X, Wang M, Li A, Li X, Huang Q. Inferring subgroup-specific driver genes from heterogeneous cancer samples via subspace learning with subgroup indication. Bioinformatics 2020; 36:1855-1863. [PMID: 31626284 DOI: 10.1093/bioinformatics/btz793] [Citation(s) in RCA: 30] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2019] [Revised: 09/23/2019] [Accepted: 10/16/2019] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION Detecting driver genes from gene mutation data is a fundamental task for tumorigenesis research. Due to the fact that cancer is a heterogeneous disease with various subgroups, subgroup-specific driver genes are the key factors in the development of precision medicine for heterogeneous cancer. However, the existing driver gene detection methods are not designed to identify subgroup specificities of their detected driver genes, and therefore cannot indicate which group of patients is associated with the detected driver genes, which is difficult to provide specifically clinical guidance for individual patients. RESULTS By incorporating the subspace learning framework, we propose a novel bioinformatics method called DriverSub, which can efficiently predict subgroup-specific driver genes in the situation where the subgroup annotations are not available. When evaluated by simulation datasets with known ground truth and compared with existing methods, DriverSub yields the best prediction of driver genes and the inference of their related subgroups. When we apply DriverSub on the mutation data of real heterogeneous cancers, we can observe that the predicted results of DriverSub are highly enriched for experimentally validated known driver genes. Moreover, the subgroups inferred by DriverSub are significantly associated with the annotated molecular subgroups, indicating its capability of predicting subgroup-specific driver genes. AVAILABILITY AND IMPLEMENTATION The source code is publicly available at https://github.com/JianingXi/DriverSub. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Jianing Xi
- School of Mechanical Engineering , Northwestern Polytechnical University, Xi'an, 710072, China.,Center for OPTical IMagery Analysis and Learning (OPTIMAL), Northwestern Polytechnical University, Xi'an, 710072, China
| | - Xiguo Yuan
- School of Computer Science and Technology, Xidian University, Xi'an 710071, China
| | - Minghui Wang
- School of Information Science and Technology, University of Science and Technology of China, Hefei 230027, China
| | - Ao Li
- School of Information Science and Technology, University of Science and Technology of China, Hefei 230027, China
| | - Xuelong Li
- Center for OPTical IMagery Analysis and Learning (OPTIMAL), Northwestern Polytechnical University, Xi'an, 710072, China.,School of Computer Science, Northwestern Polytechnical University, Xi'an 710072, China
| | - Qinghua Huang
- School of Mechanical Engineering , Northwestern Polytechnical University, Xi'an, 710072, China.,Center for OPTical IMagery Analysis and Learning (OPTIMAL), Northwestern Polytechnical University, Xi'an, 710072, China
| |
Collapse
|
21
|
Xue L, Li W, Fan X, Zhao Z, Zhou W, Feng Z, Liu L, Lin H, Li L, Xue X, Huang X, Huang P, Guo J, Du P, Lu N, Li L, Zhan Q, Song Y. Identification of second primary tumors from lung metastases in patients with esophageal squamous cell carcinoma using whole-exome sequencing. Am J Cancer Res 2020; 10:10606-10618. [PMID: 32929369 PMCID: PMC7482800 DOI: 10.7150/thno.45311] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2020] [Accepted: 08/15/2020] [Indexed: 12/30/2022] Open
Abstract
Esophageal squamous cell carcinoma (ESCC) patients with a synchronous or metachronous lung tumor can be diagnosed with lung metastasis (LM) or a second primary tumor (SPT), but the accurate discrimination between LM and SPT remains a clinical dilemma. This study aimed to investigate the feasibility of using the whole-exome sequencing (WES) technique to distinguish SPT from LM. Methods: We performed WES on 40 tumors from 14 patients, including 12 patients with double squamous cell carcinomas (SCCs) of the esophagus and lung (lymph node metastases were sequenced as internal controls) diagnosed as LM according to pathological information and 2 patients with paired primary ESCC and non-lung metastases examined as external controls. Results: Shared genomic profiles between esophageal (T) and lung (D) tumors were observed in 7 patients, suggesting their clonal relatedness, thus indicating that the lung tumors of these patients should be LM. However, distinct genomic profiles between T and D tumors were observed in the other 5 patients, suggesting the possibility of SPTs that were likely formed through independent multifocal oncogenesis. Conclusions: Our data demonstrate the limitations and insufficiency of clinicopathological criteria and that WES could be useful in understanding the clonal relationships of multiple SCCs.
Collapse
|
22
|
Wei PJ, Wu FX, Xia J, Su Y, Wang J, Zheng CH. Prioritizing Cancer Genes Based on an Improved Random Walk Method. Front Genet 2020; 11:377. [PMID: 32411180 PMCID: PMC7198854 DOI: 10.3389/fgene.2020.00377] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2019] [Accepted: 03/26/2020] [Indexed: 12/18/2022] Open
Abstract
Identifying driver genes that contribute to cancer progression from numerous passenger genes, although a central goal, is a major challenge. The protein-protein interaction network provides convenient and reasonable assistance for driver gene discovery. Random walk-based methods have been widely used to prioritize nodes in social or biological networks. However, most studies select the next arriving node uniformly from the random walker's neighbors. Few consider transiting preference according to the degree of random walker's neighbors. In this study, based on the random walk method, we propose a novel approach named Driver_IRW (Driver genes discovery with Improved Random Walk method), to prioritize cancer genes in cancer-related network. The key idea of Driver_IRW is to assign different transition probabilities for different edges of a constructed cancer-related network in accordance with the degree of the nodes' neighbors. Furthermore, the global centrality (here is betweenness centrality) and Katz feedback centrality are incorporated into the framework to evaluate the probability to walk to the seed nodes. Experimental results on four cancer types indicate that Driver_IRW performs more efficiently than some previously published methods for uncovering known cancer-related genes. In conclusion, our method can aid in prioritizing cancer-related genes and complement traditional frequency and network-based methods.
Collapse
Affiliation(s)
- Pi-Jing Wei
- Key Lab of Intelligent Computing and Signal Processing of Ministry of Education, College of Computer Science and Technology, Anhui University, Hefei, China
| | - Fang-Xiang Wu
- Division of Biomedical Engineering, University of Saskatchewan, Saskatoon, SK, Canada
- Department of Computer Sciences, University of Saskatchewan, Saskatoon, SK, Canada
- Department of Mechanical Engineering, University of Saskatchewan, Saskatoon, SK, Canada
| | - Junfeng Xia
- Institutes of Physical Science and Information Technology, Anhui University, Hefei, China
| | - Yansen Su
- Key Lab of Intelligent Computing and Signal Processing of Ministry of Education, College of Computer Science and Technology, Anhui University, Hefei, China
| | - Jing Wang
- Key Lab of Intelligent Computing and Signal Processing of Ministry of Education, College of Computer Science and Technology, Anhui University, Hefei, China
- College of Computer and Information Engineering, Fuyang Normal University, Fuyang, China
| | - Chun-Hou Zheng
- Key Lab of Intelligent Computing and Signal Processing of Ministry of Education, College of Computer Science and Technology, Anhui University, Hefei, China
| |
Collapse
|
23
|
Li F, Gao L, Wang B. Detection of Driver Modules with Rarely Mutated Genes in Cancers. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2020; 17:390-401. [PMID: 29994261 DOI: 10.1109/tcbb.2018.2846262] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Identifying driver modules or pathways is a key challenge to interpret the molecular mechanisms and pathogenesis underlying cancer. An increasing number of studies suggest that rarely mutated genes are important for the development of cancer. However, the driver modules consisting of mutated genes with low-frequency driver mutations are not well characterized. To identify driver modules with rarely mutated genes, we propose a functional similarity index to quantify the functional relationship between rarely mutated genes and other ones in the same module. Then, we develop a method to detect Driver Modules with Rarely mutated Genes (DMRG) by incorporating the functional similarity, coverage and mutual exclusivity. By applying DMRG on TCGA cancer dataset on three networks: HINT+HI2012, iRefIndex and MultiNet, we detect driver modules intersecting with the well-known signalling pathways and protein complexes, such as the cell cycle pathway and the mediator complex. DMRG can also detect driver modules effectively with 20, 40, 60 and 80 percent of samples by random selection. When compared with HotNet2, DMRG detects more rarely mutated cancer genes and has higher pathway enrichment. Overall, DMRG provides an effective method for the identification of driver modules with rarely mutated genes.
Collapse
|
24
|
Wang J, Yang Z, Domeniconi C, Zhang X, Yu G. Cooperative driver pathway discovery via fusion of multi-relational data of genes, miRNAs and pathways. Brief Bioinform 2020; 22:1984-1999. [PMID: 32103253 DOI: 10.1093/bib/bbz167] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2019] [Revised: 12/13/2019] [Accepted: 12/29/2019] [Indexed: 12/19/2022] Open
Abstract
Discovering driver pathways is an essential step to uncover the molecular mechanism underlying cancer and to explore precise treatments for cancer patients. However, due to the difficulties of mapping genes to pathways and the limited knowledge about pathway interactions, most previous work focus on identifying individual pathways. In practice, two (or even more) pathways interplay and often cooperatively trigger cancer. In this study, we proposed a new approach called CDPathway to discover cooperative driver pathways. First, CDPathway introduces a driver impact quantification function to quantify the driver weight of each gene. CDPathway assumes that genes with larger weights contribute more to the occurrence of the target disease and identifies them as candidate driver genes. Next, it constructs a heterogeneous network composed of genes, miRNAs and pathways nodes based on the known intra(inter)-relations between them and assigns the quantified driver weights to gene-pathway and gene-miRNA relational edges. To transfer driver impacts of genes to pathway interaction pairs, CDPathway collaboratively factorizes the weighted adjacency matrices of the heterogeneous network to explore the latent relations between genes, miRNAs and pathways. After this, it reconstructs the pathway interaction network and identifies the pathway pairs with maximal interactive and driver weights as cooperative driver pathways. Experimental results on the breast, uterine corpus endometrial carcinoma and ovarian cancer data from The Cancer Genome Atlas show that CDPathway can effectively identify candidate driver genes [area under the receiver operating characteristic curve (AUROC) of $\geq $0.9] and reconstruct the pathway interaction network (AUROC of>0.9), and it uncovers much more known (potential) driver genes than other competitive methods. In addition, CDPathway identifies 150% more driver pathways and 60% more potential cooperative driver pathways than the competing methods. The code of CDPathway is available at http://mlda.swu.edu.cn/codes.php?name=CDPathway.
Collapse
Affiliation(s)
- Jun Wang
- Professor of the School of Software, Shandong University
| | - Ziying Yang
- Professor of the School of Software, Shandong University
| | | | - Xiangliang Zhang
- Computational Bioscience Research Center (CBRC), Computer Science, Electrical and Mathematical Science and Engineering Division, King Abdullah University of Science and Technology, SA
| | - Guoxian Yu
- Computational Bioscience Research Center (CBRC), Computer Science, Electrical and Mathematical Science and Engineering Division, King Abdullah University of Science and Technology, SA.,Professor of the School of Software, Shandong University and Computational Bioscience Research Center
| |
Collapse
|
25
|
Multi-omics analysis reveals epithelial-mesenchymal transition-related gene FOXM1 as a novel prognostic biomarker in clear cell renal carcinoma. Aging (Albany NY) 2019; 11:10316-10337. [PMID: 31743108 PMCID: PMC6914426 DOI: 10.18632/aging.102459] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2019] [Accepted: 11/08/2019] [Indexed: 12/14/2022]
Abstract
Identification of novel clinical biomarker in clear cell renal carcinoma (ccRCC) is warranted. Integrating transcriptome (n=1669), DNA methylation (n=577) and copy number data (n=832), we developed a method to identify driver biomarkers by analyzing the omics-level dynamics of Epithelial-Mesenchymal Transition (EMT)-related genes in ccRCC. We first identified 504 expression dynamic changed genes involved in ccRCC-associated key pathways such as EMT, cell cycle, EGFR and PI3K/AKT signaling. Further analysis identified 229 (90 gene promoters) aberrant expression quantitative trait methylation (eQTM) and 256 genes with expression quantitative trait copy number (eQTCN) alterations. Among them, FOXM1 was affected by both eQTM and eQTCN. FOXM1 copy number amplification (115/500, 23% of patients), occurred in an amplified peak in chromosome 12q13.3, was enriched in late-stage ccRCC samples and was associated with worse survival. FOXM1-overexpressed pT3 patients with distant metastasis showed ~25% shorter overall survival in both training (log-rank P=0.006) and validation (log-rank P=0.018) cohorts. The eQTM-gene hybrid signature (cg00044170 and FOXM1), superior to either gene expression or DNA methylation alone, showed great potential in diagnosing localized ccRCC in training (area under curve = 0.958) and validation datasets. FOXM1 could be a novel prognostic biomarker and shed light for early diagnosis at molecular level in ccRCC.
Collapse
|
26
|
Yang K, Kondo MA, Jaaro-Peled H, Cash-Padgett T, Kano SI, Ishizuka K, Pevsner J, Tomoda T, Sawa A, Niwa M. The transcriptome landscape associated with Disrupted-in-Schizophrenia-1 locus impairment in early development and adulthood. Schizophr Res 2019; 210:149-156. [PMID: 31204062 PMCID: PMC8050833 DOI: 10.1016/j.schres.2019.05.032] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/06/2019] [Revised: 05/21/2019] [Accepted: 05/26/2019] [Indexed: 01/08/2023]
Abstract
DISC1 was originally expected to be a genetic risk factor for schizophrenia, but the genome wide association studies have not supported this idea. In contrast, neurobiological studies of DISC1 in cell and animal models have demonstrated that direct perturbation of DISC1 protein elicits neurobiological and behavioral abnormalities relevant to a wide range of psychiatric conditions, in particular psychosis. Thus, the utility of DISC1 as a biological lead for psychosis research is clear. In the present study, we aimed to capture changes in the molecular landscape in the prefrontal cortex upon perturbation of DISC1, using the Disc1 locus impairment (Disc1-LI) model in which the majority of Disc1 isoforms have been depleted, and to explore potential molecular mediators relevant to psychiatric conditions. We observed a robust change in gene expression profile elicited by Disc1-LI in which the stronger effects on molecular networks were observed in early stage compared with those in adulthood. Significant alterations were found in specific pathways relevant to psychiatric conditions, such as pathways of signaling by G protein-coupled receptor, neurotransmitter release cycle, and voltage gated potassium channels. The differentially expressed genes (DEGs) between Disc1-LI and wild-type mice are significantly enriched not only in neurons, but also in astrocytes and oligodendrocyte precursor cells. The brain-disorder-associated genes at the mRNA and protein levels rather than those at the genomic levels are enriched in the DEGs. Together, our present study supports the utility of Disc1-LI mice in biological research for psychiatric disorder-associated molecular networks.
Collapse
Affiliation(s)
- Kun Yang
- Department of Psychiatry, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA
| | - Mari A Kondo
- Department of Psychiatry, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA
| | - Hanna Jaaro-Peled
- Department of Psychiatry, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA
| | - Tyler Cash-Padgett
- Department of Psychiatry, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA
| | - Shin-Ichi Kano
- Department of Psychiatry, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA
| | - Koko Ishizuka
- Department of Psychiatry, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA
| | - Jonathan Pevsner
- Department of Neuroscience, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA; Kennedy Krieger Institute, Baltimore, MD 21205, USA
| | - Toshifumi Tomoda
- Medical Innovation Center, Kyoto University, Kyoto 606-8397, Japan
| | - Akira Sawa
- Department of Psychiatry, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA; Department of Neuroscience, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA; Department of Biomedical Engineering, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA; Department of Mental Health, Johns Hopkins University Bloomberg School of Medicine, Baltimore, MD 21205, USA.
| | - Minae Niwa
- Department of Psychiatry, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA.
| |
Collapse
|
27
|
Adaptively Weighted and Robust Mathematical Programming for the Discovery of Driver Gene Sets in Cancers. Sci Rep 2019; 9:5959. [PMID: 30976053 PMCID: PMC6459865 DOI: 10.1038/s41598-019-42500-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2018] [Accepted: 03/28/2019] [Indexed: 12/14/2022] Open
Abstract
High coverage and mutual exclusivity (HCME), which are considered two combinatorial properties of mutations in a collection of driver genes in cancers, have been used to develop mathematical programming models for distinguishing cancer driver gene sets. In this paper, we summarize a weak HCME pattern to justify the description of practical mutation datasets. We then present AWRMP, a method for identifying driver gene sets through the adaptive assignment of appropriate weights to gene candidates to tune the balance between coverage and mutual exclusivity. It embeds the genetic algorithm into the subsampling strategy to provide the optimization results robust against the uncertainty and noise in the data. Using biological datasets, we show that AWRMP can identify driver gene sets that satisfy the weak HCME pattern and outperform the state-of-arts methods in terms of robustness.
Collapse
|
28
|
Wu J, Cai Q, Wang J, Liao Y. Identifying mutated driver pathways in cancer by integrating multi-omics data. Comput Biol Chem 2019; 80:159-167. [PMID: 30959272 DOI: 10.1016/j.compbiolchem.2019.03.019] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2019] [Accepted: 03/23/2019] [Indexed: 10/27/2022]
Abstract
Since the driver pathway in cancer plays a crucial role in the formation and progression of cancer, it is very imperative to identify driver pathways, which will offer important information for precision medicine or personalized medicine. In this paper, an improved maximum weight submatrix problem model is proposed by integrating such three kinds of omics data as somatic mutations, copy number variations, and gene expressions. The model tries to adjust coverage and mutual exclusivity with the average weight of genes in a pathway, and simultaneously considers the correlation among genes, so that the pathway having high coverage but moderate mutual exclusivity can be identified. By introducing a kind of short chromosome code and a greedy based recombination operator, a parthenogenetic algorithm PGA-MWS is presented to solve the model. Experimental comparisons among algorithms GA, MOGA, iMCMC and PGA-MWS were performed on biological and simulated data sets. The experimental results show that, compared with the other three algorithms, the PGA-MWS one based on the improved model can identify the gene sets with high coverage but moderate mutual exclusivity and scales well. Many of the identified gene sets are involved in known signaling pathways, most of the implicated genes are oncogenes or tumor suppressors previously reported in literatures. The experimental results indicate that the proposed approach may become a useful complementary tool for detecting cancer pathways.
Collapse
Affiliation(s)
- Jingli Wu
- Guangxi Key Lab of Multi-source Information Mining & Security, Guangxi Normal University, Guilin 541004, China; College of Computer Science and Information Technology, Guangxi Normal University, Guilin 541004, China.
| | - Qirong Cai
- College of Computer Science and Information Technology, Guangxi Normal University, Guilin 541004, China.
| | - Jinyan Wang
- Guangxi Key Lab of Multi-source Information Mining & Security, Guangxi Normal University, Guilin 541004, China; College of Computer Science and Information Technology, Guangxi Normal University, Guilin 541004, China.
| | - Yuanxiu Liao
- College of Computer Science and Information Technology, Guangxi Normal University, Guilin 541004, China.
| |
Collapse
|
29
|
Sun S, Sun F, Wang Y. Multi-Level Comparative Framework Based on Gene Pair-Wise Expression Across Three Insulin Target Tissues for Type 2 Diabetes. Front Genet 2019; 10:252. [PMID: 30972105 PMCID: PMC6443994 DOI: 10.3389/fgene.2019.00252] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2019] [Accepted: 03/06/2019] [Indexed: 11/30/2022] Open
Abstract
Type 2 diabetes (T2D) is known as a disease caused by gene alterations characterized by insulin resistance, thus the insulin-responsive tissues are of great interest for T2D study. It’s of great relevance to systematically investigate commonalities and specificities of T2D among those tissues. Here we establish a multi-level comparative framework across three insulin target tissues (white adipose, skeletal muscle, and liver) to provide a better understanding of T2D. Starting from the ranks of gene expression, we constructed the ‘disease network’ through detecting diverse interactions to provide a well-characterization for disease affected tissues. Then, we applied random walk with restart algorithm to the disease network to prioritize its nodes and edges according to their association with T2D. Finally, we identified a merged core module by combining the clustering coefficient and Jaccard index, which can provide elaborate and visible illumination of the common and specific features for different tissues at network level. Taken together, our network-, gene-, and module-level characterization across different tissues of T2D hold the promise to provide a broader and deeper understanding for T2D mechanism.
Collapse
Affiliation(s)
- Shaoyan Sun
- School of Mathematics and Statistics, Ludong University, Yantai, China
| | - Fengnan Sun
- Clinical Laboratory, Yantaishan Hospital, Yantai, China
| | - Yong Wang
- CEMS, NCMIS, MDIS, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, China.,Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming, China
| |
Collapse
|
30
|
Zhang W, Wang SL. An Integrated Framework for Identifying Mutated Driver Pathway and Cancer Progression. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2019; 16:455-464. [PMID: 29990286 DOI: 10.1109/tcbb.2017.2788016] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Next-generation sequencing (NGS) technologies provide amount of somatic mutation data in a large number of patients. The identification of mutated driver pathway and cancer progression from these data is a challenging task because of the heterogeneity of interpatient. In addition, cancer progression at the pathway level has been proved to be more reasonable than at the gene level. In this paper, we introduce an integrated framework to identify mutated driver pathways and cancer progression (iMDPCP) at the pathway level from somatic mutation data. First, we use uncertainty coefficient to quantify mutual exclusivity on gene driver pathways and develop a computational framework to identify mutated driver pathways based on the adaptive discrete differential evolution algorithm. Then, we construct cancer progression model for driver pathways based on the Bayesian Network. Finally, we evaluate the performance of iMDPCP on real cancer somatic mutation datasets. The experimental results indicate that iMDPCP is more accurate than state-of-the-art methods according to the enrichment of KEGG pathways, and it also provides new insights on identifying cancer progression at the pathway level.
Collapse
|
31
|
Gao B, Zhao Y, Li Y, Liu J, Wang L, Li G, Su Z. Prediction of Driver Modules via Balancing Exclusive Coverages of Mutations in Cancer Samples. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2019; 6:1801384. [PMID: 30828525 PMCID: PMC6382311 DOI: 10.1002/advs.201801384] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/19/2018] [Revised: 10/04/2018] [Indexed: 05/07/2023]
Abstract
Mutual exclusivity of cancer driving mutations is a frequently observed phenomenon in the mutational landscape of cancer. The long tail of rare mutations complicates the discovery of mutually exclusive driver modules. The existing methods usually suffer from the problem that only few genes in some identified modules cover most of the cancer samples. To overcome this hurdle, an efficient method UniCovEx is presented via identifying mutually exclusive driver modules of balanced exclusive coverages. UniCovEx first searches for candidate driver modules with a strong topological relationship in signaling networks using a greedy strategy. It then evaluates the candidate modules by considering their coverage, exclusivity, and balance of coverage, using a novel metric termed exclusive entropy of modules, which measures how balanced the modules are. Finally, UniCovEx predicts sample-specific driver modules by solving a minimum set cover problem using a greedy strategy. When tested on 12 The Cancer Genome Atlas datasets of different cancer types, UniCovEx shows a significant superiority over the previous methods. The software is available at: https://sourceforge.net/projects/cancer-pathway/files/.
Collapse
Affiliation(s)
- Bo Gao
- School of MathematicsShandong UniversityJinan250100China
- State Key Laboratory of Microbial TechnologyShandong UniversityJinan250100China
| | - Yue Zhao
- IAMMADISNCMISAcademy of Mathematics and Systems ScienceChinese Academy of SciencesBeijing100190China
- School of Mathematical SciencesUniversity of Chinese Academy of SciencesBeijing100049China
| | - Yang Li
- School of MathematicsShandong UniversityJinan250100China
- State Key Laboratory of Microbial TechnologyShandong UniversityJinan250100China
| | - Juntao Liu
- School of MathematicsShandong UniversityJinan250100China
| | - Lushan Wang
- State Key Laboratory of Microbial TechnologyShandong UniversityJinan250100China
| | - Guojun Li
- School of MathematicsShandong UniversityJinan250100China
- State Key Laboratory of Microbial TechnologyShandong UniversityJinan250100China
| | - Zhengchang Su
- Department of Bioinformatics and GenomicsCollege of Computing and InformaticsThe University of North Carolina at Charlotte9201 University City BlvdCharlotteNC28223USA
| |
Collapse
|
32
|
Identifying Cancer Specific Driver Modules Using a Network-Based Method. Molecules 2018; 23:molecules23051114. [PMID: 29738475 PMCID: PMC6100049 DOI: 10.3390/molecules23051114] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2018] [Revised: 04/26/2018] [Accepted: 05/07/2018] [Indexed: 02/01/2023] Open
Abstract
Detecting driver modules is a key challenge for understanding the mechanisms of carcinogenesis at the pathway level. Identifying cancer specific driver modules is helpful for interpreting the different principles of different cancer types. However, most methods are proposed to identify driver modules in one cancer, but few methods are introduced to detect cancer specific driver modules. We propose a network-based method to detect cancer specific driver modules (CSDM) in a certain cancer type to other cancer types. We construct the specific network of a cancer by combining specific coverage and mutual exclusivity in all cancer types, to catch the specificity of the cancer at the pathway level. To illustrate the performance of the method, we apply CSDM on 12 TCGA cancer types. When we compare CSDM with SpeMDP and HotNet2 with regard to specific coverage and the enrichment of GO terms and KEGG pathways, CSDM is more accurate. We find that the specific driver modules of two different cancers have little overlap, which indicates that the driver modules detected by CSDM are specific. Finally, we also analyze three specific driver modules of BRCA, BLCA, and LAML intersecting with well-known pathways. The source code of CSDM is freely accessible at https://github.com/fengli28/CSDM.git.
Collapse
|
33
|
Zhang J, Zhang S. The Discovery of Mutated Driver Pathways in Cancer: Models and Algorithms. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2018; 15:988-998. [PMID: 28113329 DOI: 10.1109/tcbb.2016.2640963] [Citation(s) in RCA: 25] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
The pathogenesis of cancer in human is still poorly understood. With the rapid development of high-throughput sequencing technologies, huge volumes of cancer genomics data have been generated. Deciphering that data poses great opportunities and challenges to computational biologists. One of such key challenges is to distinguish driver mutations, genes as well as pathways from passenger ones. Mutual exclusivity of gene mutations (each patient has no more than one mutation in the gene set) has been observed in various cancer types and thus has been used as an important property of a driver gene set or pathway. In this article, we aim to review the recent development of computational models and algorithms for discovering driver pathways or modules in cancer with the focus on mutual exclusivity-based ones.
Collapse
|
34
|
Gao B, Li G, Liu J, Li Y, Huang X. Identification of driver modules in pan-cancer via coordinating coverage and exclusivity. Oncotarget 2018; 8:36115-36126. [PMID: 28415609 PMCID: PMC5482642 DOI: 10.18632/oncotarget.16433] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2017] [Accepted: 03/13/2017] [Indexed: 12/30/2022] Open
Abstract
It is widely accepted that cancer is driven by accumulated somatic mutations during the lifetime of an individual. Cancer mutations may target relatively small number of cell functional modules. The heterogeneity in different cancer patients makes it difficult to identify driver mutations or functional modules related to cancer. It is biologically desired to be capable of identifying cancer pathway modules through coordination between coverage and exclusivity. There have been a few approaches developed for this purpose, but they all have limitations in practice due to their computational complexity and prediction accuracy. We present a network based approach, CovEx, to predict the specific patient oriented modules by 1) discovering candidate modules for each considered gene, 2) extracting significant candidates by harmonizing coverage and exclusivity and, 3) further selecting the patient oriented modules based on a set cover model. Applying CovEx to pan-cancer datasets spanning 12 cancer types collecting from public database TCGA, it demonstrates significant superiority over the current leading competitors in performance. It is published under GNU GENERAL PUBLIC LICENSE and the source code is available at:https://sourceforge.net/projects/cancer-pathway/files/
Collapse
Affiliation(s)
- Bo Gao
- School of Mathematics, Shandong University, Jinan, Shandong, 250100, China.,Department of Computer Science, Arkansas State University, Jonesboro, Arkansas, 72401, USA
| | - Guojun Li
- School of Mathematics, Shandong University, Jinan, Shandong, 250100, China.,Department of Computer Science, Arkansas State University, Jonesboro, Arkansas, 72401, USA
| | - Juntao Liu
- School of Mathematics, Shandong University, Jinan, Shandong, 250100, China
| | - Yang Li
- School of Mathematics, Shandong University, Jinan, Shandong, 250100, China
| | - Xiuzhen Huang
- Department of Computer Science, Arkansas State University, Jonesboro, Arkansas, 72401, USA.,Molecular Biosciences Program, Arkansas State University, Jonesboro, Arkansas, 72401, USA
| |
Collapse
|
35
|
Detection of Somatic Mutations in Exome Sequencing of Tumor-only Samples. Sci Rep 2017; 7:15959. [PMID: 29162841 PMCID: PMC5698426 DOI: 10.1038/s41598-017-14896-7] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2017] [Accepted: 09/22/2017] [Indexed: 12/27/2022] Open
Abstract
Due to lack of normal samples in clinical diagnosis and to reduce costs, detection of small-scale mutations from tumor-only samples is required but remains relatively unexplored. We developed an algorithm (GATKcan) augmenting GATK with two statistics and machine learning to detect mutations in cancer. The averaged performance of GATKcan in ten experiments outperformed GATK in detecting mutations of randomly sampled 231 from 241 TCGA endometrial tumors (EC). In external validations, GATKcan outperformed GATK in TCGA breast cancer (BC), ovarian cancer (OC) and melanoma tumors, in terms of Matthews correlation coefficient (MCC) and precision, where MCC takes both sensitivity and specificity into account. Further, GATKcan reduced high fractions of false positives detected by GATK. In mutation detection of somatic variants, classified commonly by VarScan 2 and MuTect from the called variants in BC, OC and melanoma, ranked by adjusted MCC (adjusted precision) GATKcan was the top 1, followed by MuTect, VarScan 2 and GATK. Importantly, GATKcan enables detection of mutations when alternate alleles exist in normal samples. These results suggest that GATKcan trained by a cancer is able to detect mutations in future patients with the same type of cancer and is likely applicable to other cancers with similar mutations.
Collapse
|