1
|
Deng J, Wei K, Fang J, Li Y. Deep self-reconstruction driven joint nonnegative matrix factorization model for identifying multiple genomic imaging associations in complex diseases. J Biomed Inform 2024; 156:104684. [PMID: 38936566 DOI: 10.1016/j.jbi.2024.104684] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2024] [Revised: 06/14/2024] [Accepted: 06/24/2024] [Indexed: 06/29/2024]
Abstract
OBJECTIVE Comprehensive analysis of histopathology images and transcriptomics data enables the identification of candidate biomarkers and multimodal association patterns. Most existing multimodal data association studies are derived from extensions of the joint nonnegative matrix factorization model for identifying complex data associations, which can make full use of clinical prior information. However, the raw data were usually taken as the input without considering the underlying complex multi-subspace structure, influencing the subsequent integration analysis results. METHODS This study proposed a deep-self reconstructed joint nonnegative matrix factorization (DSRJNMF) model to use self-expressive properties to reconstruct the raw data to characterize the similarity structure associated with clinical labels. Then, the sparsity, orthogonality, and regularization constraints constructed from prior information are added to the DSRJNMF model to determine the sparse set of biologically relevant features across modalities. RESULTS The algorithm has been applied to identify the imaging genetic association of triple negative breast cancer (TNBC). Multilevel experimental results demonstrate that the proposed algorithm better estimates potential associations between pathological image features and miRNA-gene and identifies consistent multimodal imaging genetic biomarkers to guide the interpretation of TNBC. CONCLUSION The propose method provides a novel idea of data association analysis oriented to complex diseases.
Collapse
Affiliation(s)
- Jin Deng
- College of Mathematics and Informatics, South China Agricultural University, Guangzhou 510642, China
| | - Kai Wei
- Bio-Med Big Data Center, CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
| | - Jiana Fang
- College of Mathematics and Informatics, South China Agricultural University, Guangzhou 510642, China
| | - Ying Li
- Shanghai Institute of Technology, Shanghai 201418, China.
| |
Collapse
|
2
|
Zhuo X, Deng H, Qiu M, Qiu X. Pathomic model based on histopathological features and machine learning to predict IDO1 status and its association with breast cancer prognosis. Breast Cancer Res Treat 2024; 207:151-165. [PMID: 38780888 PMCID: PMC11230954 DOI: 10.1007/s10549-024-07350-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2024] [Accepted: 04/18/2024] [Indexed: 05/25/2024]
Abstract
PURPOSE To establish a pathomic model using histopathological image features for predicting indoleamine 2,3-dioxygenase 1 (IDO1) status and its relationship with overall survival (OS) in breast cancer. METHODS A pathomic model was constructed using machine learning and histopathological images obtained from The Cancer Genome Atlas database to predict IDO1 expression. The model performance was evaluated based on the area under the curve, calibration curve, and decision curve analysis (DCA). Prediction scores (PSes) were generated from the model and applied to divide the patients into two groups. Survival outcomes, gene set enrichment, immune microenvironment, and tumor mutations were assessed between the two groups. RESULTS Survival analysis followed by multivariate correction revealed that high IDO1 is a protective factor for OS. Further, the model was calibrated, and it exhibited good discrimination. Additionally, the DCA showed that the proposed model provided a good clinical net benefit. The Kaplan-Meier analysis revealed a positive correlation between high PS and improved OS. Univariate and multivariate Cox regression analyses demonstrated that PS is an independent protective factor for OS. Moreover, differentially expressed genes were enriched in various essential biological processes, including extracellular matrix receptor interaction, angiogenesis, transforming growth factor β signaling, epithelial mesenchymal transition, cell junction, tryptophan metabolism, and heme metabolic processes. PS was positively correlated with M1 macrophages, CD8 + T cells, T follicular helper cells, and tumor mutational burden. CONCLUSION These results indicate the potential ability of the proposed pathomic model to predict IDO1 status and the OS of breast cancer patients.
Collapse
Affiliation(s)
- Xiaohua Zhuo
- Department of Pathology, Longyan First Affiliated Hospital of Fujian Medical University, Longyan, Fujian, China
| | - Hailong Deng
- Department of Pathology, Longyan First Affiliated Hospital of Fujian Medical University, Longyan, Fujian, China
| | - Mingzhu Qiu
- Department of Pathology, Longyan First Affiliated Hospital of Fujian Medical University, Longyan, Fujian, China
| | - Xiaoming Qiu
- Department of Pathology, Longyan First Affiliated Hospital of Fujian Medical University, Longyan, Fujian, China.
| |
Collapse
|
3
|
Li Y, Du P, Zeng H, Wei Y, Fu H, Zhong X, Ma X. Integrative models of histopathological images and multi-omics data predict prognosis in endometrial carcinoma. PeerJ 2023; 11:e15674. [PMID: 37583914 PMCID: PMC10424667 DOI: 10.7717/peerj.15674] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2023] [Accepted: 06/11/2023] [Indexed: 08/17/2023] Open
Abstract
Objective This study aimed to predict the molecular features of endometrial carcinoma (EC) and the overall survival (OS) of EC patients using histopathological imaging. Methods The patients from The Cancer Genome Atlas (TCGA) were separated into the training set (n = 215) and test set (n = 214) in proportion of 1:1. By analyzing quantitative histological image features and setting up random forest model verified by cross-validation, we constructed prognostic models for OS. The model performance is evaluated with the time-dependent receiver operating characteristics (AUC) over the test set. Results Prognostic models based on histopathological imaging features (HIF) predicted OS in the test set (5-year AUC = 0.803). The performance of combining histopathology and omics transcends that of genomics, transcriptomics, or proteomics alone. Additionally, multi-dimensional omics data, including HIF, genomics, transcriptomics, and proteomics, attained the largest AUCs of 0.866, 0.869, and 0.856 at years 1, 3, and 5, respectively, showcasing the highest discrepancy in survival (HR = 18.347, 95% CI [11.09-25.65], p < 0.001). Conclusions The results of this experiment indicated that the complementary features of HIF could improve the prognostic performance of EC patients. Moreover, the integration of HIF and multi-dimensional omics data might ameliorate survival prediction and risk stratification in clinical practice.
Collapse
Affiliation(s)
- Yueyi Li
- Department of Targeting Therapy & Immunology, Cancer Center, West China Hospital, Sichuan University, Chengdu, Sichuan, China
| | - Peixin Du
- Laboratory of Integrative Medicine, Clinical Research Center for Breast, State Key Laboratory of Biotherapy, West China Hospital, Sichuan University, Chengdu, Sichuan, China
| | - Hao Zeng
- Laboratory of Integrative Medicine, Clinical Research Center for Breast, State Key Laboratory of Biotherapy, West China Hospital, Sichuan University, Chengdu, Sichuan, China
| | - Yuhao Wei
- West China School of Medicine, West China Hospital, Sichuan University, Chengdu, Sichuan, China
| | - Haoxuan Fu
- Department of Statistics and Data Science, Wharton School, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | - Xi Zhong
- Department of Critical Care Medicine, West China Hospital of Sichuan University, Chengdu, Sichuan, China
| | - Xuelei Ma
- Department of Targeting Therapy & Immunology, Cancer Center, West China Hospital, Sichuan University, Chengdu, Sichuan, China
| |
Collapse
|
4
|
Hou J, Jia X, Xie Y, Qin W. Integrative Histology-Genomic Analysis Predicts Hepatocellular Carcinoma Prognosis Using Deep Learning. Genes (Basel) 2022; 13:genes13101770. [PMID: 36292654 PMCID: PMC9601633 DOI: 10.3390/genes13101770] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2022] [Revised: 09/25/2022] [Accepted: 09/28/2022] [Indexed: 11/04/2022] Open
Abstract
Cancer prognosis analysis is of essential interest in clinical practice. In order to explore the prognostic power of computational histopathology and genomics, this paper constructs a multi-modality prognostic model for survival prediction. We collected 346 patients diagnosed with hepatocellular carcinoma (HCC) from The Cancer Genome Atlas (TCGA), each patient has 1-3 whole slide images (WSIs) and an mRNA expression file. WSIs were processed by a multi-instance deep learning model to obtain the patient-level survival risk scores; mRNA expression data were processed by weighted gene co-expression network analysis (WGCNA), and the top hub genes of each module were extracted as risk factors. Information from two modalities was integrated by Cox proportional hazard model to predict patient outcomes. The overall survival predictions of the multi-modality model (Concordance index (C-index): 0.746, 95% confidence interval (CI): ±0.077) outperformed these based on histopathology risk score or hub genes, respectively. Furthermore, in the prediction of 1-year and 3-year survival, the area under curve of the model achieved 0.816 and 0.810. In conclusion, this paper provides an effective workflow for multi-modality prognosis of HCC, the integration of histopathology and genomic information has the potential to assist clinical prognosis management.
Collapse
Affiliation(s)
- Jiaxin Hou
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
- Shenzhen College of Advanced Technology, University of Chinese Academy of Sciences, Shenzhen 518055, China
| | - Xiaoqi Jia
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
- School of Computer Science and Engineering, Northeastern University, Shenyang 110819, China
| | - Yaoqin Xie
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
| | - Wenjian Qin
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
- Correspondence:
| |
Collapse
|
5
|
Chen L, Zeng H, Xiang Y, Huang Y, Luo Y, Ma X. Histopathological Images and Multi-Omics Integration Predict Molecular Characteristics and Survival in Lung Adenocarcinoma. Front Cell Dev Biol 2021; 9:720110. [PMID: 34708036 PMCID: PMC8542778 DOI: 10.3389/fcell.2021.720110] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2021] [Accepted: 09/14/2021] [Indexed: 02/05/2023] Open
Abstract
Histopathological images and omics profiles play important roles in prognosis of cancer patients. Here, we extracted quantitative features from histopathological images to predict molecular characteristics and prognosis, and integrated image features with mutations, transcriptomics, and proteomics data for prognosis prediction in lung adenocarcinoma (LUAD). Patients obtained from The Cancer Genome Atlas (TCGA) were divided into training set (n = 235) and test set (n = 235). We developed machine learning models in training set and estimated their predictive performance in test set. In test set, the machine learning models could predict genetic aberrations: ALK (AUC = 0.879), BRAF (AUC = 0.847), EGFR (AUC = 0.855), ROS1 (AUC = 0.848), and transcriptional subtypes: proximal-inflammatory (AUC = 0.897), proximal-proliferative (AUC = 0.861), and terminal respiratory unit (AUC = 0.894) from histopathological images. Moreover, we obtained tissue microarrays from 316 LUAD patients, including four external validation sets. The prognostic model using image features was predictive of overall survival in test and four validation sets, with 5-year AUCs from 0.717 to 0.825. High-risk and low-risk groups stratified by the model showed different survival in test set (HR = 4.94, p < 0.0001) and three validation sets (HR = 1.64–2.20, p < 0.05). The combination of image features and single omics had greater prognostic power in test set, such as histopathology + transcriptomics model (5-year AUC = 0.840; HR = 7.34, p < 0.0001). Finally, the model integrating image features with multi-omics achieved the best performance (5-year AUC = 0.908; HR = 19.98, p < 0.0001). Our results indicated that the machine learning models based on histopathological image features could predict genetic aberrations, transcriptional subtypes, and survival outcomes of LUAD patients. The integration of histopathological images and multi-omics may provide better survival prediction for LUAD.
Collapse
Affiliation(s)
- Linyan Chen
- State Key Laboratory of Biotherapy, Department of Biotherapy, Cancer Center, West China Hospital, Sichuan University, Chengdu, China
| | - Hao Zeng
- State Key Laboratory of Biotherapy, Department of Biotherapy, Cancer Center, West China Hospital, Sichuan University, Chengdu, China
| | - Yu Xiang
- State Key Laboratory of Biotherapy, Department of Biotherapy, Cancer Center, West China Hospital, Sichuan University, Chengdu, China
| | - Yeqian Huang
- Department of Pathology, West China Hospital, Sichuan University, Chengdu, China
| | - Yuling Luo
- Department of Pathology, West China Hospital, Sichuan University, Chengdu, China
| | - Xuelei Ma
- State Key Laboratory of Biotherapy, Department of Biotherapy, Cancer Center, West China Hospital, Sichuan University, Chengdu, China
| |
Collapse
|
6
|
Karimi MR, Karimi AH, Abolmaali S, Sadeghi M, Schmitz U. Prospects and challenges of cancer systems medicine: from genes to disease networks. Brief Bioinform 2021; 23:6361045. [PMID: 34471925 PMCID: PMC8769701 DOI: 10.1093/bib/bbab343] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2021] [Revised: 08/02/2021] [Accepted: 08/03/2021] [Indexed: 12/20/2022] Open
Abstract
It is becoming evident that holistic perspectives toward cancer are crucial in deciphering the overwhelming complexity of tumors. Single-layer analysis of genome-wide data has greatly contributed to our understanding of cellular systems and their perturbations. However, fundamental gaps in our knowledge persist and hamper the design of effective interventions. It is becoming more apparent than ever, that cancer should not only be viewed as a disease of the genome but as a disease of the cellular system. Integrative multilayer approaches are emerging as vigorous assets in our endeavors to achieve systemic views on cancer biology. Herein, we provide a comprehensive review of the approaches, methods and technologies that can serve to achieve systemic perspectives of cancer. We start with genome-wide single-layer approaches of omics analyses of cellular systems and move on to multilayer integrative approaches in which in-depth descriptions of proteogenomics and network-based data analysis are provided. Proteogenomics is a remarkable example of how the integration of multiple levels of information can reduce our blind spots and increase the accuracy and reliability of our interpretations and network-based data analysis is a major approach for data interpretation and a robust scaffold for data integration and modeling. Overall, this review aims to increase cross-field awareness of the approaches and challenges regarding the omics-based study of cancer and to facilitate the necessary shift toward holistic approaches.
Collapse
Affiliation(s)
| | | | | | - Mehdi Sadeghi
- Department of Cell & Molecular Biology, Semnan University, Semnan, Iran
| | - Ulf Schmitz
- Department of Molecular & Cell Biology, James Cook University, Townsville, QLD 4811, Australia
| |
Collapse
|
7
|
Zeng H, Chen L, Zhang M, Luo Y, Ma X. Integration of histopathological images and multi-dimensional omics analyses predicts molecular features and prognosis in high-grade serous ovarian cancer. Gynecol Oncol 2021; 163:171-180. [PMID: 34275655 DOI: 10.1016/j.ygyno.2021.07.015] [Citation(s) in RCA: 28] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2021] [Revised: 07/04/2021] [Accepted: 07/09/2021] [Indexed: 02/05/2023]
Abstract
OBJECTIVE This study used histopathological image features to predict molecular features, and combined with multi-dimensional omics data to predict overall survival (OS) in high-grade serous ovarian cancer (HGSOC). METHODS Patients from The Cancer Genome Atlas (TCGA) were distributed into training set (n = 115) and test set (n = 114). In addition, we collected tissue microarrays of 92 patients as an external validation set. Quantitative features were extracted from histopathological images using CellProfiler, and utilized to establish prediction models by machine learning methods in training set. The prediction performance was assessed in test set and validation set. RESULTS The prediction models were able to identify BRCA1 mutation (AUC = 0.952), BRCA2 mutation (AUC = 0.912), microsatellite instability-high (AUC = 0.919), microsatellite stable (AUC = 0.924), and molecular subtypes: proliferative (AUC = 0.961), differentiated (AUC = 0.952), immunoreactive (AUC = 0.941), mesenchymal (AUC = 0.918) in test set. The prognostic model based on histopathological image features could predict OS in test set (5-year AUC = 0.825) and validation set (5-year AUC = 0.703). We next explored the integrative prognostic models of image features, genomics, transcriptomics and proteomics. In test set, the models combining two omics had higher prediction accuracy, such as image features and genomics (5-year AUC = 0.834). The multi-omics model including all features showed the best prediction performance (5-year AUC = 0.911). According to risk score of multi-omics model, the high-risk and low-risk groups had significant survival differences (HR = 18.23, p < 0.001). CONCLUSIONS These results indicated the potential ability of histopathological image features to predict above molecular features and survival risk of HGSOC patients. The integration of image features and multi-omics data may improve prognosis prediction in HGSOC patients.
Collapse
Affiliation(s)
- Hao Zeng
- Department of Biotherapy, Cancer Center, State Key Laboratory of Biotherapy, West China Hospital, Sichuan University, and Collaborative Innovation Center, Chengdu, China
| | - Linyan Chen
- Department of Biotherapy, Cancer Center, State Key Laboratory of Biotherapy, West China Hospital, Sichuan University, and Collaborative Innovation Center, Chengdu, China
| | - Mingxuan Zhang
- Department of Pathology, West China Hospital, Sichuan University, Chengdu, China
| | - Yuling Luo
- Department of Pathology, West China Hospital, Sichuan University, Chengdu, China
| | - Xuelei Ma
- Department of Biotherapy, Cancer Center, State Key Laboratory of Biotherapy, West China Hospital, Sichuan University, and Collaborative Innovation Center, Chengdu, China.
| |
Collapse
|