1
|
Hezi H, Shats D, Gurevich D, Maruvka YE, Freiman M. Exploring the interplay between colorectal cancer subtypes genomic variants and cellular morphology: A deep-learning approach. PLoS One 2024; 19:e0309380. [PMID: 39255280 PMCID: PMC11386451 DOI: 10.1371/journal.pone.0309380] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2024] [Accepted: 08/10/2024] [Indexed: 09/12/2024] Open
Abstract
Molecular subtypes of colorectal cancer (CRC) significantly influence treatment decisions. While convolutional neural networks (CNNs) have recently been introduced for automated CRC subtype identification using H&E stained histopathological images, the correlation between CRC subtype genomic variants and their corresponding cellular morphology expressed by their imaging phenotypes is yet to be fully explored. The goal of this study was to determine such correlations by incorporating genomic variants in CNN models for CRC subtype classification from H&E images. We utilized the publicly available TCGA-CRC-DX dataset, which comprises whole slide images from 360 CRC-diagnosed patients (260 for training and 100 for testing). This dataset also provides information on CRC subtype classifications and genomic variations. We trained CNN models for CRC subtype classification that account for potential correlation between genomic variations within CRC subtypes and their corresponding cellular morphology patterns. We assessed the interplay between CRC subtypes' genomic variations and cellular morphology patterns by evaluating the CRC subtype classification accuracy of the different models in a stratified 5-fold cross-validation experimental setup using the area under the ROC curve (AUROC) and average precision (AP) as the performance metrics. The CNN models that account for potential correlation between genomic variations within CRC subtypes and their cellular morphology pattern achieved superior accuracy compared to the baseline CNN classification model that does not account for genomic variations when using either single-nucleotide-polymorphism (SNP) molecular features (AUROC: 0.824±0.02 vs. 0.761±0.04, p<0.05, AP: 0.652±0.06 vs. 0.58±0.08) or CpG-Island methylation phenotype (CIMP) molecular features (AUROC: 0.834±0.01 vs. 0.787±0.03, p<0.05, AP: 0.687±0.02 vs. 0.64±0.05). Combining the CNN models account for variations in CIMP and SNP further improved classification accuracy (AUROC: 0.847±0.01 vs. 0.787±0.03, p = 0.01, AP: 0.68±0.02 vs. 0.64±0.05). The improved accuracy of CNN models for CRC subtype classification that account for potential correlation between genomic variations within CRC subtypes and their corresponding cellular morphology as expressed by H&E imaging phenotypes may elucidate the biological cues impacting cancer histopathological imaging phenotypes. Moreover, considering CRC subtypes genomic variations has the potential to improve the accuracy of deep-learning models in discerning cancer subtype from histopathological imaging data.
Collapse
Affiliation(s)
- Hadar Hezi
- Faculty of Biomedical Engineering, Technion - Israel Institute of Technology, Haifa, Israel
| | - Daniel Shats
- Faculty of Computer Science, Technion - Israel Institute of Technology, Haifa, Israel
| | - Daniel Gurevich
- Faculty of Biotechnology and Food Engineering, Technion - Israel Institute of Technology, Haifa, Israel
- Lokey Center for Life Science and Engineering, Technion - Israel Institute of Technology, Haifa, Israel
| | - Yosef E Maruvka
- Faculty of Biotechnology and Food Engineering, Technion - Israel Institute of Technology, Haifa, Israel
- Lokey Center for Life Science and Engineering, Technion - Israel Institute of Technology, Haifa, Israel
| | - Moti Freiman
- Faculty of Biomedical Engineering, Technion - Israel Institute of Technology, Haifa, Israel
| |
Collapse
|
2
|
Shi J, Shu T, Wu K, Jiang Z, Zheng L, Wang W, Wu H, Zheng Y. Masked hypergraph learning for weakly supervised histopathology whole slide image classification. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2024; 253:108237. [PMID: 38820715 DOI: 10.1016/j.cmpb.2024.108237] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/06/2024] [Revised: 05/16/2024] [Accepted: 05/20/2024] [Indexed: 06/02/2024]
Abstract
BACKGROUND AND OBJECTIVES Graph neural network (GNN) has been extensively used in histopathology whole slide image (WSI) analysis due to the efficiency and flexibility in modelling relationships among entities. However, most existing GNN-based WSI analysis methods only consider the pairwise correlation of patches from one single perspective (e.g. spatial affinity or embedding similarity) yet ignore the intrinsic non-pairwise relationships present in gigapixel WSI, which are likely to contribute to feature learning and downstream tasks. The objective of this study is therefore to explore the non-pairwise relationships in histopathology WSI and exploit them to guide the learning of slide-level representations for better classification performance. METHODS In this paper, we propose a novel Masked HyperGraph Learning (MaskHGL) framework for weakly supervised histopathology WSI classification. Compared with most GNN-based WSI classification methods, MaskHGL exploits the non-pairwise correlations between patches with hypergraph and global message passing conducted by hypergraph convolution. Concretely, multi-perspective hypergraphs are first built for each WSI, then hypergraph attention is introduced into the jointed hypergraph to propagate the non-pairwise relationships and thus yield more discriminative node representation. More importantly, a masked hypergraph reconstruction module is devised to guide the hypergraph learning which can generate more powerful robustness and generalization than the method only using hypergraph modelling. Additionally, a self-attention-based node aggregator is also applied to explore the global correlation of patches in WSI and produce the slide-level representation for classification. RESULTS The proposed method is evaluated on two public TCGA benchmark datasets and one in-house dataset. On the public TCGA-LUNG (1494 WSIs) and TCGA-EGFR (696 WSIs) test set, the area under receiver operating characteristic (ROC) curve (AUC) were 0.9752±0.0024 and 0.7421±0.0380, respectively. On the USTC-EGFR (754 WSIs) dataset, MaskHGL achieved significantly better performance with an AUC of 0.8745±0.0100, which surpassed the second-best state-of-the-art method SlideGraph+ 2.64%. CONCLUSIONS MaskHGL shows a great improvement, brought by considering the intrinsic non-pairwise relationships within WSI, in multiple downstream WSI classification tasks. In particular, the designed masked hypergraph reconstruction module promisingly alleviates the data scarcity and greatly enhances the robustness and classification ability of our MaskHGL. Notably, it has shown great potential in cancer subtyping and fine-grained lung cancer gene mutation prediction from hematoxylin and eosin (H&E) stained WSIs.
Collapse
Affiliation(s)
- Jun Shi
- School of Software, Hefei University of Technology, Hefei, 230601, Anhui Province, China
| | - Tong Shu
- School of Computer Science and Information Engineering, Hefei University of Technology, Hefei, 230601, Anhui Province, China
| | - Kun Wu
- Image Processing Center, School of Astronautics, Beihang University, Beijing, 102206, China
| | - Zhiguo Jiang
- Image Processing Center, School of Astronautics, Beihang University, Beijing, 102206, China; Tianmushan Laboratory, Hangzhou, 311115, Zhejiang Province, China
| | - Liping Zheng
- School of Software, Hefei University of Technology, Hefei, 230601, Anhui Province, China
| | - Wei Wang
- Department of Pathology, the First Affiliated Hospital of USTC, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, 230036, Anhui Province, China; Intelligent Pathology Institute, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, 230036, Anhui Province, China
| | - Haibo Wu
- Department of Pathology, the First Affiliated Hospital of USTC, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, 230036, Anhui Province, China; Intelligent Pathology Institute, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, 230036, Anhui Province, China
| | - Yushan Zheng
- School of Engineering Medicine, Beijing Advanced Innovation Center for Biomedical Engineering, Beihang University, Beijing, 100191, China.
| |
Collapse
|
3
|
Yang R, Liu P, Ji L. ProDiv: Prototype-driven consistent pseudo-bag division for whole-slide image classification. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2024; 249:108161. [PMID: 38608349 DOI: 10.1016/j.cmpb.2024.108161] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/30/2023] [Revised: 01/26/2024] [Accepted: 03/31/2024] [Indexed: 04/14/2024]
Abstract
BACKGROUND AND OBJECTIVE Pathology image classification is one of the most essential auxiliary processes in cancer diagnosis. To overcome the problem of inadequate Whole-Slide Image (WSI) samples with weak labels, pseudo-bag-based multiple instance learning (MIL) methods have attracted wide attention in pathology image classification. In this type of method, the division scheme of pseudo-bags is usually a primary factor affecting classification performance. In order to improve the division of WSI pseudo-bags on existing random/clustering approaches, this paper proposes a new Prototype-driven Division (ProDiv) scheme for the pseudo-bag-based MIL classification framework on pathology images. METHODS This scheme first designs an attention-based method to generate a bag prototype for each slide. On this basis, it further groups WSI patch instances into a series of instance clusters according to the feature similarities between the prototype and patches. Finally, pseudo-bags are obtained by randomly combining the non-overlapping patch instances of different instance clusters. Moreover, the design scheme of our ProDiv considers practicality, and it could be smoothly assembled with almost all the MIL-based WSI classification methods in recent years. RESULTS Empirical results show that our ProDiv, when integrated with several existing methods, can deliver classification AUC improvements of up to 7.3% and 10.3%, respectively on two public WSI datasets. CONCLUSIONS ProDiv could almost always bring obvious performance improvements to compared MIL models on typical metrics, which suggests the effectiveness of our scheme. Experimental visualization also visually interprets the correctness of the proposed ProDiv.
Collapse
Affiliation(s)
- Rui Yang
- School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 611731, PR China.
| | - Pei Liu
- School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 611731, PR China.
| | - Luping Ji
- School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 611731, PR China.
| |
Collapse
|
4
|
Liang M, Jiang X, Cao J, Zhang S, Liu H, Li B, Wang L, Zhang C, Jia X. HSG-MGAF Net: Heterogeneous subgraph-guided multiscale graph attention fusion network for interpretable prediction of whole-slide image. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2024; 247:108099. [PMID: 38442623 DOI: 10.1016/j.cmpb.2024.108099] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/17/2023] [Revised: 02/12/2024] [Accepted: 02/22/2024] [Indexed: 03/07/2024]
Abstract
BACKGROUND AND OBJECTIVE Pathological whole slide image (WSI) prediction and region of interest (ROI) localization are important issues in computer-aided diagnosis and postoperative analysis in clinical applications. Existing computer-aided methods for predicting WSI are mainly based on multiple instance learning (MIL) and its variants. However, most of the methods are based on instance independence and identical distribution assumption and performed at a single scale, which not fully exploit the hierarchical multiscale heterogeneous information contained in WSI. METHODS Heterogeneous Subgraph-Guided Multiscale Graph Attention Fusion Network (HSG-MGAF Net) is proposed to build the topology of critical image patches at two scales for adaptive WSI prediction and lesion localization. The HSG-MGAF Net simulates the hierarchical heterogeneous information of WSI through graph and hypergraph at two scales, respectively. This framework not only fully exploits the low-order and potential high-order correlations of image patches at each scale, but also leverages the heterogeneous information of the two scales for adaptive WSI prediction. RESULTS We validate the superiority of the proposed method on the CAMELYON16 and the TCGA- NSCLC, and the results show that HSG-MGAF Net outperforms the state-of-the-art method on both datasets. The average ACC, AUC and F1 score of HSG-MGAF Net can reach 92.7 %/0.951/0.892 and 92.2 %/0.957/0.919, respectively. The obtained heatmaps can also localize the positive regions more accurately, which have great consistency with the pixel-level labels. CONCLUSIONS The results demonstrate that HSG-MGAF Net outperforms existing weakly supervised learning methods by introducing critical heterogeneous information between the two scales. This approach paves the way for further research on light weighted heterogeneous graph-based WSI prediction and ROI localization.
Collapse
Affiliation(s)
- Meiyan Liang
- School of Physics and Electronic Engineering, Shanxi University, Taiyuan 030006, China.
| | - Xing Jiang
- School of Physics and Electronic Engineering, Shanxi University, Taiyuan 030006, China
| | - Jie Cao
- School of Optics and Photonics, Beijing Institute of Technology, Beijing 100081, China.
| | - Shupeng Zhang
- School of Physics and Electronic Engineering, Shanxi University, Taiyuan 030006, China
| | - Haishun Liu
- Department of Automation, Tsinghua University, Beijing 100084, China
| | - Bo Li
- Department of Rehabilitation Treatment, Shanxi Rongjun Hospital, Taiyuan 030000, China
| | - Lin Wang
- Department of Pathology, Shanxi Bethune Hospital, Shanxi Academy of Medical Sciences, Tongji Shanxi Hospital, Third Hospital of Shanxi Medical University, Taiyuan 030032, China
| | - Cunlin Zhang
- Department of physics, Capital Normal University, Beijing 100048, China
| | - Xiaojun Jia
- School of Physics and Electronic Engineering, Shanxi University, Taiyuan 030006, China
| |
Collapse
|
5
|
Sun X, Li W, Fu B, Peng Y, He J, Wang L, Yang T, Meng X, Li J, Wang J, Huang P, Wang R. TGMIL: A hybrid multi-instance learning model based on the Transformer and the Graph Attention Network for whole-slide images classification of renal cell carcinoma. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2023; 242:107789. [PMID: 37722310 DOI: 10.1016/j.cmpb.2023.107789] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/22/2023] [Revised: 08/30/2023] [Accepted: 09/01/2023] [Indexed: 09/20/2023]
Abstract
BACKGROUND AND OBJECTIVES The pathological diagnosis of renal cell carcinoma is crucial for treatment. Currently, the multi-instance learning method is commonly used for whole-slide image classification of renal cell carcinoma, which is mainly based on the assumption of independent identical distribution. But this is inconsistent with the need to consider the correlation between different instances in the diagnosis process. Furthermore, the problem of high resource consumption of pathology images is still urgent to be solved. Therefore, we propose a new multi-instance learning method to solve this problem. METHODS In this study, we proposed a hybrid multi-instance learning model based on the Transformer and the Graph Attention Network, called TGMIL, to achieve whole-slide image of renal cell carcinoma classification without pixel-level annotation or region of interest extraction. Our approach is divided into three steps. First, we designed a feature pyramid with the multiple low magnifications of whole-slide image named MMFP. It makes the model incorporates richer information, and reduces memory consumption as well as training time compared to the highest magnification. Second, TGMIL amalgamates the Transformer and the Graph Attention's capabilities, adeptly addressing the loss of instance contextual and spatial. Within the Graph Attention network stream, an easy and efficient approach employing max pooling and mean pooling yields the graph adjacency matrix, devoid of extra memory consumption. Finally, the outputs of two streams of TGMIL are aggregated to achieve the classification of renal cell carcinoma. RESULTS On the TCGA-RCC validation set, a public dataset for renal cell carcinoma, the area under a receiver operating characteristic (ROC) curve (AUC) and accuracy of TGMIL were 0.98±0.0015,0.9191±0.0062, respectively. It showcased remarkable proficiency on the private validation set of renal cell carcinoma pathology images, attaining AUC of 0.9386±0.0162 and ACC of 0.9197±0.0124. Furthermore, on the public breast cancer whole-slide image test dataset, CAMELYON 16, our model showed good classification performance with an accuracy of 0.8792. CONCLUSIONS TGMIL models the diagnostic process of pathologists and shows good classification performance on multiple datasets. Concurrently, the MMFP module efficiently diminishes resource requirements, offering a novel angle for exploring computational pathology images.
Collapse
Affiliation(s)
- Xinhuan Sun
- Engineering Research Center of Text Computing & Cognitive Intelligence, Ministry of Education, Key Laboratory of Intelligent Medical Image Analysis and Precise Diagnosis of Guizhou Province, State Key Laboratory of Public Big Data, College of Computer Science and Technology, Guizhou University, Guiyang, 550025, China; Department of Radiology, International Exemplary Cooperation Base of Precision Imaging for Diagnosis and Treatment, Guizhou Provincial People's Hospital, Guiyang, 550002, China
| | - Wuchao Li
- Department of Radiology, International Exemplary Cooperation Base of Precision Imaging for Diagnosis and Treatment, Guizhou Provincial People's Hospital, Guiyang, 550002, China
| | - Bangkang Fu
- Department of Radiology, International Exemplary Cooperation Base of Precision Imaging for Diagnosis and Treatment, Guizhou Provincial People's Hospital, Guiyang, 550002, China
| | - Yunsong Peng
- Department of Radiology, International Exemplary Cooperation Base of Precision Imaging for Diagnosis and Treatment, Guizhou Provincial People's Hospital, Guiyang, 550002, China
| | - Junjie He
- Engineering Research Center of Text Computing & Cognitive Intelligence, Ministry of Education, Key Laboratory of Intelligent Medical Image Analysis and Precise Diagnosis of Guizhou Province, State Key Laboratory of Public Big Data, College of Computer Science and Technology, Guizhou University, Guiyang, 550025, China; Department of Radiology, International Exemplary Cooperation Base of Precision Imaging for Diagnosis and Treatment, Guizhou Provincial People's Hospital, Guiyang, 550002, China
| | - Lihui Wang
- Engineering Research Center of Text Computing & Cognitive Intelligence, Ministry of Education, Key Laboratory of Intelligent Medical Image Analysis and Precise Diagnosis of Guizhou Province, State Key Laboratory of Public Big Data, College of Computer Science and Technology, Guizhou University, Guiyang, 550025, China
| | - Tongyin Yang
- Department of Pathology, Guizhou Provincial People's Hospital, Guiyang, 550002, China
| | - Xue Meng
- Department of Pathology, Affiliated Hospital of Zunyi Medical University, Zunyi, 563000, China
| | - Jin Li
- Department of Pathology, Affiliated Hospital of Zunyi Medical University, Zunyi, 563000, China
| | - Jinjing Wang
- Department of Pathology, Affiliated Hospital of Zunyi Medical University, Zunyi, 563000, China
| | - Ping Huang
- Department of Pathology, Guizhou Provincial People's Hospital, Guiyang, 550002, China
| | - Rongpin Wang
- Department of Radiology, International Exemplary Cooperation Base of Precision Imaging for Diagnosis and Treatment, Guizhou Provincial People's Hospital, Guiyang, 550002, China.
| |
Collapse
|
6
|
Al-Thelaya K, Gilal NU, Alzubaidi M, Majeed F, Agus M, Schneider J, Househ M. Applications of discriminative and deep learning feature extraction methods for whole slide image analysis: A survey. J Pathol Inform 2023; 14:100335. [PMID: 37928897 PMCID: PMC10622844 DOI: 10.1016/j.jpi.2023.100335] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2023] [Revised: 07/17/2023] [Accepted: 07/19/2023] [Indexed: 11/07/2023] Open
Abstract
Digital pathology technologies, including whole slide imaging (WSI), have significantly improved modern clinical practices by facilitating storing, viewing, processing, and sharing digital scans of tissue glass slides. Researchers have proposed various artificial intelligence (AI) solutions for digital pathology applications, such as automated image analysis, to extract diagnostic information from WSI for improving pathology productivity, accuracy, and reproducibility. Feature extraction methods play a crucial role in transforming raw image data into meaningful representations for analysis, facilitating the characterization of tissue structures, cellular properties, and pathological patterns. These features have diverse applications in several digital pathology applications, such as cancer prognosis and diagnosis. Deep learning-based feature extraction methods have emerged as a promising approach to accurately represent WSI contents and have demonstrated superior performance in histology-related tasks. In this survey, we provide a comprehensive overview of feature extraction methods, including both manual and deep learning-based techniques, for the analysis of WSIs. We review relevant literature, analyze the discriminative and geometric features of WSIs (i.e., features suited to support the diagnostic process and extracted by "engineered" methods as opposed to AI), and explore predictive modeling techniques using AI and deep learning. This survey examines the advances, challenges, and opportunities in this rapidly evolving field, emphasizing the potential for accurate diagnosis, prognosis, and decision-making in digital pathology.
Collapse
Affiliation(s)
- Khaled Al-Thelaya
- Department of Information and Computing Technology, College of Science and Engineering, Hamad Bin Khalifa University, Doha, Qatar
| | - Nauman Ullah Gilal
- Department of Information and Computing Technology, College of Science and Engineering, Hamad Bin Khalifa University, Doha, Qatar
| | - Mahmood Alzubaidi
- Department of Information and Computing Technology, College of Science and Engineering, Hamad Bin Khalifa University, Doha, Qatar
| | - Fahad Majeed
- Department of Information and Computing Technology, College of Science and Engineering, Hamad Bin Khalifa University, Doha, Qatar
| | - Marco Agus
- Department of Information and Computing Technology, College of Science and Engineering, Hamad Bin Khalifa University, Doha, Qatar
| | - Jens Schneider
- Department of Information and Computing Technology, College of Science and Engineering, Hamad Bin Khalifa University, Doha, Qatar
| | - Mowafa Househ
- Department of Information and Computing Technology, College of Science and Engineering, Hamad Bin Khalifa University, Doha, Qatar
| |
Collapse
|