1
|
Wang J, Quan H, Wang C, Yang G. Pyramid-based self-supervised learning for histopathological image classification. Comput Biol Med 2023; 165:107336. [PMID: 37708715 DOI: 10.1016/j.compbiomed.2023.107336] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2022] [Revised: 07/14/2023] [Accepted: 08/07/2023] [Indexed: 09/16/2023]
Abstract
Large-scale labeled datasets are crucial for the success of supervised learning in medical imaging. However, annotating histopathological images is a time-consuming and labor-intensive task that requires highly trained professionals. To address this challenge, self-supervised learning (SSL) can be utilized to pre-train models on large amounts of unsupervised data and transfer the learned representations to various downstream tasks. In this study, we propose a self-supervised Pyramid-based Local Wavelet Transformer (PLWT) model for effectively extracting rich image representations. The PLWT model extracts both local and global features to pre-train a large number of unlabeled histopathology images in a self-supervised manner. Wavelet is used to replace average pooling in the downsampling of the multi-head attention, achieving a significant reduction in information loss during the transmission of image features. Additionally, we introduce a Local Squeeze-and-Excitation (Local SE) module in the feedforward network in combination with the inverse residual to capture local image information. We evaluate PLWT's performance on three histopathological images and demonstrate the impact of pre-training. Our experiment results indicate that PLWT with self-supervised learning performs highly competitive when compared with other SSL methods, and the transferability of visual representations generated by SSL on domain-relevant histopathological images exceeds that of the supervised baseline trained on ImageNet.
Collapse
Affiliation(s)
- Junjie Wang
- Ningbo Artificial Intelligence Institute of Shanghai Jiao Tong University, Zhejiang 315000, PR China; Department of Automation, Shanghai Jiao Tong University, Shanghai 200240, PR China.
| | - Hao Quan
- College of Medicine and Biological Information Engineering, Northeastern University, Shenyang 110016, PR China.
| | - Chengguang Wang
- Ningbo Industrial Internet Institute, Zhejiang 315000, PR China.
| | - Genke Yang
- Ningbo Artificial Intelligence Institute of Shanghai Jiao Tong University, Zhejiang 315000, PR China; Department of Automation, Shanghai Jiao Tong University, Shanghai 200240, PR China.
| |
Collapse
|
2
|
Xu Y, Cui X, Zhang L, Zhao T, Wang Y. Metastasis-related gene identification by compound constrained NMF and a semisupervised cluster approach using pancancer multiomics features. Comput Biol Med 2022; 151:106263. [PMID: 36371902 DOI: 10.1016/j.compbiomed.2022.106263] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2022] [Revised: 10/16/2022] [Accepted: 10/30/2022] [Indexed: 11/11/2022]
Abstract
In recent years, with the gradual increase in pancancer-related research, more attention has been given to the field of pancancer metastasis. However, the molecular mechanism of pancancer metastasis is very unclear, and identification methods for pancancer metastasis-related genes are still lacking. In view of this research status, we developed a novel pipeline to identify pancancer metastasis-related genes based on compound constrained nonnegative matrix factorization (CCNMF). To solve the above problems, the following modules were designed. A correntropy operator and feature similarity fusion (FSF) were first adopted to process the multiomics features of genes; thus, the influences caused by irrelevant biomolecular patterns, manifested as non-Gaussian noise, were minimized. CCNMF was then adopted to handle the above features with compound constraints consisting of a gene relation network and a "metastasis-related" gene set, which maximizes the biological interpretability of the metafeatures generated by NMF. Since a negative set of pancancer "metastasis-related" genes could hardly be obtained, semisupervised analyses were performed on gene features acquired by each step in our pipeline to examine our method's effect. 83% of the 236 candidates identified by the above method were associated with the metastasis of one or more cancers, 71.9% candidates were identified immune-related in pancancer in addition to the hallmark genes. Our study provides an effective and interpretable method for identifying metastasis-related as well as immune-related genes, and the method is successfully applied to TCGA pancancer data.
Collapse
Affiliation(s)
- Yining Xu
- Faculty of Computing, Harbin Institute of Technology, 92 Xidazhi Street,TIB #20, Harbin, 150000, Hei Long Jiang, China.
| | - Xinran Cui
- Faculty of Computing, Harbin Institute of Technology, 92 Xidazhi Street,TIB #20, Harbin, 150000, Hei Long Jiang, China.
| | - Liyuan Zhang
- Faculty of Computing, Harbin Institute of Technology, 92 Xidazhi Street,TIB #20, Harbin, 150000, Hei Long Jiang, China.
| | - Tianyi Zhao
- School of medicine and Health, Harbin Institute of Technology, 92 Xidazhi Street,TIB #20, Harbin, 150000, Hei Long Jiang, China.
| | - Yadong Wang
- Faculty of Computing, Harbin Institute of Technology, 92 Xidazhi Street,TIB #20, Harbin, 150000, Hei Long Jiang, China.
| |
Collapse
|
3
|
Rashmi R, Prasad K, Udupa CBK. Region-based feature enhancement using channel-wise attention for classification of breast histopathological images. Neural Comput Appl 2022. [DOI: 10.1007/s00521-022-07966-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
AbstractBreast histopathological image analysis at 400x magnification is essential for the determination of malignant breast tumours. But manual analysis of these images is tedious, subjective, error-prone and requires domain knowledge. To this end, computer-aided tools are gaining much attention in the recent past as it aids pathologists and save time. Furthermore, advances in computational power have leveraged the usage of computer tools. Yet, usage of computer-aided tools to analyse these images is challenging due to various reasons such as heterogeneity of malignant tumours, colour variations and presence of artefacts. Moreover, these images are captured at high resolutions which pose a major challenge to designing deep learning models as it demands high computational requirements. In this context, the present work proposes a new approach to efficiently and effectively extract features from these high-resolution images. In addition, at 400x magnification, the characteristics and structure of nuclei play a prominent role in the decision of malignancy. In this regard, the study introduces a novel CNN architecture called as CWA-Net that uses a colour channel attention module to enhance the features of the potential regions of interest such as nuclei. The developed model is qualitatively and quantitatively evaluated on private and public datasets and achieved an accuracy of 0.95% and 0.96%, respectively. The experimental evaluation demonstrates that the proposed method outperforms state-of-the-art methods on both datasets.
Collapse
|
4
|
Eckardt JN, Bornhäuser M, Wendt K, Middeke JM. Semi-supervised learning in cancer diagnostics. Front Oncol 2022; 12:960984. [PMID: 35912249 PMCID: PMC9329803 DOI: 10.3389/fonc.2022.960984] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2022] [Accepted: 06/24/2022] [Indexed: 12/01/2022] Open
Abstract
In cancer diagnostics, a considerable amount of data is acquired during routine work-up. Recently, machine learning has been used to build classifiers that are tasked with cancer detection and aid in clinical decision-making. Most of these classifiers are based on supervised learning (SL) that needs time- and cost-intensive manual labeling of samples by medical experts for model training. Semi-supervised learning (SSL), however, works with only a fraction of labeled data by including unlabeled samples for information abstraction and thus can utilize the vast discrepancy between available labeled data and overall available data in cancer diagnostics. In this review, we provide a comprehensive overview of essential functionalities and assumptions of SSL and survey key studies with regard to cancer care differentiating between image-based and non-image-based applications. We highlight current state-of-the-art models in histopathology, radiology and radiotherapy, as well as genomics. Further, we discuss potential pitfalls in SSL study design such as discrepancies in data distributions and comparison to baseline SL models, and point out future directions for SSL in oncology. We believe well-designed SSL models to strongly contribute to computer-guided diagnostics in malignant disease by overcoming current hinderances in the form of sparse labeled and abundant unlabeled data.
Collapse
Affiliation(s)
- Jan-Niklas Eckardt
- Department of Internal Medicine I, University Hospital Carl Gustav Carus, Dresden, Germany
- Else Kröner Fresenius Center for Digital Health, Technical University Dresden, Dresden, Germany
- *Correspondence: Jan-Niklas Eckardt,
| | - Martin Bornhäuser
- Department of Internal Medicine I, University Hospital Carl Gustav Carus, Dresden, Germany
- German Consortium for Translational Cancer Research, Heidelberg, Germany
- National Center for Tumor Disease (NCT), Dresden, Germany
| | - Karsten Wendt
- Else Kröner Fresenius Center for Digital Health, Technical University Dresden, Dresden, Germany
- Institute of Software and Multimedia Technology, Technical University Dresden, Dresden, Germany
| | - Jan Moritz Middeke
- Department of Internal Medicine I, University Hospital Carl Gustav Carus, Dresden, Germany
- Else Kröner Fresenius Center for Digital Health, Technical University Dresden, Dresden, Germany
| |
Collapse
|
5
|
He W, Liu T, Han Y, Ming W, Du J, Liu Y, Yang Y, Wang L, Jiang Z, Wang Y, Yuan J, Cao C. A review: The detection of cancer cells in histopathology based on machine vision. Comput Biol Med 2022; 146:105636. [PMID: 35751182 DOI: 10.1016/j.compbiomed.2022.105636] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2021] [Revised: 04/04/2022] [Accepted: 04/28/2022] [Indexed: 12/24/2022]
Abstract
Machine vision is being employed in defect detection, size measurement, pattern recognition, image fusion, target tracking and 3D reconstruction. Traditional cancer detection methods are dominated by manual detection, which wastes time and manpower, and heavily relies on the pathologists' skill and work experience. Therefore, these manual detection approaches are not convenient for the inheritance of domain knowledge, and are not suitable for the rapid development of medical care in the future. The emergence of machine vision can iteratively update and learn the domain knowledge of cancer cell pathology detection to achieve automated, high-precision, and consistent detection. Consequently, this paper reviews the use of machine vision to detect cancer cells in histopathology images, as well as the benefits and drawbacks of various detection approaches. First, we review the application of image preprocessing and image segmentation in histopathology for the detection of cancer cells, and compare the benefits and drawbacks of different algorithms. Secondly, for the characteristics of histopathological cancer cell images, the research progress of shape, color and texture features and other methods is mainly reviewed. Furthermore, for the classification methods of histopathological cancer cell images, the benefits and drawbacks of traditional machine vision approaches and deep learning methods are compared and analyzed. Finally, the above research is discussed and forecasted, with the expected future development tendency serving as a guide for future research.
Collapse
Affiliation(s)
- Wenbin He
- Henan Key Lab of Intelligent Manufacturing of Mechanical Equipment, Zhengzhou University of Light Industry, Zhengzhou, 450002, China
| | - Ting Liu
- Henan Key Lab of Intelligent Manufacturing of Mechanical Equipment, Zhengzhou University of Light Industry, Zhengzhou, 450002, China
| | - Yongjie Han
- Henan Key Lab of Intelligent Manufacturing of Mechanical Equipment, Zhengzhou University of Light Industry, Zhengzhou, 450002, China
| | - Wuyi Ming
- Henan Key Lab of Intelligent Manufacturing of Mechanical Equipment, Zhengzhou University of Light Industry, Zhengzhou, 450002, China; Guangdong HUST Industrial Technology Research Institute, Guangdong Provincial Key Laboratory of Digital Manufacturing Equipment, Dongguan, 523808, China.
| | - Jinguang Du
- Henan Key Lab of Intelligent Manufacturing of Mechanical Equipment, Zhengzhou University of Light Industry, Zhengzhou, 450002, China
| | - Yinxia Liu
- Laboratory Medicine of Dongguan Kanghua Hospital, Dongguan, 523808, China
| | - Yuan Yang
- Guangdong Provincial Hospital of Chinese Medicine, Guangzhou, 510120, China.
| | - Leijie Wang
- School of Mechanical Engineering, Dongguan University of Technology Dongguan, 523808, China
| | - Zhiwen Jiang
- Henan Key Lab of Intelligent Manufacturing of Mechanical Equipment, Zhengzhou University of Light Industry, Zhengzhou, 450002, China
| | - Yongqiang Wang
- Zhengzhou Coal Mining Machinery Group Co., Ltd, Zhengzhou, 450016, China
| | - Jie Yuan
- Henan Key Lab of Intelligent Manufacturing of Mechanical Equipment, Zhengzhou University of Light Industry, Zhengzhou, 450002, China
| | - Chen Cao
- Henan Key Lab of Intelligent Manufacturing of Mechanical Equipment, Zhengzhou University of Light Industry, Zhengzhou, 450002, China; Guangdong HUST Industrial Technology Research Institute, Guangdong Provincial Key Laboratory of Digital Manufacturing Equipment, Dongguan, 523808, China
| |
Collapse
|
6
|
Tran QT, Alom MZ, Orr BA. Comprehensive study of semi-supervised learning for DNA methylation-based supervised classification of central nervous system tumors. BMC Bioinformatics 2022; 23:223. [PMID: 35676649 PMCID: PMC9178802 DOI: 10.1186/s12859-022-04764-1] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2022] [Accepted: 05/31/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Precision medicine for cancer treatment relies on an accurate pathological diagnosis. The number of known tumor classes has increased rapidly, and reliance on traditional methods of histopathologic classification alone has become unfeasible. To help reduce variability, validation costs, and standardize the histopathological diagnostic process, supervised machine learning models using DNA-methylation data have been developed for tumor classification. These methods require large labeled training data sets to obtain clinically acceptable classification accuracy. While there is abundant unlabeled epigenetic data across multiple databases, labeling pathology data for machine learning models is time-consuming and resource-intensive, especially for rare tumor types. Semi-supervised learning (SSL) approaches have been used to maximize the utility of labeled and unlabeled data for classification tasks and are effectively applied in genomics. SSL methods have not yet been explored with epigenetic data nor demonstrated beneficial to central nervous system (CNS) tumor classification. RESULTS This paper explores the application of semi-supervised machine learning on methylation data to improve the accuracy of supervised learning models in classifying CNS tumors. We comprehensively evaluated 11 SSL methods and developed a novel combination approach that included a self-training with editing using support vector machine (SETRED-SVM) model and an L2-penalized, multinomial logistic regression model to obtain high confidence labels from a few labeled instances. Results across eight random forest and neural net models show that the pseudo-labels derived from our SSL method can significantly increase prediction accuracy for 82 CNS tumors and 9 normal controls. CONCLUSIONS The proposed combination of semi-supervised technique and multinomial logistic regression holds the potential to leverage the abundant publicly available unlabeled methylation data effectively. Such an approach is highly beneficial in providing additional training examples, especially for scarce tumor types, to boost the prediction accuracy of supervised models.
Collapse
Affiliation(s)
- Quynh T Tran
- Department of Pathology, St. Jude Children's Research Hospital, 262 Danny Thomas Place, MS 250, Memphis, TN, 38105-3678, USA
| | - Md Zahangir Alom
- Department of Pathology, St. Jude Children's Research Hospital, 262 Danny Thomas Place, MS 250, Memphis, TN, 38105-3678, USA
| | - Brent A Orr
- Department of Pathology, St. Jude Children's Research Hospital, 262 Danny Thomas Place, MS 250, Memphis, TN, 38105-3678, USA.
| |
Collapse
|
7
|
A New Method of Deep Convolutional Neural Network Image Classification Based on Knowledge Transfer in Small Label Sample Environment. SENSORS 2022; 22:s22030898. [PMID: 35161644 PMCID: PMC8839952 DOI: 10.3390/s22030898] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/11/2021] [Revised: 01/17/2022] [Accepted: 01/18/2022] [Indexed: 01/27/2023]
Abstract
The problem of deep learning network image classification when a large number of image samples are obtained in life and with only a small amount of knowledge annotation, is preliminarily solved in this paper. First, a support vector machine expert labeling system is constructed by using a bag-of-words model to extract image features from a small number of labeled samples. The labels of a large number of unlabeled image samples are automatically annotated by using the constructed SVM expert labeling system. Second, a small number of labeled samples and automatically labeled image samples are combined to form an augmented training set. A deep convolutional neural network model is created by using an augmented training set. Knowledge transfer from SVMs trained with a small number of image samples annotated by artificial knowledge to deep neural network classifiers is implemented in this paper. The problem of overfitting in neural network training with small samples is solved. Finally, the public dataset caltech256 is used for experimental verification and mechanism analysis of the performance of the new method.
Collapse
|