1
|
Ma Y, Guo Y, Cui W, Liu J, Li Y, Wang Y, Qiang Y. SG-Transunet: A segmentation-guided Transformer U-Net model for KRAS gene mutation status identification in colorectal cancer. Comput Biol Med 2024; 173:108293. [PMID: 38574528 DOI: 10.1016/j.compbiomed.2024.108293] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2023] [Revised: 02/28/2024] [Accepted: 03/12/2024] [Indexed: 04/06/2024]
Abstract
Accurately identifying the Kirsten rat sarcoma virus (KRAS) gene mutation status in colorectal cancer (CRC) patients can assist doctors in deciding whether to use specific targeted drugs for treatment. Although deep learning methods are popular, they are often affected by redundant features from non-lesion areas. Moreover, existing methods commonly extract spatial features from imaging data, which neglect important frequency domain features and may degrade the performance of KRAS gene mutation status identification. To address this deficiency, we propose a segmentation-guided Transformer U-Net (SG-Transunet) model for KRAS gene mutation status identification in CRC. Integrating the strength of convolutional neural networks (CNNs) and Transformers, SG-Transunet offers a unique approach for both lesion segmentation and KRAS mutation status identification. Specifically, for precise lesion localization, we employ an encoder-decoder to obtain segmentation results and guide the KRAS gene mutation status identification task. Subsequently, a frequency domain supplement block is designed to capture frequency domain features, integrating it with high-level spatial features extracted in the encoding path to derive advanced spatial-frequency domain features. Furthermore, we introduce a pre-trained Xception block to mitigate the risk of overfitting associated with small-scale datasets. Following this, an aggregate attention module is devised to consolidate spatial-frequency domain features with global information extracted by the Transformer at shallow and deep levels, thereby enhancing feature discriminability. Finally, we propose a mutual-constrained loss function that simultaneously constrains the segmentation mask acquisition and gene status identification process. Experimental results demonstrate the superior performance of SG-Transunet over state-of-the-art methods in discriminating KRAS gene mutation status.
Collapse
Affiliation(s)
- Yulan Ma
- Department of Automation Science and Electrical Engineering, Beihang University, Beijing, 100191, China
| | - Yuzhu Guo
- Department of Automation Science and Electrical Engineering, Beihang University, Beijing, 100191, China
| | - Weigang Cui
- School of Engineering Medicine, Beihang University, Beijing, 100191, China
| | - Jingyu Liu
- School of Medical Technology, Beijing Institute of Technology, Beijing 100081, China
| | - Yang Li
- Department of Automation Science and Electrical Engineering, Beihang University, Beijing, 100191, China.
| | - Yingsen Wang
- College of Computer Science and Technology, Taiyuan University of Technology, Taiyuan, China
| | - Yan Qiang
- School of Software, North University of China, Taiyuan, China; College of Computer Science and Technology, Taiyuan University of Technology, Taiyuan, China.
| |
Collapse
|
2
|
Kodipalli A, Fernandes SL, Dasar S. An Empirical Evaluation of a Novel Ensemble Deep Neural Network Model and Explainable AI for Accurate Segmentation and Classification of Ovarian Tumors Using CT Images. Diagnostics (Basel) 2024; 14:543. [PMID: 38473015 DOI: 10.3390/diagnostics14050543] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2024] [Revised: 02/18/2024] [Accepted: 02/29/2024] [Indexed: 03/14/2024] Open
Abstract
Ovarian cancer is one of the leading causes of death worldwide among the female population. Early diagnosis is crucial for patient treatment. In this work, our main objective is to accurately detect and classify ovarian cancer. To achieve this, two datasets are considered: CT scan images of patients with cancer and those without, and biomarker (clinical parameters) data from all patients. We propose an ensemble deep neural network model and an ensemble machine learning model for the automatic binary classification of ovarian CT scan images and biomarker data. The proposed model incorporates four convolutional neural network models: VGG16, ResNet 152, Inception V3, and DenseNet 101, with transformers applied for feature extraction. These extracted features are fed into our proposed ensemble multi-layer perceptron model for classification. Preprocessing and CNN tuning techniques such as hyperparameter optimization, data augmentation, and fine-tuning are utilized during model training. Our ensemble model outperforms single classifiers and machine learning algorithms, achieving a mean accuracy of 98.96%, a precision of 97.44%, and an F1-score of 98.7%. We compared these results with those obtained using features extracted by the UNet model, followed by classification with our ensemble model. The transformer demonstrated superior performance in feature extraction over the UNet, with a mean Dice score and mean Jaccard score of 0.98 and 0.97, respectively, and standard deviations of 0.04 and 0.06 for benign tumors and 0.99 and 0.98 with standard deviations of 0.01 for malignant tumors. For the biomarker data, the combination of five machine learning models-KNN, logistic regression, SVM, decision tree, and random forest-resulted in an improved accuracy of 92.8% compared to single classifiers.
Collapse
Affiliation(s)
- Ashwini Kodipalli
- Department of Artificial Intelligence and Data Science, Global Academy of Technology, Bangalore 560098, India
| | - Steven L Fernandes
- Department of Computer Science, Design, Journalism, Creighton University, Omaha, NE 68178, USA
| | - Santosh Dasar
- Department of Radiology, SDM College of Medical Sciences & Hospital, Shri Dharmasthala Manjunatheshwara University, Dharwad 580009, India
| |
Collapse
|
3
|
Wang C, Wang L, Wang N, Wei X, Feng T, Wu M, Yao Q, Zhang R. CFATransUnet: Channel-wise cross fusion attention and transformer for 2D medical image segmentation. Comput Biol Med 2024; 168:107803. [PMID: 38064854 DOI: 10.1016/j.compbiomed.2023.107803] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2023] [Revised: 11/23/2023] [Accepted: 11/29/2023] [Indexed: 01/10/2024]
Abstract
Medical image segmentation faces current challenges in effectively extracting and fusing long-distance and local semantic information, as well as mitigating or eliminating semantic gaps during the encoding and decoding process. To alleviate the above two problems, we propose a new U-shaped network structure, called CFATransUnet, with Transformer and CNN blocks as the backbone network, equipped with Channel-wise Cross Fusion Attention and Transformer (CCFAT) module, containing Channel-wise Cross Fusion Transformer (CCFT) and Channel-wise Cross Fusion Attention (CCFA). Specifically, we use a Transformer and CNN blocks to construct the encoder and decoder for adequate extraction and fusion of long-range and local semantic features. The CCFT module utilizes the self-attention mechanism to reintegrate semantic information from different stages into cross-level global features to reduce the semantic asymmetry between features at different levels. The CCFA module adaptively acquires the importance of each feature channel based on a global perspective in a network learning manner, enhancing effective information grasping and suppressing non-important features to mitigate semantic gaps. The combination of CCFT and CCFA can guide the effective fusion of different levels of features more powerfully with a global perspective. The consistent architecture of the encoder and decoder also alleviates the semantic gap. Experimental results suggest that the proposed CFATransUnet achieves state-of-the-art performance on four datasets. The code is available at https://github.com/CPU0808066/CFATransUnet.
Collapse
Affiliation(s)
- Cheng Wang
- Department of Optical Science and Engineering, Fudan University, Shanghai 200433, China
| | - Le Wang
- Academy for Engineering and Technology, Fudan University, Shanghai 200433, China
| | - Nuoqi Wang
- Department of Optical Science and Engineering, Fudan University, Shanghai 200433, China
| | - Xiaoling Wei
- Department of Endodontics, Shanghai Stomatological Hospital, Fudan University, Shanghai 200001, China
| | - Ting Feng
- Academy for Engineering and Technology, Fudan University, Shanghai 200433, China
| | - Minfeng Wu
- Department of Dermatology, Huadong Hospital Affiliated to Fudan University, Shanghai, 200040, China.
| | - Qi Yao
- Academy for Engineering and Technology, Fudan University, Shanghai 200433, China.
| | - Rongjun Zhang
- Department of Optical Science and Engineering, Fudan University, Shanghai 200433, China; Academy for Engineering and Technology, Fudan University, Shanghai 200433, China; Zhuhai Fudan Innovation Institute, Zhuhai 519031, China.
| |
Collapse
|
4
|
Song P, Li J, Fan H, Fan L. TGDAUNet: Transformer and GCNN based dual-branch attention UNet for medical image segmentation. Comput Biol Med 2023; 167:107583. [PMID: 37890420 DOI: 10.1016/j.compbiomed.2023.107583] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2023] [Revised: 09/28/2023] [Accepted: 10/15/2023] [Indexed: 10/29/2023]
Abstract
Accurate and automatic segmentation of medical images is a key step in clinical diagnosis and analysis. Currently, the successful application of Transformers' model in the field of computer vision, researchers have begun to gradually explore the application of Transformers in medical segmentation of images, especially in combination with convolutional neural networks with coding-decoding structure, which have achieved remarkable results in the field of medical segmentation. However, most studies have combined Transformers with CNNs at a single scale or processed only the highest-level semantic feature information, ignoring the rich location information in the lower-level semantic feature information. At the same time, for problems such as blurred structural boundaries and heterogeneous textures in images, most existing methods usually simply connect contour information to capture the boundaries of the target. However, these methods cannot capture the precise outline of the target and ignore the potential relationship between the boundary and the region. In this paper, we propose the TGDAUNet, which consists of a dual-branch backbone network of CNNs and Transformers and a parallel attention mechanism, to achieve accurate segmentation of lesions in medical images. Firstly, high-level semantic feature information of the CNN backbone branches is fused at multiple scales, and the high-level and low-level feature information complement each other's location and spatial information. We further use the polarised self-attentive (PSA) module to reduce the impact of redundant information caused by multiple scales, to better couple with the feature information extracted from the Transformers backbone branch, and to establish global contextual long-range dependencies at multiple scales. In addition, we have designed the Reverse Graph-reasoned Fusion (RGF) module and the Feature Aggregation (FA) module to jointly guide the global context. The FA module aggregates high-level semantic feature information to generate an original global predictive segmentation map. The RGF module captures non-significant features of the boundaries in the original or secondary global prediction segmentation graph through a reverse attention mechanism, establishing a graph reasoning module to explore the potential semantic relationships between boundaries and regions, further refining the target boundaries. Finally, to validate the effectiveness of our proposed method, we compare our proposed method with the current popular methods in the CVC-ClinicDB, Kvasir-SEG, ETIS, CVC-ColonDB, CVC-300,datasets as well as the skin cancer segmentation datasets ISIC-2016 and ISIC-2017. The large number of experimental results show that our method outperforms the currently popular methods. Source code is released at https://github.com/sd-spf/TGDAUNet.
Collapse
Affiliation(s)
- Pengfei Song
- Co-Innovation Center of Shandong Colleges and Universities: Future Intelligent Computing, School of Computer Science and Technology, Shandong Technology and Business University, Laishan District, Yantai, 264005, China
| | - Jinjiang Li
- Co-Innovation Center of Shandong Colleges and Universities: Future Intelligent Computing, School of Computer Science and Technology, Shandong Technology and Business University, Laishan District, Yantai, 264005, China
| | - Hui Fan
- Co-Innovation Center of Shandong Colleges and Universities: Future Intelligent Computing, School of Computer Science and Technology, Shandong Technology and Business University, Laishan District, Yantai, 264005, China.
| | - Linwei Fan
- School of Computer Science and Technology, Shandong University of Finance and Economics, Jinan, Shandong, 250014, China
| |
Collapse
|
5
|
Hu X, Cao Y, Hu W, Zhang W, Li J, Wang C, Mukhopadhyay SC, Li Y, Liu Z, Li S. Refined Feature-based Multi-frame and Multi-scale Fusing Gate network for accurate segmentation of plaques in ultrasound videos. Comput Biol Med 2023; 163:107091. [PMID: 37331099 DOI: 10.1016/j.compbiomed.2023.107091] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2022] [Revised: 04/29/2023] [Accepted: 05/27/2023] [Indexed: 06/20/2023]
Abstract
The accurate segmentation of carotid plaques in ultrasound videos will provide evidence for clinicians to evaluate the properties of plaques and treat patients effectively. However, the confusing background, blurry boundaries and plaque movement in ultrasound videos make accurate plaque segmentation challenging. To address the above challenges, we propose the Refined Feature-based Multi-frame and Multi-scale Fusing Gate Network (RMFG_Net), which captures spatial and temporal features in consecutive video frames for high-quality segmentation results and no manual annotation of the first frame. A spatial-temporal feature filter is proposed to suppress the noise of low-level CNN features and promote the detailed target area. To obtain a more accurate plaque position, we propose a transformer-based cross-scale spatial location algorithm, which models the relationship between adjacent layers of consecutive video frames to achieve stable positioning. To make full use of more detailed and semantic information, multi-layer gated computing is applied to fuse features of different layers, ensuring sufficient useful feature map aggregation for segmentation. Experiments on two clinical datasets demonstrate that the proposed method outperforms other state-of-the-art methods under different evaluation metrics, and it processes images with a speed of 68 frames per second which is suitable for real-time segmentation. A large number of ablation experiments were conducted to demonstrate the effectiveness of each component and experimental setting, as well as the potential of the proposed method in ultrasound video plaque segmentation tasks. The codes can be publicly available from https://github.com/xifengHuu/RMFG_Net.git.
Collapse
Affiliation(s)
- Xifeng Hu
- School of Information Science and Engineering, Shandong University, Qingdao 266237, China
| | - Yankun Cao
- School of Software, Shandong University, Jinan 250101, China
| | - Weifeng Hu
- School of Information Science and Engineering, Shandong University, Qingdao 266237, China
| | - Wenzhen Zhang
- School of Information Science and Engineering, Shandong University, Qingdao 266237, China
| | - Jing Li
- Beijing Hospital National Geriatrics Center, No. 1 Dahua Road, Dongcheng District, Beijing 100730, China
| | - Chuanyu Wang
- Beijing Hospital National Geriatrics Center, No. 1 Dahua Road, Dongcheng District, Beijing 100730, China
| | | | - Yujun Li
- School of Information Science and Engineering, Shandong University, Qingdao 266237, China.
| | - Zhi Liu
- School of Information Science and Engineering, Shandong University, Qingdao 266237, China.
| | - Shuo Li
- School of Case Western Reserve University, Cleveland, OH, USA
| |
Collapse
|
6
|
Liu Z, Lv Q, Yang Z, Li Y, Lee CH, Shen L. Recent progress in transformer-based medical image analysis. Comput Biol Med 2023; 164:107268. [PMID: 37494821 DOI: 10.1016/j.compbiomed.2023.107268] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2023] [Revised: 05/30/2023] [Accepted: 07/16/2023] [Indexed: 07/28/2023]
Abstract
The transformer is primarily used in the field of natural language processing. Recently, it has been adopted and shows promise in the computer vision (CV) field. Medical image analysis (MIA), as a critical branch of CV, also greatly benefits from this state-of-the-art technique. In this review, we first recap the core component of the transformer, the attention mechanism, and the detailed structures of the transformer. After that, we depict the recent progress of the transformer in the field of MIA. We organize the applications in a sequence of different tasks, including classification, segmentation, captioning, registration, detection, enhancement, localization, and synthesis. The mainstream classification and segmentation tasks are further divided into eleven medical image modalities. A large number of experiments studied in this review illustrate that the transformer-based method outperforms existing methods through comparisons with multiple evaluation metrics. Finally, we discuss the open challenges and future opportunities in this field. This task-modality review with the latest contents, detailed information, and comprehensive comparison may greatly benefit the broad MIA community.
Collapse
Affiliation(s)
- Zhaoshan Liu
- Department of Mechanical Engineering, National University of Singapore, 9 Engineering Drive 1, Singapore, 117575, Singapore.
| | - Qiujie Lv
- Department of Mechanical Engineering, National University of Singapore, 9 Engineering Drive 1, Singapore, 117575, Singapore; School of Intelligent Systems Engineering, Sun Yat-sen University, No. 66, Gongchang Road, Guangming District, 518107, China.
| | - Ziduo Yang
- Department of Mechanical Engineering, National University of Singapore, 9 Engineering Drive 1, Singapore, 117575, Singapore; School of Intelligent Systems Engineering, Sun Yat-sen University, No. 66, Gongchang Road, Guangming District, 518107, China.
| | - Yifan Li
- Department of Mechanical Engineering, National University of Singapore, 9 Engineering Drive 1, Singapore, 117575, Singapore.
| | - Chau Hung Lee
- Department of Radiology, Tan Tock Seng Hospital, 11 Jalan Tan Tock Seng, Singapore, 308433, Singapore.
| | - Lei Shen
- Department of Mechanical Engineering, National University of Singapore, 9 Engineering Drive 1, Singapore, 117575, Singapore.
| |
Collapse
|
7
|
Xiao H, Li L, Liu Q, Zhu X, Zhang Q. Transformers in medical image segmentation: A review. Biomed Signal Process Control 2023. [DOI: 10.1016/j.bspc.2023.104791] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/09/2023]
|