1
|
Dong X, Yang K, Liu J, Tang F, Liao W, Zhang Y, Liang S. Cross-Domain Mutual-Assistance Learning Framework for Fully Automated Diagnosis of Primary Tumor in Nasopharyngeal Carcinoma. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:3676-3689. [PMID: 38739507 DOI: 10.1109/tmi.2024.3400406] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2024]
Abstract
Accurate T-staging of nasopharyngeal carcinoma (NPC) holds paramount importance in guiding treatment decisions and prognosticating outcomes for distinct risk groups. Regrettably, the landscape of deep learning-based techniques for T-staging in NPC remains sparse, and existing methodologies often exhibit suboptimal performance due to their neglect of crucial domain-specific knowledge pertinent to primary tumor diagnosis. To address these issues, we propose a new cross-domain mutual-assistance learning framework for fully automated diagnosis of primary tumor using H&N MR images. Specifically, we tackle primary tumor diagnosis task with the convolutional neural network consisting of a 3D cross-domain knowledge perception network (CKP net) for excavated cross-domain-invariant features emphasizing tumor intensity variations and internal tumor heterogeneity, and a multi-domain mutual-information sharing fusion network (M2SF net), comprising a dual-pathway domain-specific representation module and a mutual information fusion module, for intelligently gauging and amalgamating multi-domain, multi-scale T-stage diagnosis-oriented features. The proposed 3D cross-domain mutual-assistance learning framework not only embraces task-specific multi-domain diagnostic knowledge but also automates the entire process of primary tumor diagnosis. We evaluate our model on an internal and an external MR images dataset in a three-fold cross-validation paradigm. Exhaustive experimental results demonstrate that our method outperforms the other algorithms, and obtains promising performance for tumor segmentation and T-staging. These findings underscore its potential for clinical application, offering valuable assistance to clinicians in treatment decision-making and prognostication for various risk groups.
Collapse
|
2
|
Rong C, Li Z, Li R, Wang Y. Spatial-aware contrastive learning for cross-domain medical image registration. Med Phys 2024; 51:8141-8150. [PMID: 39031488 DOI: 10.1002/mp.17311] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2024] [Revised: 06/14/2024] [Accepted: 07/01/2024] [Indexed: 07/22/2024] Open
Abstract
BACKGROUND With the rapid advancement of medical imaging technologies, precise image analysis and diagnosis play a crucial role in enhancing treatment outcomes and patient care. Computed tomography (CT) and magnetic resonance imaging (MRI), as pivotal technologies in medical imaging, exhibit unique advantages in bone imaging and soft tissue contrast, respectively. However, cross-domain medical image registration confronts significant challenges due to the substantial differences in contrast, texture, and noise levels between different imaging modalities. PURPOSE The purpose of this study is to address the major challenges encountered in the field of cross-domain medical image registration by proposing a spatial-aware contrastive learning approach that effectively integrates shared information from CT and MRI images. Our objective is to optimize the feature space representation by employing advanced reconstruction and contrastive loss functions, overcoming the limitations of traditional registration methods when dealing with different imaging modalities. Through this approach, we aim to enhance the model's ability to learn structural similarities across domain images, improve registration accuracy, and provide more precise imaging analysis tools for clinical diagnosis and treatment planning. METHODS With prior knowledge that different domains of images (CT and MRI) share same content-style information, we extract equivalent feature spaces from both images, enabling accurate cross-domain point matching. We employ a structure resembling that of an autoencoder, augmented with designed reconstruction and contrastive losses to fulfill our objectives. We also propose region mask to solve the conflict between spatial correlation and distinctiveness, to obtain a better representation space. RESULTS Our research results demonstrate the significant superiority of the proposed spatial-aware contrastive learning approach in the domain of cross-domain medical image registration. Quantitatively, our method achieved an average Dice similarity coefficient (DSC) of 85.68%, target registration error (TRE) of 1.92 mm, and mean Hausdorff distance (MHD) of 1.26 mm, surpassing current state-of-the-art methods. Additionally, the registration processing time was significantly reduced to 2.67 s on a GPU, highlighting the efficiency of our approach. The experimental outcomes not only validate the effectiveness of our method in improving the accuracy of cross-domain image registration but also prove its adaptability across different medical image analysis scenarios, offering robust support for enhancing diagnostic precision and patient treatment outcomes. CONCLUSIONS The spatial-aware contrastive learning approach proposed in this paper introduces a new perspective and solution to the domain of cross-domain medical image registration. By effectively optimizing the feature space representation through carefully designed reconstruction and contrastive loss functions, our method significantly improves the accuracy and stability of registration between CT and MRI images. The experimental results demonstrate the clear advantages of our approach in enhancing the accuracy of cross-domain image registration, offering significant application value in promoting precise diagnosis and personalized treatment planning. In the future, we look forward to further exploring the application of this method in a broader range of medical imaging datasets and its potential integration with other advanced technologies, contributing more innovations to the field of medical image analysis and processing.
Collapse
Affiliation(s)
- Chenchu Rong
- School of Electronic Science and Engineering, Nanjing University, Nanjing, China
| | - Zhiru Li
- School of Electronic Science and Engineering, Nanjing University, Nanjing, China
| | - Rui Li
- The Second Affiliated Hospital of Nantong University, Medical School of Nantong University, Nantong, China
| | - Yuanqing Wang
- School of Electronic Science and Engineering, Nanjing University, Nanjing, China
| |
Collapse
|
3
|
Chen Z, Pan Y, Ye Y, Wang Z, Xia Y. TriLA: Triple-Level Alignment Based Unsupervised Domain Adaptation for Joint Segmentation of Optic Disc and Optic Cup. IEEE J Biomed Health Inform 2024; 28:5497-5508. [PMID: 38805331 DOI: 10.1109/jbhi.2024.3406447] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/30/2024]
Abstract
Cross-domain joint segmentation of optic disc and optic cup on fundus images is essential, yet challenging, for effective glaucoma screening. Although many unsupervised domain adaptation (UDA) methods have been proposed, these methods can hardly achieve complete domain alignment, leading to suboptimal performance. In this paper, we propose a triple-level alignment (TriLA) model to address this issue by aligning the source and target domains at the input level, feature level, and output level simultaneously. At the input level, a learnable Fourier domain adaptation (LFDA) module is developed to learn the cut-off frequency adaptively for frequency-domain translation. At the feature level, we disentangle the style and content features and align them in the corresponding feature spaces using consistency constraints. At the output level, we design a segmentation consistency constraint to emphasize the segmentation consistency across domains. The proposed model is trained on the RIGA+ dataset and widely evaluated on six different UDA scenarios. Our comprehensive results not only demonstrate that the proposed TriLA substantially outperforms other state-of-the-art UDA methods in joint segmentation of optic disc and optic cup, but also suggest the effectiveness of the triple-level alignment strategy.
Collapse
|
4
|
Tao R, Zou X, Gao X, Li X, Wang Z, Zhao X, Zheng G, Hang D. Incremental regression of localization context for automatic segmentation of ossified ligamentum flavum from CT data. Int J Comput Assist Radiol Surg 2024; 19:1723-1731. [PMID: 38568402 DOI: 10.1007/s11548-024-03109-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2023] [Accepted: 03/08/2024] [Indexed: 09/02/2024]
Abstract
PURPOSE Segmentation of ossified ligamentum flavum (OLF) plays a crucial role in developing computer-assisted, image-guided systems for decompressive thoracic laminectomy. Manual segmentation is time-consuming, tedious, and label-intensive. It also suffers from inter- and intra-observer variability. Automatic segmentation is highly desired. METHODS A two-stage, localization context-aware framework is developed for automatic segmentation of ossified ligamentum flavum. In the first stage, localization heatmaps of OLFs are obtained via incremental regression. In the second stage, the obtained heatmaps are then treated as the localization context for a segmentation U-Net. Our framework can directly map a whole volumetic data to its volume-wise labels. RESULTS We designed and conducted comprehensive experiments on datasets of 100 patients to evaluate the performance of the proposed method. Our method achieved an average Dice similarity coefficient of 61.2 ± 7.6%, an average surface distance of 1.1 ± 0.5 mm, and an average positive predictive value of 62.0 ± 12.8%. CONCLUSION To the best knowledge of the authors, this is the first study aiming for automatic segmentation of ossified ligamentum flavum. Results from the comprehensive experiments demonstrate the superior performance of the proposed method over the state-of-the-art methods.
Collapse
Affiliation(s)
- Rong Tao
- Institute of Medical Robotics, Shanghai Jiao Tong University, Dongchuan Road, Shanghai, 200240, China
| | - Xiaoyang Zou
- Institute of Medical Robotics, Shanghai Jiao Tong University, Dongchuan Road, Shanghai, 200240, China
| | - Xiaoru Gao
- Institute of Medical Robotics, Shanghai Jiao Tong University, Dongchuan Road, Shanghai, 200240, China
| | - Xinhua Li
- Department of Orthopedics, Shanghai General Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, 200080, China
| | - Zhiyu Wang
- Department of Radiology, Shanghai General Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, 200080, China
| | - Xin Zhao
- Department of Orthopedics, Shanghai 9th People's Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, 200011, China.
| | - Guoyan Zheng
- Institute of Medical Robotics, Shanghai Jiao Tong University, Dongchuan Road, Shanghai, 200240, China.
| | - Donghua Hang
- Department of Orthopedics, Shanghai General Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, 200080, China.
| |
Collapse
|
5
|
Wang R, Zheng G. PFMNet: Prototype-based feature mapping network for few-shot domain adaptation in medical image segmentation. Comput Med Imaging Graph 2024; 116:102406. [PMID: 38824715 DOI: 10.1016/j.compmedimag.2024.102406] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2024] [Revised: 05/23/2024] [Accepted: 05/24/2024] [Indexed: 06/04/2024]
Abstract
Lack of data is one of the biggest hurdles for rare disease research using deep learning. Due to the lack of rare-disease images and annotations, training a robust network for automatic rare-disease image segmentation is very challenging. To address this challenge, few-shot domain adaptation (FSDA) has emerged as a practical research direction, aiming to leverage a limited number of annotated images from a target domain to facilitate adaptation of models trained on other large datasets in a source domain. In this paper, we present a novel prototype-based feature mapping network (PFMNet) designed for FSDA in medical image segmentation. PFMNet adopts an encoder-decoder structure for segmentation, with the prototype-based feature mapping (PFM) module positioned at the bottom of the encoder-decoder structure. The PFM module transforms high-level features from the target domain into the source domain-like features that are more easily comprehensible by the decoder. By leveraging these source domain-like features, the decoder can effectively perform few-shot segmentation in the target domain and generate accurate segmentation masks. We evaluate the performance of PFMNet through experiments on three typical yet challenging few-shot medical image segmentation tasks: cross-center optic disc/cup segmentation, cross-center polyp segmentation, and cross-modality cardiac structure segmentation. We consider four different settings: 5-shot, 10-shot, 15-shot, and 20-shot. The experimental results substantiate the efficacy of our proposed approach for few-shot domain adaptation in medical image segmentation.
Collapse
Affiliation(s)
- Runze Wang
- Institute of Medical Robotics, School of Biomedical Engineering, Shanghai Jiao Tong University, No. 800, Dongchuan Road, Shanghai, 200240, China
| | - Guoyan Zheng
- Institute of Medical Robotics, School of Biomedical Engineering, Shanghai Jiao Tong University, No. 800, Dongchuan Road, Shanghai, 200240, China.
| |
Collapse
|
6
|
Huang L, Zhang N, Yi Y, Zhou W, Zhou B, Dai J, Wang J. SAMCF: Adaptive global style alignment and multi-color spaces fusion for joint optic cup and disc segmentation. Comput Biol Med 2024; 178:108639. [PMID: 38878394 DOI: 10.1016/j.compbiomed.2024.108639] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2024] [Revised: 04/21/2024] [Accepted: 05/18/2024] [Indexed: 07/24/2024]
Abstract
The optic cup (OC) and optic disc (OD) are two critical structures in retinal fundus images, and their relative positions and sizes are essential for effectively diagnosing eye diseases. With the success of deep learning in computer vision, deep learning-based segmentation models have been widely used for joint optic cup and disc segmentation. However, there are three prominent issues that impact the segmentation performance. First, significant differences among datasets collecting from various institutions, protocols, and devices lead to performance degradation of models. Second, we find that images with only RGB information struggle to counteract the interference caused by brightness variations, affecting color representation capability. Finally, existing methods typically ignored the edge perception, facing the challenges in obtaining clear and smooth edge segmentation results. To address these drawbacks, we propose a novel framework based on Style Alignment and Multi-Color Fusion (SAMCF) for joint OC and OD segmentation. Initially, we introduce a domain generalization method to generate uniformly styled images without damaged image content for mitigating domain shift issues. Next, based on multiple color spaces, we propose a feature extraction and fusion network aiming to handle brightness variation interference and improve color representation capability. Lastly, an edge aware loss is designed to generate fine edge segmentation results. Our experiments conducted on three public datasets, DGS, RIM, and REFUGE, demonstrate that our proposed SAMCF achieves superior performance to existing state-of-the-art methods. Moreover, SAMCF exhibits remarkable generalization ability across multiple retinal fundus image datasets, showcasing its outstanding generality.
Collapse
Affiliation(s)
- Longjun Huang
- School of Software, Nanchang Key Laboratory for Blindness and Visual Impairment Prevention Technology and Equipment, Jiangxi Normal University, Nanchang, 330022, China
| | - Ningyi Zhang
- School of Software, Nanchang Key Laboratory for Blindness and Visual Impairment Prevention Technology and Equipment, Jiangxi Normal University, Nanchang, 330022, China
| | - Yugen Yi
- School of Software, Nanchang Key Laboratory for Blindness and Visual Impairment Prevention Technology and Equipment, Jiangxi Normal University, Nanchang, 330022, China.
| | - Wei Zhou
- College of Computer Science, Shenyang Aerospace University, Shenyang, 110136, China
| | - Bin Zhou
- School of Software, Nanchang Key Laboratory for Blindness and Visual Impairment Prevention Technology and Equipment, Jiangxi Normal University, Nanchang, 330022, China
| | - Jiangyan Dai
- School of Computer Engineering, Weifang University, 261061, China.
| | - Jianzhong Wang
- College of Information Science and Technology, Northeast Normal University, Changchun, 130117, China
| |
Collapse
|
7
|
Kumari S, Singh P. Deep learning for unsupervised domain adaptation in medical imaging: Recent advancements and future perspectives. Comput Biol Med 2024; 170:107912. [PMID: 38219643 DOI: 10.1016/j.compbiomed.2023.107912] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2023] [Revised: 11/02/2023] [Accepted: 12/24/2023] [Indexed: 01/16/2024]
Abstract
Deep learning has demonstrated remarkable performance across various tasks in medical imaging. However, these approaches primarily focus on supervised learning, assuming that the training and testing data are drawn from the same distribution. Unfortunately, this assumption may not always hold true in practice. To address these issues, unsupervised domain adaptation (UDA) techniques have been developed to transfer knowledge from a labeled domain to a related but unlabeled domain. In recent years, significant advancements have been made in UDA, resulting in a wide range of methodologies, including feature alignment, image translation, self-supervision, and disentangled representation methods, among others. In this paper, we provide a comprehensive literature review of recent deep UDA approaches in medical imaging from a technical perspective. Specifically, we categorize current UDA research in medical imaging into six groups and further divide them into finer subcategories based on the different tasks they perform. We also discuss the respective datasets used in the studies to assess the divergence between the different domains. Finally, we discuss emerging areas and provide insights and discussions on future research directions to conclude this survey.
Collapse
Affiliation(s)
- Suruchi Kumari
- Department of Computer Science and Engineering, Indian Institute of Technology Roorkee, India.
| | - Pravendra Singh
- Department of Computer Science and Engineering, Indian Institute of Technology Roorkee, India.
| |
Collapse
|
8
|
Tiwary P, Bhattacharyya K, A P P. Cycle consistent twin energy-based models for image-to-image translation. Med Image Anal 2024; 91:103031. [PMID: 37988920 DOI: 10.1016/j.media.2023.103031] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2022] [Revised: 09/10/2023] [Accepted: 11/13/2023] [Indexed: 11/23/2023]
Abstract
Domain shift refers to change of distributional characteristics between the training (source) and the testing (target) datasets of a learning task, leading to performance drop. For tasks involving medical images, domain shift may be caused because of several factors such as change in underlying imaging modalities, measuring devices and staining mechanisms. Recent approaches address this issue via generative models based on the principles of adversarial learning albeit they suffer from issues such as difficulty in training and lack of diversity. Motivated by the aforementioned observations, we adapt an alternative class of deep generative models called the Energy-Based Models (EBMs) for the task of unpaired image-to-image translation of medical images. Specifically, we propose a novel method called the Cycle Consistent Twin EBMs (CCT-EBM) which employs a pair of EBMs in the latent space of an Auto-Encoder trained on the source data. While one of the EBMs translates the source to the target domain the other does vice-versa along with a novel consistency loss, ensuring translation symmetry and coupling between the domains. We theoretically analyze the proposed method and show that our design leads to better translation between the domains with reduced langevin mixing steps. We demonstrate the efficacy of our method through detailed quantitative and qualitative experiments on image segmentation tasks on three different datasets vis-a-vis state-of-the-art methods.
Collapse
Affiliation(s)
- Piyush Tiwary
- Department of Electrical Communication Engineering, Indian Institute of Science, Bangalore, Karnataka 560012, India.
| | - Kinjawl Bhattacharyya
- Department of Electrical Engineering, Indian Institute of Technology Kharagpur, Kharagpur, West Bengal 721302, India
| | - Prathosh A P
- Department of Electrical Communication Engineering, Indian Institute of Science, Bangalore, Karnataka 560012, India
| |
Collapse
|
9
|
Zhong W, Luo J, Du W. Deep learning with fetal ECG recognition. Physiol Meas 2023; 44:115006. [PMID: 37939396 DOI: 10.1088/1361-6579/ad0ab7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2023] [Accepted: 11/07/2023] [Indexed: 11/10/2023]
Abstract
Objective.Independent component analysis (ICA) is widely used in the extraction of fetal ECG (FECG). However, the amplitude, order, and positive or negative values of the ICA results are uncertain. The main objective is to present a novel approach to FECG recognition by using a deep learning strategy.Approach.A cross-domain consistent convolutional neural network (CDC-Net) is developed for the task of FECG recognition. The output of the ICA algorithm is used as input to the CDC-Net and the CDC-Net identifies which channel's signal is the target FECG.Main results.Signals from two databases are used to test the efficiency of the proposed method. The proposed deep learning method exhibits good performance on FECG recognition. Specifically, the Precision, Recall and F1-score of the proposed method on the ADFECGDB database are 91.69%, 91.37% and 91.52%, respectively. The Precision, Recall and F1-score of the proposed method on the Daisy database are 97.85%, 97.42% and 97.63%, respectively.Significance. This study is a proof of concept that the proposed method can automatically recognize the FECG signals in multi-channel ECG data. The development of FECG recognition technology contributes to automated FECG monitoring.
Collapse
Affiliation(s)
- Wei Zhong
- Guangdong Police College, Guangzhou, 510000, People's Republic of China
| | - Jiahui Luo
- Guangdong Police College, Guangzhou, 510000, People's Republic of China
| | - Wei Du
- Guangdong Police College, Guangzhou, 510000, People's Republic of China
| |
Collapse
|
10
|
Li D, Peng Y, Sun J, Guo Y. Unsupervised deep consistency learning adaptation network for cardiac cross-modality structural segmentation. Med Biol Eng Comput 2023; 61:2713-2732. [PMID: 37450212 DOI: 10.1007/s11517-023-02833-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2022] [Accepted: 04/05/2023] [Indexed: 07/18/2023]
Abstract
Deep neural networks have recently been succeessful in the field of medical image segmentation; however, they are typically subject to performance degradation problems when well-trained models are tested in another new domain with different data distributions. Given that annotated cross-domain images may inaccessible, unsupervised domain adaptation methods that transfer learnable information from annotated source domains to unannotated target domains with different distributions have attracted substantial attention. Many methods leverage image-level or pixel-level translation networks to align domain-invariant information and mitigate domain shift issues. However, These methods rarely perform well when there is a large domain gap. A new unsupervised deep consistency learning adaptation network, which adopts input space consistency learning and output space consistency learning to realize unsupervised domain adaptation and cardiac structural segmentation, is introduced in this paper The framework mainly includes a domain translation path and a cross-modality segmentation path. In domain translation path, a symmetric alignment generator network with attention to cross-modality features and anatomy is introduced to align bidirectional domain features. In the segmentation path, entropy map minimization, output probability map minimization and segmentation prediction minimization are leveraged to align the output space features. The model conducts supervised learning to extract source domain features and conducts unsupervised deep consistency learning to extract target domain features. Through experimental testing on two challenging cross-modality segmentation tasks, our method has robust performance compared to that of previous methods. Furthermore, ablation experiments are conducted to confirm the effectiveness of our framework.
Collapse
Affiliation(s)
- Dapeng Li
- College of Computer Science and Engineering, Shandong University of Science and Technology, Qingdao, 266590, Shandong, China
| | - Yanjun Peng
- College of Computer Science and Engineering, Shandong University of Science and Technology, Qingdao, 266590, Shandong, China.
- Shandong Province Key Laboratory of Wisdom Mining Information Technology, Shandong University of Science and Technology, Qingdao, 266590, Shandong, China.
| | - Jindong Sun
- College of Computer Science and Engineering, Shandong University of Science and Technology, Qingdao, 266590, Shandong, China
| | - Yanfei Guo
- College of Computer Science and Engineering, Shandong University of Science and Technology, Qingdao, 266590, Shandong, China
| |
Collapse
|
11
|
Wang R, Zhou Q, Zheng G. EDRL: Entropy-guided disentangled representation learning for unsupervised domain adaptation in semantic segmentation. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2023; 240:107729. [PMID: 37531690 DOI: 10.1016/j.cmpb.2023.107729] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/09/2022] [Revised: 07/15/2023] [Accepted: 07/19/2023] [Indexed: 08/04/2023]
Abstract
BACKGROUND AND OBJECTIVE Deep learning-based approaches are excellent at learning from large amounts of data, but can be poor at generalizing the learned knowledge to testing datasets with domain shift, i.e., when there exists distribution discrepancy between the training dataset (source domain) and the testing dataset (target domain). In this paper, we investigate unsupervised domain adaptation (UDA) techniques to train a cross-domain segmentation method which is robust to domain shift, eliminating the requirement of any annotations on the target domain. METHODS To this end, we propose an Entropy-guided Disentangled Representation Learning, referred as EDRL, for UDA in semantic segmentation. Concretely, we synergistically integrate image alignment via disentangled representation learning with feature alignment via entropy-based adversarial learning into one network, which is trained end-to-end. We additionally introduce a dynamic feature selection mechanism via soft gating, which helps to further enhance the task-specific feature alignment. We validate the proposed method on two publicly available datasets: the CT-MR dataset and the multi-sequence cardiac MR (MS-CMR) dataset. RESULTS On both datasets, our method achieved better results than the state-of-the-art (SOTA) methods. Specifically, on the CT-MR dataset, our method achieved an average DSC of 84.8% when taking CT as the source domain and MR as the target domain, and an average DSC of 84.0% when taking MR as the source domain and CT as the target domain. CONCLUSIONS Results from comprehensive experiments demonstrate the efficacy of the proposed EDRL model for cross-domain medical image segmentation.
Collapse
Affiliation(s)
- Runze Wang
- Institute of Medical Robotics, School of Biomedical Engineering, Shanghai Jiao Tong University, No. 800, Dongchuan Road, Shanghai, 200240, China
| | - Qin Zhou
- Institute of Medical Robotics, School of Biomedical Engineering, Shanghai Jiao Tong University, No. 800, Dongchuan Road, Shanghai, 200240, China
| | - Guoyan Zheng
- Institute of Medical Robotics, School of Biomedical Engineering, Shanghai Jiao Tong University, No. 800, Dongchuan Road, Shanghai, 200240, China.
| |
Collapse
|
12
|
Apivanichkul K, Phasukkit P, Dankulchai P, Sittiwong W, Jitwatcharakomol T. Enhanced Deep-Learning-Based Automatic Left-Femur Segmentation Scheme with Attribute Augmentation. SENSORS (BASEL, SWITZERLAND) 2023; 23:5720. [PMID: 37420884 PMCID: PMC10305208 DOI: 10.3390/s23125720] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/25/2023] [Revised: 05/27/2023] [Accepted: 06/14/2023] [Indexed: 07/09/2023]
Abstract
This research proposes augmenting cropped computed tomography (CT) slices with data attributes to enhance the performance of a deep-learning-based automatic left-femur segmentation scheme. The data attribute is the lying position for the left-femur model. In the study, the deep-learning-based automatic left-femur segmentation scheme was trained, validated, and tested using eight categories of CT input datasets for the left femur (F-I-F-VIII). The segmentation performance was assessed by Dice similarity coefficient (DSC) and intersection over union (IoU); and the similarity between the predicted 3D reconstruction images and ground-truth images was determined by spectral angle mapper (SAM) and structural similarity index measure (SSIM). The left-femur segmentation model achieved the highest DSC (88.25%) and IoU (80.85%) under category F-IV (using cropped and augmented CT input datasets with large feature coefficients), with an SAM and SSIM of 0.117-0.215 and 0.701-0.732. The novelty of this research lies in the use of attribute augmentation in medical image preprocessing to enhance the performance of the deep-learning-based automatic left-femur segmentation scheme.
Collapse
Affiliation(s)
- Kamonchat Apivanichkul
- School of Engineering, King Mongkut’s Institute of Technology Ladkrabang, Bangkok 10520, Thailand;
| | - Pattarapong Phasukkit
- School of Engineering, King Mongkut’s Institute of Technology Ladkrabang, Bangkok 10520, Thailand;
- King Mongkut Chaokhunthahan Hospital (KMCH), King Mongkut’s Institute of Technology Ladkrabang, Bangkok 10520, Thailand
| | - Pittaya Dankulchai
- Division of Radiation Oncology, Department of Radiology, Faculty of Medicine Siriraj Hospital, Mahidol University, Bangkok 10700, Thailand; (P.D.); (W.S.); (T.J.)
| | - Wiwatchai Sittiwong
- Division of Radiation Oncology, Department of Radiology, Faculty of Medicine Siriraj Hospital, Mahidol University, Bangkok 10700, Thailand; (P.D.); (W.S.); (T.J.)
| | - Tanun Jitwatcharakomol
- Division of Radiation Oncology, Department of Radiology, Faculty of Medicine Siriraj Hospital, Mahidol University, Bangkok 10700, Thailand; (P.D.); (W.S.); (T.J.)
| |
Collapse
|
13
|
van Tulder G, de Bruijne M. Unpaired, unsupervised domain adaptation assumes your domains are already similar. Med Image Anal 2023; 87:102825. [PMID: 37116296 DOI: 10.1016/j.media.2023.102825] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2022] [Revised: 03/30/2023] [Accepted: 04/17/2023] [Indexed: 04/30/2023]
Abstract
Unsupervised domain adaptation is a popular method in medical image analysis, but it can be tricky to make it work: without labels to link the domains, domains must be matched using feature distributions. If there is no additional information, this often leaves a choice between multiple possibilities to map the data that may be equally likely but not equally correct. In this paper we explore the fundamental problems that may arise in unsupervised domain adaptation, and discuss conditions that might still make it work. Focusing on medical image analysis, we argue that images from different domains may have similar class balance, similar intensities, similar spatial structure, or similar textures. We demonstrate how these implicit conditions can affect domain adaptation performance in experiments with synthetic data, MNIST digits, and medical images. We observe that practical success of unsupervised domain adaptation relies on existing similarities in the data, and is anything but guaranteed in the general case. Understanding these implicit assumptions is a key step in identifying potential problems in domain adaptation and improving the reliability of the results.
Collapse
Affiliation(s)
- Gijs van Tulder
- Data Science group, Faculty of Science, Radboud University, Postbus 9010, 6500 GL Nijmegen, The Netherlands; Biomedical Imaging Group, Erasmus MC, Postbus 2040, 3000 CA Rotterdam, The Netherlands.
| | - Marleen de Bruijne
- Biomedical Imaging Group, Erasmus MC, Postbus 2040, 3000 CA Rotterdam, The Netherlands; Department of Computer Science, University of Copenhagen, Universitetsparken 1, 2100 Copenhagen, Denmark.
| |
Collapse
|
14
|
Ding W, Abdel-Basset M, Hawash H, Pedrycz W. MIC-Net: A deep network for cross-site segmentation of COVID-19 infection in the fog-assisted IoMT. Inf Sci (N Y) 2023; 623:20-39. [PMID: 36532157 PMCID: PMC9745980 DOI: 10.1016/j.ins.2022.12.017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2022] [Revised: 12/02/2022] [Accepted: 12/07/2022] [Indexed: 12/15/2022]
Abstract
The automatic segmentation of COVID-19 pneumonia from a computerized tomography (CT) scan has become a major interest for scholars in developing a powerful diagnostic framework in the Internet of Medical Things (IoMT). Federated deep learning (FDL) is considered a promising approach for efficient and cooperative training from multi-institutional image data. However, the nonindependent and identically distributed (Non-IID) data from health care remain a remarkable challenge, limiting the applicability of FDL in the real world. The variability in features incurred by different scanning protocols, scanners, or acquisition parameters produces the learning drift phenomena during the training, which impairs both the training speed and segmentation performance of the model. This paper proposes a novel FDL approach for reliable and efficient multi-institutional COVID-19 segmentation, called MIC-Net. MIC-Net consists of three main building modules: the down-sampler, context enrichment (CE) module, and up-sampler. The down-sampler was designed to effectively learn both local and global representations from input CT scans by combining the advantages of lightweight convolutional and attention modules. The contextual enrichment (CE) module is introduced to enable the network to capture the contextual representation that can be later exploited to enrich the semantic knowledge of the up-sampler through skip connections. To further tackle the inter-site heterogeneity within the model, the approach uses an adaptive and switchable normalization (ASN) to adaptively choose the best normalization strategy according to the underlying data. A novel federated periodic selection protocol (FED-PCS) is proposed to fairly select the training participants according to their resource state, data quality, and loss of a local model. The results of an experimental evaluation of MIC-Net on three publicly available data sets show its robust performance, with an average dice score of 88.90% and an average surface dice of 87.53%.
Collapse
Affiliation(s)
- Weiping Ding
- School of Information Science and Technology, Nantong University, Nantong, China
- Faculty of Data Science, City University of Macau, Macau, China
| | | | | | - Witold Pedrycz
- Electrical and Computer Engineering, University of Alberta, Edmonton, AB T6R 2V4, Canada
| |
Collapse
|
15
|
Multi-point attention-based semi-supervised learning for diabetic retinopathy classification. Biomed Signal Process Control 2023. [DOI: 10.1016/j.bspc.2022.104412] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
|
16
|
Liu H, Zhuang Y, Song E, Xu X, Hung CC. A bidirectional multilayer contrastive adaptation network with anatomical structure preservation for unpaired cross-modality medical image segmentation. Comput Biol Med 2022; 149:105964. [PMID: 36007288 DOI: 10.1016/j.compbiomed.2022.105964] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2022] [Revised: 07/16/2022] [Accepted: 08/13/2022] [Indexed: 11/03/2022]
Abstract
Multi-modal medical image segmentation has achieved great success through supervised deep learning networks. However, because of domain shift and limited annotation information, unpaired cross-modality segmentation tasks are still challenging. The unsupervised domain adaptation (UDA) methods can alleviate the segmentation degradation of cross-modality segmentation by knowledge transfer between different domains, but current methods still suffer from the problems of model collapse, adversarial training instability, and mismatch of anatomical structures. To tackle these issues, we propose a bidirectional multilayer contrastive adaptation network (BMCAN) for unpaired cross-modality segmentation. The shared encoder is first adopted for learning modality-invariant encoding representations in image synthesis and segmentation simultaneously. Secondly, to retain the anatomical structure consistency in cross-modality image synthesis, we present a structure-constrained cross-modality image translation approach for image alignment. Thirdly, we construct a bidirectional multilayer contrastive learning approach to preserve the anatomical structures and enhance encoding representations, which utilizes two groups of domain-specific multilayer perceptron (MLP) networks to learn modality-specific features. Finally, a semantic information adversarial learning approach is designed to learn structural similarities of semantic outputs for output space alignment. Our proposed method was tested on three different cross-modality segmentation tasks: brain tissue, brain tumor, and cardiac substructure segmentation. Compared with other UDA methods, experimental results show that our proposed BMCAN achieves state-of-the-art segmentation performance on the above three tasks, and it has fewer training components and better feature representations for overcoming overfitting and domain shift problems. Our proposed method can efficiently reduce the annotation burden of radiologists in cross-modality image analysis.
Collapse
Affiliation(s)
- Hong Liu
- Center for Biomedical Imaging and Bioinformatics, School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, 430074, China.
| | - Yuzhou Zhuang
- Institute of Artificial Intelligence, Huazhong University of Science and Technology, Wuhan, 430074, China.
| | - Enmin Song
- Center for Biomedical Imaging and Bioinformatics, School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, 430074, China.
| | - Xiangyang Xu
- Center for Biomedical Imaging and Bioinformatics, School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, 430074, China.
| | - Chih-Cheng Hung
- Center for Machine Vision and Security Research, Kennesaw State University, Marietta, MA, 30060, USA.
| |
Collapse
|
17
|
Liu X, Sanchez P, Thermos S, O'Neil AQ, Tsaftaris SA. Learning disentangled representations in the imaging domain. Med Image Anal 2022; 80:102516. [PMID: 35751992 DOI: 10.1016/j.media.2022.102516] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2021] [Revised: 04/05/2022] [Accepted: 06/10/2022] [Indexed: 12/12/2022]
Abstract
Disentangled representation learning has been proposed as an approach to learning general representations even in the absence of, or with limited, supervision. A good general representation can be fine-tuned for new target tasks using modest amounts of data, or used directly in unseen domains achieving remarkable performance in the corresponding task. This alleviation of the data and annotation requirements offers tantalising prospects for applications in computer vision and healthcare. In this tutorial paper, we motivate the need for disentangled representations, revisit key concepts, and describe practical building blocks and criteria for learning such representations. We survey applications in medical imaging emphasising choices made in exemplar key works, and then discuss links to computer vision applications. We conclude by presenting limitations, challenges, and opportunities.
Collapse
Affiliation(s)
- Xiao Liu
- School of Engineering, The University of Edinburgh, Edinburgh EH9 3FG, UK.
| | - Pedro Sanchez
- School of Engineering, The University of Edinburgh, Edinburgh EH9 3FG, UK
| | - Spyridon Thermos
- School of Engineering, The University of Edinburgh, Edinburgh EH9 3FG, UK
| | - Alison Q O'Neil
- School of Engineering, The University of Edinburgh, Edinburgh EH9 3FG, UK; Canon Medical Research Europe, Edinburgh EH6 5NP, UK
| | - Sotirios A Tsaftaris
- School of Engineering, The University of Edinburgh, Edinburgh EH9 3FG, UK; The Alan Turing Institute, London NW1 2DB, UK
| |
Collapse
|
18
|
Chen Y, Yang XH, Wei Z, Heidari AA, Zheng N, Li Z, Chen H, Hu H, Zhou Q, Guan Q. Generative Adversarial Networks in Medical Image augmentation: A review. Comput Biol Med 2022; 144:105382. [PMID: 35276550 DOI: 10.1016/j.compbiomed.2022.105382] [Citation(s) in RCA: 76] [Impact Index Per Article: 25.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2022] [Revised: 02/25/2022] [Accepted: 03/02/2022] [Indexed: 12/31/2022]
Abstract
OBJECT With the development of deep learning, the number of training samples for medical image-based diagnosis and treatment models is increasing. Generative Adversarial Networks (GANs) have attracted attention in medical image processing due to their excellent image generation capabilities and have been widely used in data augmentation. In this paper, a comprehensive and systematic review and analysis of medical image augmentation work are carried out, and its research status and development prospects are reviewed. METHOD This paper reviews 105 medical image augmentation related papers, which mainly collected by ELSEVIER, IEEE Xplore, and Springer from 2018 to 2021. We counted these papers according to the parts of the organs corresponding to the images, and sorted out the medical image datasets that appeared in them, the loss function in model training, and the quantitative evaluation metrics of image augmentation. At the same time, we briefly introduce the literature collected in three journals and three conferences that have received attention in medical image processing. RESULT First, we summarize the advantages of various augmentation models, loss functions, and evaluation metrics. Researchers can use this information as a reference when designing augmentation tasks. Second, we explore the relationship between augmented models and the amount of the training set, and tease out the role that augmented models may play when the quality of the training set is limited. Third, the statistical number of papers shows that the development momentum of this research field remains strong. Furthermore, we discuss the existing limitations of this type of model and suggest possible research directions. CONCLUSION We discuss GAN-based medical image augmentation work in detail. This method effectively alleviates the challenge of limited training samples for medical image diagnosis and treatment models. It is hoped that this review will benefit researchers interested in this field.
Collapse
Affiliation(s)
- Yizhou Chen
- College of Computer Science and Technology, Zhejiang University of Technology, Hangzhou, China.
| | - Xu-Hua Yang
- College of Computer Science and Technology, Zhejiang University of Technology, Hangzhou, China.
| | - Zihan Wei
- College of Computer Science and Technology, Zhejiang University of Technology, Hangzhou, China.
| | - Ali Asghar Heidari
- School of Surveying and Geospatial Engineering, College of Engineering, University of Tehran, Tehran, Iran; Department of Computer Science, School of Computing, National University of Singapore, Singapore, Singapore.
| | - Nenggan Zheng
- Qiushi Academy for Advanced Studies, Zhejiang University, Hangzhou, Zhejiang, China.
| | - Zhicheng Li
- Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China.
| | - Huiling Chen
- College of Computer Science and Artificial Intelligence, Wenzhou University, Wenzhou, Zhejiang, 325035, China.
| | - Haigen Hu
- College of Computer Science and Technology, Zhejiang University of Technology, Hangzhou, China.
| | - Qianwei Zhou
- College of Computer Science and Technology, Zhejiang University of Technology, Hangzhou, China.
| | - Qiu Guan
- College of Computer Science and Technology, Zhejiang University of Technology, Hangzhou, China.
| |
Collapse
|