1
|
Kumari S, Singh P. Deep learning for unsupervised domain adaptation in medical imaging: Recent advancements and future perspectives. Comput Biol Med 2024; 170:107912. [PMID: 38219643 DOI: 10.1016/j.compbiomed.2023.107912] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2023] [Revised: 11/02/2023] [Accepted: 12/24/2023] [Indexed: 01/16/2024]
Abstract
Deep learning has demonstrated remarkable performance across various tasks in medical imaging. However, these approaches primarily focus on supervised learning, assuming that the training and testing data are drawn from the same distribution. Unfortunately, this assumption may not always hold true in practice. To address these issues, unsupervised domain adaptation (UDA) techniques have been developed to transfer knowledge from a labeled domain to a related but unlabeled domain. In recent years, significant advancements have been made in UDA, resulting in a wide range of methodologies, including feature alignment, image translation, self-supervision, and disentangled representation methods, among others. In this paper, we provide a comprehensive literature review of recent deep UDA approaches in medical imaging from a technical perspective. Specifically, we categorize current UDA research in medical imaging into six groups and further divide them into finer subcategories based on the different tasks they perform. We also discuss the respective datasets used in the studies to assess the divergence between the different domains. Finally, we discuss emerging areas and provide insights and discussions on future research directions to conclude this survey.
Collapse
Affiliation(s)
- Suruchi Kumari
- Department of Computer Science and Engineering, Indian Institute of Technology Roorkee, India.
| | - Pravendra Singh
- Department of Computer Science and Engineering, Indian Institute of Technology Roorkee, India.
| |
Collapse
|
2
|
Tiwary P, Bhattacharyya K, A P P. Cycle consistent twin energy-based models for image-to-image translation. Med Image Anal 2024; 91:103031. [PMID: 37988920 DOI: 10.1016/j.media.2023.103031] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2022] [Revised: 09/10/2023] [Accepted: 11/13/2023] [Indexed: 11/23/2023]
Abstract
Domain shift refers to change of distributional characteristics between the training (source) and the testing (target) datasets of a learning task, leading to performance drop. For tasks involving medical images, domain shift may be caused because of several factors such as change in underlying imaging modalities, measuring devices and staining mechanisms. Recent approaches address this issue via generative models based on the principles of adversarial learning albeit they suffer from issues such as difficulty in training and lack of diversity. Motivated by the aforementioned observations, we adapt an alternative class of deep generative models called the Energy-Based Models (EBMs) for the task of unpaired image-to-image translation of medical images. Specifically, we propose a novel method called the Cycle Consistent Twin EBMs (CCT-EBM) which employs a pair of EBMs in the latent space of an Auto-Encoder trained on the source data. While one of the EBMs translates the source to the target domain the other does vice-versa along with a novel consistency loss, ensuring translation symmetry and coupling between the domains. We theoretically analyze the proposed method and show that our design leads to better translation between the domains with reduced langevin mixing steps. We demonstrate the efficacy of our method through detailed quantitative and qualitative experiments on image segmentation tasks on three different datasets vis-a-vis state-of-the-art methods.
Collapse
Affiliation(s)
- Piyush Tiwary
- Department of Electrical Communication Engineering, Indian Institute of Science, Bangalore, Karnataka 560012, India.
| | - Kinjawl Bhattacharyya
- Department of Electrical Engineering, Indian Institute of Technology Kharagpur, Kharagpur, West Bengal 721302, India
| | - Prathosh A P
- Department of Electrical Communication Engineering, Indian Institute of Science, Bangalore, Karnataka 560012, India
| |
Collapse
|
3
|
Zhao S, Wang J, Wang X, Wang Y, Zheng H, Chen B, Zeng A, Wei F, Al-Kindi S, Li S. Attractive deep morphology-aware active contour network for vertebral body contour extraction with extensions to heterogeneous and semi-supervised scenarios. Med Image Anal 2023; 89:102906. [PMID: 37499333 DOI: 10.1016/j.media.2023.102906] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2022] [Revised: 07/07/2023] [Accepted: 07/13/2023] [Indexed: 07/29/2023]
Abstract
Automatic vertebral body contour extraction (AVBCE) from heterogeneous spinal MRI is indispensable for the comprehensive diagnosis and treatment of spinal diseases. However, AVBCE is challenging due to data heterogeneity, image characteristics complexity, and vertebral body morphology variations, which may cause morphology errors in semantic segmentation. Deep active contour-based (deep ACM-based) methods provide a promising complement for tackling morphology errors by directly parameterizing the contour coordinates. Extending the target contours' capture range and providing morphology-aware parameter maps are crucial for deep ACM-based methods. For this purpose, we propose a novel Attractive Deep Morphology-aware actIve contouR nEtwork (ADMIRE) that embeds an elaborated contour attraction term (CAT) and a comprehensive contour quality (CCQ) loss into the deep ACM-based framework. The CAT adaptively extends the target contours' capture range by designing an all-to-all force field to enable the target contours' energy to contribute to farther locations. Furthermore, the CCQ loss is carefully designed to generate morphology-aware active contour parameters by simultaneously supervising the contour shape, tension, and smoothness. These designs, in cooperation with the deep ACM-based framework, enable robustness to data heterogeneity, image characteristics complexity, and target contour morphology variations. Furthermore, the deep ACM-based ADMIRE is able to cooperate well with semi-supervised strategies such as mean teacher, which enables its function in semi-supervised scenarios. ADMIRE is trained and evaluated on four challenging datasets, including three spinal datasets with more than 1000 heterogeneous images and more than 10000 vertebrae bodies, as well as a cardiac dataset with both normal and pathological cases. Results show ADMIRE achieves state-of-the-art performance on all datasets, which proves ADMIRE's accuracy, robustness, and generalization ability.
Collapse
Affiliation(s)
- Shen Zhao
- Department of Artificial Intelligence, Sun Yat-sen University, Guangzhou 510006, China
| | - Jinhong Wang
- Department of Artificial Intelligence, Sun Yat-sen University, Guangzhou 510006, China
| | - Xinxin Wang
- Department of Artificial Intelligence, Sun Yat-sen University, Guangzhou 510006, China
| | - Yikang Wang
- Department of Artificial Intelligence, Sun Yat-sen University, Guangzhou 510006, China
| | - Hanying Zheng
- Department of Artificial Intelligence, Sun Yat-sen University, Guangzhou 510006, China
| | - Bin Chen
- Affiliated Hangzhou First People's Hospital, Zhejiang University School of Medicine, Zhejiang, China.
| | - An Zeng
- School of Computer Science and Technology, Guangdong University of Technology, Guangzhou, China
| | - Fuxin Wei
- Department of Orthopedics, the Seventh Affiliated Hospital of Sun Yet-sen University, Shen Zhen, China
| | - Sadeer Al-Kindi
- School of Medicine, Case Western Reserve University, Cleveland, USA
| | - Shuo Li
- School of Medicine, Case Western Reserve University, Cleveland, USA
| |
Collapse
|
4
|
Kudo S, Chen Z, Zhou X, Izu LT, Chen-Izu Y, Zhu X, Tamura T, Kanaya S, Huang M. A training pipeline of an arrhythmia classifier for atrial fibrillation detection using Photoplethysmography signal. Front Physiol 2023; 14:1084837. [PMID: 36744032 PMCID: PMC9892629 DOI: 10.3389/fphys.2023.1084837] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2022] [Accepted: 01/02/2023] [Indexed: 01/20/2023] Open
Abstract
Photoplethysmography (PPG) signal is potentially suitable in atrial fibrillation (AF) detection for its convenience in use and similarity in physiological origin to electrocardiogram (ECG). There are a few preceding studies that have shown the possibility of using the peak-to-peak interval of the PPG signal (PPIp) in AF detection. However, as a generalized model, the accuracy of an AF detector should be pursued on the one hand; on the other hand, its generalizability should be paid attention to in view of the individual differences in PPG manifestation of even the same arrhythmia and the existence of sub-types. Moreover, a binary classifier for atrial fibrillation and normal sinus rhythm is not convincing enough for the similarity between AF and ectopic beats. In this study, we project the atrial fibrillation detection as a multiple-class classification and try to propose a training pipeline that is advantageous both to the accuracy and generalizability of the classifier by designing and determining the configurable options of the pipeline, in terms of input format, deep learning model (with hyperparameter optimization), and scheme of transfer learning. With a rigorous comparison of the possible combinations of the configurable components in the pipeline, we confirmed that first-order difference of heartbeat sequence as the input format, a 2-layer CNN-1-layer Transformer hybridR model as the learning model and the whole model fine-tuning as the implementing scheme of transfer learning is the best combination for the pipeline (F1 value: 0.80, overall accuracy: 0.87)R.
Collapse
Affiliation(s)
- Sota Kudo
- Computational Systems Biology Lab, Graduate School of Science and Technology, Nara Institute of Science and Technology, Ikoma, Japan
| | | | - Xue Zhou
- Computational Systems Biology Lab, Graduate School of Science and Technology, Nara Institute of Science and Technology, Ikoma, Japan
| | - Leighton T. Izu
- Department of Pharmacology, University of California, Davis, Davis, CA, United States
| | - Ye Chen-Izu
- Department of Biomedical Engineering, University of California, Davis, Davis, CA, United States
| | - Xin Zhu
- Biomedical Information Engineering Lab, The University of Aizu, Aizu-Wakamatsu, Japan
| | - Toshiyo Tamura
- Future Robotics Organization, Waseda University, Tokyo, Japan
| | - Shigehiko Kanaya
- Computational Systems Biology Lab, Graduate School of Science and Technology, Nara Institute of Science and Technology, Ikoma, Japan
| | - Ming Huang
- Computational Systems Biology Lab, Graduate School of Science and Technology, Nara Institute of Science and Technology, Ikoma, Japan,*Correspondence: Ming Huang ,
| |
Collapse
|
5
|
Liu H, Zhuang Y, Song E, Xu X, Hung CC. A bidirectional multilayer contrastive adaptation network with anatomical structure preservation for unpaired cross-modality medical image segmentation. Comput Biol Med 2022; 149:105964. [PMID: 36007288 DOI: 10.1016/j.compbiomed.2022.105964] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2022] [Revised: 07/16/2022] [Accepted: 08/13/2022] [Indexed: 11/03/2022]
Abstract
Multi-modal medical image segmentation has achieved great success through supervised deep learning networks. However, because of domain shift and limited annotation information, unpaired cross-modality segmentation tasks are still challenging. The unsupervised domain adaptation (UDA) methods can alleviate the segmentation degradation of cross-modality segmentation by knowledge transfer between different domains, but current methods still suffer from the problems of model collapse, adversarial training instability, and mismatch of anatomical structures. To tackle these issues, we propose a bidirectional multilayer contrastive adaptation network (BMCAN) for unpaired cross-modality segmentation. The shared encoder is first adopted for learning modality-invariant encoding representations in image synthesis and segmentation simultaneously. Secondly, to retain the anatomical structure consistency in cross-modality image synthesis, we present a structure-constrained cross-modality image translation approach for image alignment. Thirdly, we construct a bidirectional multilayer contrastive learning approach to preserve the anatomical structures and enhance encoding representations, which utilizes two groups of domain-specific multilayer perceptron (MLP) networks to learn modality-specific features. Finally, a semantic information adversarial learning approach is designed to learn structural similarities of semantic outputs for output space alignment. Our proposed method was tested on three different cross-modality segmentation tasks: brain tissue, brain tumor, and cardiac substructure segmentation. Compared with other UDA methods, experimental results show that our proposed BMCAN achieves state-of-the-art segmentation performance on the above three tasks, and it has fewer training components and better feature representations for overcoming overfitting and domain shift problems. Our proposed method can efficiently reduce the annotation burden of radiologists in cross-modality image analysis.
Collapse
Affiliation(s)
- Hong Liu
- Center for Biomedical Imaging and Bioinformatics, School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, 430074, China.
| | - Yuzhou Zhuang
- Institute of Artificial Intelligence, Huazhong University of Science and Technology, Wuhan, 430074, China.
| | - Enmin Song
- Center for Biomedical Imaging and Bioinformatics, School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, 430074, China.
| | - Xiangyang Xu
- Center for Biomedical Imaging and Bioinformatics, School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, 430074, China.
| | - Chih-Cheng Hung
- Center for Machine Vision and Security Research, Kennesaw State University, Marietta, MA, 30060, USA.
| |
Collapse
|