1
|
de Boisredon d'Assier MA, Portafaix A, Vorontsov E, Le WT, Kadoury S. Image-level supervision and self-training for transformer-based cross-modality tumor segmentation. Med Image Anal 2024; 97:103287. [PMID: 39111265 DOI: 10.1016/j.media.2024.103287] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2023] [Revised: 06/20/2024] [Accepted: 07/24/2024] [Indexed: 08/30/2024]
Abstract
Deep neural networks are commonly used for automated medical image segmentation, but models will frequently struggle to generalize well across different imaging modalities. This issue is particularly problematic due to the limited availability of annotated data, both in the target as well as the source modality, making it difficult to deploy these models on a larger scale. To overcome these challenges, we propose a new semi-supervised training strategy called MoDATTS. Our approach is designed for accurate cross-modality 3D tumor segmentation on unpaired bi-modal datasets. An image-to-image translation strategy between modalities is used to produce synthetic but annotated images and labels in the desired modality and improve generalization to the unannotated target modality. We also use powerful vision transformer architectures for both image translation (TransUNet) and segmentation (Medformer) tasks and introduce an iterative self-training procedure in the later task to further close the domain gap between modalities, thus also training on unlabeled images in the target modality. MoDATTS additionally allows the possibility to exploit image-level labels with a semi-supervised objective that encourages the model to disentangle tumors from the background. This semi-supervised methodology helps in particular to maintain downstream segmentation performance when pixel-level label scarcity is also present in the source modality dataset, or when the source dataset contains healthy controls. The proposed model achieves superior performance compared to other methods from participating teams in the CrossMoDA 2022 vestibular schwannoma (VS) segmentation challenge, as evidenced by its reported top Dice score of 0.87±0.04 for the VS segmentation. MoDATTS also yields consistent improvements in Dice scores over baselines on a cross-modality adult brain gliomas segmentation task composed of four different contrasts from the BraTS 2020 challenge dataset, where 95% of a target supervised model performance is reached when no target modality annotations are available. We report that 99% and 100% of this maximum performance can be attained if 20% and 50% of the target data is additionally annotated, which further demonstrates that MoDATTS can be leveraged to reduce the annotation burden.
Collapse
Affiliation(s)
| | - Aloys Portafaix
- Polytechnique Montreal, Montreal, QC, Canada; Centre de Recherche du Centre Hospitalier de l'Université de Montréal, Montreal, QC, Canada
| | | | - William Trung Le
- Polytechnique Montreal, Montreal, QC, Canada; Centre de Recherche du Centre Hospitalier de l'Université de Montréal, Montreal, QC, Canada
| | - Samuel Kadoury
- Polytechnique Montreal, Montreal, QC, Canada; Centre de Recherche du Centre Hospitalier de l'Université de Montréal, Montreal, QC, Canada.
| |
Collapse
|
2
|
Zheng B, Zhang R, Diao S, Zhu J, Yuan Y, Cai J, Shao L, Li S, Qin W. Dual domain distribution disruption with semantics preservation: Unsupervised domain adaptation for medical image segmentation. Med Image Anal 2024; 97:103275. [PMID: 39032395 DOI: 10.1016/j.media.2024.103275] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2024] [Revised: 06/14/2024] [Accepted: 07/10/2024] [Indexed: 07/23/2024]
Abstract
Recent unsupervised domain adaptation (UDA) methods in medical image segmentation commonly utilize Generative Adversarial Networks (GANs) for domain translation. However, the translated images often exhibit a distribution deviation from the ideal due to the inherent instability of GANs, leading to challenges such as visual inconsistency and incorrect style, consequently causing the segmentation model to fall into the fixed wrong pattern. To address this problem, we propose a novel UDA framework known as Dual Domain Distribution Disruption with Semantics Preservation (DDSP). Departing from the idea of generating images conforming to the target domain distribution in GAN-based UDA methods, we make the model domain-agnostic and focus on anatomical structural information by leveraging semantic information as constraints to guide the model to adapt to images with disrupted distributions in both source and target domains. Furthermore, we introduce the inter-channel similarity feature alignment based on the domain-invariant structural prior information, which facilitates the shared pixel-wise classifier to achieve robust performance on target domain features by aligning the source and target domain features across channels. Without any exaggeration, our method significantly outperforms existing state-of-the-art UDA methods on three public datasets (i.e., the heart dataset, the brain dataset, and the prostate dataset). The code is available at https://github.com/MIXAILAB/DDSPSeg.
Collapse
Affiliation(s)
- Boyun Zheng
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China; Shenzhen College of Advanced Technology, University of Chinese Academy of Sciences, Shenzhen 518055, China
| | - Ranran Zhang
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
| | - Songhui Diao
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China; Shenzhen College of Advanced Technology, University of Chinese Academy of Sciences, Shenzhen 518055, China
| | - Jingke Zhu
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China; Shenzhen College of Advanced Technology, University of Chinese Academy of Sciences, Shenzhen 518055, China
| | - Yixuan Yuan
- Department of Electronic Engineering, The Chinese University of Hong Kong, 999077, Hong Kong, China
| | - Jing Cai
- Department of Health Technology and Informatics, The Hong Kong Polytechnic University, 999077, Hong Kong, China
| | - Liang Shao
- Department of Cardiology, Jiangxi Provincial People's Hospital, The First Affiliated Hospital of Nanchang Medical College, Nanchang 330013, China
| | - Shuo Li
- Department of Biomedical Engineering, Department of Computer and Data Science, Case Western Reserve University, Cleveland, United States.
| | - Wenjian Qin
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China.
| |
Collapse
|
3
|
Chen L, Bian Y, Zeng J, Meng Q, Zhu W, Shi F, Shao C, Chen X, Xiang D. Style Consistency Unsupervised Domain Adaptation Medical Image Segmentation. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2024; 33:4882-4895. [PMID: 39236126 DOI: 10.1109/tip.2024.3451934] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/07/2024]
Abstract
Unsupervised domain adaptation medical image segmentation is aimed to segment unlabeled target domain images with labeled source domain images. However, different medical imaging modalities lead to large domain shift between their images, in which well-trained models from one imaging modality often fail to segment images from anothor imaging modality. In this paper, to mitigate domain shift between source domain and target domain, a style consistency unsupervised domain adaptation image segmentation method is proposed. First, a local phase-enhanced style fusion method is designed to mitigate domain shift and produce locally enhanced organs of interest. Second, a phase consistency discriminator is constructed to distinguish the phase consistency of domain-invariant features between source domain and target domain, so as to enhance the disentanglement of the domain-invariant and style encoders and removal of domain-specific features from the domain-invariant encoder. Third, a style consistency estimation method is proposed to obtain inconsistency maps from intermediate synthesized target domain images with different styles to measure the difficult regions, mitigate domain shift between synthesized target domain images and real target domain images, and improve the integrity of interested organs. Fourth, style consistency entropy is defined for target domain images to further improve the integrity of the interested organ by the concentration on the inconsistent regions. Comprehensive experiments have been performed with an in-house dataset and a publicly available dataset. The experimental results have demonstrated the superiority of our framework over state-of-the-art methods.
Collapse
|
4
|
Yang M, Wu Z, Zheng H, Huang L, Ding W, Pan L, Yin L. Cross-Modality Medical Image Segmentation via Enhanced Feature Alignment and Cross Pseudo Supervision Learning. Diagnostics (Basel) 2024; 14:1751. [PMID: 39202240 PMCID: PMC11353479 DOI: 10.3390/diagnostics14161751] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2024] [Revised: 08/08/2024] [Accepted: 08/10/2024] [Indexed: 09/03/2024] Open
Abstract
Given the diversity of medical images, traditional image segmentation models face the issue of domain shift. Unsupervised domain adaptation (UDA) methods have emerged as a pivotal strategy for cross modality analysis. These methods typically utilize generative adversarial networks (GANs) for both image-level and feature-level domain adaptation through the transformation and reconstruction of images, assuming the features between domains are well-aligned. However, this assumption falters with significant gaps between different medical image modalities, such as MRI and CT. These gaps hinder the effective training of segmentation networks with cross-modality images and can lead to misleading training guidance and instability. To address these challenges, this paper introduces a novel approach comprising a cross-modality feature alignment sub-network and a cross pseudo supervised dual-stream segmentation sub-network. These components work together to bridge domain discrepancies more effectively and ensure a stable training environment. The feature alignment sub-network is designed for the bidirectional alignment of features between the source and target domains, incorporating a self-attention module to aid in learning structurally consistent and relevant information. The segmentation sub-network leverages an enhanced cross-pseudo-supervised loss to harmonize the output of the two segmentation networks, assessing pseudo-distances between domains to improve the pseudo-label quality and thus enhancing the overall learning efficiency of the framework. This method's success is demonstrated by notable advancements in segmentation precision across target domains for abdomen and brain tasks.
Collapse
Affiliation(s)
- Mingjing Yang
- College of Physics and Information Engineering, Fuzhou University, Fuzhou 350108, China; (M.Y.); (Z.W.); (H.Z.); (L.H.)
| | - Zhicheng Wu
- College of Physics and Information Engineering, Fuzhou University, Fuzhou 350108, China; (M.Y.); (Z.W.); (H.Z.); (L.H.)
| | - Hanyu Zheng
- College of Physics and Information Engineering, Fuzhou University, Fuzhou 350108, China; (M.Y.); (Z.W.); (H.Z.); (L.H.)
| | - Liqin Huang
- College of Physics and Information Engineering, Fuzhou University, Fuzhou 350108, China; (M.Y.); (Z.W.); (H.Z.); (L.H.)
| | - Wangbin Ding
- School of Medical Imaging, Fujian Medical University, Fuzhou 350122, China;
| | - Lin Pan
- College of Physics and Information Engineering, Fuzhou University, Fuzhou 350108, China; (M.Y.); (Z.W.); (H.Z.); (L.H.)
| | - Lei Yin
- The Departments of Radiology, Shengli Clinical Medical College of Fujian Medical University, Fuzhou 350001, China
- Fujian Provincial Hospital, Fuzhou 350001, China
- Fuzhou University Affiliated Provincial Hospital, Fuzhou 350001, China
| |
Collapse
|
5
|
Cui H, Li Y, Wang Y, Xu D, Wu LM, Xia Y. Toward Accurate Cardiac MRI Segmentation With Variational Autoencoder-Based Unsupervised Domain Adaptation. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:2924-2936. [PMID: 38546999 DOI: 10.1109/tmi.2024.3382624] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/02/2024]
Abstract
Accurate myocardial segmentation is crucial in the diagnosis and treatment of myocardial infarction (MI), especially in Late Gadolinium Enhancement (LGE) cardiac magnetic resonance (CMR) images, where the infarcted myocardium exhibits a greater brightness. However, segmentation annotations for LGE images are usually not available. Although knowledge gained from CMR images of other modalities with ample annotations, such as balanced-Steady State Free Precession (bSSFP), can be transferred to the LGE images, the difference in image distribution between the two modalities (i.e., domain shift) usually results in a significant degradation in model performance. To alleviate this, an end-to-end Variational autoencoder based feature Alignment Module Combining Explicit and Implicit features (VAMCEI) is proposed. We first re-derive the Kullback-Leibler (KL) divergence between the posterior distributions of the two domains as a measure of the global distribution distance. Second, we calculate the prototype contrastive loss between the two domains, bringing closer the prototypes of the same category across domains and pushing away the prototypes of different categories within or across domains. Finally, a domain discriminator is added to the output space, which indirectly aligns the feature distribution and forces the extracted features to be more favorable for segmentation. In addition, by combining CycleGAN and VAMCEI, we propose a more refined multi-stage unsupervised domain adaptation (UDA) framework for myocardial structure segmentation. We conduct extensive experiments on the MSCMRSeg 2019, MyoPS 2020 and MM-WHS 2017 datasets. The experimental results demonstrate that our framework achieves superior performances than state-of-the-art methods.
Collapse
|
6
|
Guo Z, Feng J, Lu W, Yin Y, Yang G, Zhou J. Cross-modality cerebrovascular segmentation based on pseudo-label generation via paired data. Comput Med Imaging Graph 2024; 115:102393. [PMID: 38704993 DOI: 10.1016/j.compmedimag.2024.102393] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2024] [Revised: 04/26/2024] [Accepted: 04/26/2024] [Indexed: 05/07/2024]
Abstract
Accurate segmentation of cerebrovascular structures from Computed Tomography Angiography (CTA), Magnetic Resonance Angiography (MRA), and Digital Subtraction Angiography (DSA) is crucial for clinical diagnosis of cranial vascular diseases. Recent advancements in deep Convolution Neural Network (CNN) have significantly improved the segmentation process. However, training segmentation networks for all modalities requires extensive data labeling for each modality, which is often expensive and time-consuming. To circumvent this limitation, we introduce an approach to train cross-modality cerebrovascular segmentation network based on paired data from source and target domains. Our approach involves training a universal vessel segmentation network with manually labeled source domain data, which automatically produces initial labels for target domain training images. We improve the initial labels of target domain training images by fusing paired images, which are then used to refine the target domain segmentation network. A series of experimental arrangements is presented to assess the efficacy of our method in various practical application scenarios. The experiments conducted on an MRA-CTA dataset and a DSA-CTA dataset demonstrate that the proposed method is effective for cross-modality cerebrovascular segmentation and achieves state-of-the-art performance.
Collapse
Affiliation(s)
- Zhanqiang Guo
- Department of Automation, BNRist, Tsinghua University, Beijing, China
| | - Jianjiang Feng
- Department of Automation, BNRist, Tsinghua University, Beijing, China.
| | - Wangsheng Lu
- UnionStrong (Beijing) Technology Co.Ltd, Beijing, China
| | - Yin Yin
- UnionStrong (Beijing) Technology Co.Ltd, Beijing, China
| | | | - Jie Zhou
- Department of Automation, BNRist, Tsinghua University, Beijing, China
| |
Collapse
|
7
|
Pei C, Wu F, Yang M, Pan L, Ding W, Dong J, Huang L, Zhuang X. Multi-Source Domain Adaptation for Medical Image Segmentation. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:1640-1651. [PMID: 38133966 DOI: 10.1109/tmi.2023.3346285] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/24/2023]
Abstract
Unsupervised domain adaptation(UDA) aims to mitigate the performance drop of models tested on the target domain, due to the domain shift from the target to sources. Most UDA segmentation methods focus on the scenario of solely single source domain. However, in practical situations data with gold standard could be available from multiple sources (domains), and the multi-source training data could provide more information for knowledge transfer. How to utilize them to achieve better domain adaptation yet remains to be further explored. This work investigates multi-source UDA and proposes a new framework for medical image segmentation. Firstly, we employ a multi-level adversarial learning scheme to adapt features at different levels between each of the source domains and the target, to improve the segmentation performance. Then, we propose a multi-model consistency loss to transfer the learned multi-source knowledge to the target domain simultaneously. Finally, we validated the proposed framework on two applications, i.e., multi-modality cardiac segmentation and cross-modality liver segmentation. The results showed our method delivered promising performance and compared favorably to state-of-the-art approaches.
Collapse
|
8
|
Kumari S, Singh P. Deep learning for unsupervised domain adaptation in medical imaging: Recent advancements and future perspectives. Comput Biol Med 2024; 170:107912. [PMID: 38219643 DOI: 10.1016/j.compbiomed.2023.107912] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2023] [Revised: 11/02/2023] [Accepted: 12/24/2023] [Indexed: 01/16/2024]
Abstract
Deep learning has demonstrated remarkable performance across various tasks in medical imaging. However, these approaches primarily focus on supervised learning, assuming that the training and testing data are drawn from the same distribution. Unfortunately, this assumption may not always hold true in practice. To address these issues, unsupervised domain adaptation (UDA) techniques have been developed to transfer knowledge from a labeled domain to a related but unlabeled domain. In recent years, significant advancements have been made in UDA, resulting in a wide range of methodologies, including feature alignment, image translation, self-supervision, and disentangled representation methods, among others. In this paper, we provide a comprehensive literature review of recent deep UDA approaches in medical imaging from a technical perspective. Specifically, we categorize current UDA research in medical imaging into six groups and further divide them into finer subcategories based on the different tasks they perform. We also discuss the respective datasets used in the studies to assess the divergence between the different domains. Finally, we discuss emerging areas and provide insights and discussions on future research directions to conclude this survey.
Collapse
Affiliation(s)
- Suruchi Kumari
- Department of Computer Science and Engineering, Indian Institute of Technology Roorkee, India.
| | - Pravendra Singh
- Department of Computer Science and Engineering, Indian Institute of Technology Roorkee, India.
| |
Collapse
|
9
|
Zhang Y, Wang Y, Xu L, Yao Y, Qian W, Qi L. ST-GAN: A Swin Transformer-Based Generative Adversarial Network for Unsupervised Domain Adaptation of Cross-Modality Cardiac Segmentation. IEEE J Biomed Health Inform 2024; 28:893-904. [PMID: 38019618 DOI: 10.1109/jbhi.2023.3336965] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2023]
Abstract
Unsupervised domain adaptation (UDA) methods have shown great potential in cross-modality medical image segmentation tasks, where target domain labels are unavailable. However, the domain shift among different image modalities remains challenging, because the conventional UDA methods are based on convolutional neural networks (CNNs), which tend to focus on the texture of images and cannot establish the global semantic relevance of features due to the locality of CNNs. This paper proposes a novel end-to-end Swin Transformer-based generative adversarial network (ST-GAN) for cross-modality cardiac segmentation. In the generator of ST-GAN, we utilize the local receptive fields of CNNs to capture spatial information and introduce the Swin Transformer to extract global semantic information, which enables the generator to better extract the domain-invariant features in UDA tasks. In addition, we design a multi-scale feature fuser to sufficiently fuse the features acquired at different stages and improve the robustness of the UDA network. We extensively evaluated our method with two cross-modality cardiac segmentation tasks on the MS-CMR 2019 dataset and the M&Ms dataset. The results of two different tasks show the validity of ST-GAN compared with the state-of-the-art cross-modality cardiac image segmentation methods.
Collapse
|
10
|
Wu J, Wang G, Gu R, Lu T, Chen Y, Zhu W, Vercauteren T, Ourselin S, Zhang S. UPL-SFDA: Uncertainty-Aware Pseudo Label Guided Source-Free Domain Adaptation for Medical Image Segmentation. IEEE TRANSACTIONS ON MEDICAL IMAGING 2023; 42:3932-3943. [PMID: 37738202 DOI: 10.1109/tmi.2023.3318364] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/24/2023]
Abstract
Domain Adaptation (DA) is important for deep learning-based medical image segmentation models to deal with testing images from a new target domain. As the source-domain data are usually unavailable when a trained model is deployed at a new center, Source-Free Domain Adaptation (SFDA) is appealing for data and annotation-efficient adaptation to the target domain. However, existing SFDA methods have a limited performance due to lack of sufficient supervision with source-domain images unavailable and target-domain images unlabeled. We propose a novel Uncertainty-aware Pseudo Label guided (UPL) SFDA method for medical image segmentation. Specifically, we propose Target Domain Growing (TDG) to enhance the diversity of predictions in the target domain by duplicating the pre-trained model's prediction head multiple times with perturbations. The different predictions in these duplicated heads are used to obtain pseudo labels for unlabeled target-domain images and their uncertainty to identify reliable pseudo labels. We also propose a Twice Forward pass Supervision (TFS) strategy that uses reliable pseudo labels obtained in one forward pass to supervise predictions in the next forward pass. The adaptation is further regularized by a mean prediction-based entropy minimization term that encourages confident and consistent results in different prediction heads. UPL-SFDA was validated with a multi-site heart MRI segmentation dataset, a cross-modality fetal brain segmentation dataset, and a 3D fetal tissue segmentation dataset. It improved the average Dice by 5.54, 5.01 and 6.89 percentage points for the three tasks compared with the baseline, respectively, and outperformed several state-of-the-art SFDA methods.
Collapse
|
11
|
Ding W, Li L, Qiu J, Wang S, Huang L, Chen Y, Yang S, Zhuang X. Aligning Multi-Sequence CMR Towards Fully Automated Myocardial Pathology Segmentation. IEEE TRANSACTIONS ON MEDICAL IMAGING 2023; 42:3474-3486. [PMID: 37347625 DOI: 10.1109/tmi.2023.3288046] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/24/2023]
Abstract
Myocardial pathology segmentation (MyoPS) is critical for the risk stratification and treatment planning of myocardial infarction (MI). Multi-sequence cardiac magnetic resonance (MS-CMR) images can provide valuable information. For instance, balanced steady-state free precession cine sequences present clear anatomical boundaries, while late gadolinium enhancement and T2-weighted CMR sequences visualize myocardial scar and edema of MI, respectively. Existing methods usually fuse anatomical and pathological information from different CMR sequences for MyoPS, but assume that these images have been spatially aligned. However, MS-CMR images are usually unaligned due to the respiratory motions in clinical practices, which poses additional challenges for MyoPS. This work presents an automatic MyoPS framework for unaligned MS-CMR images. Specifically, we design a combined computing model for simultaneous image registration and information fusion, which aggregates multi-sequence features into a common space to extract anatomical structures (i.e., myocardium). Consequently, we can highlight the informative regions in the common space via the extracted myocardium to improve MyoPS performance, considering the spatial relationship between myocardial pathologies and myocardium. Experiments on a private MS-CMR dataset and a public dataset from the MYOPS2020 challenge show that our framework could achieve promising performance for fully automatic MyoPS.
Collapse
|
12
|
Li D, Peng Y, Sun J, Guo Y. Unsupervised deep consistency learning adaptation network for cardiac cross-modality structural segmentation. Med Biol Eng Comput 2023; 61:2713-2732. [PMID: 37450212 DOI: 10.1007/s11517-023-02833-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2022] [Accepted: 04/05/2023] [Indexed: 07/18/2023]
Abstract
Deep neural networks have recently been succeessful in the field of medical image segmentation; however, they are typically subject to performance degradation problems when well-trained models are tested in another new domain with different data distributions. Given that annotated cross-domain images may inaccessible, unsupervised domain adaptation methods that transfer learnable information from annotated source domains to unannotated target domains with different distributions have attracted substantial attention. Many methods leverage image-level or pixel-level translation networks to align domain-invariant information and mitigate domain shift issues. However, These methods rarely perform well when there is a large domain gap. A new unsupervised deep consistency learning adaptation network, which adopts input space consistency learning and output space consistency learning to realize unsupervised domain adaptation and cardiac structural segmentation, is introduced in this paper The framework mainly includes a domain translation path and a cross-modality segmentation path. In domain translation path, a symmetric alignment generator network with attention to cross-modality features and anatomy is introduced to align bidirectional domain features. In the segmentation path, entropy map minimization, output probability map minimization and segmentation prediction minimization are leveraged to align the output space features. The model conducts supervised learning to extract source domain features and conducts unsupervised deep consistency learning to extract target domain features. Through experimental testing on two challenging cross-modality segmentation tasks, our method has robust performance compared to that of previous methods. Furthermore, ablation experiments are conducted to confirm the effectiveness of our framework.
Collapse
Affiliation(s)
- Dapeng Li
- College of Computer Science and Engineering, Shandong University of Science and Technology, Qingdao, 266590, Shandong, China
| | - Yanjun Peng
- College of Computer Science and Engineering, Shandong University of Science and Technology, Qingdao, 266590, Shandong, China.
- Shandong Province Key Laboratory of Wisdom Mining Information Technology, Shandong University of Science and Technology, Qingdao, 266590, Shandong, China.
| | - Jindong Sun
- College of Computer Science and Engineering, Shandong University of Science and Technology, Qingdao, 266590, Shandong, China
| | - Yanfei Guo
- College of Computer Science and Engineering, Shandong University of Science and Technology, Qingdao, 266590, Shandong, China
| |
Collapse
|
13
|
Gu R, Wang G, Lu J, Zhang J, Lei W, Chen Y, Liao W, Zhang S, Li K, Metaxas DN, Zhang S. CDDSA: Contrastive domain disentanglement and style augmentation for generalizable medical image segmentation. Med Image Anal 2023; 89:102904. [PMID: 37506556 DOI: 10.1016/j.media.2023.102904] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2022] [Revised: 06/06/2023] [Accepted: 07/12/2023] [Indexed: 07/30/2023]
Abstract
Generalization to previously unseen images with potential domain shifts is essential for clinically applicable medical image segmentation. Disentangling domain-specific and domain-invariant features is key for Domain Generalization (DG). However, existing DG methods struggle to achieve effective disentanglement. To address this problem, we propose an efficient framework called Contrastive Domain Disentanglement and Style Augmentation (CDDSA) for generalizable medical image segmentation. First, a disentangle network decomposes the image into domain-invariant anatomical representation and domain-specific style code, where the former is sent for further segmentation that is not affected by domain shift, and the disentanglement is regularized by a decoder that combines the anatomical representation and style code to reconstruct the original image. Second, to achieve better disentanglement, a contrastive loss is proposed to encourage the style codes from the same domain and different domains to be compact and divergent, respectively. Finally, to further improve generalizability, we propose a style augmentation strategy to synthesize images with various unseen styles in real time while maintaining anatomical information. Comprehensive experiments on a public multi-site fundus image dataset and an in-house multi-site Nasopharyngeal Carcinoma Magnetic Resonance Image (NPC-MRI) dataset show that the proposed CDDSA achieved remarkable generalizability across different domains, and it outperformed several state-of-the-art methods in generalizable segmentation. Code is available at https://github.com/HiLab-git/DAG4MIA.
Collapse
Affiliation(s)
- Ran Gu
- School of Mechanical and Electrical Engineering, University of Electronic Science and Technology of China, Chengdu, China
| | - Guotai Wang
- School of Mechanical and Electrical Engineering, University of Electronic Science and Technology of China, Chengdu, China; Shanghai AI Lab, Shanghai, China.
| | - Jiangshan Lu
- School of Mechanical and Electrical Engineering, University of Electronic Science and Technology of China, Chengdu, China
| | - Jingyang Zhang
- School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China; School of Biomedical Engineering, ShanghaiTech University, Shanghai, China
| | - Wenhui Lei
- School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, Shanghai, China; Shanghai AI Lab, Shanghai, China
| | - Yinan Chen
- SenseTime Research, Shanghai, China; West China Hospital-SenseTime Joint Lab, West China Biomedical Big Data Center, Sichuan University, Chengdu, China
| | - Wenjun Liao
- Department of Radiation Oncology, Sichuan Cancer Hospital and Institute, University of Electronic Science and Technology of China, Chengdu, China
| | - Shichuan Zhang
- Department of Radiation Oncology, Sichuan Cancer Hospital and Institute, University of Electronic Science and Technology of China, Chengdu, China
| | - Kang Li
- West China Hospital-SenseTime Joint Lab, West China Biomedical Big Data Center, Sichuan University, Chengdu, China
| | - Dimitris N Metaxas
- Department of Computer Science, Rutgers University, Piscataway NJ 08854, USA
| | - Shaoting Zhang
- School of Mechanical and Electrical Engineering, University of Electronic Science and Technology of China, Chengdu, China; SenseTime Research, Shanghai, China; Shanghai AI Lab, Shanghai, China.
| |
Collapse
|
14
|
Wang R, Zhou Q, Zheng G. EDRL: Entropy-guided disentangled representation learning for unsupervised domain adaptation in semantic segmentation. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2023; 240:107729. [PMID: 37531690 DOI: 10.1016/j.cmpb.2023.107729] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/09/2022] [Revised: 07/15/2023] [Accepted: 07/19/2023] [Indexed: 08/04/2023]
Abstract
BACKGROUND AND OBJECTIVE Deep learning-based approaches are excellent at learning from large amounts of data, but can be poor at generalizing the learned knowledge to testing datasets with domain shift, i.e., when there exists distribution discrepancy between the training dataset (source domain) and the testing dataset (target domain). In this paper, we investigate unsupervised domain adaptation (UDA) techniques to train a cross-domain segmentation method which is robust to domain shift, eliminating the requirement of any annotations on the target domain. METHODS To this end, we propose an Entropy-guided Disentangled Representation Learning, referred as EDRL, for UDA in semantic segmentation. Concretely, we synergistically integrate image alignment via disentangled representation learning with feature alignment via entropy-based adversarial learning into one network, which is trained end-to-end. We additionally introduce a dynamic feature selection mechanism via soft gating, which helps to further enhance the task-specific feature alignment. We validate the proposed method on two publicly available datasets: the CT-MR dataset and the multi-sequence cardiac MR (MS-CMR) dataset. RESULTS On both datasets, our method achieved better results than the state-of-the-art (SOTA) methods. Specifically, on the CT-MR dataset, our method achieved an average DSC of 84.8% when taking CT as the source domain and MR as the target domain, and an average DSC of 84.0% when taking MR as the source domain and CT as the target domain. CONCLUSIONS Results from comprehensive experiments demonstrate the efficacy of the proposed EDRL model for cross-domain medical image segmentation.
Collapse
Affiliation(s)
- Runze Wang
- Institute of Medical Robotics, School of Biomedical Engineering, Shanghai Jiao Tong University, No. 800, Dongchuan Road, Shanghai, 200240, China
| | - Qin Zhou
- Institute of Medical Robotics, School of Biomedical Engineering, Shanghai Jiao Tong University, No. 800, Dongchuan Road, Shanghai, 200240, China
| | - Guoyan Zheng
- Institute of Medical Robotics, School of Biomedical Engineering, Shanghai Jiao Tong University, No. 800, Dongchuan Road, Shanghai, 200240, China.
| |
Collapse
|
15
|
Liu H, Zhuang Y, Song E, Xu X, Ma G, Cetinkaya C, Hung CC. A modality-collaborative convolution and transformer hybrid network for unpaired multi-modal medical image segmentation with limited annotations. Med Phys 2023; 50:5460-5478. [PMID: 36864700 DOI: 10.1002/mp.16338] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2022] [Revised: 02/07/2023] [Accepted: 02/22/2023] [Indexed: 03/04/2023] Open
Abstract
BACKGROUND Multi-modal learning is widely adopted to learn the latent complementary information between different modalities in multi-modal medical image segmentation tasks. Nevertheless, the traditional multi-modal learning methods require spatially well-aligned and paired multi-modal images for supervised training, which cannot leverage unpaired multi-modal images with spatial misalignment and modality discrepancy. For training accurate multi-modal segmentation networks using easily accessible and low-cost unpaired multi-modal images in clinical practice, unpaired multi-modal learning has received comprehensive attention recently. PURPOSE Existing unpaired multi-modal learning methods usually focus on the intensity distribution gap but ignore the scale variation problem between different modalities. Besides, within existing methods, shared convolutional kernels are frequently employed to capture common patterns in all modalities, but they are typically inefficient at learning global contextual information. On the other hand, existing methods highly rely on a large number of labeled unpaired multi-modal scans for training, which ignores the practical scenario when labeled data is limited. To solve the above problems, we propose a modality-collaborative convolution and transformer hybrid network (MCTHNet) using semi-supervised learning for unpaired multi-modal segmentation with limited annotations, which not only collaboratively learns modality-specific and modality-invariant representations, but also could automatically leverage extensive unlabeled scans for improving performance. METHODS We make three main contributions to the proposed method. First, to alleviate the intensity distribution gap and scale variation problems across modalities, we develop a modality-specific scale-aware convolution (MSSC) module that can adaptively adjust the receptive field sizes and feature normalization parameters according to the input. Secondly, we propose a modality-invariant vision transformer (MIViT) module as the shared bottleneck layer for all modalities, which implicitly incorporates convolution-like local operations with the global processing of transformers for learning generalizable modality-invariant representations. Third, we design a multi-modal cross pseudo supervision (MCPS) method for semi-supervised learning, which enforces the consistency between the pseudo segmentation maps generated by two perturbed networks to acquire abundant annotation information from unlabeled unpaired multi-modal scans. RESULTS Extensive experiments are performed on two unpaired CT and MR segmentation datasets, including a cardiac substructure dataset derived from the MMWHS-2017 dataset and an abdominal multi-organ dataset consisting of the BTCV and CHAOS datasets. Experiment results show that our proposed method significantly outperforms other existing state-of-the-art methods under various labeling ratios, and achieves a comparable segmentation performance close to single-modal methods with fully labeled data by only leveraging a small portion of labeled data. Specifically, when the labeling ratio is 25%, our proposed method achieves overall mean DSC values of 78.56% and 76.18% in cardiac and abdominal segmentation, respectively, which significantly improves the average DSC value of two tasks by 12.84% compared to single-modal U-Net models. CONCLUSIONS Our proposed method is beneficial for reducing the annotation burden of unpaired multi-modal medical images in clinical applications.
Collapse
Affiliation(s)
- Hong Liu
- School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, China
| | - Yuzhou Zhuang
- School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, China
| | - Enmin Song
- School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, China
| | - Xiangyang Xu
- School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, China
| | - Guangzhi Ma
- School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, China
| | - Coskun Cetinkaya
- Center for Machine Vision and Security Research, Kennesaw State University, Kennesaw, Georgia, USA
| | - Chih-Cheng Hung
- Center for Machine Vision and Security Research, Kennesaw State University, Kennesaw, Georgia, USA
| |
Collapse
|
16
|
Li L, Ding W, Huang L, Zhuang X, Grau V. Multi-modality cardiac image computing: A survey. Med Image Anal 2023; 88:102869. [PMID: 37384950 DOI: 10.1016/j.media.2023.102869] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2022] [Revised: 05/01/2023] [Accepted: 06/12/2023] [Indexed: 07/01/2023]
Abstract
Multi-modality cardiac imaging plays a key role in the management of patients with cardiovascular diseases. It allows a combination of complementary anatomical, morphological and functional information, increases diagnosis accuracy, and improves the efficacy of cardiovascular interventions and clinical outcomes. Fully-automated processing and quantitative analysis of multi-modality cardiac images could have a direct impact on clinical research and evidence-based patient management. However, these require overcoming significant challenges including inter-modality misalignment and finding optimal methods to integrate information from different modalities. This paper aims to provide a comprehensive review of multi-modality imaging in cardiology, the computing methods, the validation strategies, the related clinical workflows and future perspectives. For the computing methodologies, we have a favored focus on the three tasks, i.e., registration, fusion and segmentation, which generally involve multi-modality imaging data, either combining information from different modalities or transferring information across modalities. The review highlights that multi-modality cardiac imaging data has the potential of wide applicability in the clinic, such as trans-aortic valve implantation guidance, myocardial viability assessment, and catheter ablation therapy and its patient selection. Nevertheless, many challenges remain unsolved, such as missing modality, modality selection, combination of imaging and non-imaging data, and uniform analysis and representation of different modalities. There is also work to do in defining how the well-developed techniques fit in clinical workflows and how much additional and relevant information they introduce. These problems are likely to continue to be an active field of research and the questions to be answered in the future.
Collapse
Affiliation(s)
- Lei Li
- Department of Engineering Science, University of Oxford, Oxford, UK.
| | - Wangbin Ding
- College of Physics and Information Engineering, Fuzhou University, Fuzhou, China
| | - Liqin Huang
- College of Physics and Information Engineering, Fuzhou University, Fuzhou, China
| | - Xiahai Zhuang
- School of Data Science, Fudan University, Shanghai, China
| | - Vicente Grau
- Department of Engineering Science, University of Oxford, Oxford, UK
| |
Collapse
|
17
|
Wang Y, Chen Y, Zhang Y, Zhu H. Rethinking Disentanglement in Unsupervised Domain Adaptation for Medical Image Segmentation. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2023; 2023:1-6. [PMID: 38082792 DOI: 10.1109/embc40787.2023.10341077] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/18/2023]
Abstract
Domain adaptation has become an important topic because the trained neural networks from the source domain generally perform poorly in the target domain due to domain shifts, especially for medical image analysis. Previous DA methods mainly focus on disentangling domain features. However, it is based on feature independence, which often can not be guaranteed in reality. In this work, we present a new DA approach called Dimension-based Disentangled Dilated Domain Adaptation (D4A) to disentangle the storage locations between the features to tackle the problem of domain shift for medical image segmentation tasks without the annotations of the target domain. We use Adaptive Instance Normalization (AdaIN) to encourage the content information to be stored in the spatial dimension, and the style information to be stored in the channel dimension. In addition, we apply dilated convolution to preserve anatomical information avoiding the loss of information due to downsampling. We validate the proposed method for cross-modality medical image segmentation tasks on two public datasets, and the comparison experiments and ablation studies demonstrate the effectiveness of our method, which outperforms the state-of-the-art methods.
Collapse
|
18
|
van Tulder G, de Bruijne M. Unpaired, unsupervised domain adaptation assumes your domains are already similar. Med Image Anal 2023; 87:102825. [PMID: 37116296 DOI: 10.1016/j.media.2023.102825] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2022] [Revised: 03/30/2023] [Accepted: 04/17/2023] [Indexed: 04/30/2023]
Abstract
Unsupervised domain adaptation is a popular method in medical image analysis, but it can be tricky to make it work: without labels to link the domains, domains must be matched using feature distributions. If there is no additional information, this often leaves a choice between multiple possibilities to map the data that may be equally likely but not equally correct. In this paper we explore the fundamental problems that may arise in unsupervised domain adaptation, and discuss conditions that might still make it work. Focusing on medical image analysis, we argue that images from different domains may have similar class balance, similar intensities, similar spatial structure, or similar textures. We demonstrate how these implicit conditions can affect domain adaptation performance in experiments with synthetic data, MNIST digits, and medical images. We observe that practical success of unsupervised domain adaptation relies on existing similarities in the data, and is anything but guaranteed in the general case. Understanding these implicit assumptions is a key step in identifying potential problems in domain adaptation and improving the reliability of the results.
Collapse
Affiliation(s)
- Gijs van Tulder
- Data Science group, Faculty of Science, Radboud University, Postbus 9010, 6500 GL Nijmegen, The Netherlands; Biomedical Imaging Group, Erasmus MC, Postbus 2040, 3000 CA Rotterdam, The Netherlands.
| | - Marleen de Bruijne
- Biomedical Imaging Group, Erasmus MC, Postbus 2040, 3000 CA Rotterdam, The Netherlands; Department of Computer Science, University of Copenhagen, Universitetsparken 1, 2100 Copenhagen, Denmark.
| |
Collapse
|
19
|
Shen Z, Cao P, Yang J, Zaiane OR. WS-LungNet: A two-stage weakly-supervised lung cancer detection and diagnosis network. Comput Biol Med 2023; 154:106587. [PMID: 36709519 DOI: 10.1016/j.compbiomed.2023.106587] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2022] [Revised: 01/13/2023] [Accepted: 01/22/2023] [Indexed: 01/26/2023]
Abstract
Computer-aided lung cancer diagnosis (CAD) system on computed tomography (CT) helps radiologists guide preoperative planning and prognosis assessment. The flexibility and scalability of deep learning methods are limited in lung CAD. In essence, two significant challenges to be solved are (1) Label scarcity due to cost annotations of CT images by experienced domain experts, and (2) Label inconsistency between the observed nodule malignancy and the patients' pathology evaluation. These two issues can be considered weak label problems. We address these issues in this paper by introducing a weakly-supervised lung cancer detection and diagnosis network (WS-LungNet), consisting of a semi-supervised computer-aided detection (Semi-CADe) that can segment 3D pulmonary nodules based on unlabeled data through adversarial learning to reduce label scarcity, as well as a cross-nodule attention computer-aided diagnosis (CNA-CADx) for evaluating malignancy at the patient level by modeling correlations between nodules via cross-attention mechanisms and thereby eliminating label inconsistency. Through extensive evaluations on the LIDC-IDRI public database, we show that our proposed method achieves 82.99% competition performance metric (CPM) on pulmonary nodule detection and 88.63% area under the curve (AUC) on lung cancer diagnosis. Extensive experiments demonstrate the advantage of WS-LungNet on nodule detection and malignancy evaluation tasks. Our promising results demonstrate the benefits and flexibility of the semi-supervised segmentation with adversarial learning and the nodule instance correlation learning with the attention mechanism. The results also suggest that making use of the unlabeled data and taking the relationship among nodules in a case into account are essential for lung cancer detection and diagnosis.
Collapse
Affiliation(s)
- Zhiqiang Shen
- College of Computer Science and Engineering, Northeastern University, Shenyang, China; Key Laboratory of Intelligent Computing in Medical Image, Ministry of Education, Shenyang, China
| | - Peng Cao
- College of Computer Science and Engineering, Northeastern University, Shenyang, China; Key Laboratory of Intelligent Computing in Medical Image, Ministry of Education, Shenyang, China.
| | - Jinzhu Yang
- College of Computer Science and Engineering, Northeastern University, Shenyang, China; Key Laboratory of Intelligent Computing in Medical Image, Ministry of Education, Shenyang, China
| | - Osmar R Zaiane
- Alberta Machine Intelligence Institute, University of Alberta, Canada
| |
Collapse
|
20
|
Qiu J, Li L, Wang S, Zhang K, Chen Y, Yang S, Zhuang X. MyoPS-Net: Myocardial pathology segmentation with flexible combination of multi-sequence CMR images. Med Image Anal 2023; 84:102694. [PMID: 36495601 DOI: 10.1016/j.media.2022.102694] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2022] [Revised: 10/05/2022] [Accepted: 11/16/2022] [Indexed: 11/29/2022]
Abstract
Myocardial pathology segmentation (MyoPS) can be a prerequisite for the accurate diagnosis and treatment planning of myocardial infarction. However, achieving this segmentation is challenging, mainly due to the inadequate and indistinct information from an image. In this work, we develop an end-to-end deep neural network, referred to as MyoPS-Net, to flexibly combine five-sequence cardiac magnetic resonance (CMR) images for MyoPS. To extract precise and adequate information, we design an effective yet flexible architecture to extract and fuse cross-modal features. This architecture can tackle different numbers of CMR images and complex combinations of modalities, with output branches targeting specific pathologies. To impose anatomical knowledge on the segmentation results, we first propose a module to regularize myocardium consistency and localize the pathologies, and then introduce an inclusiveness loss to utilize relations between myocardial scars and edema. We evaluated the proposed MyoPS-Net on two datasets, i.e., a private one consisting of 50 paired multi-sequence CMR images and a public one from MICCAI2020 MyoPS Challenge. Experimental results showed that MyoPS-Net could achieve state-of-the-art performance in various scenarios. Note that in practical clinics, the subjects may not have full sequences, such as missing LGE CMR or mapping CMR scans. We therefore conducted extensive experiments to investigate the performance of the proposed method in dealing with such complex combinations of different CMR sequences. Results proved the superiority and generalizability of MyoPS-Net, and more importantly, indicated a practical clinical application. The code has been released via https://github.com/QJYBall/MyoPS-Net.
Collapse
Affiliation(s)
- Junyi Qiu
- School of Data Science, Fudan University, Shanghai, China
| | - Lei Li
- Institute of Biomedical Engineering, University of Oxford, Oxford, UK
| | - Sihan Wang
- School of Data Science, Fudan University, Shanghai, China
| | - Ke Zhang
- School of Data Science, Fudan University, Shanghai, China
| | - Yinyin Chen
- Department of Radiology, Zhongshan Hospital, Fudan University, Shanghai, China; Department of Medical Imaging, Shanghai Medical School, Fudan University and Shanghai Institute of Medical Imaging, Shanghai, China
| | - Shan Yang
- Department of Radiology, Zhongshan Hospital, Fudan University, Shanghai, China; Department of Medical Imaging, Shanghai Medical School, Fudan University and Shanghai Institute of Medical Imaging, Shanghai, China
| | - Xiahai Zhuang
- School of Data Science, Fudan University, Shanghai, China.
| |
Collapse
|
21
|
Jafari M, Francis S, Garibaldi JM, Chen X. LMISA: A lightweight multi-modality image segmentation network via domain adaptation using gradient magnitude and shape constraint. Med Image Anal 2022; 81:102536. [PMID: 35870297 DOI: 10.1016/j.media.2022.102536] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2021] [Revised: 04/26/2022] [Accepted: 07/11/2022] [Indexed: 11/20/2022]
Abstract
In medical image segmentation, supervised machine learning models trained using one image modality (e.g. computed tomography (CT)) are often prone to failure when applied to another image modality (e.g. magnetic resonance imaging (MRI)) even for the same organ. This is due to the significant intensity variations of different image modalities. In this paper, we propose a novel end-to-end deep neural network to achieve multi-modality image segmentation, where image labels of only one modality (source domain) are available for model training and the image labels for the other modality (target domain) are not available. In our method, a multi-resolution locally normalized gradient magnitude approach is firstly applied to images of both domains for minimizing the intensity discrepancy. Subsequently, a dual task encoder-decoder network including image segmentation and reconstruction is utilized to effectively adapt a segmentation network to the unlabeled target domain. Additionally, a shape constraint is imposed by leveraging adversarial learning. Finally, images from the target domain are segmented, as the network learns a consistent latent feature representation with shape awareness from both domains. We implement both 2D and 3D versions of our method, in which we evaluate CT and MRI images for kidney and cardiac tissue segmentation. For kidney, a public CT dataset (KiTS19, MICCAI 2019) and a local MRI dataset were utilized. The cardiac dataset was from the Multi-Modality Whole Heart Segmentation (MMWHS) challenge 2017. Experimental results reveal that our proposed method achieves significantly higher performance with a much lower model complexity in comparison with other state-of-the-art methods. More importantly, our method is also capable of producing superior segmentation results than other methods for images of an unseen target domain without model retraining. The code is available at GitHub (https://github.com/MinaJf/LMISA) to encourage method comparison and further research.
Collapse
Affiliation(s)
- Mina Jafari
- Intelligent Modeling and Analysis Group, School of Computer Science, University of Nottingham, UK.
| | - Susan Francis
- The Sir Peter Mansfield Imaging Centre, University of Nottingham, UK
| | - Jonathan M Garibaldi
- Intelligent Modeling and Analysis Group, School of Computer Science, University of Nottingham, UK
| | - Xin Chen
- Intelligent Modeling and Analysis Group, School of Computer Science, University of Nottingham, UK.
| |
Collapse
|
22
|
Dual attention-guided and learnable spatial transformation data augmentation multi-modal unsupervised medical image segmentation. Biomed Signal Process Control 2022. [DOI: 10.1016/j.bspc.2022.103849] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
|
23
|
Liu H, Zhuang Y, Song E, Xu X, Hung CC. A bidirectional multilayer contrastive adaptation network with anatomical structure preservation for unpaired cross-modality medical image segmentation. Comput Biol Med 2022; 149:105964. [PMID: 36007288 DOI: 10.1016/j.compbiomed.2022.105964] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2022] [Revised: 07/16/2022] [Accepted: 08/13/2022] [Indexed: 11/03/2022]
Abstract
Multi-modal medical image segmentation has achieved great success through supervised deep learning networks. However, because of domain shift and limited annotation information, unpaired cross-modality segmentation tasks are still challenging. The unsupervised domain adaptation (UDA) methods can alleviate the segmentation degradation of cross-modality segmentation by knowledge transfer between different domains, but current methods still suffer from the problems of model collapse, adversarial training instability, and mismatch of anatomical structures. To tackle these issues, we propose a bidirectional multilayer contrastive adaptation network (BMCAN) for unpaired cross-modality segmentation. The shared encoder is first adopted for learning modality-invariant encoding representations in image synthesis and segmentation simultaneously. Secondly, to retain the anatomical structure consistency in cross-modality image synthesis, we present a structure-constrained cross-modality image translation approach for image alignment. Thirdly, we construct a bidirectional multilayer contrastive learning approach to preserve the anatomical structures and enhance encoding representations, which utilizes two groups of domain-specific multilayer perceptron (MLP) networks to learn modality-specific features. Finally, a semantic information adversarial learning approach is designed to learn structural similarities of semantic outputs for output space alignment. Our proposed method was tested on three different cross-modality segmentation tasks: brain tissue, brain tumor, and cardiac substructure segmentation. Compared with other UDA methods, experimental results show that our proposed BMCAN achieves state-of-the-art segmentation performance on the above three tasks, and it has fewer training components and better feature representations for overcoming overfitting and domain shift problems. Our proposed method can efficiently reduce the annotation burden of radiologists in cross-modality image analysis.
Collapse
Affiliation(s)
- Hong Liu
- Center for Biomedical Imaging and Bioinformatics, School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, 430074, China.
| | - Yuzhou Zhuang
- Institute of Artificial Intelligence, Huazhong University of Science and Technology, Wuhan, 430074, China.
| | - Enmin Song
- Center for Biomedical Imaging and Bioinformatics, School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, 430074, China.
| | - Xiangyang Xu
- Center for Biomedical Imaging and Bioinformatics, School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, 430074, China.
| | - Chih-Cheng Hung
- Center for Machine Vision and Security Research, Kennesaw State University, Marietta, MA, 30060, USA.
| |
Collapse
|
24
|
Liu X, Sanchez P, Thermos S, O'Neil AQ, Tsaftaris SA. Learning disentangled representations in the imaging domain. Med Image Anal 2022; 80:102516. [PMID: 35751992 DOI: 10.1016/j.media.2022.102516] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2021] [Revised: 04/05/2022] [Accepted: 06/10/2022] [Indexed: 12/12/2022]
Abstract
Disentangled representation learning has been proposed as an approach to learning general representations even in the absence of, or with limited, supervision. A good general representation can be fine-tuned for new target tasks using modest amounts of data, or used directly in unseen domains achieving remarkable performance in the corresponding task. This alleviation of the data and annotation requirements offers tantalising prospects for applications in computer vision and healthcare. In this tutorial paper, we motivate the need for disentangled representations, revisit key concepts, and describe practical building blocks and criteria for learning such representations. We survey applications in medical imaging emphasising choices made in exemplar key works, and then discuss links to computer vision applications. We conclude by presenting limitations, challenges, and opportunities.
Collapse
Affiliation(s)
- Xiao Liu
- School of Engineering, The University of Edinburgh, Edinburgh EH9 3FG, UK.
| | - Pedro Sanchez
- School of Engineering, The University of Edinburgh, Edinburgh EH9 3FG, UK
| | - Spyridon Thermos
- School of Engineering, The University of Edinburgh, Edinburgh EH9 3FG, UK
| | - Alison Q O'Neil
- School of Engineering, The University of Edinburgh, Edinburgh EH9 3FG, UK; Canon Medical Research Europe, Edinburgh EH6 5NP, UK
| | - Sotirios A Tsaftaris
- School of Engineering, The University of Edinburgh, Edinburgh EH9 3FG, UK; The Alan Turing Institute, London NW1 2DB, UK
| |
Collapse
|
25
|
Zhao SX, Chen Y, Yang KF, Luo Y, Ma BY, Li YJ. A Local and Global Feature Disentangled Network: Toward Classification of Benign-Malignant Thyroid Nodules From Ultrasound Image. IEEE TRANSACTIONS ON MEDICAL IMAGING 2022; 41:1497-1509. [PMID: 34990353 DOI: 10.1109/tmi.2022.3140797] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Thyroid nodules are one of the most common nodular lesions. The incidence of thyroid cancer has increased rapidly in the past three decades and is one of the cancers with the highest incidence. As a non-invasive imaging modality, ultrasonography can identify benign and malignant thyroid nodules, and it can be used for large-scale screening. In this study, inspired by the domain knowledge of sonographers when diagnosing ultrasound images, a local and global feature disentangled network (LoGo-Net) is proposed to classify benign and malignant thyroid nodules. This model imitates the dual-pathway structure of human vision and establishes a new feature extraction method to improve the recognition performance of nodules. We use the tissue-anatomy disentangled (TAD) block to connect the dual pathways, which decouples the cues of local and global features based on the self-attention mechanism. To verify the effectiveness of the model, we constructed a large-scale dataset and conducted extensive experiments. The results show that our method achieves an accuracy of 89.33%, which has the potential to be used in the clinical practice of doctors, including early cancer screening procedures in remote or resource-poor areas.
Collapse
|
26
|
Li L, Zimmer VA, Schnabel JA, Zhuang X. Medical image analysis on left atrial LGE MRI for atrial fibrillation studies: A review. Med Image Anal 2022; 77:102360. [PMID: 35124370 PMCID: PMC7614005 DOI: 10.1016/j.media.2022.102360] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2021] [Revised: 11/04/2021] [Accepted: 01/10/2022] [Indexed: 02/08/2023]
Abstract
Late gadolinium enhancement magnetic resonance imaging (LGE MRI) is commonly used to visualize and quantify left atrial (LA) scars. The position and extent of LA scars provide important information on the pathophysiology and progression of atrial fibrillation (AF). Hence, LA LGE MRI computing and analysis are essential for computer-assisted diagnosis and treatment stratification of AF patients. Since manual delineations can be time-consuming and subject to intra- and inter-expert variability, automating this computing is highly desired, which nevertheless is still challenging and under-researched. This paper aims to provide a systematic review on computing methods for LA cavity, wall, scar, and ablation gap segmentation and quantification from LGE MRI, and the related literature for AF studies. Specifically, we first summarize AF-related imaging techniques, particularly LGE MRI. Then, we review the methodologies of the four computing tasks in detail and summarize the validation strategies applied in each task as well as state-of-the-art results on public datasets. Finally, the possible future developments are outlined, with a brief survey on the potential clinical applications of the aforementioned methods. The review indicates that the research into this topic is still in the early stages. Although several methods have been proposed, especially for the LA cavity segmentation, there is still a large scope for further algorithmic developments due to performance issues related to the high variability of enhancement appearance and differences in image acquisition.
Collapse
Affiliation(s)
- Lei Li
- School of Data Science, Fudan University, Shanghai, China; School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China; School of Biomedical Engineering and Imaging Sciences, King's College London, London, UK
| | - Veronika A Zimmer
- School of Biomedical Engineering and Imaging Sciences, King's College London, London, UK; Department of Informatics, Technical University of Munich, Germany
| | - Julia A Schnabel
- School of Biomedical Engineering and Imaging Sciences, King's College London, London, UK; Department of Informatics, Technical University of Munich, Germany; Helmholtz Center Munich, Germany
| | - Xiahai Zhuang
- School of Data Science, Fudan University, Shanghai, China.
| |
Collapse
|
27
|
Li D, Peng Y, Guo Y, Sun J. TAUNet: a triple-attention-based multi-modality MRI fusion U-Net for cardiac pathology segmentation. COMPLEX INTELL SYST 2022. [DOI: 10.1007/s40747-022-00660-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
AbstractAutomated segmentation of cardiac pathology in MRI plays a significant role for diagnosis and treatment of some cardiac disease. In clinical practice, multi-modality MRI is widely used to improve the cardiac pathology segmentation, because it can provide multiple or complementary information. Recently, deep learning methods have presented implausible performance in multi-modality medical image segmentation. However, how to fuse the underlying multi-modality information effectively to segment the pathology with irregular shapes and small region at random locations, is still a challenge task. In this paper, a triple-attention-based multi-modality MRI fusion U-Net was proposed to learn complex relationship between different modalities and pay more attention on shape information, thus to achieve improved pathology segmentation. First, three independent encoders and one fusion encoder were applied to extract specific and multiple modality features. Secondly, we concatenate the modality feature maps and use the channel attention to fuse specific modal information at every stage of the three dedicate independent encoders, then the three single modality feature maps and channel attention feature maps are together concatenated to the decoder path. Spatial attention was adopted in decoder path to capture the correlation of various positions. Once more, we employ shape attention to focus shape-dependent information. Lastly, the training approach is made efficient by introducing deep supervision mechanism with object contextual representations block to ensure precisely boundary prediction. Our proposed network was evaluated on the public MICCAI 2020 Myocardial pathology segmentation dataset which involves patients suffering from myocardial infarction. Experiments on the dataset with three modalities demonstrate the effectiveness of fusion mode of our model, and attention mechanism can integrate various modality information well. We demonstrated that such a deep learning approach could better fuse complementary information to improve the segmentation performance of cardiac pathology.
Collapse
|
28
|
Bruns S, Wolterink JM, van den Boogert TPW, Runge JH, Bouma BJ, Henriques JP, Baan J, Viergever MA, Planken RN, Išgum I. Deep learning-based whole-heart segmentation in 4D contrast-enhanced cardiac CT. Comput Biol Med 2021; 142:105191. [PMID: 35026571 DOI: 10.1016/j.compbiomed.2021.105191] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2021] [Revised: 12/27/2021] [Accepted: 12/27/2021] [Indexed: 11/17/2022]
Abstract
Automatic cardiac chamber and left ventricular (LV) myocardium segmentation over the cardiac cycle significantly extends the utilization of contrast-enhanced cardiac CT, potentially enabling in-depth assessment of cardiac function. Therefore, we evaluate an automatic method for cardiac chamber and LV myocardium segmentation in 4D cardiac CT. In this study, 4D contrast-enhanced cardiac CT scans of 1509 patients selected for transcatheter aortic valve implantation with 21,605 3D images, were divided into development (N = 12) and test set (N = 1497). 3D convolutional neural networks were trained with end-systolic (ES) and end-diastolic (ED) images. Dice similarity coefficient (DSC) and average symmetric surface distance (ASSD) were computed for 3D segmentations at ES and ED in the development set via cross-validation, and for 2D segmentations in four cardiac phases for 81 test set patients. Segmentation quality in the full test set of 1497 patients was assessed visually on a three-point scale per structure based on estimated overlap with the ground truth. Automatic segmentation resulted in a mean DSC of 0.89 ± 0.10 and ASSD of 1.43 ± 1.45 mm in 12 patients in 3D, and a DSC of 0.89 ± 0.08 and ASSD of 1.86 ± 1.20 mm in 81 patients in 2D. The qualitative evaluation in the whole test set of 1497 patients showed that automatic segmentations were assigned grade 1 (clinically useful) in 98.5%, 92.2%, 83.1%, 96.3%, and 91.6% of cases for LV cavity and myocardium, right ventricle, left atrium, and right atrium. Our automatic method using convolutional neural networks performed clinically useful segmentation across the cardiac cycle in a large set of 4D cardiac CT images, potentially enabling in-depth assessment of cardiac function.
Collapse
Affiliation(s)
- Steffen Bruns
- Department of Biomedical Engineering and Physics, Amsterdam UMC, University of Amsterdam, Meibergdreef 9, 1105AZ, Amsterdam, the Netherlands; Amsterdam Cardiovascular Sciences, Amsterdam UMC, Meibergdreef 9, 1105 AZ, Amsterdam, the Netherlands.
| | - Jelmer M Wolterink
- Department of Applied Mathematics, Technical Medical Centre, University of Twente, Drienerlolaan 5, 7522 NB, Enschede, the Netherlands.
| | - Thomas P W van den Boogert
- Heart Centre, Academic Medical Centre, Amsterdam Cardiovascular Sciences, University of Amsterdam, Meibergdreef 9, 1105 AZ, Amsterdam, the Netherlands.
| | - Jurgen H Runge
- Department of Radiology and Nuclear Medicine, Amsterdam UMC, Meibergdreef 9, 1105AZ, Amsterdam, the Netherlands.
| | - Berto J Bouma
- Department of Cardiology, Amsterdam UMC, Meibergdreef 9, 1105 AZ, Amsterdam, the Netherlands.
| | - José P Henriques
- Heart Centre, Academic Medical Centre, Amsterdam Cardiovascular Sciences, University of Amsterdam, Meibergdreef 9, 1105 AZ, Amsterdam, the Netherlands.
| | - Jan Baan
- Department of Cardiology, Amsterdam UMC, Meibergdreef 9, 1105 AZ, Amsterdam, the Netherlands.
| | - Max A Viergever
- Image Sciences Institute, University Medical Center Utrecht, Heidelberglaan 100, 3584 CX, Utrecht, the Netherlands.
| | - R Nils Planken
- Department of Radiology and Nuclear Medicine, Amsterdam UMC, Meibergdreef 9, 1105AZ, Amsterdam, the Netherlands.
| | - Ivana Išgum
- Department of Biomedical Engineering and Physics, Amsterdam UMC, University of Amsterdam, Meibergdreef 9, 1105AZ, Amsterdam, the Netherlands; Amsterdam Cardiovascular Sciences, Amsterdam UMC, Meibergdreef 9, 1105 AZ, Amsterdam, the Netherlands; Department of Radiology and Nuclear Medicine, Amsterdam UMC, Meibergdreef 9, 1105AZ, Amsterdam, the Netherlands.
| |
Collapse
|
29
|
Decomposing normal and abnormal features of medical images for content-based image retrieval of glioma imaging. Med Image Anal 2021; 74:102227. [PMID: 34543911 DOI: 10.1016/j.media.2021.102227] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2021] [Revised: 09/05/2021] [Accepted: 09/06/2021] [Indexed: 11/20/2022]
Abstract
In medical imaging, the characteristics purely derived from a disease should reflect the extent to which abnormal findings deviate from the normal features. Indeed, physicians often need corresponding images without abnormal findings of interest or, conversely, images that contain similar abnormal findings regardless of normal anatomical context. This is called comparative diagnostic reading of medical images, which is essential for a correct diagnosis. To support comparative diagnostic reading, content-based image retrieval (CBIR) that can selectively utilize normal and abnormal features in medical images as two separable semantic components will be useful. In this study, we propose a neural network architecture to decompose the semantic components of medical images into two latent codes: normal anatomy code and abnormal anatomy code. The normal anatomy code represents counterfactual normal anatomies that should have existed if the sample is healthy, whereas the abnormal anatomy code attributes to abnormal changes that reflect deviation from the normal baseline. By calculating the similarity based on either normal or abnormal anatomy codes or the combination of the two codes, our algorithm can retrieve images according to the selected semantic component from a dataset consisting of brain magnetic resonance images of gliomas. Moreover, it can utilize a synthetic query vector combining normal and abnormal anatomy codes from two different query images. To evaluate whether the retrieved images are acquired according to the targeted semantic component, the overlap of the ground-truth labels is calculated as metrics of the semantic consistency. Our algorithm provides a flexible CBIR framework by handling the decomposed features with qualitatively and quantitatively remarkable results.
Collapse
|