1
|
Jiao J, Zhou J, Li X, Xia M, Huang Y, Huang L, Wang N, Zhang X, Zhou S, Wang Y, Guo Y. USFM: A universal ultrasound foundation model generalized to tasks and organs towards label efficient image analysis. Med Image Anal 2024; 96:103202. [PMID: 38788326 DOI: 10.1016/j.media.2024.103202] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2023] [Revised: 03/29/2024] [Accepted: 05/11/2024] [Indexed: 05/26/2024]
Abstract
Inadequate generality across different organs and tasks constrains the application of ultrasound (US) image analysis methods in smart healthcare. Building a universal US foundation model holds the potential to address these issues. Nevertheless, the development of such foundation models encounters intrinsic challenges in US analysis, i.e., insufficient databases, low quality, and ineffective features. In this paper, we present a universal US foundation model, named USFM, generalized to diverse tasks and organs towards label efficient US image analysis. First, a large-scale Multi-organ, Multi-center, and Multi-device US database was built, comprehensively containing over two million US images. Organ-balanced sampling was employed for unbiased learning. Then, USFM is self-supervised pre-trained on the sufficient US database. To extract the effective features from low-quality US images, we proposed a spatial-frequency dual masked image modeling method. A productive spatial noise addition-recovery approach was designed to learn meaningful US information robustly, while a novel frequency band-stop masking learning approach was also employed to extract complex, implicit grayscale distribution and textural variations. Extensive experiments were conducted on the various tasks of segmentation, classification, and image enhancement from diverse organs and diseases. Comparisons with representative US image analysis models illustrate the universality and effectiveness of USFM. The label efficiency experiments suggest the USFM obtains robust performance with only 20% annotation, laying the groundwork for the rapid development of US models in clinical practices.
Collapse
Affiliation(s)
- Jing Jiao
- Department of Electronic Engineering, School of Information Science and Technology, Fudan University, Shanghai, China
| | - Jin Zhou
- Fudan University Shanghai Cancer Center, Shanghai, China
| | - Xiaokang Li
- Department of Electronic Engineering, School of Information Science and Technology, Fudan University, Shanghai, China
| | - Menghua Xia
- Department of Radiology and Biomedical Imaging, Yale School of Medicine, New Haven, CT, USA
| | - Yi Huang
- Department of Electronic Engineering, School of Information Science and Technology, Fudan University, Shanghai, China
| | - Lihong Huang
- Department of Electronic Engineering, School of Information Science and Technology, Fudan University, Shanghai, China
| | - Na Wang
- Department of Electronic Engineering, School of Information Science and Technology, Fudan University, Shanghai, China; SenseTime Research, Shanghai, China
| | - Xiaofan Zhang
- Shanghai Artificial Intelligence Laboratory, Shanghai, China
| | - Shichong Zhou
- Fudan University Shanghai Cancer Center, Shanghai, China
| | - Yuanyuan Wang
- Department of Electronic Engineering, School of Information Science and Technology, Fudan University, Shanghai, China; Key Laboratory of Medical Imaging Computing and Computer Assisted Intervention of Shanghai, Shanghai, China
| | - Yi Guo
- Department of Electronic Engineering, School of Information Science and Technology, Fudan University, Shanghai, China; Key Laboratory of Medical Imaging Computing and Computer Assisted Intervention of Shanghai, Shanghai, China.
| |
Collapse
|
2
|
Zheng B, Zhang R, Diao S, Zhu J, Yuan Y, Cai J, Shao L, Li S, Qin W. Dual domain distribution disruption with semantics preservation: Unsupervised domain adaptation for medical image segmentation. Med Image Anal 2024; 97:103275. [PMID: 39032395 DOI: 10.1016/j.media.2024.103275] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2024] [Revised: 06/14/2024] [Accepted: 07/10/2024] [Indexed: 07/23/2024]
Abstract
Recent unsupervised domain adaptation (UDA) methods in medical image segmentation commonly utilize Generative Adversarial Networks (GANs) for domain translation. However, the translated images often exhibit a distribution deviation from the ideal due to the inherent instability of GANs, leading to challenges such as visual inconsistency and incorrect style, consequently causing the segmentation model to fall into the fixed wrong pattern. To address this problem, we propose a novel UDA framework known as Dual Domain Distribution Disruption with Semantics Preservation (DDSP). Departing from the idea of generating images conforming to the target domain distribution in GAN-based UDA methods, we make the model domain-agnostic and focus on anatomical structural information by leveraging semantic information as constraints to guide the model to adapt to images with disrupted distributions in both source and target domains. Furthermore, we introduce the inter-channel similarity feature alignment based on the domain-invariant structural prior information, which facilitates the shared pixel-wise classifier to achieve robust performance on target domain features by aligning the source and target domain features across channels. Without any exaggeration, our method significantly outperforms existing state-of-the-art UDA methods on three public datasets (i.e., the heart dataset, the brain dataset, and the prostate dataset). The code is available at https://github.com/MIXAILAB/DDSPSeg.
Collapse
Affiliation(s)
- Boyun Zheng
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China; Shenzhen College of Advanced Technology, University of Chinese Academy of Sciences, Shenzhen 518055, China
| | - Ranran Zhang
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
| | - Songhui Diao
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China; Shenzhen College of Advanced Technology, University of Chinese Academy of Sciences, Shenzhen 518055, China
| | - Jingke Zhu
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China; Shenzhen College of Advanced Technology, University of Chinese Academy of Sciences, Shenzhen 518055, China
| | - Yixuan Yuan
- Department of Electronic Engineering, The Chinese University of Hong Kong, 999077, Hong Kong, China
| | - Jing Cai
- Department of Health Technology and Informatics, The Hong Kong Polytechnic University, 999077, Hong Kong, China
| | - Liang Shao
- Department of Cardiology, Jiangxi Provincial People's Hospital, The First Affiliated Hospital of Nanchang Medical College, Nanchang 330013, China
| | - Shuo Li
- Department of Biomedical Engineering, Department of Computer and Data Science, Case Western Reserve University, Cleveland, United States.
| | - Wenjian Qin
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China.
| |
Collapse
|
3
|
Huang L, Zhou J, Jiao J, Zhou S, Chang C, Wang Y, Guo Y. Standardization of ultrasound images across various centers: M2O-DiffGAN bridging the gaps among unpaired multi-domain ultrasound images. Med Image Anal 2024; 95:103187. [PMID: 38705056 DOI: 10.1016/j.media.2024.103187] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2023] [Revised: 02/20/2024] [Accepted: 04/22/2024] [Indexed: 05/07/2024]
Abstract
Domain shift problem is commonplace for ultrasound image analysis due to difference imaging setting and diverse medical centers, which lead to poor generalizability of deep learning-based methods. Multi-Source Domain Transformation (MSDT) provides a promising way to tackle the performance degeneration caused by the domain shift, which is more practical and challenging compared to conventional single-source transformation tasks. An effective unsupervised domain combination strategy is highly required to handle multiple domains without annotations. Fidelity and quality of generated images are also important to ensure the accuracy of computer-aided diagnosis. However, existing MSDT approaches underperform in above two areas. In this paper, an efficient domain transformation model named M2O-DiffGAN is introduced to achieve a unified mapping from multiple unlabeled source domains to the target domain. A cycle-consistent "many-to-one" adversarial learning architecture is introduced to model various unlabeled domains jointly. A condition adversarial diffusion process is employed to generate images with high-fidelity, combining an adversarial projector to capture reverse transition probabilities over large step sizes for accelerating sampling. Considering the limited perceptual information of ultrasound images, an ultrasound-specific content loss helps to capture more perceptual features for synthesizing high-quality ultrasound images. Massive comparisons on six clinical datasets covering thyroid, carotid and breast demonstrate the superiority of the M2O-DiffGAN in the performance of bridging the domain gaps and enlarging the generalization of downstream analysis methods compared to state-of-the-art algorithms. It improves the mean MI, Bhattacharyya Coefficient, dice and IoU assessments by 0.390, 0.120, 0.245 and 0.250, presenting promising clinical applications.
Collapse
Affiliation(s)
- Lihong Huang
- Department of Electronic Engineering, School of Information Science and Technology, Fudan University, Shanghai, China
| | - Jin Zhou
- Fudan University Shanghai Cancer Center, Shanghai, China
| | - Jing Jiao
- Department of Electronic Engineering, School of Information Science and Technology, Fudan University, Shanghai, China
| | - Shichong Zhou
- Fudan University Shanghai Cancer Center, Shanghai, China
| | - Cai Chang
- Fudan University Shanghai Cancer Center, Shanghai, China
| | - Yuanyuan Wang
- Department of Electronic Engineering, School of Information Science and Technology, Fudan University, Shanghai, China; Key Laboratory of Medical Imaging Computing and Computer Assisted Intervention of Shanghai, Shanghai, China.
| | - Yi Guo
- Department of Electronic Engineering, School of Information Science and Technology, Fudan University, Shanghai, China; Key Laboratory of Medical Imaging Computing and Computer Assisted Intervention of Shanghai, Shanghai, China.
| |
Collapse
|
4
|
Vafaeezadeh M, Behnam H, Gifani P. Ultrasound Image Analysis with Vision Transformers-Review. Diagnostics (Basel) 2024; 14:542. [PMID: 38473014 DOI: 10.3390/diagnostics14050542] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2023] [Revised: 02/22/2024] [Accepted: 02/29/2024] [Indexed: 03/14/2024] Open
Abstract
Ultrasound (US) has become a widely used imaging modality in clinical practice, characterized by its rapidly evolving technology, advantages, and unique challenges, such as a low imaging quality and high variability. There is a need to develop advanced automatic US image analysis methods to enhance its diagnostic accuracy and objectivity. Vision transformers, a recent innovation in machine learning, have demonstrated significant potential in various research fields, including general image analysis and computer vision, due to their capacity to process large datasets and learn complex patterns. Their suitability for automatic US image analysis tasks, such as classification, detection, and segmentation, has been recognized. This review provides an introduction to vision transformers and discusses their applications in specific US image analysis tasks, while also addressing the open challenges and potential future trends in their application in medical US image analysis. Vision transformers have shown promise in enhancing the accuracy and efficiency of ultrasound image analysis and are expected to play an increasingly important role in the diagnosis and treatment of medical conditions using ultrasound imaging as technology progresses.
Collapse
Affiliation(s)
- Majid Vafaeezadeh
- Biomedical Engineering Department, School of Electrical Engineering, Iran University of Science and Technology, Tehran 1311416846, Iran
| | - Hamid Behnam
- Biomedical Engineering Department, School of Electrical Engineering, Iran University of Science and Technology, Tehran 1311416846, Iran
| | - Parisa Gifani
- Medical Sciences and Technologies Department, Science and Research Branch, Islamic Azad University, Tehran 1477893855, Iran
| |
Collapse
|
5
|
Kumari S, Singh P. Deep learning for unsupervised domain adaptation in medical imaging: Recent advancements and future perspectives. Comput Biol Med 2024; 170:107912. [PMID: 38219643 DOI: 10.1016/j.compbiomed.2023.107912] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2023] [Revised: 11/02/2023] [Accepted: 12/24/2023] [Indexed: 01/16/2024]
Abstract
Deep learning has demonstrated remarkable performance across various tasks in medical imaging. However, these approaches primarily focus on supervised learning, assuming that the training and testing data are drawn from the same distribution. Unfortunately, this assumption may not always hold true in practice. To address these issues, unsupervised domain adaptation (UDA) techniques have been developed to transfer knowledge from a labeled domain to a related but unlabeled domain. In recent years, significant advancements have been made in UDA, resulting in a wide range of methodologies, including feature alignment, image translation, self-supervision, and disentangled representation methods, among others. In this paper, we provide a comprehensive literature review of recent deep UDA approaches in medical imaging from a technical perspective. Specifically, we categorize current UDA research in medical imaging into six groups and further divide them into finer subcategories based on the different tasks they perform. We also discuss the respective datasets used in the studies to assess the divergence between the different domains. Finally, we discuss emerging areas and provide insights and discussions on future research directions to conclude this survey.
Collapse
Affiliation(s)
- Suruchi Kumari
- Department of Computer Science and Engineering, Indian Institute of Technology Roorkee, India.
| | - Pravendra Singh
- Department of Computer Science and Engineering, Indian Institute of Technology Roorkee, India.
| |
Collapse
|
6
|
Chen T, Xia M, Huang Y, Jiao J, Wang Y. Cross-Domain Echocardiography Segmentation with Multi-Space Joint Adaptation. SENSORS (BASEL, SWITZERLAND) 2023; 23:1479. [PMID: 36772517 PMCID: PMC9921139 DOI: 10.3390/s23031479] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/15/2022] [Revised: 01/18/2023] [Accepted: 01/26/2023] [Indexed: 06/18/2023]
Abstract
The segmentation of the left ventricle endocardium (LVendo) and the left ventricle epicardium (LVepi) in echocardiography plays an important role in clinical diagnosis. Recently, deep neural networks have been the most commonly used approach for echocardiography segmentation. However, the performance of a well-trained segmentation network may degrade in unseen domain datasets due to the distribution shift of the data. Adaptation algorithms can improve the generalization of deep neural networks to different domains. In this paper, we present a multi-space adaptation-segmentation-joint framework, named MACS, for cross-domain echocardiography segmentation. It adopts a generative adversarial architecture; the generator fulfills the segmentation task and the multi-space discriminators align the two domains on both the feature space and output space. We evaluated the MACS method on two echocardiography datasets from different medical centers and vendors, the publicly available CAMUS dataset and our self-acquired dataset. The experimental results indicated that the MACS could handle unseen domain datasets well, without requirements for manual annotations, and improve the generalization performance by 2.2% in the Dice metric.
Collapse
Affiliation(s)
- Tongwaner Chen
- Department of Electronic Engineering, Fudan University, Shanghai 200433, China
| | - Menghua Xia
- Department of Electronic Engineering, Fudan University, Shanghai 200433, China
| | - Yi Huang
- Department of Electronic Engineering, Fudan University, Shanghai 200433, China
| | - Jing Jiao
- Department of Electronic Engineering, Fudan University, Shanghai 200433, China
| | - Yuanyuan Wang
- Department of Electronic Engineering, Fudan University, Shanghai 200433, China
- Key Laboratory of Medical Imaging Computing and Computer Assisted Intervention of Shanghai, Shanghai 200032, China
| |
Collapse
|