1
|
Santhirasekaram A, Winkler M, Rockall A, Glocker B. A geometric approach to robust medical image segmentation. Med Image Anal 2024; 97:103260. [PMID: 38970862 DOI: 10.1016/j.media.2024.103260] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2023] [Revised: 06/12/2024] [Accepted: 06/26/2024] [Indexed: 07/08/2024]
Abstract
Robustness of deep learning segmentation models is crucial for their safe incorporation into clinical practice. However, these models can falter when faced with distributional changes. This challenge is evident in magnetic resonance imaging (MRI) scans due to the diverse acquisition protocols across various domains, leading to differences in image characteristics such as textural appearances. We posit that the restricted anatomical differences between subjects could be harnessed to refine the latent space into a set of shape components. The learned set then aims to encompass the relevant anatomical shape variation found within the patient population. We explore this by utilising multiple MRI sequences to learn texture invariant and shape equivariant features which are used to construct a shape dictionary using vector quantisation. We investigate shape equivariance to a number of different types of groups. We hypothesise and prove that the greater the group order, i.e., the denser the constraint, the better becomes the model robustness. We achieve shape equivariance either with a contrastive based approach or by imposing equivariant constraints on the convolutional kernels. The resulting shape equivariant dictionary is then sampled to compose the segmentation output. Our method achieves state-of-the-art performance for the task of single domain generalisation for prostate and cardiac MRI segmentation. Code is available at https://github.com/AinkaranSanthi/A_Geometric_Perspective_For_Robust_Segmentation.
Collapse
Affiliation(s)
| | - Mathias Winkler
- Department of Surgery and Cancer, Imperial College London, United Kingdom
| | - Andrea Rockall
- Department of Surgery and Cancer, Imperial College London, United Kingdom
| | - Ben Glocker
- Department of Computing, Imperial College London, United Kingdom
| |
Collapse
|
2
|
Zhang Y, Balestra G, Zhang K, Wang J, Rosati S, Giannini V. MultiTrans: Multi-branch transformer network for medical image segmentation. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2024; 254:108280. [PMID: 38878361 DOI: 10.1016/j.cmpb.2024.108280] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/29/2024] [Revised: 05/13/2024] [Accepted: 06/06/2024] [Indexed: 07/28/2024]
Abstract
BACKGROUND AND OBJECTIVE Transformer, which is notable for its ability of global context modeling, has been used to remedy the shortcomings of Convolutional neural networks (CNN) and break its dominance in medical image segmentation. However, the self-attention module is both memory and computational inefficient, so many methods have to build their Transformer branch upon largely downsampled feature maps or adopt the tokenized image patches to fit their model into accessible GPUs. This patch-wise operation restricts the network in extracting pixel-level intrinsic structural or dependencies inside each patch, hurting the performance of pixel-level classification tasks. METHODS To tackle these issues, we propose a memory- and computation-efficient self-attention module to enable reasoning on relatively high-resolution features, promoting the efficiency of learning global information while effective grasping fine spatial details. Furthermore, we design a novel Multi-Branch Transformer (MultiTrans) architecture to provide hierarchical features for handling objects with variable shapes and sizes in medical images. By building four parallel Transformer branches on different levels of CNN, our hybrid network aggregates both multi-scale global contexts and multi-scale local features. RESULTS MultiTrans achieves the highest segmentation accuracy on three medical image datasets with different modalities: Synapse, ACDC and M&Ms. Compared to the Standard Self-Attention (SSA), the proposed Efficient Self-Attention (ESA) can largely reduce the training memory and computational complexity while even slightly improve the accuracy. Specifically, the training memory cost, FLOPs and Params of our ESA are 18.77%, 20.68% and 74.07% of the SSA. CONCLUSIONS Experiments on three medical image datasets demonstrate the generality and robustness of the designed network. The ablation study shows the efficiency and effectiveness of our proposed ESA. Code is available at: https://github.com/Yanhua-Zhang/MultiTrans-extension.
Collapse
Affiliation(s)
- Yanhua Zhang
- Department of Electronics and Telecommunications, Politecnico di Torino, Corso Duca degli Abruzzi 24, Turin, 10129, Italy; School of Astronautics, Northwestern Polytechnical University, 127 West Youyi Road, Xi'an, 710072, China.
| | - Gabriella Balestra
- Department of Electronics and Telecommunications, Politecnico di Torino, Corso Duca degli Abruzzi 24, Turin, 10129, Italy.
| | - Ke Zhang
- School of Astronautics, Northwestern Polytechnical University, 127 West Youyi Road, Xi'an, 710072, China.
| | - Jingyu Wang
- School of Astronautics, Northwestern Polytechnical University, 127 West Youyi Road, Xi'an, 710072, China.
| | - Samanta Rosati
- Department of Electronics and Telecommunications, Politecnico di Torino, Corso Duca degli Abruzzi 24, Turin, 10129, Italy.
| | - Valentina Giannini
- Department of Surgical Sciences, University of Turin, Turin, 10124, Italy; Radiology Unit, Candiolo Cancer Institute, FPO-IRCCS, Candiolo, 10060, Italy.
| |
Collapse
|
3
|
Crawley R, Amirrajab S, Lustermans D, Holtackers RJ, Plein S, Veta M, Breeuwer M, Chiribiri A, Scannell CM. Automated cardiovascular MR myocardial scar quantification with unsupervised domain adaptation. Eur Radiol Exp 2024; 8:93. [PMID: 39143405 PMCID: PMC11324636 DOI: 10.1186/s41747-024-00497-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2024] [Accepted: 07/15/2024] [Indexed: 08/16/2024] Open
Abstract
Quantification of myocardial scar from late gadolinium enhancement (LGE) cardiovascular magnetic resonance (CMR) images can be facilitated by automated artificial intelligence (AI)-based analysis. However, AI models are susceptible to domain shifts in which the model performance is degraded when applied to data with different characteristics than the original training data. In this study, CycleGAN models were trained to translate local hospital data to the appearance of a public LGE CMR dataset. After domain adaptation, an AI scar quantification pipeline including myocardium segmentation, scar segmentation, and computation of scar burden, previously developed on the public dataset, was evaluated on an external test set including 44 patients clinically assessed for ischemic scar. The mean ± standard deviation Dice similarity coefficients between the manual and AI-predicted segmentations in all patients were similar to those previously reported: 0.76 ± 0.05 for myocardium and 0.75 ± 0.32 for scar, 0.41 ± 0.12 for scar in scans with pathological findings. Bland-Altman analysis showed a mean bias in scar burden percentage of -0.62% with limits of agreement from -8.4% to 7.17%. These results show the feasibility of deploying AI models, trained with public data, for LGE CMR quantification on local clinical data using unsupervised CycleGAN-based domain adaptation. RELEVANCE STATEMENT: Our study demonstrated the possibility of using AI models trained from public databases to be applied to patient data acquired at a specific institution with different acquisition settings, without additional manual labor to obtain further training labels.
Collapse
Affiliation(s)
- Richard Crawley
- School of Biomedical Engineering & Imaging Sciences, King's College London, London, UK
| | - Sina Amirrajab
- Department of Biomedical Engineering, Eindhoven University of Technology, Eindhoven, the Netherlands
| | - Didier Lustermans
- Department of Biomedical Engineering, Eindhoven University of Technology, Eindhoven, the Netherlands
| | - Robert J Holtackers
- Cardiovascular Research Institute Maastricht (CARIM), Maastricht University, Maastricht, the Netherlands
- Department of Radiology and Nuclear Medicine, Maastricht University Medical Center, Maastricht, the Netherlands
| | - Sven Plein
- School of Biomedical Engineering & Imaging Sciences, King's College London, London, UK
- Leeds Institute of Cardiovascular and Metabolic Medicine, University of Leeds, Leeds, UK
| | - Mitko Veta
- Department of Biomedical Engineering, Eindhoven University of Technology, Eindhoven, the Netherlands
| | - Marcel Breeuwer
- Department of Biomedical Engineering, Eindhoven University of Technology, Eindhoven, the Netherlands
| | - Amedeo Chiribiri
- School of Biomedical Engineering & Imaging Sciences, King's College London, London, UK
| | - Cian M Scannell
- School of Biomedical Engineering & Imaging Sciences, King's College London, London, UK.
- Department of Biomedical Engineering, Eindhoven University of Technology, Eindhoven, the Netherlands.
| |
Collapse
|
4
|
Qiu Y, Guo H, Wang S, Yang S, Peng X, Xiayao D, Chen R, Yang J, Liu J, Li M, Li Z, Chen H, Chen M. Deep learning-based multimodal fusion of the surface ECG and clinical features in prediction of atrial fibrillation recurrence following catheter ablation. BMC Med Inform Decis Mak 2024; 24:225. [PMID: 39118118 PMCID: PMC11308714 DOI: 10.1186/s12911-024-02616-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2024] [Accepted: 07/22/2024] [Indexed: 08/10/2024] Open
Abstract
BACKGROUND Despite improvement in treatment strategies for atrial fibrillation (AF), a significant proportion of patients still experience recurrence after ablation. This study aims to propose a novel algorithm based on Transformer using surface electrocardiogram (ECG) signals and clinical features can predict AF recurrence. METHODS Between October 2018 to December 2021, patients who underwent index radiofrequency ablation for AF with at least one standard 10-second surface ECG during sinus rhythm were enrolled. An end-to-end deep learning framework based on Transformer and a fusion module was used to predict AF recurrence using ECG and clinical features. Model performance was evaluated using areas under the receiver operating characteristic curve (AUROC), sensitivity, specificity, accuracy and F1-score. RESULTS A total of 920 patients (median age 61 [IQR 14] years, 66.3% male) were included. After a median follow-up of 24 months, 253 patients (27.5%) experienced AF recurrence. A single deep learning enabled ECG signals identified AF recurrence with an AUROC of 0.769, sensitivity of 75.5%, specificity of 61.1%, F1 score of 55.6% and overall accuracy of 65.2%. Combining ECG signals and clinical features increased the AUROC to 0.899, sensitivity to 81.1%, specificity to 81.7%, F1 score to 71.7%, and overall accuracy to 81.5%. CONCLUSIONS The Transformer algorithm demonstrated excellent performance in predicting AF recurrence. Integrating ECG and clinical features enhanced the models' performance and may help identify patients at low risk for AF recurrence after index ablation.
Collapse
Affiliation(s)
- Yue Qiu
- Department of Cardiology, The First Affiliated Hospital of Nanjing Medical University, Guangzhou Road, Nanjing, 210029, Jiangsu Province, China
| | - Hongcheng Guo
- State Key Lab of Software Development Environment, Beihang University, Beijing, 100191, China
| | - Shixin Wang
- Department of Cardiology, The First Affiliated Hospital of Nanjing Medical University, Guangzhou Road, Nanjing, 210029, Jiangsu Province, China
| | - Shu Yang
- Department of Cardiology, The First Affiliated Hospital of Nanjing Medical University, Guangzhou Road, Nanjing, 210029, Jiangsu Province, China
| | - Xiafeng Peng
- Department of Cardiology, The First Affiliated Hospital of Nanjing Medical University, Guangzhou Road, Nanjing, 210029, Jiangsu Province, China
| | - Dongqin Xiayao
- Department of Cardiology, The First Affiliated Hospital of Nanjing Medical University, Guangzhou Road, Nanjing, 210029, Jiangsu Province, China
| | - Renjie Chen
- State Key Lab of Software Development Environment, Beihang University, Beijing, 100191, China
| | - Jian Yang
- State Key Lab of Software Development Environment, Beihang University, Beijing, 100191, China
| | - Jiaheng Liu
- State Key Lab of Software Development Environment, Beihang University, Beijing, 100191, China
| | - Mingfang Li
- Department of Cardiology, The First Affiliated Hospital of Nanjing Medical University, Guangzhou Road, Nanjing, 210029, Jiangsu Province, China
| | - Zhoujun Li
- State Key Lab of Software Development Environment, Beihang University, Beijing, 100191, China
| | - Hongwu Chen
- Department of Cardiology, The First Affiliated Hospital of Nanjing Medical University, Guangzhou Road, Nanjing, 210029, Jiangsu Province, China
| | - Minglong Chen
- Department of Cardiology, The First Affiliated Hospital of Nanjing Medical University, Guangzhou Road, Nanjing, 210029, Jiangsu Province, China.
| |
Collapse
|
5
|
Hua Y, Xu K, Yang X. Variational image registration with learned prior using multi-stage VAEs. Comput Biol Med 2024; 178:108785. [PMID: 38925089 DOI: 10.1016/j.compbiomed.2024.108785] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2023] [Revised: 05/16/2024] [Accepted: 06/15/2024] [Indexed: 06/28/2024]
Abstract
Variational Autoencoders (VAEs) are an efficient variational inference technique coupled with the generated network. Due to the uncertainty provided by variational inference, VAEs have been applied in medical image registration. However, a critical problem in VAEs is that the simple prior cannot provide suitable regularization, which leads to the mismatch between the variational posterior and prior. An optimal prior can close the gap between the evidence's real and variational posterior. In this paper, we propose a multi-stage VAE to learn the optimal prior, which is the aggregated posterior. A lightweight VAE is used to generate the aggregated posterior as a whole. It is an effective way to estimate the distribution of the high-dimensional aggregated posterior that commonly exists in medical image registration based on VAEs. A factorized telescoping classifier is trained to estimate the density ratio of a simple given prior and aggregated posterior, aiming to calculate the KL divergence between the variational and aggregated posterior more accurately. We analyze the KL divergence and find that the finer the factorization, the smaller the KL divergence is. However, too fine a partition is not conducive to registration accuracy. Moreover, the diagonal hypothesis of the variational posterior's covariance ignores the relationship between latent variables in image registration. To address this issue, we learn a covariance matrix with low-rank information to enable correlations with each dimension of the variational posterior. The covariance matrix is further used as a measure to reduce the uncertainty of deformation fields. Experimental results on four public medical image datasets demonstrate that our proposed method outperforms other methods in negative log-likelihood (NLL) and achieves better registration accuracy.
Collapse
Affiliation(s)
- Yong Hua
- College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, 518060, Guangdong, China
| | - Kangrong Xu
- College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, 518060, Guangdong, China
| | - Xuan Yang
- College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, 518060, Guangdong, China.
| |
Collapse
|
6
|
Ni R, Han K, Haibe-Kains B, Rink A. Generalizability of deep learning in organ-at-risk segmentation: A transfer learning study in cervical brachytherapy. Radiother Oncol 2024; 197:110332. [PMID: 38763356 DOI: 10.1016/j.radonc.2024.110332] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2024] [Revised: 05/02/2024] [Accepted: 05/03/2024] [Indexed: 05/21/2024]
Abstract
PURPOSE Deep learning can automate delineation in radiation therapy, reducing time and variability. Yet, its efficacy varies across different institutions, scanners, or settings, emphasizing the need for adaptable and robust models in clinical environments. Our study demonstrates the effectiveness of the transfer learning (TL) approach in enhancing the generalizability of deep learning models for auto-segmentation of organs-at-risk (OARs) in cervical brachytherapy. METHODS A pre-trained model was developed using 120 scans with ring and tandem applicator on a 3T magnetic resonance (MR) scanner (RT3). Four OARs were segmented and evaluated. Segmentation performance was evaluated by Volumetric Dice Similarity Coefficient (vDSC), 95 % Hausdorff Distance (HD95), surface DSC, and Added Path Length (APL). The model was fine-tuned on three out-of-distribution target groups. Pre- and post-TL outcomes, and influence of number of fine-tuning scans, were compared. A model trained with one group (Single) and a model trained with all four groups (Mixed) were evaluated on both seen and unseen data distributions. RESULTS TL enhanced segmentation accuracy across target groups, matching the pre-trained model's performance. The first five fine-tuning scans led to the most noticeable improvements, with performance plateauing with more data. TL outperformed training-from-scratch given the same training data. The Mixed model performed similarly to the Single model on RT3 scans but demonstrated superior performance on unseen data. CONCLUSIONS TL can improve a model's generalizability for OAR segmentation in MR-guided cervical brachytherapy, requiring less fine-tuning data and reduced training time. These results provide a foundation for developing adaptable models to accommodate clinical settings.
Collapse
Affiliation(s)
- Ruiyan Ni
- Department of Medical Biophysics, University of Toronto, Toronto, Canada
| | - Kathy Han
- Princess Margaret Cancer Center, University Health Network, Toronto, CA, Canada; Department of Radiation Oncology, University of Toronto, Toronto, CA, Canada
| | - Benjamin Haibe-Kains
- Department of Medical Biophysics, University of Toronto, Toronto, Canada; Princess Margaret Cancer Center, University Health Network, Toronto, CA, Canada; Vector Institute, Toronto, Toronto, CA, Canada.
| | - Alexandra Rink
- Department of Medical Biophysics, University of Toronto, Toronto, Canada; Princess Margaret Cancer Center, University Health Network, Toronto, CA, Canada; Department of Radiation Oncology, University of Toronto, Toronto, CA, Canada.
| |
Collapse
|
7
|
Nerella S, Bandyopadhyay S, Zhang J, Contreras M, Siegel S, Bumin A, Silva B, Sena J, Shickel B, Bihorac A, Khezeli K, Rashidi P. Transformers and large language models in healthcare: A review. Artif Intell Med 2024; 154:102900. [PMID: 38878555 DOI: 10.1016/j.artmed.2024.102900] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2023] [Revised: 05/28/2024] [Accepted: 05/30/2024] [Indexed: 08/09/2024]
Abstract
With Artificial Intelligence (AI) increasingly permeating various aspects of society, including healthcare, the adoption of the Transformers neural network architecture is rapidly changing many applications. Transformer is a type of deep learning architecture initially developed to solve general-purpose Natural Language Processing (NLP) tasks and has subsequently been adapted in many fields, including healthcare. In this survey paper, we provide an overview of how this architecture has been adopted to analyze various forms of healthcare data, including clinical NLP, medical imaging, structured Electronic Health Records (EHR), social media, bio-physiological signals, biomolecular sequences. Furthermore, which have also include the articles that used the transformer architecture for generating surgical instructions and predicting adverse outcomes after surgeries under the umbrella of critical care. Under diverse settings, these models have been used for clinical diagnosis, report generation, data reconstruction, and drug/protein synthesis. Finally, we also discuss the benefits and limitations of using transformers in healthcare and examine issues such as computational cost, model interpretability, fairness, alignment with human values, ethical implications, and environmental impact.
Collapse
Affiliation(s)
- Subhash Nerella
- Department of Biomedical Engineering, University of Florida, Gainesville, United States
| | | | - Jiaqing Zhang
- Department of Electrical and Computer Engineering, University of Florida, Gainesville, United States
| | - Miguel Contreras
- Department of Biomedical Engineering, University of Florida, Gainesville, United States
| | - Scott Siegel
- Department of Biomedical Engineering, University of Florida, Gainesville, United States
| | - Aysegul Bumin
- Department of Computer and Information Science and Engineering, University of Florida, Gainesville, United States
| | - Brandon Silva
- Department of Computer and Information Science and Engineering, University of Florida, Gainesville, United States
| | - Jessica Sena
- Department Of Computer Science, Universidade Federal de Minas Gerais, Belo Horizonte, Brazil
| | - Benjamin Shickel
- Department of Medicine, University of Florida, Gainesville, United States
| | - Azra Bihorac
- Department of Medicine, University of Florida, Gainesville, United States
| | - Kia Khezeli
- Department of Biomedical Engineering, University of Florida, Gainesville, United States
| | - Parisa Rashidi
- Department of Biomedical Engineering, University of Florida, Gainesville, United States.
| |
Collapse
|
8
|
Zheng H, Liu X, Huang Z, Ren Y, Fu B, Shi T, Liu L, Guo Q, Tian C, Liang D, Wang R, Chen J, Hu Z. Deep learning for intracranial aneurysm segmentation using CT angiography. Phys Med Biol 2024; 69:155024. [PMID: 39008990 DOI: 10.1088/1361-6560/ad6372] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2024] [Accepted: 07/15/2024] [Indexed: 07/17/2024]
Abstract
Objective.This study aimed to employ a two-stage deep learning method to accurately detect small aneurysms (4-10 mm in size) in computed tomography angiography images.Approach.This study included 956 patients from 6 hospitals and a public dataset obtained with 6 CT scanners from different manufacturers. The proposed method consists of two components: a lightweight and fast head region selection (HRS) algorithm and an adaptive 3D nnU-Net network, which is used as the main architecture for segmenting aneurysms. Segments generated by the deep neural network were compared with expert-generated manual segmentation results and assessed using Dice scores.MainResults.The area under the curve (AUC) exceeded 79% across all datasets. In particular, the precision and AUC reached 85.2% and 87.6%, respectively, on certain datasets. The experimental results demonstrated the promising performance of this approach, which reduced the inference time by more than 50% compared to direct inference without HRS.Significance.Compared with a model without HRS, the deep learning approach we developed can accurately segment aneurysms by automatically localizing brain regions and can accelerate aneurysm inference by more than 50%.
Collapse
Affiliation(s)
- Huizhong Zheng
- Lauterbur Research Center for Biomedical Imaging, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, People's Republic of China
| | - Xinfeng Liu
- Department of Radiology, Guizhou Provincial People's Hospital, Guiyang 550002, People's Republic of China
| | - Zhenxing Huang
- Lauterbur Research Center for Biomedical Imaging, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, People's Republic of China
| | - Yan Ren
- AI for Science (AI4S)-Preferred Program, Peking University Shenzhen Graduate School, Shenzhen 518055, People's Republic of China
- School of Electronic and Computer Engineering, Peking University Shenzhen Graduate School, Shenzhen 518005, People's Republic of China
| | - Bin Fu
- AI for Science (AI4S)-Preferred Program, Peking University Shenzhen Graduate School, Shenzhen 518055, People's Republic of China
- School of Electronic and Computer Engineering, Peking University Shenzhen Graduate School, Shenzhen 518005, People's Republic of China
| | - Tianliang Shi
- Department of Radiology, Tongren Municipal People's Hospital, Tongren, Guizhou 554300, People's Republic of China
| | - Lu Liu
- Department of Radiology, The Second People's Hospital of Guiyang, Guiyang, Guizhou 550002, People's Republic of China
| | - Qiping Guo
- Department of Radiology, Xingyi Municipal People's Hospital, Xingyi, Guizhou 562400, People's Republic of China
| | - Chong Tian
- Department of Radiology, Guizhou Provincial People's Hospital, Guiyang 550002, People's Republic of China
| | - Dong Liang
- Lauterbur Research Center for Biomedical Imaging, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, People's Republic of China
| | - Rongpin Wang
- Department of Radiology, Guizhou Provincial People's Hospital, Guiyang 550002, People's Republic of China
| | - Jie Chen
- AI for Science (AI4S)-Preferred Program, Peking University Shenzhen Graduate School, Shenzhen 518055, People's Republic of China
- School of Electronic and Computer Engineering, Peking University Shenzhen Graduate School, Shenzhen 518005, People's Republic of China
- Peng Cheng Laboratory, Shenzhen 518005, People's Republic of China
| | - Zhanli Hu
- Lauterbur Research Center for Biomedical Imaging, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, People's Republic of China
| |
Collapse
|
9
|
Pham TV, Vu TN, Le HMQ, Pham VT, Tran TT. CapNet: An Automatic Attention-Based with Mixer Model for Cardiovascular Magnetic Resonance Image Segmentation. JOURNAL OF IMAGING INFORMATICS IN MEDICINE 2024:10.1007/s10278-024-01191-x. [PMID: 38980628 DOI: 10.1007/s10278-024-01191-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/24/2023] [Revised: 05/21/2024] [Accepted: 05/22/2024] [Indexed: 07/10/2024]
Abstract
Deep neural networks have shown excellent performance in medical image segmentation, especially for cardiac images. Transformer-based models, though having advantages over convolutional neural networks due to the ability of long-range dependence learning, still have shortcomings such as having a large number of parameters and and high computational cost. Additionally, for better results, they are often pretrained on a larger data, thus requiring large memory size and increasing resource expenses. In this study, we propose a new lightweight but efficient model, namely CapNet, based on convolutions and mixing modules for cardiac segmentation from magnetic resonance images (MRI) that can be trained from scratch with a small amount of parameters. To handle varying sizes and shapes which often occur in cardiac systolic and diastolic phases, we propose attention modules for pooling, spatial, and channel information. We also propose a novel loss called the Tversky Shape Power Distance function based on the shape dissimilarity between labels and predictions that shows promising performances compared to other losses. Experiments on three public datasets including ACDC benchmark, Sunnybrook data, and MS-CMR challenge are conducted and compared with other state of the arts (SOTA). For binary segmentation, the proposed CapNet obtained the Dice similarity coefficient (DSC) of 94% and 95.93% for respectively the Endocardium and Epicardium regions with Sunnybrook dataset, 94.49% for Endocardium, and 96.82% for Epicardium with the ACDC data. Regarding the multiclass case, the average DSC by CapNet is 93.05% for the ACDC data; and the DSC scores for the MS-CMR are 94.59%, 92.22%, and 93.99% for respectively the bSSFP, T2-SPAIR, and LGE sequences of the MS-CMR. Moreover, the statistical significance analysis tests with p-value < 0.05 compared with transformer-based methods and some CNN-based approaches demonstrated that the CapNet, though having fewer training parameters, is statistically significant. The promising evaluation metrics show comparative results in both Dice and IoU indices compared to SOTA CNN-based and Transformer-based architectures.
Collapse
Affiliation(s)
- Tien Viet Pham
- Department of Automation Engineering, School of Electrical and Electronic Engineering, Hanoi University of Science and Technology, Hanoi, Vietnam
| | - Tu Ngoc Vu
- Department of Automation Engineering, School of Electrical and Electronic Engineering, Hanoi University of Science and Technology, Hanoi, Vietnam
| | - Hoang-Minh-Quang Le
- Department of Automation Engineering, School of Electrical and Electronic Engineering, Hanoi University of Science and Technology, Hanoi, Vietnam
| | - Van-Truong Pham
- Department of Automation Engineering, School of Electrical and Electronic Engineering, Hanoi University of Science and Technology, Hanoi, Vietnam
| | - Thi-Thao Tran
- Department of Automation Engineering, School of Electrical and Electronic Engineering, Hanoi University of Science and Technology, Hanoi, Vietnam.
| |
Collapse
|
10
|
Jafari R, Verma R, Aggarwal V, Gupta RK, Singh A. Deep learning-based segmentation of left ventricular myocardium on dynamic contrast-enhanced MRI: a comprehensive evaluation across temporal frames. Int J Comput Assist Radiol Surg 2024:10.1007/s11548-024-03221-z. [PMID: 38965165 DOI: 10.1007/s11548-024-03221-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2024] [Accepted: 06/24/2024] [Indexed: 07/06/2024]
Abstract
PURPOSE Cardiac perfusion MRI is vital for disease diagnosis, treatment planning, and risk stratification, with anomalies serving as markers of underlying ischemic pathologies. AI-assisted methods and tools enable accurate and efficient left ventricular (LV) myocardium segmentation on all DCE-MRI timeframes, offering a solution to the challenges posed by the multidimensional nature of the data. This study aims to develop and assess an automated method for LV myocardial segmentation on DCE-MRI data of a local hospital. METHODS The study consists of retrospective DCE-MRI data from 55 subjects acquired at the local hospital using a 1.5 T MRI scanner. The dataset included subjects with and without cardiac abnormalities. The timepoint for the reference frame (post-contrast LV myocardium) was identified using standard deviation across the temporal sequences. Iterative image registration of other temporal images with respect to this reference image was performed using Maxwell's demons algorithm. The registered stack was fed to the model built using the U-Net framework for predicting the LV myocardium at all timeframes of DCE-MRI. RESULTS The mean and standard deviation of the dice similarity coefficient (DSC) for myocardial segmentation using pre-trained network Net_cine is 0.78 ± 0.04, and for the fine-tuned network Net_dyn which predicts mask on all timeframes individually, it is 0.78 ± 0.03. The DSC for Net_dyn ranged from 0.71 to 0.93. The average DSC achieved for the reference frame is 0.82 ± 0.06. CONCLUSION The study proposed a fast and fully automated AI-assisted method to segment LV myocardium on all timeframes of DCE-MRI data. The method is robust, and its performance is independent of the intra-temporal sequence registration and can easily accommodate timeframes with potential registration errors.
Collapse
Affiliation(s)
- Raufiya Jafari
- Centre for Biomedical Engineering, Indian Institute of Technology Delhi, New Delhi, Delhi, 110016, India
| | - Radhakrishan Verma
- Department of Radiology, Fortis Memorial Research Institute, Gurugram, India
| | - Vinayak Aggarwal
- Department of Cardiology, Fortis Memorial Research Institute, Gurugram, India
| | - Rakesh Kumar Gupta
- Department of Radiology, Fortis Memorial Research Institute, Gurugram, India
| | - Anup Singh
- Centre for Biomedical Engineering, Indian Institute of Technology Delhi, New Delhi, Delhi, 110016, India.
- Department of Biomedical Engineering, All India Institute of Medical Sciences, New Delhi, Delhi, India.
- Yardi School of Artificial Intelligence, Indian Institute of Technology Delhi, New Delhi, Delhi, India.
| |
Collapse
|
11
|
Guan H, Yap PT, Bozoki A, Liu M. Federated learning for medical image analysis: A survey. PATTERN RECOGNITION 2024; 151:110424. [PMID: 38559674 PMCID: PMC10976951 DOI: 10.1016/j.patcog.2024.110424] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
Machine learning in medical imaging often faces a fundamental dilemma, namely, the small sample size problem. Many recent studies suggest using multi-domain data pooled from different acquisition sites/centers to improve statistical power. However, medical images from different sites cannot be easily shared to build large datasets for model training due to privacy protection reasons. As a promising solution, federated learning, which enables collaborative training of machine learning models based on data from different sites without cross-site data sharing, has attracted considerable attention recently. In this paper, we conduct a comprehensive survey of the recent development of federated learning methods in medical image analysis. We have systematically gathered research papers on federated learning and its applications in medical image analysis published between 2017 and 2023. Our search and compilation were conducted using databases from IEEE Xplore, ACM Digital Library, Science Direct, Springer Link, Web of Science, Google Scholar, and PubMed. In this survey, we first introduce the background of federated learning for dealing with privacy protection and collaborative learning issues. We then present a comprehensive review of recent advances in federated learning methods for medical image analysis. Specifically, existing methods are categorized based on three critical aspects of a federated learning system, including client end, server end, and communication techniques. In each category, we summarize the existing federated learning methods according to specific research problems in medical image analysis and also provide insights into the motivations of different approaches. In addition, we provide a review of existing benchmark medical imaging datasets and software platforms for current federated learning research. We also conduct an experimental study to empirically evaluate typical federated learning methods for medical image analysis. This survey can help to better understand the current research status, challenges, and potential research opportunities in this promising research field.
Collapse
Affiliation(s)
- Hao Guan
- Department of Radiology and Biomedical Research Imaging Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Pew-Thian Yap
- Department of Radiology and Biomedical Research Imaging Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Andrea Bozoki
- Department of Neurology, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Mingxia Liu
- Department of Radiology and Biomedical Research Imaging Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| |
Collapse
|
12
|
Aghapanah H, Rasti R, Kermani S, Tabesh F, Banaem HY, Aliakbar HP, Sanei H, Segars WP. CardSegNet: An adaptive hybrid CNN-vision transformer model for heart region segmentation in cardiac MRI. Comput Med Imaging Graph 2024; 115:102382. [PMID: 38640619 DOI: 10.1016/j.compmedimag.2024.102382] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2023] [Revised: 03/08/2024] [Accepted: 04/10/2024] [Indexed: 04/21/2024]
Abstract
Cardiovascular MRI (CMRI) is a non-invasive imaging technique adopted for assessing the blood circulatory system's structure and function. Precise image segmentation is required to measure cardiac parameters and diagnose abnormalities through CMRI data. Because of anatomical heterogeneity and image variations, cardiac image segmentation is a challenging task. Quantification of cardiac parameters requires high-performance segmentation of the left ventricle (LV), right ventricle (RV), and left ventricle myocardium from the background. The first proposed solution here is to manually segment the regions, which is a time-consuming and error-prone procedure. In this context, many semi- or fully automatic solutions have been proposed recently, among which deep learning-based methods have revealed high performance in segmenting regions in CMRI data. In this study, a self-adaptive multi attention (SMA) module is introduced to adaptively leverage multiple attention mechanisms for better segmentation. The convolutional-based position and channel attention mechanisms with a patch tokenization-based vision transformer (ViT)-based attention mechanism in a hybrid and end-to-end manner are integrated into the SMA. The CNN- and ViT-based attentions mine the short- and long-range dependencies for more precise segmentation. The SMA module is applied in an encoder-decoder structure with a ResNet50 backbone named CardSegNet. Furthermore, a deep supervision method with multi-loss functions is introduced to the CardSegNet optimizer to reduce overfitting and enhance the model's performance. The proposed model is validated on the ACDC2017 (n=100), M&Ms (n=321), and a local dataset (n=22) using the 10-fold cross-validation method with promising segmentation results, demonstrating its outperformance versus its counterparts.
Collapse
Affiliation(s)
- Hamed Aghapanah
- School of Advanced Technologies in Medicine, Isfahan University of Medical Sciences, Isfahan, Iran
| | - Reza Rasti
- Department of Biomedical Engineering, Faculty of Engineering, University of Isfahan, Isfahan, Iran; Department of Biomedical Engineering, Duke University, Durham, NC 27708, USA.
| | - Saeed Kermani
- School of Advanced Technologies in Medicine, Isfahan University of Medical Sciences, Isfahan, Iran.
| | - Faezeh Tabesh
- Cardiovascular Research Institute, Isfahan University of Medical Sciences, Isfahan, Iran
| | - Hossein Yousefi Banaem
- Skull Base Research Center, Loghman Hakim Hospital, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Hamidreza Pour Aliakbar
- Rajaie Cardiovascular Medical and Research Center, Iran University of Medical Sciences, Tehran, Iran
| | - Hamid Sanei
- Cardiovascular Research Institute, Isfahan University of Medical Sciences, Isfahan, Iran
| | - William Paul Segars
- Department of Biomedical Engineering, Faculty of Engineering, University of Isfahan, Isfahan, Iran
| |
Collapse
|
13
|
Haberl D, Spielvogel CP, Jiang Z, Orlhac F, Iommi D, Carrió I, Buvat I, Haug AR, Papp L. Multicenter PET image harmonization using generative adversarial networks. Eur J Nucl Med Mol Imaging 2024; 51:2532-2546. [PMID: 38696130 PMCID: PMC11224088 DOI: 10.1007/s00259-024-06708-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2024] [Accepted: 03/25/2024] [Indexed: 07/05/2024]
Abstract
PURPOSE To improve reproducibility and predictive performance of PET radiomic features in multicentric studies by cycle-consistent generative adversarial network (GAN) harmonization approaches. METHODS GAN-harmonization was developed to harmonize whole-body PET scans to perform image style and texture translation between different centers and scanners. GAN-harmonization was evaluated by application to two retrospectively collected open datasets and different tasks. First, GAN-harmonization was performed on a dual-center lung cancer cohort (127 female, 138 male) where the reproducibility of radiomic features in healthy liver tissue was evaluated. Second, GAN-harmonization was applied to a head and neck cancer cohort (43 female, 154 male) acquired from three centers. Here, the clinical impact of GAN-harmonization was analyzed by predicting the development of distant metastases using a logistic regression model incorporating first-order statistics and texture features from baseline 18F-FDG PET before and after harmonization. RESULTS Image quality remained high (structural similarity: left kidney ≥ 0.800, right kidney ≥ 0.806, liver ≥ 0.780, lung ≥ 0.838, spleen ≥ 0.793, whole-body ≥ 0.832) after image harmonization across all utilized datasets. Using GAN-harmonization, inter-site reproducibility of radiomic features in healthy liver tissue increased at least by ≥ 5 ± 14% (first-order), ≥ 16 ± 7% (GLCM), ≥ 19 ± 5% (GLRLM), ≥ 16 ± 8% (GLSZM), ≥ 17 ± 6% (GLDM), and ≥ 23 ± 14% (NGTDM). In the head and neck cancer cohort, the outcome prediction improved from AUC 0.68 (95% CI 0.66-0.71) to AUC 0.73 (0.71-0.75) by application of GAN-harmonization. CONCLUSIONS GANs are capable of performing image harmonization and increase reproducibility and predictive performance of radiomic features derived from different centers and scanners.
Collapse
Affiliation(s)
- David Haberl
- Division of Nuclear Medicine, Medical University of Vienna, Währinger Gürtel 18-20/E4L, A-1090, Vienna, Austria
| | - Clemens P Spielvogel
- Division of Nuclear Medicine, Medical University of Vienna, Währinger Gürtel 18-20/E4L, A-1090, Vienna, Austria
- Christian Doppler Laboratory for Applied Metabolomics, Medical University of Vienna, Vienna, Austria
| | - Zewen Jiang
- Division of Nuclear Medicine, Medical University of Vienna, Währinger Gürtel 18-20/E4L, A-1090, Vienna, Austria
- Christian Doppler Laboratory for Applied Metabolomics, Medical University of Vienna, Vienna, Austria
| | - Fanny Orlhac
- LITO Laboratory, U1288 Inserm, Institut Curie, University Paris-Saclay, Orsay, France
| | - David Iommi
- Division of Nuclear Medicine, Medical University of Vienna, Währinger Gürtel 18-20/E4L, A-1090, Vienna, Austria
| | - Ignasi Carrió
- Department of Nuclear Medicine, Hospital Sant Pau and Autonomous University of Barcelona, Barcelona, Spain
| | - Irène Buvat
- LITO Laboratory, U1288 Inserm, Institut Curie, University Paris-Saclay, Orsay, France
| | - Alexander R Haug
- Division of Nuclear Medicine, Medical University of Vienna, Währinger Gürtel 18-20/E4L, A-1090, Vienna, Austria
- Christian Doppler Laboratory for Applied Metabolomics, Medical University of Vienna, Vienna, Austria
| | - Laszlo Papp
- Center for Medical Physics and Biomedical Engineering, Medical University of Vienna, Vienna, Austria.
| |
Collapse
|
14
|
Zhang Q, Fotaki A, Ghadimi S, Wang Y, Doneva M, Wetzl J, Delfino JG, O'Regan DP, Prieto C, Epstein FH. Improving the efficiency and accuracy of cardiovascular magnetic resonance with artificial intelligence-review of evidence and proposition of a roadmap to clinical translation. J Cardiovasc Magn Reson 2024; 26:101051. [PMID: 38909656 PMCID: PMC11331970 DOI: 10.1016/j.jocmr.2024.101051] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2024] [Revised: 06/09/2024] [Accepted: 06/18/2024] [Indexed: 06/25/2024] Open
Abstract
BACKGROUND Cardiovascular magnetic resonance (CMR) is an important imaging modality for the assessment of heart disease; however, limitations of CMR include long exam times and high complexity compared to other cardiac imaging modalities. Recently advancements in artificial intelligence (AI) technology have shown great potential to address many CMR limitations. While the developments are remarkable, translation of AI-based methods into real-world CMR clinical practice remains at a nascent stage and much work lies ahead to realize the full potential of AI for CMR. METHODS Herein we review recent cutting-edge and representative examples demonstrating how AI can advance CMR in areas such as exam planning, accelerated image reconstruction, post-processing, quality control, classification and diagnosis. RESULTS These advances can be applied to speed up and simplify essentially every application including cine, strain, late gadolinium enhancement, parametric mapping, 3D whole heart, flow, perfusion and others. AI is a unique technology based on training models using data. Beyond reviewing the literature, this paper discusses important AI-specific issues in the context of CMR, including (1) properties and characteristics of datasets for training and validation, (2) previously published guidelines for reporting CMR AI research, (3) considerations around clinical deployment, (4) responsibilities of clinicians and the need for multi-disciplinary teams in the development and deployment of AI in CMR, (5) industry considerations, and (6) regulatory perspectives. CONCLUSIONS Understanding and consideration of all these factors will contribute to the effective and ethical deployment of AI to improve clinical CMR.
Collapse
Affiliation(s)
- Qiang Zhang
- Oxford Centre for Clinical Magnetic Resonance Research, Division of Cardiovascular Medicine, Radcliffe Department of Medicine, University of Oxford, Oxford, UK; Big Data Institute, University of Oxford, Oxford, UK.
| | - Anastasia Fotaki
- School of Biomedical Engineering and Imaging Sciences, King's College London, London, UK; Royal Brompton Hospital, Guy's and St Thomas' NHS Foundation Trust, London, UK.
| | - Sona Ghadimi
- Department of Biomedical Engineering, University of Virginia, Charlottesville, VA, USA.
| | - Yu Wang
- Department of Biomedical Engineering, University of Virginia, Charlottesville, VA, USA.
| | | | - Jens Wetzl
- Siemens Healthineers AG, Erlangen, Germany.
| | - Jana G Delfino
- US Food and Drug Administration, Center for Devices and Radiological Health (CDRH), Office of Science and Engineering Laboratories (OSEL), Silver Spring, MD, USA.
| | - Declan P O'Regan
- MRC Laboratory of Medical Sciences, Imperial College London, London, UK.
| | - Claudia Prieto
- School of Biomedical Engineering and Imaging Sciences, King's College London, London, UK; School of Engineering, Pontificia Universidad Católica de Chile, Santiago, Chile.
| | - Frederick H Epstein
- Department of Biomedical Engineering, University of Virginia, Charlottesville, VA, USA.
| |
Collapse
|
15
|
Li K, Zhu Y, Yu L, Heng PA. A Dual Enrichment Synergistic Strategy to Handle Data Heterogeneity for Domain Incremental Cardiac Segmentation. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:2279-2290. [PMID: 38345948 DOI: 10.1109/tmi.2024.3364240] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2024]
Abstract
Upon remarkable progress in cardiac image segmentation, contemporary studies dedicate to further upgrading model functionality toward perfection, through progressively exploring the sequentially delivered datasets over time by domain incremental learning. Existing works mainly concentrated on addressing the heterogeneous style variations, but overlooked the critical shape variations across domains hidden behind the sub-disease composition discrepancy. In case the updated model catastrophically forgets the sub-diseases that were learned in past domains but are no longer present in the subsequent domains, we proposed a dual enrichment synergistic strategy to incrementally broaden model competence for a growing number of sub-diseases. The data-enriched scheme aims to diversify the shape composition of current training data via displacement-aware shape encoding and decoding, to gradually build up the robustness against cross-domain shape variations. Meanwhile, the model-enriched scheme intends to strengthen model capabilities by progressively appending and consolidating the latest expertise into a dynamically-expanded multi-expert network, to gradually cultivate the generalization ability over style-variated domains. The above two schemes work in synergy to collaboratively upgrade model capabilities in two-pronged manners. We have extensively evaluated our network with the ACDC and M&Ms datasets in single-domain and compound-domain incremental learning settings. Our approach outperformed other competing methods and achieved comparable results to the upper bound.
Collapse
|
16
|
Liu X, Sanchez P, Thermos S, O'Neil AQ, Tsaftaris SA. Compositionally Equivariant Representation Learning. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:2169-2179. [PMID: 38277249 DOI: 10.1109/tmi.2024.3358955] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/28/2024]
Abstract
Deep learning models often need sufficient supervision (i.e., labelled data) in order to be trained effectively. By contrast, humans can swiftly learn to identify important anatomy in medical images like MRI and CT scans, with minimal guidance. This recognition capability easily generalises to new images from different medical facilities and to new tasks in different settings. This rapid and generalisable learning ability is largely due to the compositional structure of image patterns in the human brain, which are not well represented in current medical models. In this paper, we study the utilisation of compositionality in learning more interpretable and generalisable representations for medical image segmentation. Overall, we propose that the underlying generative factors that are used to generate the medical images satisfy compositional equivariance property, where each factor is compositional (e.g., corresponds to human anatomy) and also equivariant to the task. Hence, a good representation that approximates well the ground truth factor has to be compositionally equivariant. By modelling the compositional representations with learnable von-Mises-Fisher (vMF) kernels, we explore how different design and learning biases can be used to enforce the representations to be more compositionally equivariant under un-, weakly-, and semi-supervised settings. Extensive results show that our methods achieve the best performance over several strong baselines on the task of semi-supervised domain-generalised medical image segmentation. Code will be made publicly available upon acceptance at https://github.com/vios-s.
Collapse
|
17
|
Jeong H, Lim H, Yoon C, Won J, Lee GY, de la Rosa E, Kirschke JS, Kim B, Kim N, Kim C. Robust Ensemble of Two Different Multimodal Approaches to Segment 3D Ischemic Stroke Segmentation Using Brain Tumor Representation Among Multiple Center Datasets. JOURNAL OF IMAGING INFORMATICS IN MEDICINE 2024:10.1007/s10278-024-01099-6. [PMID: 38693333 DOI: 10.1007/s10278-024-01099-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/23/2023] [Revised: 03/18/2024] [Accepted: 03/22/2024] [Indexed: 05/03/2024]
Abstract
Ischemic stroke segmentation at an acute stage is vital in assessing the severity of patients' impairment and guiding therapeutic decision-making for reperfusion. Although many deep learning studies have shown attractive performance in medical segmentation, it is difficult to use these models trained on public data with private hospitals' datasets. Here, we demonstrate an ensemble model that employs two different multimodal approaches for generalization, a more effective way to perform on external datasets. First, after we jointly train a segmentation model on diffusion-weighted imaging (DWI) and apparent diffusion coefficient (ADC) MR modalities, the model is inferred on the DWI images. Second, a channel-wise segmentation model is trained by concatenating the DWI and ADC images as input, and then is inferred using both MR modalities. Before training with ischemic stroke data, we utilized BraTS 2021, a public brain tumor dataset, for transfer learning. An extensive ablation study evaluates which strategy learns better representations for ischemic stroke segmentation. In our study, nnU-Net well-known for robustness is selected as our baseline model. Our proposed method is evaluated on three different datasets: the Asan Medical Center (AMC) I and II, and the 2022 Ischemic Stroke Lesion Segmentation (ISLES). Our experiments are widely validated over a large, multi-center, and multi-scanner dataset with a huge amount of 846 scans. Not only stroke lesion models can benefit from transfer learning using brain tumor data, but combining the MR modalities using different training schemes also highly improves segmentation performance. The method achieved a top-1 ranking in the ongoing ISLES'22 challenge and performed particularly well on lesion-wise metrics of interest to neuroradiologists, achieving a Dice coefficient of 78.69% and a lesion-wise F1 score of 82.46%. Also, the method was relatively robust on the AMC I (Dice, 60.35%; lesion-wise F1, 68.30%) and II (Dice; 74.12%; lesion-wise F1, 67.53%) datasets in different settings. The high segmentation accuracy of our proposed method could improve radiologists' ability to detect ischemic stroke lesions in MRI images. Our model weights and inference code are available on https://github.com/MDOpx/ISLES22-model-inference .
Collapse
Affiliation(s)
- Hyunsu Jeong
- Graduate School of Artificial Intelligence (GSAI), Department of Electrical Engineering, Medical Science and Engineering, and Medical Device Innovation Center, Convergence IT Engineering, Mechanical Engineering, Pohang University of Science and Technology (POSTECH), Pohang, South Korea
| | - Hyunseok Lim
- Graduate School of Artificial Intelligence (GSAI), Department of Electrical Engineering, Medical Science and Engineering, and Medical Device Innovation Center, Convergence IT Engineering, Mechanical Engineering, Pohang University of Science and Technology (POSTECH), Pohang, South Korea
| | - Chiho Yoon
- Graduate School of Artificial Intelligence (GSAI), Department of Electrical Engineering, Medical Science and Engineering, and Medical Device Innovation Center, Convergence IT Engineering, Mechanical Engineering, Pohang University of Science and Technology (POSTECH), Pohang, South Korea
| | - Jongjun Won
- Department of Medical Science, Asan Medical Center, Asan Medical Institute of Convergence Science and Technology, University of Ulsan College of Medicine, Seoul, Republic of Korea
| | - Grace Yoojin Lee
- Department of Medical Science, Asan Medical Center, Asan Medical Institute of Convergence Science and Technology, University of Ulsan College of Medicine, Seoul, Republic of Korea
| | - Ezequiel de la Rosa
- icometrix, Leuven, Belgium
- Department of Informatics, Technical University of Munich, Neuroradiology Munich, Germany
| | - Jan S Kirschke
- Department of Informatics, Technical University of Munich, Neuroradiology Munich, Germany
- Department of Diagnostic and Interventional Neuroradiology, School of Medicine, Klinikum Rechtsder Isar, Technical University of Munich, Munich, Germany
| | - Bumjoon Kim
- Department of Biomedical Engineering, Convergence Medicine, Radiology, Neurology, Asan Medical Center, University of Ulsan College of Medicine, Seoul, South Korea.
| | - Namkug Kim
- Department of Biomedical Engineering, Convergence Medicine, Radiology, Neurology, Asan Medical Center, University of Ulsan College of Medicine, Seoul, South Korea.
| | - Chulhong Kim
- Graduate School of Artificial Intelligence (GSAI), Department of Electrical Engineering, Medical Science and Engineering, and Medical Device Innovation Center, Convergence IT Engineering, Mechanical Engineering, Pohang University of Science and Technology (POSTECH), Pohang, South Korea.
| |
Collapse
|
18
|
Zhu Z, Ma X, Wang W, Dong S, Wang K, Wu L, Luo G, Wang G, Li S. Boosting knowledge diversity, accuracy, and stability via tri-enhanced distillation for domain continual medical image segmentation. Med Image Anal 2024; 94:103112. [PMID: 38401270 DOI: 10.1016/j.media.2024.103112] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2023] [Revised: 01/10/2024] [Accepted: 02/20/2024] [Indexed: 02/26/2024]
Abstract
Domain continual medical image segmentation plays a crucial role in clinical settings. This approach enables segmentation models to continually learn from a sequential data stream across multiple domains. However, it faces the challenge of catastrophic forgetting. Existing methods based on knowledge distillation show potential to address this challenge via a three-stage process: distillation, transfer, and fusion. Yet, each stage presents its unique issues that, collectively, amplify the problem of catastrophic forgetting. To address these issues at each stage, we propose a tri-enhanced distillation framework. (1) Stochastic Knowledge Augmentation reduces redundancy in knowledge, thereby increasing both the diversity and volume of knowledge derived from the old network. (2) Adaptive Knowledge Transfer selectively captures critical information from the old knowledge, facilitating a more accurate knowledge transfer. (3) Global Uncertainty-Guided Fusion introduces a global uncertainty view of the dataset to fuse the old and new knowledge with reduced bias, promoting a more stable knowledge fusion. Our experimental results not only validate the feasibility of our approach, but also demonstrate its superior performance compared to state-of-the-art methods. We suggest that our innovative tri-enhanced distillation framework may establish a robust benchmark for domain continual medical image segmentation.
Collapse
Affiliation(s)
- Zhanshi Zhu
- Faculty of Computing, Harbin Institute of Technology, Harbin, China
| | - Xinghua Ma
- Faculty of Computing, Harbin Institute of Technology, Harbin, China
| | - Wei Wang
- Faculty of Computing, Harbin Institute of Technology, Shenzhen, China.
| | - Suyu Dong
- College of Computer and Control Engineering, Northeast Forestry University, Harbin, China
| | - Kuanquan Wang
- Faculty of Computing, Harbin Institute of Technology, Harbin, China.
| | - Lianming Wu
- Department of Radiology, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China
| | - Gongning Luo
- Faculty of Computing, Harbin Institute of Technology, Harbin, China.
| | - Guohua Wang
- College of Computer and Control Engineering, Northeast Forestry University, Harbin, China
| | - Shuo Li
- Department of Biomedical Engineering, Case Western Reserve University, Cleveland, OH 44106, USA
| |
Collapse
|
19
|
Li B, Xu Y, Wang Y, Zhang B. DECTNet: Dual Encoder Network combined convolution and Transformer architecture for medical image segmentation. PLoS One 2024; 19:e0301019. [PMID: 38573957 PMCID: PMC10994332 DOI: 10.1371/journal.pone.0301019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2023] [Accepted: 03/09/2024] [Indexed: 04/06/2024] Open
Abstract
Automatic and accurate segmentation of medical images plays an essential role in disease diagnosis and treatment planning. Convolution neural networks have achieved remarkable results in medical image segmentation in the past decade. Meanwhile, deep learning models based on Transformer architecture also succeeded tremendously in this domain. However, due to the ambiguity of the medical image boundary and the high complexity of physical organization structures, implementing effective structure extraction and accurate segmentation remains a problem requiring a solution. In this paper, we propose a novel Dual Encoder Network named DECTNet to alleviate this problem. Specifically, the DECTNet embraces four components, which are a convolution-based encoder, a Transformer-based encoder, a feature fusion decoder, and a deep supervision module. The convolutional structure encoder can extract fine spatial contextual details in images. Meanwhile, the Transformer structure encoder is designed using a hierarchical Swin Transformer architecture to model global contextual information. The novel feature fusion decoder integrates the multi-scale representation from two encoders and selects features that focus on segmentation tasks by channel attention mechanism. Further, a deep supervision module is used to accelerate the convergence of the proposed method. Extensive experiments demonstrate that, compared to the other seven models, the proposed method achieves state-of-the-art results on four segmentation tasks: skin lesion segmentation, polyp segmentation, Covid-19 lesion segmentation, and MRI cardiac segmentation.
Collapse
Affiliation(s)
- Boliang Li
- Department of Control Science and Engineering, Harbin Institute of Technology, Harbin, Heilongjiang, China
| | - Yaming Xu
- Department of Control Science and Engineering, Harbin Institute of Technology, Harbin, Heilongjiang, China
| | - Yan Wang
- Department of Control Science and Engineering, Harbin Institute of Technology, Harbin, Heilongjiang, China
| | - Bo Zhang
- Sergeant Schools of Army Academy of Armored Forces, Changchun, Jilin, China
| |
Collapse
|
20
|
Pan NY, Huang TY, Yu JJ, Peng HH, Chuang TC, Lin YR, Chung HW, Wu MT. Virtual MOLLI Target: Generative Adversarial Networks Toward Improved Motion Correction in MRI Myocardial T1 Mapping. J Magn Reson Imaging 2024. [PMID: 38563660 DOI: 10.1002/jmri.29373] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Revised: 03/21/2024] [Accepted: 03/21/2024] [Indexed: 04/04/2024] Open
Abstract
BACKGROUND The modified Look-Locker inversion recovery (MOLLI) sequence is commonly used for myocardial T1 mapping. However, it acquires images with different inversion times, which causes difficulty in motion correction for respiratory-induced misregistration to a given target image. HYPOTHESIS Using a generative adversarial network (GAN) to produce virtual MOLLI images with consistent heart positions can reduce respiratory-induced misregistration of MOLLI datasets. STUDY TYPE Retrospective. POPULATION 1071 MOLLI datasets from 392 human participants. FIELD STRENGTH/SEQUENCE Modified Look-Locker inversion recovery sequence at 3 T. ASSESSMENT A GAN model with a single inversion time image as input was trained to generate virtual MOLLI target (VMT) images at different inversion times which were subsequently used in an image registration algorithm. Four VMT models were investigated and the best performing model compared with the standard vendor-provided motion correction (MOCO) technique. STATISTICAL TESTS The effectiveness of the motion correction technique was assessed using the fitting quality index (FQI), mutual information (MI), and Dice coefficients of motion-corrected images, plus subjective quality evaluation of T1 maps by three independent readers using Likert score. Wilcoxon signed-rank test with Bonferroni correction for multiple comparison. Significance levels were defined as P < 0.01 for highly significant differences and P < 0.05 for significant differences. RESULTS The best performing VMT model with iterative registration demonstrated significantly better performance (FQI 0.88 ± 0.03, MI 1.78 ± 0.20, Dice 0.84 ± 0.23, quality score 2.26 ± 0.95) compared to other approaches, including the vendor-provided MOCO method (FQI 0.86 ± 0.04, MI 1.69 ± 0.25, Dice 0.80 ± 0.27, quality score 2.16 ± 1.01). DATA CONCLUSION Our GAN model generating VMT images improved motion correction, which may assist reliable T1 mapping in the presence of respiratory motion. Its robust performance, even with considerable respiratory-induced heart displacements, may be beneficial for patients with difficulties in breath-holding. LEVEL OF EVIDENCE 3 TECHNICAL EFFICACY: Stage 1.
Collapse
Affiliation(s)
- Nai-Yu Pan
- Department of Electrical Engineering, National Taiwan University of Science and Technology, Taipei, Taiwan
| | - Teng-Yi Huang
- Department of Electrical Engineering, National Taiwan University of Science and Technology, Taipei, Taiwan
| | - Jui-Jung Yu
- Department of Electrical Engineering, National Taiwan University of Science and Technology, Taipei, Taiwan
| | - Hsu-Hsia Peng
- Department of Biomedical Engineering and Environmental Sciences, National Tsing Hua University, Hsinchu, Taiwan
| | - Tzu-Chao Chuang
- Department of Electrical Engineering, National Sun Yat-Sen University, Kaohsiung, Taiwan
| | - Yi-Ru Lin
- Department of Electronic and Computer Engineering, National Taiwan University of Science and Technology, Taipei, Taiwan
| | - Hsiao-Wen Chung
- Department of Electrical Engineering, National Taiwan University, Taipei, Taiwan
| | - Ming-Ting Wu
- Department of Radiology, Kaohsiung Veterans General Hospital, Kaohsiung, Taiwan
- School of Medicine, College of Medicine, National Yang Ming Chiao Tung University, Taipei, Taiwan
| |
Collapse
|
21
|
Svanera M, Savardi M, Signoroni A, Benini S, Muckli L. Fighting the scanner effect in brain MRI segmentation with a progressive level-of-detail network trained on multi-site data. Med Image Anal 2024; 93:103090. [PMID: 38241763 DOI: 10.1016/j.media.2024.103090] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2023] [Revised: 10/30/2023] [Accepted: 01/12/2024] [Indexed: 01/21/2024]
Abstract
Many clinical and research studies of the human brain require accurate structural MRI segmentation. While traditional atlas-based methods can be applied to volumes from any acquisition site, recent deep learning algorithms ensure high accuracy only when tested on data from the same sites exploited in training (i.e., internal data). Performance degradation experienced on external data (i.e., unseen volumes from unseen sites) is due to the inter-site variability in intensity distributions, and to unique artefacts caused by different MR scanner models and acquisition parameters. To mitigate this site-dependency, often referred to as the scanner effect, we propose LOD-Brain, a 3D convolutional neural network with progressive levels-of-detail (LOD), able to segment brain data from any site. Coarser network levels are responsible for learning a robust anatomical prior helpful in identifying brain structures and their locations, while finer levels refine the model to handle site-specific intensity distributions and anatomical variations. We ensure robustness across sites by training the model on an unprecedentedly rich dataset aggregating data from open repositories: almost 27,000 T1w volumes from around 160 acquisition sites, at 1.5 - 3T, from a population spanning from 8 to 90 years old. Extensive tests demonstrate that LOD-Brain produces state-of-the-art results, with no significant difference in performance between internal and external sites, and robust to challenging anatomical variations. Its portability paves the way for large-scale applications across different healthcare institutions, patient populations, and imaging technology manufacturers. Code, model, and demo are available on the project website.
Collapse
Affiliation(s)
- Michele Svanera
- Center for Cognitive Neuroimaging at the School of Psychology & Neuroscience, University of Glasgow, UK.
| | - Mattia Savardi
- Department of Medical and Surgical Specialties, Radiological Sciences, and Public Health, University of Brescia, Italy
| | - Alberto Signoroni
- Department of Medical and Surgical Specialties, Radiological Sciences, and Public Health, University of Brescia, Italy
| | - Sergio Benini
- Department of Information Engineering, University of Brescia, Italy
| | - Lars Muckli
- Center for Cognitive Neuroimaging at the School of Psychology & Neuroscience, University of Glasgow, UK
| |
Collapse
|
22
|
Bonny T, Al-Ali A, Al-Ali M, Alsaadi R, Al Nassan W, Obaideen K, AlMallahi M. Dental bitewing radiographs segmentation using deep learning-based convolutional neural network algorithms. Oral Radiol 2024; 40:165-177. [PMID: 38047985 DOI: 10.1007/s11282-023-00717-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2023] [Accepted: 10/11/2023] [Indexed: 12/05/2023]
Abstract
OBJECTIVES Dental radiographs, particularly bitewing radiographs, are widely used in dental diagnosis and treatment Dental image segmentation is difficult for various reasons, such as intricate structures, low contrast, noise, roughness, and unclear borders, resulting in poor image quality. Recent developments in deep learning models have improved performance in analyzing dental images. In this research, our primary objective is to determine the most effective segmentation technique for bitewing radiographs based on different metrics: accuracy, training time, and the number of training parameters as a reflection of architectural cost. METHODS In this research, we employ several deep learning models, namely Resnet-18, Resnet-50, Xception, Inception Resnet v2, and Mobilenetv2, to segment bitewing radiographs. The process begins by importing the radiographs into MATLAB®(MathWorks Inc), where the images are first improved, then segmented using the graph cut method based on regions to produce a binary mask that distinguishes the background from the original X-ray. RESULTS The deep learning models were trained on 298 and 99 radiograph training and validation sets and were evaluated using 99 images from the testing set. We also compare the segmentation model using several criteria, including accuracy, speed, and size, to determine which network is superior. Furthermore, we compare our findings with prior research to provide a comprehensive understanding of the advancements made in dental image segmentation. The accurate segmentation achieved was 93.67% and 94.42% by the Resnet-18 and Resnet-50 models, respectively. CONCLUSION This research advances dental image analysis and facilitates more accurate diagnoses and treatment planning by determining the best segmentation technique. The outcomes of this study can guide researchers and practitioners in selecting appropriate segmentation methods for practical dental image analysis.
Collapse
Affiliation(s)
- Talal Bonny
- Department of Computer Engineering, University of Sharjah, Sharjah, United Arab Emirates.
| | - Abdelaziz Al-Ali
- Department of Computer Engineering, University of Sharjah, Sharjah, United Arab Emirates
| | - Mohammed Al-Ali
- Department of Computer Engineering, University of Sharjah, Sharjah, United Arab Emirates
| | - Rashid Alsaadi
- Electrical and Electronics Engineering, University of Sharjah, Sharjah, United Arab Emirates
| | - Wafaa Al Nassan
- Department of Computer Engineering, University of Sharjah, Sharjah, United Arab Emirates
| | - Khaled Obaideen
- Research Institute of Science and Technology, University of Sharjah, Sharjah, United Arab Emirates
| | - Maryam AlMallahi
- Industrial Engineering and Engineering Management Department, University of Sharjah, Sharjah, United Arab Emirates
| |
Collapse
|
23
|
Miao J, Zhou SP, Zhou GQ, Wang KN, Yang M, Zhou S, Chen Y. SC-SSL: Self-Correcting Collaborative and Contrastive Co-Training Model for Semi-Supervised Medical Image Segmentation. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:1347-1364. [PMID: 37995173 DOI: 10.1109/tmi.2023.3336534] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/25/2023]
Abstract
Image segmentation achieves significant improvements with deep neural networks at the premise of a large scale of labeled training data, which is laborious to assure in medical image tasks. Recently, semi-supervised learning (SSL) has shown great potential in medical image segmentation. However, the influence of the learning target quality for unlabeled data is usually neglected in these SSL methods. Therefore, this study proposes a novel self-correcting co-training scheme to learn a better target that is more similar to ground-truth labels from collaborative network outputs. Our work has three-fold highlights. First, we advance the learning target generation as a learning task, improving the learning confidence for unannotated data with a self-correcting module. Second, we impose a structure constraint to encourage the shape similarity further between the improved learning target and the collaborative network outputs. Finally, we propose an innovative pixel-wise contrastive learning loss to boost the representation capacity under the guidance of an improved learning target, thus exploring unlabeled data more efficiently with the awareness of semantic context. We have extensively evaluated our method with the state-of-the-art semi-supervised approaches on four public-available datasets, including the ACDC dataset, M&Ms dataset, Pancreas-CT dataset, and Task_07 CT dataset. The experimental results with different labeled-data ratios show our proposed method's superiority over other existing methods, demonstrating its effectiveness in semi-supervised medical image segmentation.
Collapse
|
24
|
Xu Z, Lu D, Luo J, Zheng Y, Tong RKY. Separated collaborative learning for semi-supervised prostate segmentation with multi-site heterogeneous unlabeled MRI data. Med Image Anal 2024; 93:103095. [PMID: 38310678 DOI: 10.1016/j.media.2024.103095] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2023] [Revised: 09/11/2023] [Accepted: 01/24/2024] [Indexed: 02/06/2024]
Abstract
Segmenting prostate from magnetic resonance imaging (MRI) is a critical procedure in prostate cancer staging and treatment planning. Considering the nature of labeled data scarcity for medical images, semi-supervised learning (SSL) becomes an appealing solution since it can simultaneously exploit limited labeled data and a large amount of unlabeled data. However, SSL relies on the assumption that the unlabeled images are abundant, which may not be satisfied when the local institute has limited image collection capabilities. An intuitive solution is to seek support from other centers to enrich the unlabeled image pool. However, this further introduces data heterogeneity, which can impede SSL that works under identical data distribution with certain model assumptions. Aiming at this under-explored yet valuable scenario, in this work, we propose a separated collaborative learning (SCL) framework for semi-supervised prostate segmentation with multi-site unlabeled MRI data. Specifically, on top of the teacher-student framework, SCL exploits multi-site unlabeled data by: (i) Local learning, which advocates local distribution fitting, including the pseudo label learning that reinforces confirmation of low-entropy easy regions and the cyclic propagated real label learning that leverages class prototypes to regularize the distribution of intra-class features; (ii) External multi-site learning, which aims to robustly mine informative clues from external data, mainly including the local-support category mutual dependence learning, which takes the spirit that mutual information can effectively measure the amount of information shared by two variables even from different domains, and the stability learning under strong adversarial perturbations to enhance robustness to heterogeneity. Extensive experiments on prostate MRI data from six different clinical centers show that our method can effectively generalize SSL on multi-site unlabeled data and significantly outperform other semi-supervised segmentation methods. Besides, we validate the extensibility of our method on the multi-class cardiac MRI segmentation task with data from four different clinical centers.
Collapse
Affiliation(s)
- Zhe Xu
- Department of Biomedical Engineering, The Chinese University of Hong Kong, Shatin, NT, Hong Kong, China.
| | - Donghuan Lu
- Tencent Jarvis Research Center, Youtu Lab, Shenzhen, China.
| | - Jie Luo
- Massachusetts General Hospital, Harvard Medical School, Boston, USA
| | - Yefeng Zheng
- Tencent Jarvis Research Center, Youtu Lab, Shenzhen, China
| | - Raymond Kai-Yu Tong
- Department of Biomedical Engineering, The Chinese University of Hong Kong, Shatin, NT, Hong Kong, China.
| |
Collapse
|
25
|
Lekadir K. A deep learning solution to detect left ventricular structural abnormalities with chest X-rays: towards trustworthy AI in cardiology. Eur Heart J 2024:ehad775. [PMID: 38527415 DOI: 10.1093/eurheartj/ehad775] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 03/27/2024] Open
Affiliation(s)
- Karim Lekadir
- University of Barcelona, Department of Mathematics and Computer Science, Artificial Intelligence in Medicine Lab (BCN-AIM), Barcelona, Spain
| |
Collapse
|
26
|
Zhang Y, Chen Z, Yang X. Light-M: An efficient lightweight medical image segmentation framework for resource-constrained IoMT. Comput Biol Med 2024; 170:108088. [PMID: 38320339 DOI: 10.1016/j.compbiomed.2024.108088] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2023] [Revised: 12/22/2023] [Accepted: 01/27/2024] [Indexed: 02/08/2024]
Abstract
The Internet of Medical Things (IoMT) is being incorporated into current healthcare systems. This technology intends to connect patients, IoMT devices, and hospitals over mobile networks, allowing for more secure, quick, and convenient health monitoring and intelligent healthcare services. However, existing intelligent healthcare applications typically rely on large-scale AI models, and standard IoMT devices have significant resource constraints. To alleviate this paradox, in this paper, we propose a Knowledge Distillation (KD)-based IoMT end-edge-cloud orchestrated architecture for medical image segmentation tasks, called Light-M, aiming to deploy a lightweight medical model in resource-constrained IoMT devices. Specifically, Light-M trains a large teacher model in the cloud server and employs computation in local nodes through imitation of the performance of the teacher model using knowledge distillation. Light-M contains two KD strategies: (1) active exploration and passive transfer (AEPT) and (2) self-attention-based inter-class feature variation (AIFV) distillation for the medical image segmentation task. The AEPT encourages the student model to learn undiscovered knowledge/features of the teacher model without additional feature layers, aiming to explore new features and outperform the teacher. To improve the distinguishability of the student for different classes, the student learns the self-attention-based feature variation (AIFV) between classes. Since the proposed AEPT and AIFV only appear in the training process, our framework does not involve any additional computation burden for a student model during the segmentation task deployment. Extensive experiments on cardiac images and public real-scene datasets demonstrate that our approach improves student model learning representations and outperforms state-of-the-art methods by combining two knowledge distillation strategies. Moreover, when deployed on the IoT device, the distilled student model takes only 29.6 ms for one sample at the inference step.
Collapse
Affiliation(s)
- Yifan Zhang
- Shenzhen University, 3688 Nanhai Ave., Shenzhen, 518060, Guangdong, China
| | - Zhuangzhuang Chen
- Shenzhen University, 3688 Nanhai Ave., Shenzhen, 518060, Guangdong, China
| | - Xuan Yang
- Shenzhen University, 3688 Nanhai Ave., Shenzhen, 518060, Guangdong, China.
| |
Collapse
|
27
|
Brown AL, Sexton ZA, Hu Z, Yang W, Marsden AL. Computational approaches for mechanobiology in cardiovascular development and diseases. Curr Top Dev Biol 2024; 156:19-50. [PMID: 38556423 DOI: 10.1016/bs.ctdb.2024.01.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/02/2024]
Abstract
The cardiovascular development in vertebrates evolves in response to genetic and mechanical cues. The dynamic interplay among mechanics, cell biology, and anatomy continually shapes the hydraulic networks, characterized by complex, non-linear changes in anatomical structure and blood flow dynamics. To better understand this interplay, a diverse set of molecular and computational tools has been used to comprehensively study cardiovascular mechanobiology. With the continual advancement of computational capacity and numerical techniques, cardiovascular simulation is increasingly vital in both basic science research for understanding developmental mechanisms and disease etiologies, as well as in clinical studies aimed at enhancing treatment outcomes. This review provides an overview of computational cardiovascular modeling. Beginning with the fundamental concepts of computational cardiovascular modeling, it navigates through the applications of computational modeling in investigating mechanobiology during cardiac development. Second, the article illustrates the utility of computational hemodynamic modeling in the context of treatment planning for congenital heart diseases. It then delves into the predictive potential of computational models for elucidating tissue growth and remodeling processes. In closing, we outline prevailing challenges and future prospects, underscoring the transformative impact of computational cardiovascular modeling in reshaping cardiovascular science and clinical practice.
Collapse
Affiliation(s)
- Aaron L Brown
- Department of Mechanical Engineering, Stanford University, Stanford, CA, United States
| | - Zachary A Sexton
- Department of Bioengineering, Stanford University, Stanford, CA, United States
| | - Zinan Hu
- Department of Mechanical Engineering, Stanford University, Stanford, CA, United States
| | - Weiguang Yang
- Department of Pediatrics, Stanford University, Stanford, CA, United States
| | - Alison L Marsden
- Department of Bioengineering, Stanford University, Stanford, CA, United States; Department of Pediatrics, Stanford University, Stanford, CA, United States.
| |
Collapse
|
28
|
Hognon C, Conze PH, Bourbonne V, Gallinato O, Colin T, Jaouen V, Visvikis D. Contrastive image adaptation for acquisition shift reduction in medical imaging. Artif Intell Med 2024; 148:102747. [PMID: 38325919 DOI: 10.1016/j.artmed.2023.102747] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2022] [Revised: 10/21/2023] [Accepted: 12/10/2023] [Indexed: 02/09/2024]
Abstract
The domain shift, or acquisition shift in medical imaging, is responsible for potentially harmful differences between development and deployment conditions of medical image analysis techniques. There is a growing need in the community for advanced methods that could mitigate this issue better than conventional approaches. In this paper, we consider configurations in which we can expose a learning-based pixel level adaptor to a large variability of unlabeled images during its training, i.e. sufficient to span the acquisition shift expected during the training or testing of a downstream task model. We leverage the ability of convolutional architectures to efficiently learn domain-agnostic features and train a many-to-one unsupervised mapping between a source collection of heterogeneous images from multiple unknown domains subjected to the acquisition shift and a homogeneous subset of this source set of lower cardinality, potentially constituted of a single image. To this end, we propose a new cycle-free image-to-image architecture based on a combination of three loss functions : a contrastive PatchNCE loss, an adversarial loss and an edge preserving loss allowing for rich domain adaptation to the target image even under strong domain imbalance and low data regimes. Experiments support the interest of the proposed contrastive image adaptation approach for the regularization of downstream deep supervised segmentation and cross-modality synthesis models.
Collapse
Affiliation(s)
- Clément Hognon
- UMR U1101 Inserm LaTIM, IMT Atlantique, Université de Bretagne Occidentale, France; SOPHiA Genetics, Pessac, France
| | - Pierre-Henri Conze
- UMR U1101 Inserm LaTIM, IMT Atlantique, Université de Bretagne Occidentale, France
| | - Vincent Bourbonne
- UMR U1101 Inserm LaTIM, IMT Atlantique, Université de Bretagne Occidentale, France
| | | | | | - Vincent Jaouen
- UMR U1101 Inserm LaTIM, IMT Atlantique, Université de Bretagne Occidentale, France.
| | - Dimitris Visvikis
- UMR U1101 Inserm LaTIM, IMT Atlantique, Université de Bretagne Occidentale, France
| |
Collapse
|
29
|
Jiao R, Zhang Y, Ding L, Xue B, Zhang J, Cai R, Jin C. Learning with limited annotations: A survey on deep semi-supervised learning for medical image segmentation. Comput Biol Med 2024; 169:107840. [PMID: 38157773 DOI: 10.1016/j.compbiomed.2023.107840] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2023] [Revised: 10/30/2023] [Accepted: 12/07/2023] [Indexed: 01/03/2024]
Abstract
Medical image segmentation is a fundamental and critical step in many image-guided clinical approaches. Recent success of deep learning-based segmentation methods usually relies on a large amount of labeled data, which is particularly difficult and costly to obtain, especially in the medical imaging domain where only experts can provide reliable and accurate annotations. Semi-supervised learning has emerged as an appealing strategy and been widely applied to medical image segmentation tasks to train deep models with limited annotations. In this paper, we present a comprehensive review of recently proposed semi-supervised learning methods for medical image segmentation and summarize both the technical novelties and empirical results. Furthermore, we analyze and discuss the limitations and several unsolved problems of existing approaches. We hope this review can inspire the research community to explore solutions to this challenge and further advance the field of medical image segmentation.
Collapse
Affiliation(s)
- Rushi Jiao
- School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, 200240, China; School of Engineering Medicine, Beihang University, Beijing, 100191, China; Shanghai Artificial Intelligence Laboratory, Shanghai, 200232, China.
| | - Yichi Zhang
- School of Data Science, Fudan University, Shanghai, 200433, China; Artificial Intelligence Innovation and Incubation Institute, Fudan University, Shanghai, 200433, China.
| | - Le Ding
- School of Biological Science and Medical Engineering, Beihang University, Beijing, 100191, China.
| | - Bingsen Xue
- School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, 200240, China; Shanghai Artificial Intelligence Laboratory, Shanghai, 200232, China.
| | - Jicong Zhang
- School of Biological Science and Medical Engineering, Beihang University, Beijing, 100191, China; Hefei Innovation Research Institute, Beihang University, Hefei, 230012, China.
| | - Rong Cai
- School of Engineering Medicine, Beihang University, Beijing, 100191, China; Key Laboratory for Biomechanics and Mechanobiology of Ministry of Education, Beihang University, Beijing, 100191, China.
| | - Cheng Jin
- School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, 200240, China; Shanghai Artificial Intelligence Laboratory, Shanghai, 200232, China; Beijing Anding Hospital, Capital Medical University, Beijing, 100088, China.
| |
Collapse
|
30
|
Fan L, Gong X, Zheng C, Li J. Data pyramid structure for optimizing EUS-based GISTs diagnosis in multi-center analysis with missing label. Comput Biol Med 2024; 169:107897. [PMID: 38171262 DOI: 10.1016/j.compbiomed.2023.107897] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2023] [Revised: 12/04/2023] [Accepted: 12/23/2023] [Indexed: 01/05/2024]
Abstract
This study introduces the Data Pyramid Structure (DPS) to address data sparsity and missing labels in medical image analysis. The DPS optimizes multi-task learning and enables sustainable expansion of multi-center data analysis. Specifically, It facilitates attribute prediction and malignant tumor diagnosis tasks by implementing a segmentation and aggregation strategy on data with absent attribute labels. To leverage multi-center data, we propose the Unified Ensemble Learning Framework (UELF) and the Unified Federated Learning Framework (UFLF), which incorporate strategies for data transfer and incremental learning in scenarios with missing labels. The proposed method was evaluated on a challenging EUS patient dataset from five centers, achieving promising diagnostic performance. The average accuracy was 0.984 with an AUC of 0.927 for multi-center analysis, surpassing state-of-the-art approaches. The interpretability of the predictions further highlights the potential clinical relevance of our method.
Collapse
Affiliation(s)
- Lin Fan
- School of Computing and Artificial Intelligence, Southwest Jiaotong University, Chengdu, Sichuan 611756, China; Manufacturing Industry Chains Collaboration and Information Support Technology Key Laboratory of Sichuan Province, China; Engineering Research Center of Sustainable Urban Intelligent Transportation, Ministry of Education, China; National Engineering Laboratory of Integrated Transportation Big Data Application Technology, China
| | - Xun Gong
- School of Computing and Artificial Intelligence, Southwest Jiaotong University, Chengdu, Sichuan 611756, China; Manufacturing Industry Chains Collaboration and Information Support Technology Key Laboratory of Sichuan Province, China; Engineering Research Center of Sustainable Urban Intelligent Transportation, Ministry of Education, China; National Engineering Laboratory of Integrated Transportation Big Data Application Technology, China.
| | - Cenyang Zheng
- School of Computing and Artificial Intelligence, Southwest Jiaotong University, Chengdu, Sichuan 611756, China; Manufacturing Industry Chains Collaboration and Information Support Technology Key Laboratory of Sichuan Province, China; Engineering Research Center of Sustainable Urban Intelligent Transportation, Ministry of Education, China; National Engineering Laboratory of Integrated Transportation Big Data Application Technology, China
| | - Jiao Li
- Department of Gastroenterology, The Third People's Hospital of Chendu, Affiliated Hospital of Southwest Jiaotong University, Chengdu 610031, China
| |
Collapse
|
31
|
Priya S, Dhruba DD, Perry SS, Aher PY, Gupta A, Nagpal P, Jacob M. Optimizing Deep Learning for Cardiac MRI Segmentation: The Impact of Automated Slice Range Classification. Acad Radiol 2024; 31:503-513. [PMID: 37541826 DOI: 10.1016/j.acra.2023.07.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2023] [Revised: 07/07/2023] [Accepted: 07/09/2023] [Indexed: 08/06/2023]
Abstract
RATIONALE AND OBJECTIVES Cardiac magnetic resonance imaging is crucial for diagnosing cardiovascular diseases, but lengthy postprocessing and manual segmentation can lead to observer bias. Deep learning (DL) has been proposed for automated cardiac segmentation; however, its effectiveness is limited by the slice range selection from base to apex. MATERIALS AND METHODS In this study, we integrated an automated slice range classification step to identify basal to apical short-axis slices before DL-based segmentation. We employed publicly available Multi-Disease, Multi-View & Multi-Center Right Ventricular Segmentation in Cardiac MRI data set with short-axis cine data from 160 training, 40 validation, and 160 testing cases. Three classification and seven segmentation DL models were studied. The top-performing segmentation model was assessed with and without the classification model. Model validation to compare automated and manual segmentation was performed using Dice score and Hausdorff distance and clinical indices (correlation score and Bland-Altman plots). RESULTS The combined classification (CBAM-integrated 2D-CNN) and segmentation model (2D-UNet with dilated convolution block) demonstrated superior performance, achieving Dice scores of 0.952 for left ventricle (LV), 0.933 for right ventricle (RV), and 0.875 for myocardium, compared to the stand-alone segmentation model (0.949 for LV, 0.925 for RV, and 0.867 for myocardium). Combined classification and segmentation model showed high correlation (0.92-0.99) with manual segmentation for biventricular volumes, ejection fraction, and myocardial mass. The mean absolute difference (2.8-8.3 mL) for clinical parameters between automated and manual segmentation was within the interobserver variability range, indicating comparable performance to manual annotation. CONCLUSION Integrating an initial automated slice range classification step into the segmentation process improves the performance of DL-based cardiac chamber segmentation.
Collapse
Affiliation(s)
- Sarv Priya
- Department of Radiology, University of Iowa Carver College of Medicine, Iowa City, Iowa (S.P.).
| | - Durjoy D Dhruba
- Department of Electrical and Computer Engineering, University of Iowa, Iowa City, Iowa (D.D.D., M.J.)
| | - Sarah S Perry
- Department of Biostatistics, University of Iowa, Iowa City, Iowa (S.S.P.)
| | - Pritish Y Aher
- Department of Radiology, University of Miami, Miller School of Medicine, Miami, Florida (P.Y.A.)
| | - Amit Gupta
- Department of Radiology, University Hospital Cleveland Medical Center, Cleveland, Ohio (A.G.)
| | - Prashant Nagpal
- Department of Radiology, University of Wisconsin School of Medicine and Public Health, Madison, Wisconsin (P.N.)
| | - Mathews Jacob
- Department of Electrical and Computer Engineering, University of Iowa, Iowa City, Iowa (D.D.D., M.J.)
| |
Collapse
|
32
|
Huang Y, Yang X, Liu L, Zhou H, Chang A, Zhou X, Chen R, Yu J, Chen J, Chen C, Liu S, Chi H, Hu X, Yue K, Li L, Grau V, Fan DP, Dong F, Ni D. Segment anything model for medical images? Med Image Anal 2024; 92:103061. [PMID: 38086235 DOI: 10.1016/j.media.2023.103061] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2023] [Revised: 09/28/2023] [Accepted: 12/05/2023] [Indexed: 01/12/2024]
Abstract
The Segment Anything Model (SAM) is the first foundation model for general image segmentation. It has achieved impressive results on various natural image segmentation tasks. However, medical image segmentation (MIS) is more challenging because of the complex modalities, fine anatomical structures, uncertain and complex object boundaries, and wide-range object scales. To fully validate SAM's performance on medical data, we collected and sorted 53 open-source datasets and built a large medical segmentation dataset with 18 modalities, 84 objects, 125 object-modality paired targets, 1050K 2D images, and 6033K masks. We comprehensively analyzed different models and strategies on the so-called COSMOS 1050K dataset. Our findings mainly include the following: (1) SAM showed remarkable performance in some specific objects but was unstable, imperfect, or even totally failed in other situations. (2) SAM with the large ViT-H showed better overall performance than that with the small ViT-B. (3) SAM performed better with manual hints, especially box, than the Everything mode. (4) SAM could help human annotation with high labeling quality and less time. (5) SAM was sensitive to the randomness in the center point and tight box prompts, and may suffer from a serious performance drop. (6) SAM performed better than interactive methods with one or a few points, but will be outpaced as the number of points increases. (7) SAM's performance correlated to different factors, including boundary complexity, intensity differences, etc. (8) Finetuning the SAM on specific medical tasks could improve its average DICE performance by 4.39% and 6.68% for ViT-B and ViT-H, respectively. Codes and models are available at: https://github.com/yuhoo0302/Segment-Anything-Model-for-Medical-Images. We hope that this comprehensive report can help researchers explore the potential of SAM applications in MIS, and guide how to appropriately use and develop SAM.
Collapse
Affiliation(s)
- Yuhao Huang
- National-Regional Key Technology Engineering Laboratory for Medical Ultrasound, School of Biomedical Engineering, Shenzhen University Medical School, Shenzhen University, Shenzhen, China; Medical UltraSound Image Computing (MUSIC) Lab, Shenzhen University, Shenzhen, China; Marshall Laboratory of Biomedical Engineering, Shenzhen University, Shenzhen, China
| | - Xin Yang
- National-Regional Key Technology Engineering Laboratory for Medical Ultrasound, School of Biomedical Engineering, Shenzhen University Medical School, Shenzhen University, Shenzhen, China; Medical UltraSound Image Computing (MUSIC) Lab, Shenzhen University, Shenzhen, China; Marshall Laboratory of Biomedical Engineering, Shenzhen University, Shenzhen, China
| | - Lian Liu
- National-Regional Key Technology Engineering Laboratory for Medical Ultrasound, School of Biomedical Engineering, Shenzhen University Medical School, Shenzhen University, Shenzhen, China; Medical UltraSound Image Computing (MUSIC) Lab, Shenzhen University, Shenzhen, China; Marshall Laboratory of Biomedical Engineering, Shenzhen University, Shenzhen, China
| | - Han Zhou
- National-Regional Key Technology Engineering Laboratory for Medical Ultrasound, School of Biomedical Engineering, Shenzhen University Medical School, Shenzhen University, Shenzhen, China; Medical UltraSound Image Computing (MUSIC) Lab, Shenzhen University, Shenzhen, China; Marshall Laboratory of Biomedical Engineering, Shenzhen University, Shenzhen, China
| | - Ao Chang
- National-Regional Key Technology Engineering Laboratory for Medical Ultrasound, School of Biomedical Engineering, Shenzhen University Medical School, Shenzhen University, Shenzhen, China; Medical UltraSound Image Computing (MUSIC) Lab, Shenzhen University, Shenzhen, China; Marshall Laboratory of Biomedical Engineering, Shenzhen University, Shenzhen, China
| | - Xinrui Zhou
- National-Regional Key Technology Engineering Laboratory for Medical Ultrasound, School of Biomedical Engineering, Shenzhen University Medical School, Shenzhen University, Shenzhen, China; Medical UltraSound Image Computing (MUSIC) Lab, Shenzhen University, Shenzhen, China; Marshall Laboratory of Biomedical Engineering, Shenzhen University, Shenzhen, China
| | - Rusi Chen
- National-Regional Key Technology Engineering Laboratory for Medical Ultrasound, School of Biomedical Engineering, Shenzhen University Medical School, Shenzhen University, Shenzhen, China; Medical UltraSound Image Computing (MUSIC) Lab, Shenzhen University, Shenzhen, China; Marshall Laboratory of Biomedical Engineering, Shenzhen University, Shenzhen, China
| | - Junxuan Yu
- National-Regional Key Technology Engineering Laboratory for Medical Ultrasound, School of Biomedical Engineering, Shenzhen University Medical School, Shenzhen University, Shenzhen, China; Medical UltraSound Image Computing (MUSIC) Lab, Shenzhen University, Shenzhen, China; Marshall Laboratory of Biomedical Engineering, Shenzhen University, Shenzhen, China
| | - Jiongquan Chen
- National-Regional Key Technology Engineering Laboratory for Medical Ultrasound, School of Biomedical Engineering, Shenzhen University Medical School, Shenzhen University, Shenzhen, China; Medical UltraSound Image Computing (MUSIC) Lab, Shenzhen University, Shenzhen, China; Marshall Laboratory of Biomedical Engineering, Shenzhen University, Shenzhen, China
| | - Chaoyu Chen
- National-Regional Key Technology Engineering Laboratory for Medical Ultrasound, School of Biomedical Engineering, Shenzhen University Medical School, Shenzhen University, Shenzhen, China; Medical UltraSound Image Computing (MUSIC) Lab, Shenzhen University, Shenzhen, China; Marshall Laboratory of Biomedical Engineering, Shenzhen University, Shenzhen, China
| | - Sijing Liu
- National-Regional Key Technology Engineering Laboratory for Medical Ultrasound, School of Biomedical Engineering, Shenzhen University Medical School, Shenzhen University, Shenzhen, China; Medical UltraSound Image Computing (MUSIC) Lab, Shenzhen University, Shenzhen, China; Marshall Laboratory of Biomedical Engineering, Shenzhen University, Shenzhen, China
| | | | - Xindi Hu
- Shenzhen RayShape Medical Technology Co., Ltd, Shenzhen, China
| | - Kejuan Yue
- Hunan First Normal University, Changsha, China
| | - Lei Li
- Department of Engineering Science, University of Oxford, Oxford, UK
| | - Vicente Grau
- Department of Engineering Science, University of Oxford, Oxford, UK
| | - Deng-Ping Fan
- Computer Vision Lab (CVL), ETH Zurich, Zurich, Switzerland
| | - Fajin Dong
- Ultrasound Department, the Second Clinical Medical College, Jinan University, China; First Affiliated Hospital, Southern University of Science and Technology, Shenzhen People's Hospital, Shenzhen, China.
| | - Dong Ni
- National-Regional Key Technology Engineering Laboratory for Medical Ultrasound, School of Biomedical Engineering, Shenzhen University Medical School, Shenzhen University, Shenzhen, China; Medical UltraSound Image Computing (MUSIC) Lab, Shenzhen University, Shenzhen, China; Marshall Laboratory of Biomedical Engineering, Shenzhen University, Shenzhen, China.
| |
Collapse
|
33
|
Zhang Y, Wang Y, Xu L, Yao Y, Qian W, Qi L. ST-GAN: A Swin Transformer-Based Generative Adversarial Network for Unsupervised Domain Adaptation of Cross-Modality Cardiac Segmentation. IEEE J Biomed Health Inform 2024; 28:893-904. [PMID: 38019618 DOI: 10.1109/jbhi.2023.3336965] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2023]
Abstract
Unsupervised domain adaptation (UDA) methods have shown great potential in cross-modality medical image segmentation tasks, where target domain labels are unavailable. However, the domain shift among different image modalities remains challenging, because the conventional UDA methods are based on convolutional neural networks (CNNs), which tend to focus on the texture of images and cannot establish the global semantic relevance of features due to the locality of CNNs. This paper proposes a novel end-to-end Swin Transformer-based generative adversarial network (ST-GAN) for cross-modality cardiac segmentation. In the generator of ST-GAN, we utilize the local receptive fields of CNNs to capture spatial information and introduce the Swin Transformer to extract global semantic information, which enables the generator to better extract the domain-invariant features in UDA tasks. In addition, we design a multi-scale feature fuser to sufficiently fuse the features acquired at different stages and improve the robustness of the UDA network. We extensively evaluated our method with two cross-modality cardiac segmentation tasks on the MS-CMR 2019 dataset and the M&Ms dataset. The results of two different tasks show the validity of ST-GAN compared with the state-of-the-art cross-modality cardiac image segmentation methods.
Collapse
|
34
|
Wang Z, Li B, Yu H, Zhang Z, Ran M, Xia W, Yang Z, Lu J, Chen H, Zhou J, Shan H, Zhang Y. Promoting fast MR imaging pipeline by full-stack AI. iScience 2024; 27:108608. [PMID: 38174317 PMCID: PMC10762466 DOI: 10.1016/j.isci.2023.108608] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2023] [Revised: 10/17/2023] [Accepted: 11/29/2023] [Indexed: 01/05/2024] Open
Abstract
Magnetic resonance imaging (MRI) is a widely used imaging modality in clinics for medical disease diagnosis, staging, and follow-up. Deep learning has been extensively used to accelerate k-space data acquisition, enhance MR image reconstruction, and automate tissue segmentation. However, these three tasks are usually treated as independent tasks and optimized for evaluation by radiologists, thus ignoring the strong dependencies among them; this may be suboptimal for downstream intelligent processing. Here, we present a novel paradigm, full-stack learning (FSL), which can simultaneously solve these three tasks by considering the overall imaging process and leverage the strong dependence among them to further improve each task, significantly boosting the efficiency and efficacy of practical MRI workflows. Experimental results obtained on multiple open MR datasets validate the superiority of FSL over existing state-of-the-art methods on each task. FSL has great potential to optimize the practical workflow of MRI for medical diagnosis and radiotherapy.
Collapse
Affiliation(s)
- Zhiwen Wang
- School of Computer Science, Sichuan University, Chengdu, Sichuan, China
| | - Bowen Li
- School of Computer Science, Sichuan University, Chengdu, Sichuan, China
| | - Hui Yu
- School of Computer Science, Sichuan University, Chengdu, Sichuan, China
| | - Zhongzhou Zhang
- School of Computer Science, Sichuan University, Chengdu, Sichuan, China
| | - Maosong Ran
- School of Computer Science, Sichuan University, Chengdu, Sichuan, China
| | - Wenjun Xia
- School of Computer Science, Sichuan University, Chengdu, Sichuan, China
| | - Ziyuan Yang
- School of Computer Science, Sichuan University, Chengdu, Sichuan, China
| | - Jingfeng Lu
- School of Cyber Science and Engineering, Sichuan University, Chengdu, Sichuan, China
| | - Hu Chen
- School of Computer Science, Sichuan University, Chengdu, Sichuan, China
| | - Jiliu Zhou
- School of Computer Science, Sichuan University, Chengdu, Sichuan, China
| | - Hongming Shan
- Institute of Science and Technology for Brain-inspired Intelligence, Fudan University, Shanghai, China
| | - Yi Zhang
- School of Cyber Science and Engineering, Sichuan University, Chengdu, Sichuan, China
| |
Collapse
|
35
|
Chen X, Xia Y, Dall'Armellina E, Ravikumar N, Frangi AF. Joint shape/texture representation learning for cardiovascular disease diagnosis from magnetic resonance imaging. EUROPEAN HEART JOURNAL. IMAGING METHODS AND PRACTICE 2024; 2:qyae042. [PMID: 39045211 PMCID: PMC11195696 DOI: 10.1093/ehjimp/qyae042] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/23/2023] [Accepted: 04/09/2024] [Indexed: 07/25/2024]
Abstract
Aims Cardiovascular diseases (CVDs) are the leading cause of mortality worldwide. Cardiac image and mesh are two primary modalities to present the shape and structure of the heart and have been demonstrated to be efficient in CVD prediction and diagnosis. However, previous research has been generally focussed on a single modality (image or mesh), and few of them have tried to jointly consider the image and mesh representations of heart. To obtain efficient and explainable biomarkers for CVD prediction and diagnosis, it is needed to jointly consider both representations. Methods and results We design a novel multi-channel variational auto-encoder, mesh-image variational auto-encoder, to learn joint representation of paired mesh and image. After training, the shape-aware image representation (SAIR) can be learned directly from the raw images and applied for further CVD prediction and diagnosis. We demonstrate our method on data from UK Biobank study and two other datasets via extensive experiments. In acute myocardial infarction prediction, SAIR achieves 81.43% accuracy, significantly higher than traditional biomarkers like metadata and clinical indices (left ventricle and right ventricle clinical indices of cardiac function like chamber volume, mass, and ejection fraction). Conclusion Our mesh-image variational auto-encoder provides a novel approach for 3D cardiac mesh reconstruction from images. The extraction of SAIR is fast and without need of segmentation masks, and its focussing can be visualized in the corresponding cardiac meshes. SAIR archives better performance than traditional biomarkers and can be applied as an efficient supplement to them, which is of significant potential in CVD analysis.
Collapse
Affiliation(s)
- Xiang Chen
- School of Computing, University of Leeds, Woodhouse, LS2 9JT Leeds, UK
| | - Yan Xia
- School of Computing, University of Leeds, Woodhouse, LS2 9JT Leeds, UK
| | | | - Nishant Ravikumar
- School of Computing, University of Leeds, Woodhouse, LS2 9JT Leeds, UK
| | - Alejandro F Frangi
- Christabel Pankhurst Institute, The University of Manchester, Oxford Rd, M13 9PL Manchester, UK
- Department of Computer Science, School of Engineering, The University of Manchester, Oxford Rd, M13 9PL Manchester, UK
- Division of Informatics, Imaging, and Data Sciences, School of Health Sciences, The University of Manchester, Oxford Rd, M13 9PL Manchester, UK
- NIHR Manchester Biomedical Research Centre, Manchester Academic Health Science Centre, Oxford Rd, M13 9PL Manchester, UK
- Medical Imaging Research Center (MIRC), University Hospital Gasthuisberg, UZ Herestraat 49 - bus 7003, 3000 Leuven, Belgium
- Department of Cardiovascular Sciences, KU Leuven, UZ Herestraat 49 - box 911, 3000 Leuven, Belgium
- Department of Electrical Engineering, KU Leuven, Kasteelpark Arenberg 10 postbus 2440, 3001 Leuven, Belgium
- Alan Turing Institute, British Library, 96 Euston Rd., NW1 2DB London, UK
| |
Collapse
|
36
|
Morales MA, Manning WJ, Nezafat R. Present and Future Innovations in AI and Cardiac MRI. Radiology 2024; 310:e231269. [PMID: 38193835 PMCID: PMC10831479 DOI: 10.1148/radiol.231269] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2023] [Revised: 10/21/2023] [Accepted: 10/26/2023] [Indexed: 01/10/2024]
Abstract
Cardiac MRI is used to diagnose and treat patients with a multitude of cardiovascular diseases. Despite the growth of clinical cardiac MRI, complicated image prescriptions and long acquisition protocols limit the specialty and restrain its impact on the practice of medicine. Artificial intelligence (AI)-the ability to mimic human intelligence in learning and performing tasks-will impact nearly all aspects of MRI. Deep learning (DL) primarily uses an artificial neural network to learn a specific task from example data sets. Self-driving scanners are increasingly available, where AI automatically controls cardiac image prescriptions. These scanners offer faster image collection with higher spatial and temporal resolution, eliminating the need for cardiac triggering or breath holding. In the future, fully automated inline image analysis will most likely provide all contour drawings and initial measurements to the reader. Advanced analysis using radiomic or DL features may provide new insights and information not typically extracted in the current analysis workflow. AI may further help integrate these features with clinical, genetic, wearable-device, and "omics" data to improve patient outcomes. This article presents an overview of AI and its application in cardiac MRI, including in image acquisition, reconstruction, and processing, and opportunities for more personalized cardiovascular care through extraction of novel imaging markers.
Collapse
Affiliation(s)
- Manuel A. Morales
- From the Department of Medicine, Cardiovascular Division (M.A.M.,
W.J.M., R.N.), and Department of Radiology (W.J.M.), Beth Israel Deaconess
Medical Center and Harvard Medical School, 330 Brookline Ave, Boston, MA
02215
| | - Warren J. Manning
- From the Department of Medicine, Cardiovascular Division (M.A.M.,
W.J.M., R.N.), and Department of Radiology (W.J.M.), Beth Israel Deaconess
Medical Center and Harvard Medical School, 330 Brookline Ave, Boston, MA
02215
| | - Reza Nezafat
- From the Department of Medicine, Cardiovascular Division (M.A.M.,
W.J.M., R.N.), and Department of Radiology (W.J.M.), Beth Israel Deaconess
Medical Center and Harvard Medical School, 330 Brookline Ave, Boston, MA
02215
| |
Collapse
|
37
|
Kolasa K, Admassu B, Hołownia-Voloskova M, Kędzior KJ, Poirrier JE, Perni S. Systematic reviews of machine learning in healthcare: a literature review. Expert Rev Pharmacoecon Outcomes Res 2024; 24:63-115. [PMID: 37955147 DOI: 10.1080/14737167.2023.2279107] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2023] [Accepted: 10/31/2023] [Indexed: 11/14/2023]
Abstract
INTRODUCTION The increasing availability of data and computing power has made machine learning (ML) a viable approach to faster, more efficient healthcare delivery. METHODS A systematic literature review (SLR) of published SLRs evaluating ML applications in healthcare settings published between1 January 2010 and 27 March 2023 was conducted. RESULTS In total 220 SLRs covering 10,462 ML algorithms were reviewed. The main application of AI in medicine related to the clinical prediction and disease prognosis in oncology and neurology with the use of imaging data. Accuracy, specificity, and sensitivity were provided in 56%, 28%, and 25% SLRs respectively. Internal and external validation was reported in 53% and less than 1% of the cases respectively. The most common modeling approach was neural networks (2,454 ML algorithms), followed by support vector machine and random forest/decision trees (1,578 and 1,522 ML algorithms, respectively). EXPERT OPINION The review indicated considerable reporting gaps in terms of the ML's performance, both internal and external validation. Greater accessibility to healthcare data for developers can ensure the faster adoption of ML algorithms into clinical practice.
Collapse
Affiliation(s)
- Katarzyna Kolasa
- Division of Health Economics and Healthcare Management, Kozminski University, Warsaw, Poland
| | - Bisrat Admassu
- Division of Health Economics and Healthcare Management, Kozminski University, Warsaw, Poland
| | | | | | | | | |
Collapse
|
38
|
Wang J, Zhang N, Wang S, Liang W, Zhao H, Xia W, Zhu J, Zhang Y, Zhang W, Chai S. AI approach to biventricular function assessment in cine-MRI: an ultra-small training dataset and multivendor study. Phys Med Biol 2023; 68:245025. [PMID: 37918023 DOI: 10.1088/1361-6560/ad0903] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2023] [Accepted: 11/01/2023] [Indexed: 11/04/2023]
Abstract
Objective. It was a great challenge to train an excellent and generalized model on an ultra-small data set composed of multi-orientation cardiac cine magnetic resonance imaging (MRI) images. We try to develop a 3D deep learning method based on an ultra-small training data set from muti-orientation cine MRI images and assess its performance of automated biventricular structure segmentation and function assessment in multivendor.Approach. We completed the training and testing of our deep learning networks using only heart datasets of 150 cases (90 cases for training and 60 cases for testing). This datasets were obtained from three different MRI vendors and each subject included two phases of the cardiac cycle and three cine sequences. A 3D deep learning algorithm combining Transformers and U-Net was trained. The performance of the segmentation was evaluated using the Dice metric and Hausdorff distance (HD). Based on this, the manual and automatic results of cardiac function parameters were compared with Pearson correlation, intraclass correlation coefficient (ICC) and Bland-Altman analysis in multivendor.Main results. The results show that the average Dice of 0.92, 0.92, 0.94 and HD95 of 2.50, 1.36, 1.37 for three sequences. The automatic and manual results of seven parameters were excellently correlated with the lowestr2 value of 0.824 and the highest of 0.983. The ICC (0.908-0.989,P< 0.001) showed that the results were highly consistent. Bland-Altman with a 95% limit of agreement showed there was no significant difference except for the difference in RVESV (P= 0.005) and LVM (P< 0.001).Significance. The model had high accuracy in segmentation and excellent correlation and consistency in function assessment. It provides a fast and effective method for studying cardiac MRI and heart disease.
Collapse
Affiliation(s)
- Jing Wang
- School of Information Science and Engineering, University of Jinan, People's Republic of China
| | - Nan Zhang
- Department of Radiology, Beijing Anzhen Hospital, Capital Medical University, People's Republic of China
| | - Shuyu Wang
- Department of Electric Information Engineering, Shandong Youth University of Political Science, People's Republic of China
| | - Wei Liang
- Department of Ecological Environment Statistics, Department of ecological environment of Shandong, People's Republic of China
| | - Haiyue Zhao
- Department of Electric Information Engineering, Shandong Youth University of Political Science, People's Republic of China
| | - Weili Xia
- Department of Science and Education, Shandong Mental Health Center, People's Republic of China
| | - Jianlei Zhu
- Department of Neuromodulation Center, Shandong Mental Health Center, People's Republic of China
| | - Yan Zhang
- Department of Radiology, Shandong Mental Health Center, People's Republic of China
| | - Wei Zhang
- Department of Medical Ultrasound, The First Affiliated Hospital of Shandong First Medical University & Shandong Provincial Qianfoshan Hospital, People's Republic of China
| | - Senchun Chai
- Department of Automation, School of automation, Beijing Institute of Technology, People's Republic of China
| |
Collapse
|
39
|
Ruthven M, Peplinski AM, Adams DM, King AP, Miquel ME. Real-time speech MRI datasets with corresponding articulator ground-truth segmentations. Sci Data 2023; 10:860. [PMID: 38042857 PMCID: PMC10693552 DOI: 10.1038/s41597-023-02766-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2023] [Accepted: 11/20/2023] [Indexed: 12/04/2023] Open
Abstract
The use of real-time magnetic resonance imaging (rt-MRI) of speech is increasing in clinical practice and speech science research. Analysis of such images often requires segmentation of articulators and the vocal tract, and the community is turning to deep-learning-based methods to perform this segmentation. While there are publicly available rt-MRI datasets of speech, these do not include ground-truth (GT) segmentations, a key requirement for the development of deep-learning-based segmentation methods. To begin to address this barrier, this work presents rt-MRI speech datasets of five healthy adult volunteers with corresponding GT segmentations and velopharyngeal closure patterns. The images were acquired using standard clinical MRI scanners, coils and sequences to facilitate acquisition of similar images in other centres. The datasets include manually created GT segmentations of six anatomical features including the tongue, soft palate and vocal tract. In addition, this work makes code and instructions to implement a current state-of-the-art deep-learning-based method to segment rt-MRI speech datasets publicly available, thus providing the community and others with a starting point for developing such methods.
Collapse
Affiliation(s)
- Matthieu Ruthven
- Clinical Physics, Barts Health NHS Trust, West Smithfield, London, EC1A 7BE, UK
- School of Biomedical Engineering & Imaging Sciences, King's College London, King's Health Partners, St Thomas' Hospital, London, SE1 7EH, UK
| | | | - David M Adams
- Clinical Physics, Barts Health NHS Trust, West Smithfield, London, EC1A 7BE, UK
| | - Andrew P King
- School of Biomedical Engineering & Imaging Sciences, King's College London, King's Health Partners, St Thomas' Hospital, London, SE1 7EH, UK
| | - Marc Eric Miquel
- Clinical Physics, Barts Health NHS Trust, West Smithfield, London, EC1A 7BE, UK.
- Digital Environment Research Institute (DERI), Empire House, 67-75 New Road, Queen Mary University of London, London, E1 1HH, UK.
- Advanced Cardiovascular Imaging, Barts NIHR BRC, Queen Mary University of London, London, EC1M 6BQ, UK.
| |
Collapse
|
40
|
Wu J, Wang G, Gu R, Lu T, Chen Y, Zhu W, Vercauteren T, Ourselin S, Zhang S. UPL-SFDA: Uncertainty-Aware Pseudo Label Guided Source-Free Domain Adaptation for Medical Image Segmentation. IEEE TRANSACTIONS ON MEDICAL IMAGING 2023; 42:3932-3943. [PMID: 37738202 DOI: 10.1109/tmi.2023.3318364] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/24/2023]
Abstract
Domain Adaptation (DA) is important for deep learning-based medical image segmentation models to deal with testing images from a new target domain. As the source-domain data are usually unavailable when a trained model is deployed at a new center, Source-Free Domain Adaptation (SFDA) is appealing for data and annotation-efficient adaptation to the target domain. However, existing SFDA methods have a limited performance due to lack of sufficient supervision with source-domain images unavailable and target-domain images unlabeled. We propose a novel Uncertainty-aware Pseudo Label guided (UPL) SFDA method for medical image segmentation. Specifically, we propose Target Domain Growing (TDG) to enhance the diversity of predictions in the target domain by duplicating the pre-trained model's prediction head multiple times with perturbations. The different predictions in these duplicated heads are used to obtain pseudo labels for unlabeled target-domain images and their uncertainty to identify reliable pseudo labels. We also propose a Twice Forward pass Supervision (TFS) strategy that uses reliable pseudo labels obtained in one forward pass to supervise predictions in the next forward pass. The adaptation is further regularized by a mean prediction-based entropy minimization term that encourages confident and consistent results in different prediction heads. UPL-SFDA was validated with a multi-site heart MRI segmentation dataset, a cross-modality fetal brain segmentation dataset, and a 3D fetal tissue segmentation dataset. It improved the average Dice by 5.54, 5.01 and 6.89 percentage points for the three tasks compared with the baseline, respectively, and outperformed several state-of-the-art SFDA methods.
Collapse
|
41
|
Beetz M, Banerjee A, Ossenberg-Engels J, Grau V. Multi-class point cloud completion networks for 3D cardiac anatomy reconstruction from cine magnetic resonance images. Med Image Anal 2023; 90:102975. [PMID: 37804586 DOI: 10.1016/j.media.2023.102975] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2022] [Revised: 07/08/2023] [Accepted: 09/18/2023] [Indexed: 10/09/2023]
Abstract
Cine magnetic resonance imaging (MRI) is the current gold standard for the assessment of cardiac anatomy and function. However, it typically only acquires a set of two-dimensional (2D) slices of the underlying three-dimensional (3D) anatomy of the heart, thus limiting the understanding and analysis of both healthy and pathological cardiac morphology and physiology. In this paper, we propose a novel fully automatic surface reconstruction pipeline capable of reconstructing multi-class 3D cardiac anatomy meshes from raw cine MRI acquisitions. Its key component is a multi-class point cloud completion network (PCCN) capable of correcting both the sparsity and misalignment issues of the 3D reconstruction task in a unified model. We first evaluate the PCCN on a large synthetic dataset of biventricular anatomies and observe Chamfer distances between reconstructed and gold standard anatomies below or similar to the underlying image resolution for multiple levels of slice misalignment. Furthermore, we find a reduction in reconstruction error compared to a benchmark 3D U-Net by 32% and 24% in terms of Hausdorff distance and mean surface distance, respectively. We then apply the PCCN as part of our automated reconstruction pipeline to 1000 subjects from the UK Biobank study in a cross-domain transfer setting and demonstrate its ability to reconstruct accurate and topologically plausible biventricular heart meshes with clinical metrics comparable to the previous literature. Finally, we investigate the robustness of our proposed approach and observe its capacity to successfully handle multiple common outlier conditions.
Collapse
Affiliation(s)
- Marcel Beetz
- Institute of Biomedical Engineering, Department of Engineering Science, University of Oxford, Oxford OX3 7DQ, UK.
| | - Abhirup Banerjee
- Institute of Biomedical Engineering, Department of Engineering Science, University of Oxford, Oxford OX3 7DQ, UK; Division of Cardiovascular Medicine, Radcliffe Department of Medicine, University of Oxford, Oxford OX3 9DU, UK.
| | - Julius Ossenberg-Engels
- Institute of Biomedical Engineering, Department of Engineering Science, University of Oxford, Oxford OX3 7DQ, UK
| | - Vicente Grau
- Institute of Biomedical Engineering, Department of Engineering Science, University of Oxford, Oxford OX3 7DQ, UK
| |
Collapse
|
42
|
Zhang M, Wu Y, Zhang H, Qin Y, Zheng H, Tang W, Arnold C, Pei C, Yu P, Nan Y, Yang G, Walsh S, Marshall DC, Komorowski M, Wang P, Guo D, Jin D, Wu Y, Zhao S, Chang R, Zhang B, Lu X, Qayyum A, Mazher M, Su Q, Wu Y, Liu Y, Zhu Y, Yang J, Pakzad A, Rangelov B, Estepar RSJ, Espinosa CC, Sun J, Yang GZ, Gu Y. Multi-site, Multi-domain Airway Tree Modeling. Med Image Anal 2023; 90:102957. [PMID: 37716199 DOI: 10.1016/j.media.2023.102957] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2023] [Revised: 06/07/2023] [Accepted: 09/04/2023] [Indexed: 09/18/2023]
Abstract
Open international challenges are becoming the de facto standard for assessing computer vision and image analysis algorithms. In recent years, new methods have extended the reach of pulmonary airway segmentation that is closer to the limit of image resolution. Since EXACT'09 pulmonary airway segmentation, limited effort has been directed to the quantitative comparison of newly emerged algorithms driven by the maturity of deep learning based approaches and extensive clinical efforts for resolving finer details of distal airways for early intervention of pulmonary diseases. Thus far, public annotated datasets are extremely limited, hindering the development of data-driven methods and detailed performance evaluation of new algorithms. To provide a benchmark for the medical imaging community, we organized the Multi-site, Multi-domain Airway Tree Modeling (ATM'22), which was held as an official challenge event during the MICCAI 2022 conference. ATM'22 provides large-scale CT scans with detailed pulmonary airway annotation, including 500 CT scans (300 for training, 50 for validation, and 150 for testing). The dataset was collected from different sites and it further included a portion of noisy COVID-19 CTs with ground-glass opacity and consolidation. Twenty-three teams participated in the entire phase of the challenge and the algorithms for the top ten teams are reviewed in this paper. Both quantitative and qualitative results revealed that deep learning models embedded with the topological continuity enhancement achieved superior performance in general. ATM'22 challenge holds as an open-call design, the training data and the gold standard evaluation are available upon successful registration via its homepage (https://atm22.grand-challenge.org/).
Collapse
Affiliation(s)
- Minghui Zhang
- Institute of Medical Robotics, Shanghai Jiao Tong University, Shanghai, 200240, China; Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, Shanghai, 200240, China; Department of Automation, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Yangqian Wu
- Institute of Medical Robotics, Shanghai Jiao Tong University, Shanghai, 200240, China; Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, Shanghai, 200240, China; Department of Automation, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Hanxiao Zhang
- Institute of Medical Robotics, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Yulei Qin
- Institute of Medical Robotics, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Hao Zheng
- Institute of Medical Robotics, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Wen Tang
- InferVision Medical Technology Co., Ltd., Beijing, China
| | | | - Chenhao Pei
- InferVision Medical Technology Co., Ltd., Beijing, China
| | - Pengxin Yu
- InferVision Medical Technology Co., Ltd., Beijing, China
| | - Yang Nan
- Imperial College London, London, UK
| | | | | | | | | | - Puyang Wang
- Alibaba DAMO Academy, 969 West Wen Yi Road, Hangzhou, Zhejiang, China
| | - Dazhou Guo
- Alibaba DAMO Academy USA, 860 Washington Street, 8F, NY, USA
| | - Dakai Jin
- Alibaba DAMO Academy USA, 860 Washington Street, 8F, NY, USA
| | - Ya'nan Wu
- College of Medicine and Biological Information Engineering, Northeastern University, Shenyang, China
| | - Shuiqing Zhao
- College of Medicine and Biological Information Engineering, Northeastern University, Shenyang, China
| | - Runsheng Chang
- College of Medicine and Biological Information Engineering, Northeastern University, Shenyang, China
| | - Boyu Zhang
- A.I R&D Center, Sanmed Biotech Inc., No. 266 Tongchang Road, Xiangzhou District, Zhuhai, Guangdong, China
| | - Xing Lu
- A.I R&D Center, Sanmed Biotech Inc., T220 Trade st. SanDiego, CA, USA
| | - Abdul Qayyum
- ENIB, UMR CNRS 6285 LabSTICC, Brest, 29238, France
| | - Moona Mazher
- Department of Computer Engineering and Mathematics, University Rovira I Virgili, Tarragona, Spain
| | - Qi Su
- Shanghai Jiao Tong University, Shanghai, China
| | - Yonghuang Wu
- School of Information Science and Technology, Fudan University, Shanghai, China
| | - Ying'ao Liu
- University of Science and Technology of China, Hefei, Anhui, China
| | | | - Jiancheng Yang
- Dianei Technology, Shanghai, China; EPFL, Lausanne, Switzerland
| | - Ashkan Pakzad
- Medical Physics and Biomedical Engineering Department, University College London, London, UK
| | - Bojidar Rangelov
- Center for Medical Image Computing, University College London, London, UK
| | | | | | - Jiayuan Sun
- Department of Respiratory and Critical Care Medicine, Department of Respiratory Endoscopy, Shanghai Chest Hospital, Shanghai, China.
| | - Guang-Zhong Yang
- Institute of Medical Robotics, Shanghai Jiao Tong University, Shanghai, 200240, China.
| | - Yun Gu
- Institute of Medical Robotics, Shanghai Jiao Tong University, Shanghai, 200240, China; Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, Shanghai, 200240, China; Department of Automation, Shanghai Jiao Tong University, Shanghai, 200240, China.
| |
Collapse
|
43
|
Chen Z, Zhuo W, Wang T, Cheng J, Xue W, Ni D. Semi-Supervised Representation Learning for Segmentation on Medical Volumes and Sequences. IEEE TRANSACTIONS ON MEDICAL IMAGING 2023; 42:3972-3986. [PMID: 37756175 DOI: 10.1109/tmi.2023.3319973] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/29/2023]
Abstract
Benefiting from the massive labeled samples, deep learning-based segmentation methods have achieved great success for two dimensional natural images. However, it is still a challenging task to segment high dimensional medical volumes and sequences, due to the considerable efforts for clinical expertise to make large scale annotations. Self/semi-supervised learning methods have been shown to improve the performance by exploiting unlabeled data. However, they are still lack of mining local semantic discrimination and exploitation of volume/sequence structures. In this work, we propose a semi-supervised representation learning method with two novel modules to enhance the features in the encoder and decoder, respectively. For the encoder, based on the continuity between slices/frames and the common spatial layout of organs across subjects, we propose an asymmetric network with an attention-guided predictor to enable prediction between feature maps of different slices of unlabeled data. For the decoder, based on the semantic consistency between labeled data and unlabeled data, we introduce a novel semantic contrastive learning to regularize the feature maps in the decoder. The two parts are trained jointly with both labeled and unlabeled volumes/sequences in a semi-supervised manner. When evaluated on three benchmark datasets of medical volumes and sequences, our model outperforms existing methods with a large margin of 7.3% DSC on ACDC, 6.5% on Prostate, and 3.2% on CAMUS when only a few labeled data is available. Further, results on the M&M dataset show that the proposed method yields improvement without using any domain adaption techniques for data from unknown domain. Intensive evaluations reveal the effectiveness of representation mining, and superiority on performance of our method. The code is available at https://github.com/CcchenzJ/BootstrapRepresentation.
Collapse
|
44
|
Patil SS, Ramteke M, Verma M, Seth S, Bhargava R, Mittal S, Rathore AS. A Domain-Shift Invariant CNN Framework for Cardiac MRI Segmentation Across Unseen Domains. J Digit Imaging 2023; 36:2148-2163. [PMID: 37430062 PMCID: PMC10501982 DOI: 10.1007/s10278-023-00873-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2023] [Revised: 06/09/2023] [Accepted: 06/13/2023] [Indexed: 07/12/2023] Open
Abstract
The emergence of various deep learning approaches in diagnostic medical image segmentation has made machines capable of accomplishing human-level accuracy. However, the generalizability of these architectures across patients from different countries, Magnetic Resonance Imaging (MRI) scans from distinct vendors, and varying imaging conditions remains questionable. In this work, we propose a translatable deep learning framework for diagnostic segmentation of cine MRI scans. This study aims to render the available SOTA (state-of-the-art) architectures domain-shift invariant by utilizing the heterogeneity of multi-sequence cardiac MRI. To develop and test our approach, we curated a diverse group of public datasets and a dataset obtained from private source. We evaluated 3 SOTA CNN (Convolution neural network) architectures i.e., U-Net, Attention-U-Net, and Attention-Res-U-Net. These architectures were first trained on a combination of three different cardiac MRI sequences. Next, we examined the M&M (multi-center & mutli-vendor) challenge dataset to investigate the effect of different training sets on translatability. The U-Net architecture, trained on the multi-sequence dataset, proved to be the most generalizable across multiple datasets during validation on unseen domains. This model attained mean dice scores of 0.81, 0.85, and 0.83 for myocardial wall segmentation after testing on unseen MyoPS (Myocardial Pathology Segmentation) 2020 dataset, AIIMS (All India Institute of Medical Sciences) dataset and M&M dataset, respectively. Our framework achieved Pearson's correlation values of 0.98, 0.99, and 0.95 between the observed and predicted parameters of end diastole volume, end systole volume, and ejection fraction, respectively, on the unseen Indian population dataset.
Collapse
Affiliation(s)
- Sanjeet S Patil
- Department of Chemical Engineering, Indian Institute of Technology, Delhi, New Delhi, India
| | - Manojkumar Ramteke
- Department of Chemical Engineering, Indian Institute of Technology, Delhi, New Delhi, India
- Yardi School of Artificial Intelligence, Indian Institute of Technology, Delhi, New Delhi, India
| | - Mansi Verma
- Department of Cardiology, All India Institute of Medical Science, Rishikesh, Uttarakhand, India
| | - Sandeep Seth
- Department of Cardiology, All India Institute Medical Science, New Delhi, India
| | - Rohit Bhargava
- Departments of Bioengineering, Electrical & Computer Engineering, Mechanical Science & Engineering, Chemical and Biomolecular Engineering and Chemistry, Beckman Institute for Advanced Science and Technology, Cancer Center at Illinois, University of Illinois at Urbana-Champaign, Urbana, IL, 61801, USA
| | - Shachi Mittal
- Department of Laboratory Medicine and Pathology, School of Medicine, University of Washington, Seattle, WA, USA.
| | - Anurag S Rathore
- Department of Chemical Engineering, Indian Institute of Technology, Delhi, New Delhi, India.
- Yardi School of Artificial Intelligence, Indian Institute of Technology, Delhi, New Delhi, India.
| |
Collapse
|
45
|
Chen B, Thandiackal K, Pati P, Goksel O. Generative appearance replay for continual unsupervised domain adaptation. Med Image Anal 2023; 89:102924. [PMID: 37597316 DOI: 10.1016/j.media.2023.102924] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2023] [Revised: 06/19/2023] [Accepted: 08/01/2023] [Indexed: 08/21/2023]
Abstract
Deep learning models can achieve high accuracy when trained on large amounts of labeled data. However, real-world scenarios often involve several challenges: Training data may become available in installments, may originate from multiple different domains, and may not contain labels for training. Certain settings, for instance medical applications, often involve further restrictions that prohibit retention of previously seen data due to privacy regulations. In this work, to address such challenges, we study unsupervised segmentation in continual learning scenarios that involve domain shift. To that end, we introduce GarDA (Generative Appearance Replay for continual Domain Adaptation), a generative-replay based approach that can adapt a segmentation model sequentially to new domains with unlabeled data. In contrast to single-step unsupervised domain adaptation (UDA), continual adaptation to a sequence of domains enables leveraging and consolidation of information from multiple domains. Unlike previous approaches in incremental UDA, our method does not require access to previously seen data, making it applicable in many practical scenarios. We evaluate GarDA on three datasets with different organs and modalities, where it substantially outperforms existing techniques. Our code is available at: https://github.com/histocartography/generative-appearance-replay.
Collapse
Affiliation(s)
- Boqi Chen
- ETH AI Center, Zurich, Switzerland; Department of Computer Science, ETH Zurich, Switzerland
| | - Kevin Thandiackal
- IBM Research Europe, Zurich, Switzerland; Computer-Assisted Applications in Medicine, ETH Zurich, Zurich, Switzerland.
| | | | - Orcun Goksel
- Computer-Assisted Applications in Medicine, ETH Zurich, Zurich, Switzerland; Department of Information Technology, Uppsala University, Uppsala, Sweden
| |
Collapse
|
46
|
Mariscal-Harana J, Asher C, Vergani V, Rizvi M, Keehn L, Kim RJ, Judd RM, Petersen SE, Razavi R, King AP, Ruijsink B, Puyol-Antón E. An artificial intelligence tool for automated analysis of large-scale unstructured clinical cine cardiac magnetic resonance databases. EUROPEAN HEART JOURNAL. DIGITAL HEALTH 2023; 4:370-383. [PMID: 37794871 PMCID: PMC10545512 DOI: 10.1093/ehjdh/ztad044] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/13/2023] [Revised: 06/05/2023] [Accepted: 07/12/2023] [Indexed: 10/06/2023]
Abstract
Aims Artificial intelligence (AI) techniques have been proposed for automating analysis of short-axis (SAX) cine cardiac magnetic resonance (CMR), but no CMR analysis tool exists to automatically analyse large (unstructured) clinical CMR datasets. We develop and validate a robust AI tool for start-to-end automatic quantification of cardiac function from SAX cine CMR in large clinical databases. Methods and results Our pipeline for processing and analysing CMR databases includes automated steps to identify the correct data, robust image pre-processing, an AI algorithm for biventricular segmentation of SAX CMR and estimation of functional biomarkers, and automated post-analysis quality control to detect and correct errors. The segmentation algorithm was trained on 2793 CMR scans from two NHS hospitals and validated on additional cases from this dataset (n = 414) and five external datasets (n = 6888), including scans of patients with a range of diseases acquired at 12 different centres using CMR scanners from all major vendors. Median absolute errors in cardiac biomarkers were within the range of inter-observer variability: <8.4 mL (left ventricle volume), <9.2 mL (right ventricle volume), <13.3 g (left ventricular mass), and <5.9% (ejection fraction) across all datasets. Stratification of cases according to phenotypes of cardiac disease and scanner vendors showed good performance across all groups. Conclusion We show that our proposed tool, which combines image pre-processing steps, a domain-generalizable AI algorithm trained on a large-scale multi-domain CMR dataset and quality control steps, allows robust analysis of (clinical or research) databases from multiple centres, vendors, and cardiac diseases. This enables translation of our tool for use in fully automated processing of large multi-centre databases.
Collapse
Affiliation(s)
- Jorge Mariscal-Harana
- School of Biomedical Engineering & Imaging Sciences Rayne Institute, 4th Floor, Lambeth Wing St. Thomas' Hospital Westminster Bridge Road London SE1 7EH
| | - Clint Asher
- School of Biomedical Engineering & Imaging Sciences Rayne Institute, 4th Floor, Lambeth Wing St. Thomas' Hospital Westminster Bridge Road London SE1 7EH
- Department of Adult and Paediatric Cardiology, Guy’s and St Thomas’ NHS Foundation Trust, Westminster Bridge Road, London SE1 7EH, London, UK
| | - Vittoria Vergani
- School of Biomedical Engineering & Imaging Sciences Rayne Institute, 4th Floor, Lambeth Wing St. Thomas' Hospital Westminster Bridge Road London SE1 7EH
| | - Maleeha Rizvi
- School of Biomedical Engineering & Imaging Sciences Rayne Institute, 4th Floor, Lambeth Wing St. Thomas' Hospital Westminster Bridge Road London SE1 7EH
- Department of Adult and Paediatric Cardiology, Guy’s and St Thomas’ NHS Foundation Trust, Westminster Bridge Road, London SE1 7EH, London, UK
| | - Louise Keehn
- Department of Clinical Pharmacology, King’s College London British Heart Foundation Centre, St Thomas’ Hospital, London, Westminster Bridge Road, London SE1 7EH, UK
| | - Raymond J Kim
- Division of Cardiology, Department of Medicine, Duke University, 40 Duke Medicine Circle, Durham, NC 27710, USA
| | - Robert M Judd
- Division of Cardiology, Department of Medicine, Duke University, 40 Duke Medicine Circle, Durham, NC 27710, USA
| | - Steffen E Petersen
- William Harvey Research Institute, NIHR Barts Biomedical Research Centre, Queen Mary University of London, Charterhouse Square, London EC1M 6BQ, UK
- Barts Heart Centre, St Bartholomew’s Hospital, Barts Health NHS Trust, W Smithfield, London EC1A 7BE, UK
- Health Data Research UK, Gibbs Building, 215 Euston Rd., London NW1 2BE, UK
- Alan Turing Institute, 96 Euston Rd., London NW1 2DB, UK
| | - Reza Razavi
- School of Biomedical Engineering & Imaging Sciences Rayne Institute, 4th Floor, Lambeth Wing St. Thomas' Hospital Westminster Bridge Road London SE1 7EH
- Department of Adult and Paediatric Cardiology, Guy’s and St Thomas’ NHS Foundation Trust, Westminster Bridge Road, London SE1 7EH, London, UK
| | - Andrew P King
- School of Biomedical Engineering & Imaging Sciences Rayne Institute, 4th Floor, Lambeth Wing St. Thomas' Hospital Westminster Bridge Road London SE1 7EH
| | - Bram Ruijsink
- School of Biomedical Engineering & Imaging Sciences Rayne Institute, 4th Floor, Lambeth Wing St. Thomas' Hospital Westminster Bridge Road London SE1 7EH
- Department of Adult and Paediatric Cardiology, Guy’s and St Thomas’ NHS Foundation Trust, Westminster Bridge Road, London SE1 7EH, London, UK
- Department of Cardiology, Division Heart and Lungs, University Medical Center Utrecht, Utrecht University, 3584 CX Utrecht, the Netherlands
| | - Esther Puyol-Antón
- School of Biomedical Engineering & Imaging Sciences Rayne Institute, 4th Floor, Lambeth Wing St. Thomas' Hospital Westminster Bridge Road London SE1 7EH
| |
Collapse
|
47
|
Sheikhjafari A, Krishnaswamy D, Noga M, Ray N, Punithakumar K. Deep Learning Based Parameterization of Diffeomorphic Image Registration for Cardiac Image Segmentation. IEEE Trans Nanobioscience 2023; 22:800-807. [PMID: 37220045 DOI: 10.1109/tnb.2023.3276867] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]
Abstract
Cardiac segmentation from magnetic resonance imaging (MRI) is one of the essential tasks in analyzing the anatomy and function of the heart for the assessment and diagnosis of cardiac diseases. However, cardiac MRI generates hundreds of images per scan, and manual annotation of them is challenging and time-consuming, and therefore processing these images automatically is of interest. This study proposes a novel end-to-end supervised cardiac MRI segmentation framework based on a diffeomorphic deformable registration that can segment cardiac chambers from 2D and 3D images or volumes. To represent actual cardiac deformation, the method parameterizes the transformation using radial and rotational components computed via deep learning, with a set of paired images and segmentation masks used for training. The formulation guarantees transformations that are invertible and prevents mesh folding, which is essential for preserving the topology of the segmentation results. A physically plausible transformation is achieved by employing diffeomorphism in computing the transformations and activation functions that constrain the range of the radial and rotational components. The method was evaluated over three different data sets and showed significant improvements compared to exacting learning and non-learning based methods in terms of the Dice score and Hausdorff distance metrics.
Collapse
|
48
|
Li P, Zhou R, He J, Zhao S, Tian Y. A global-frequency-domain network for medical image segmentation. Comput Biol Med 2023; 164:107290. [PMID: 37579584 DOI: 10.1016/j.compbiomed.2023.107290] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2023] [Revised: 07/13/2023] [Accepted: 07/28/2023] [Indexed: 08/16/2023]
Abstract
The UNet series networks have been a leader in the field of medical image segmentation since their introduction. However, encoder and decoder structures of the traditional UNet series network are complex, with a large number of parameters and floating-point operations. This requires a large amount of data as support for model training, but most medical datasets only contain limited numbers of samples. To address this issue, we propose a global frequency domain UNet (GFUNet), a novel architecture for fast medical image segmentation. Inspired by recent modified Multi-Layer Perceptron(MLP)-like models, we combine Fourier Transform with UNet structure to achieve more efficient and effective encoding and decoding processes. Meanwhile, A dual-domain encoding module is designed to improve the performance of the encoder and decoder by fully used frequency domain feature. Furthermore, due to the excellent property of the Fourier Transform and its optimization, our network greatly reduces the number of parameters compared to other UNets. We evaluate GFUNet on several medical segmentation tasks, achieving improved segmentation performance compared to state-of-the-art network architectures for medical image segmentation. Compared to the original UNet, the results show that we reduce the number of parameters by 46 times, reduce computational complexity by 114 times, and improved the considerable dice score.
Collapse
Affiliation(s)
- Penghui Li
- School of Artificial Intelligence, Beijing Normal University, No. 19, Xinjiekouwai St, Haidian District, 100875, Beijing, PR China
| | - Rui Zhou
- School of Artificial Intelligence, Beijing Normal University, No. 19, Xinjiekouwai St, Haidian District, 100875, Beijing, PR China
| | - Jin He
- School of Artificial Intelligence, Beijing Normal University, No. 19, Xinjiekouwai St, Haidian District, 100875, Beijing, PR China
| | - Shifeng Zhao
- School of Artificial Intelligence, Beijing Normal University, No. 19, Xinjiekouwai St, Haidian District, 100875, Beijing, PR China
| | - Yun Tian
- School of Artificial Intelligence, Beijing Normal University, No. 19, Xinjiekouwai St, Haidian District, 100875, Beijing, PR China.
| |
Collapse
|
49
|
Sander J, de Vos BD, Bruns S, Planken N, Viergever MA, Leiner T, Išgum I. Reconstruction and completion of high-resolution 3D cardiac shapes using anisotropic CMRI segmentations and continuous implicit neural representations. Comput Biol Med 2023; 164:107266. [PMID: 37494823 DOI: 10.1016/j.compbiomed.2023.107266] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2023] [Revised: 06/26/2023] [Accepted: 07/16/2023] [Indexed: 07/28/2023]
Abstract
Since the onset of computer-aided diagnosis in medical imaging, voxel-based segmentation has emerged as the primary methodology for automatic analysis of left ventricle (LV) function and morphology in cardiac magnetic resonance images (CMRI). In standard clinical practice, simultaneous multi-slice 2D cine short-axis MR imaging is performed under multiple breath-holds resulting in highly anisotropic 3D images. Furthermore, sparse-view CMRI often lacks whole heart coverage caused by large slice thickness and often suffers from inter-slice misalignment induced by respiratory motion. Therefore, these volumes only provide limited information about the true 3D cardiac anatomy which may hamper highly accurate assessment of functional and anatomical abnormalities. To address this, we propose a method that learns a continuous implicit function representing 3D LV shapes by training an auto-decoder. For training, high-resolution segmentations from cardiac CT angiography are used. The ability of our approach to reconstruct and complete high-resolution shapes from manually or automatically obtained sparse-view cardiac shape information is evaluated by using paired high- and low-resolution CMRI LV segmentations. The results show that the reconstructed LV shapes have an unconstrained subvoxel resolution and appear smooth and plausible in through-plane direction. Furthermore, Bland-Altman analysis reveals that reconstructed high-resolution ventricle volumes are closer to the corresponding reference volumes than reference low-resolution volumes with bias of [limits of agreement] -3.51 [-18.87, 11.85] mL, and 12.96 [-10.01, 35.92] mL respectively. Finally, the results demonstrate that the proposed approach allows recovering missing shape information and can indirectly correct for limited motion-induced artifacts.
Collapse
Affiliation(s)
- Jörg Sander
- Department of Biomedical Engineering and Physics, Amsterdam University Medical Center location University of Amsterdam, The Netherlands; Amsterdam Cardiovascular Sciences, Amsterdam, The Netherlands; Informatics Institute, University of Amsterdam, Amsterdam, The Netherlands.
| | - Bob D de Vos
- Department of Biomedical Engineering and Physics, Amsterdam University Medical Center location University of Amsterdam, The Netherlands
| | - Steffen Bruns
- Department of Biomedical Engineering and Physics, Amsterdam University Medical Center location University of Amsterdam, The Netherlands
| | - Nils Planken
- Department of Radiology and Nuclear Medicine,Amsterdam University Medical Center location University of Amsterdam, Amsterdam, The Netherlands
| | - Max A Viergever
- Image Sciences Institute, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Tim Leiner
- Department of Radiology, Mayo Clinic, Rochester, MN, USA
| | - Ivana Išgum
- Department of Biomedical Engineering and Physics, Amsterdam University Medical Center location University of Amsterdam, The Netherlands; Amsterdam Cardiovascular Sciences, Amsterdam, The Netherlands; Department of Radiology and Nuclear Medicine,Amsterdam University Medical Center location University of Amsterdam, Amsterdam, The Netherlands; Informatics Institute, University of Amsterdam, Amsterdam, The Netherlands
| |
Collapse
|
50
|
Liu Z, Lv Q, Yang Z, Li Y, Lee CH, Shen L. Recent progress in transformer-based medical image analysis. Comput Biol Med 2023; 164:107268. [PMID: 37494821 DOI: 10.1016/j.compbiomed.2023.107268] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2023] [Revised: 05/30/2023] [Accepted: 07/16/2023] [Indexed: 07/28/2023]
Abstract
The transformer is primarily used in the field of natural language processing. Recently, it has been adopted and shows promise in the computer vision (CV) field. Medical image analysis (MIA), as a critical branch of CV, also greatly benefits from this state-of-the-art technique. In this review, we first recap the core component of the transformer, the attention mechanism, and the detailed structures of the transformer. After that, we depict the recent progress of the transformer in the field of MIA. We organize the applications in a sequence of different tasks, including classification, segmentation, captioning, registration, detection, enhancement, localization, and synthesis. The mainstream classification and segmentation tasks are further divided into eleven medical image modalities. A large number of experiments studied in this review illustrate that the transformer-based method outperforms existing methods through comparisons with multiple evaluation metrics. Finally, we discuss the open challenges and future opportunities in this field. This task-modality review with the latest contents, detailed information, and comprehensive comparison may greatly benefit the broad MIA community.
Collapse
Affiliation(s)
- Zhaoshan Liu
- Department of Mechanical Engineering, National University of Singapore, 9 Engineering Drive 1, Singapore, 117575, Singapore.
| | - Qiujie Lv
- Department of Mechanical Engineering, National University of Singapore, 9 Engineering Drive 1, Singapore, 117575, Singapore; School of Intelligent Systems Engineering, Sun Yat-sen University, No. 66, Gongchang Road, Guangming District, 518107, China.
| | - Ziduo Yang
- Department of Mechanical Engineering, National University of Singapore, 9 Engineering Drive 1, Singapore, 117575, Singapore; School of Intelligent Systems Engineering, Sun Yat-sen University, No. 66, Gongchang Road, Guangming District, 518107, China.
| | - Yifan Li
- Department of Mechanical Engineering, National University of Singapore, 9 Engineering Drive 1, Singapore, 117575, Singapore.
| | - Chau Hung Lee
- Department of Radiology, Tan Tock Seng Hospital, 11 Jalan Tan Tock Seng, Singapore, 308433, Singapore.
| | - Lei Shen
- Department of Mechanical Engineering, National University of Singapore, 9 Engineering Drive 1, Singapore, 117575, Singapore.
| |
Collapse
|