1
|
Mahawar J, Paul A. Generalizable diagnosis of chest radiographs through attention-guided decomposition of images utilizing self-consistency loss. Comput Biol Med 2024; 180:108922. [PMID: 39089108 DOI: 10.1016/j.compbiomed.2024.108922] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2023] [Revised: 07/15/2024] [Accepted: 07/16/2024] [Indexed: 08/03/2024]
Abstract
BACKGROUND Chest X-ray (CXR) is one of the most commonly performed imaging tests worldwide. Due to its wide usage, there is a growing need for automated and generalizable methods to accurately diagnose these images. Traditional methods for chest X-ray analysis often struggle with generalization across diverse datasets due to variations in imaging protocols, patient demographics, and the presence of overlapping anatomical structures. Therefore, there is a significant demand for advanced diagnostic tools that can consistently identify abnormalities across different patient populations and imaging settings. We propose a method that can provide a generalizable diagnosis of chest X-ray. METHOD Our method utilizes an attention-guided decomposer network (ADSC) to extract disease maps from chest X-ray images. The ADSC employs one encoder and multiple decoders, incorporating a novel self-consistency loss to ensure consistent functionality across its modules. The attention-guided encoder captures salient features of abnormalities, while three distinct decoders generate a normal synthesized image, a disease map, and a reconstructed input image, respectively. A discriminator differentiates the real and the synthesized normal chest X-rays, enhancing the quality of generated images. The disease map along with the original chest X-ray image are fed to a DenseNet-121 classifier modified for multi-class classification of the input X-ray. RESULTS Experimental results on multiple publicly available datasets demonstrate the effectiveness of our approach. For multi-class classification, we achieve up to a 3% improvement in AUROC score for certain abnormalities compared to the existing methods. For binary classification (normal versus abnormal), our method surpasses existing approaches across various datasets. In terms of generalizability, we train our model on one dataset and tested it on multiple datasets. The standard deviation of AUROC scores for different test datasets is calculated to measure the variability of performance across datasets. Our model exhibits superior generalization across datasets from diverse sources. CONCLUSIONS Our model shows promising results for the generalizable diagnosis of chest X-rays. The impacts of using the attention mechanism and the self-consistency loss in our method are evident from the results. In the future, we plan to incorporate Explainable AI techniques to provide explanations for model decisions. Additionally, we aim to design data augmentation techniques to reduce class imbalance in our model.
Collapse
Affiliation(s)
- Jayant Mahawar
- Department of Computer Science and Engineering, Indian Institute of Technology Jodhpur, N.H. 62, Nagaur Road, Karwar, Jodhpur, 342030, Rajasthan, India.
| | - Angshuman Paul
- Department of Computer Science and Engineering, Indian Institute of Technology Jodhpur, N.H. 62, Nagaur Road, Karwar, Jodhpur, 342030, Rajasthan, India.
| |
Collapse
|
2
|
Batool A, Byun YC. Brain tumor detection with integrating traditional and computational intelligence approaches across diverse imaging modalities - Challenges and future directions. Comput Biol Med 2024; 175:108412. [PMID: 38691914 DOI: 10.1016/j.compbiomed.2024.108412] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2023] [Revised: 03/18/2024] [Accepted: 04/02/2024] [Indexed: 05/03/2024]
Abstract
Brain tumor segmentation and classification play a crucial role in the diagnosis and treatment planning of brain tumors. Accurate and efficient methods for identifying tumor regions and classifying different tumor types are essential for guiding medical interventions. This study comprehensively reviews brain tumor segmentation and classification techniques, exploring various approaches based on image processing, machine learning, and deep learning. Furthermore, our study aims to review existing methodologies, discuss their advantages and limitations, and highlight recent advancements in this field. The impact of existing segmentation and classification techniques for automated brain tumor detection is also critically examined using various open-source datasets of Magnetic Resonance Images (MRI) of different modalities. Moreover, our proposed study highlights the challenges related to segmentation and classification techniques and datasets having various MRI modalities to enable researchers to develop innovative and robust solutions for automated brain tumor detection. The results of this study contribute to the development of automated and robust solutions for analyzing brain tumors, ultimately aiding medical professionals in making informed decisions and providing better patient care.
Collapse
Affiliation(s)
- Amreen Batool
- Department of Electronic Engineering, Institute of Information Science & Technology, Jeju National University, Jeju, 63243, South Korea
| | - Yung-Cheol Byun
- Department of Computer Engineering, Major of Electronic Engineering, Jeju National University, Institute of Information Science Technology, Jeju, 63243, South Korea.
| |
Collapse
|
3
|
Zhang Y, Chen Z, Yang X. Light-M: An efficient lightweight medical image segmentation framework for resource-constrained IoMT. Comput Biol Med 2024; 170:108088. [PMID: 38320339 DOI: 10.1016/j.compbiomed.2024.108088] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2023] [Revised: 12/22/2023] [Accepted: 01/27/2024] [Indexed: 02/08/2024]
Abstract
The Internet of Medical Things (IoMT) is being incorporated into current healthcare systems. This technology intends to connect patients, IoMT devices, and hospitals over mobile networks, allowing for more secure, quick, and convenient health monitoring and intelligent healthcare services. However, existing intelligent healthcare applications typically rely on large-scale AI models, and standard IoMT devices have significant resource constraints. To alleviate this paradox, in this paper, we propose a Knowledge Distillation (KD)-based IoMT end-edge-cloud orchestrated architecture for medical image segmentation tasks, called Light-M, aiming to deploy a lightweight medical model in resource-constrained IoMT devices. Specifically, Light-M trains a large teacher model in the cloud server and employs computation in local nodes through imitation of the performance of the teacher model using knowledge distillation. Light-M contains two KD strategies: (1) active exploration and passive transfer (AEPT) and (2) self-attention-based inter-class feature variation (AIFV) distillation for the medical image segmentation task. The AEPT encourages the student model to learn undiscovered knowledge/features of the teacher model without additional feature layers, aiming to explore new features and outperform the teacher. To improve the distinguishability of the student for different classes, the student learns the self-attention-based feature variation (AIFV) between classes. Since the proposed AEPT and AIFV only appear in the training process, our framework does not involve any additional computation burden for a student model during the segmentation task deployment. Extensive experiments on cardiac images and public real-scene datasets demonstrate that our approach improves student model learning representations and outperforms state-of-the-art methods by combining two knowledge distillation strategies. Moreover, when deployed on the IoT device, the distilled student model takes only 29.6 ms for one sample at the inference step.
Collapse
Affiliation(s)
- Yifan Zhang
- Shenzhen University, 3688 Nanhai Ave., Shenzhen, 518060, Guangdong, China
| | - Zhuangzhuang Chen
- Shenzhen University, 3688 Nanhai Ave., Shenzhen, 518060, Guangdong, China
| | - Xuan Yang
- Shenzhen University, 3688 Nanhai Ave., Shenzhen, 518060, Guangdong, China.
| |
Collapse
|
4
|
Xing F, Yang X, Cornish TC, Ghosh D. Learning with limited target data to detect cells in cross-modality images. Med Image Anal 2023; 90:102969. [PMID: 37802010 DOI: 10.1016/j.media.2023.102969] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2023] [Revised: 08/16/2023] [Accepted: 09/11/2023] [Indexed: 10/08/2023]
Abstract
Deep neural networks have achieved excellent cell or nucleus quantification performance in microscopy images, but they often suffer from performance degradation when applied to cross-modality imaging data. Unsupervised domain adaptation (UDA) based on generative adversarial networks (GANs) has recently improved the performance of cross-modality medical image quantification. However, current GAN-based UDA methods typically require abundant target data for model training, which is often very expensive or even impossible to obtain for real applications. In this paper, we study a more realistic yet challenging UDA situation, where (unlabeled) target training data is limited and previous work seldom delves into cell identification. We first enhance a dual GAN with task-specific modeling, which provides additional supervision signals to assist with generator learning. We explore both single-directional and bidirectional task-augmented GANs for domain adaptation. Then, we further improve the GAN by introducing a differentiable, stochastic data augmentation module to explicitly reduce discriminator overfitting. We examine source-, target-, and dual-domain data augmentation for GAN enhancement, as well as joint task and data augmentation in a unified GAN-based UDA framework. We evaluate the framework for cell detection on multiple public and in-house microscopy image datasets, which are acquired with different imaging modalities, staining protocols and/or tissue preparations. The experiments demonstrate that our method significantly boosts performance when compared with the reference baseline, and it is superior to or on par with fully supervised models that are trained with real target annotations. In addition, our method outperforms recent state-of-the-art UDA approaches by a large margin on different datasets.
Collapse
Affiliation(s)
- Fuyong Xing
- Department of Biostatistics and Informatics, University of Colorado Anschutz Medical Campus, 13001 E 17th Pl, Aurora, CO 80045, USA.
| | - Xinyi Yang
- Department of Biostatistics and Informatics, University of Colorado Anschutz Medical Campus, 13001 E 17th Pl, Aurora, CO 80045, USA
| | - Toby C Cornish
- Department of Pathology, University of Colorado Anschutz Medical Campus, 13001 E 17th Pl, Aurora, CO 80045, USA
| | - Debashis Ghosh
- Department of Biostatistics and Informatics, University of Colorado Anschutz Medical Campus, 13001 E 17th Pl, Aurora, CO 80045, USA
| |
Collapse
|
5
|
Dorent R, Haouchine N, Kogl F, Joutard S, Juvekar P, Torio E, Golby A, Ourselin S, Frisken S, Vercauteren T, Kapur T, Wells WM. Unified Brain MR-Ultrasound Synthesis using Multi-Modal Hierarchical Representations. MEDICAL IMAGE COMPUTING AND COMPUTER-ASSISTED INTERVENTION : MICCAI ... INTERNATIONAL CONFERENCE ON MEDICAL IMAGE COMPUTING AND COMPUTER-ASSISTED INTERVENTION 2023; 2023:448-458. [PMID: 38655383 PMCID: PMC7615858 DOI: 10.1007/978-3-031-43999-5_43] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 04/26/2024]
Abstract
We introduce MHVAE, a deep hierarchical variational autoencoder (VAE) that synthesizes missing images from various modalities. Extending multi-modal VAEs with a hierarchical latent structure, we introduce a probabilistic formulation for fusing multi-modal images in a common latent representation while having the flexibility to handle incomplete image sets as input. Moreover, adversarial learning is employed to generate sharper images. Extensive experiments are performed on the challenging problem of joint intra-operative ultrasound (iUS) and Magnetic Resonance (MR) synthesis. Our model outperformed multi-modal VAEs, conditional GANs, and the current state-of-the-art unified method (ResViT) for synthesizing missing images, demonstrating the advantage of using a hierarchical latent representation and a principled probabilistic fusion operation. Our code is publicly available.
Collapse
Affiliation(s)
- Reuben Dorent
- Harvard Medical School, Brigham and Women's Hospital, Boston, MA, USA
| | - Nazim Haouchine
- Harvard Medical School, Brigham and Women's Hospital, Boston, MA, USA
| | - Fryderyk Kogl
- Harvard Medical School, Brigham and Women's Hospital, Boston, MA, USA
| | | | - Parikshit Juvekar
- Harvard Medical School, Brigham and Women's Hospital, Boston, MA, USA
| | - Erickson Torio
- Harvard Medical School, Brigham and Women's Hospital, Boston, MA, USA
| | - Alexandra Golby
- Harvard Medical School, Brigham and Women's Hospital, Boston, MA, USA
| | | | - Sarah Frisken
- Harvard Medical School, Brigham and Women's Hospital, Boston, MA, USA
| | | | - Tina Kapur
- Harvard Medical School, Brigham and Women's Hospital, Boston, MA, USA
| | - William M Wells
- Harvard Medical School, Brigham and Women's Hospital, Boston, MA, USA
- Massachusetts Institute of Technology, Cambridge, MA, USA
| |
Collapse
|
6
|
Gu R, Wang G, Lu J, Zhang J, Lei W, Chen Y, Liao W, Zhang S, Li K, Metaxas DN, Zhang S. CDDSA: Contrastive domain disentanglement and style augmentation for generalizable medical image segmentation. Med Image Anal 2023; 89:102904. [PMID: 37506556 DOI: 10.1016/j.media.2023.102904] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2022] [Revised: 06/06/2023] [Accepted: 07/12/2023] [Indexed: 07/30/2023]
Abstract
Generalization to previously unseen images with potential domain shifts is essential for clinically applicable medical image segmentation. Disentangling domain-specific and domain-invariant features is key for Domain Generalization (DG). However, existing DG methods struggle to achieve effective disentanglement. To address this problem, we propose an efficient framework called Contrastive Domain Disentanglement and Style Augmentation (CDDSA) for generalizable medical image segmentation. First, a disentangle network decomposes the image into domain-invariant anatomical representation and domain-specific style code, where the former is sent for further segmentation that is not affected by domain shift, and the disentanglement is regularized by a decoder that combines the anatomical representation and style code to reconstruct the original image. Second, to achieve better disentanglement, a contrastive loss is proposed to encourage the style codes from the same domain and different domains to be compact and divergent, respectively. Finally, to further improve generalizability, we propose a style augmentation strategy to synthesize images with various unseen styles in real time while maintaining anatomical information. Comprehensive experiments on a public multi-site fundus image dataset and an in-house multi-site Nasopharyngeal Carcinoma Magnetic Resonance Image (NPC-MRI) dataset show that the proposed CDDSA achieved remarkable generalizability across different domains, and it outperformed several state-of-the-art methods in generalizable segmentation. Code is available at https://github.com/HiLab-git/DAG4MIA.
Collapse
Affiliation(s)
- Ran Gu
- School of Mechanical and Electrical Engineering, University of Electronic Science and Technology of China, Chengdu, China
| | - Guotai Wang
- School of Mechanical and Electrical Engineering, University of Electronic Science and Technology of China, Chengdu, China; Shanghai AI Lab, Shanghai, China.
| | - Jiangshan Lu
- School of Mechanical and Electrical Engineering, University of Electronic Science and Technology of China, Chengdu, China
| | - Jingyang Zhang
- School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China; School of Biomedical Engineering, ShanghaiTech University, Shanghai, China
| | - Wenhui Lei
- School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, Shanghai, China; Shanghai AI Lab, Shanghai, China
| | - Yinan Chen
- SenseTime Research, Shanghai, China; West China Hospital-SenseTime Joint Lab, West China Biomedical Big Data Center, Sichuan University, Chengdu, China
| | - Wenjun Liao
- Department of Radiation Oncology, Sichuan Cancer Hospital and Institute, University of Electronic Science and Technology of China, Chengdu, China
| | - Shichuan Zhang
- Department of Radiation Oncology, Sichuan Cancer Hospital and Institute, University of Electronic Science and Technology of China, Chengdu, China
| | - Kang Li
- West China Hospital-SenseTime Joint Lab, West China Biomedical Big Data Center, Sichuan University, Chengdu, China
| | - Dimitris N Metaxas
- Department of Computer Science, Rutgers University, Piscataway NJ 08854, USA
| | - Shaoting Zhang
- School of Mechanical and Electrical Engineering, University of Electronic Science and Technology of China, Chengdu, China; SenseTime Research, Shanghai, China; Shanghai AI Lab, Shanghai, China.
| |
Collapse
|
7
|
Zuo L, Liu Y, Xue Y, Dewey BE, Remedios SW, Hays SP, Bilgel M, Mowry EM, Newsome SD, Calabresi PA, Resnick SM, Prince JL, Carass A. HACA3: A unified approach for multi-site MR image harmonization. Comput Med Imaging Graph 2023; 109:102285. [PMID: 37657151 PMCID: PMC10592042 DOI: 10.1016/j.compmedimag.2023.102285] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2023] [Revised: 07/11/2023] [Accepted: 08/08/2023] [Indexed: 09/03/2023]
Abstract
The lack of standardization and consistency of acquisition is a prominent issue in magnetic resonance (MR) imaging. This often causes undesired contrast variations in the acquired images due to differences in hardware and acquisition parameters. In recent years, image synthesis-based MR harmonization with disentanglement has been proposed to compensate for the undesired contrast variations. The general idea is to disentangle anatomy and contrast information from MR images to achieve cross-site harmonization. Despite the success of existing methods, we argue that major improvements can be made from three aspects. First, most existing methods are built upon the assumption that multi-contrast MR images of the same subject share the same anatomy. This assumption is questionable, since different MR contrasts are specialized to highlight different anatomical features. Second, these methods often require a fixed set of MR contrasts for training (e.g., both T1-weighted and T2-weighted images), limiting their applicability. Lastly, existing methods are generally sensitive to imaging artifacts. In this paper, we present Harmonization with Attention-based Contrast, Anatomy, and Artifact Awareness (HACA3), a novel approach to address these three issues. HACA3 incorporates an anatomy fusion module that accounts for the inherent anatomical differences between MR contrasts. Furthermore, HACA3 can be trained and applied to any combination of MR contrasts and is robust to imaging artifacts. HACA3 is developed and evaluated on diverse MR datasets acquired from 21 sites with varying field strengths, scanner platforms, and acquisition protocols. Experiments show that HACA3 achieves state-of-the-art harmonization performance under multiple image quality metrics. We also demonstrate the versatility and potential clinical impact of HACA3 on downstream tasks including white matter lesion segmentation for people with multiple sclerosis and longitudinal volumetric analyses for normal aging subjects. Code is available at https://github.com/lianruizuo/haca3.
Collapse
Affiliation(s)
- Lianrui Zuo
- Department of Electrical and Computer Engineering, Johns Hopkins University, Baltimore, MD 21218, USA; Laboratory of Behavioral Neuroscience, National Institute on Aging, National Institutes of Health, Baltimore, MD 21224, USA.
| | - Yihao Liu
- Department of Electrical and Computer Engineering, Johns Hopkins University, Baltimore, MD 21218, USA
| | - Yuan Xue
- Department of Electrical and Computer Engineering, Johns Hopkins University, Baltimore, MD 21218, USA
| | - Blake E Dewey
- Department of Neurology, Johns Hopkins School of Medicine, Baltimore, MD 21287, USA
| | - Samuel W Remedios
- Department of Computer Science, Johns Hopkins University, Baltimore, MD 21218, USA; Radiology and Imaging Sciences, Clinical Center, National Institutes of Health, Bethesda, MD 20892, USA
| | - Savannah P Hays
- Department of Electrical and Computer Engineering, Johns Hopkins University, Baltimore, MD 21218, USA
| | - Murat Bilgel
- Laboratory of Behavioral Neuroscience, National Institute on Aging, National Institutes of Health, Baltimore, MD 21224, USA
| | - Ellen M Mowry
- Department of Neurology, Johns Hopkins School of Medicine, Baltimore, MD 21287, USA
| | - Scott D Newsome
- Department of Neurology, Johns Hopkins School of Medicine, Baltimore, MD 21287, USA
| | - Peter A Calabresi
- Department of Neurology, Johns Hopkins School of Medicine, Baltimore, MD 21287, USA
| | - Susan M Resnick
- Laboratory of Behavioral Neuroscience, National Institute on Aging, National Institutes of Health, Baltimore, MD 21224, USA
| | - Jerry L Prince
- Department of Electrical and Computer Engineering, Johns Hopkins University, Baltimore, MD 21218, USA
| | - Aaron Carass
- Department of Electrical and Computer Engineering, Johns Hopkins University, Baltimore, MD 21218, USA
| |
Collapse
|
8
|
Li L, Ding W, Huang L, Zhuang X, Grau V. Multi-modality cardiac image computing: A survey. Med Image Anal 2023; 88:102869. [PMID: 37384950 DOI: 10.1016/j.media.2023.102869] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2022] [Revised: 05/01/2023] [Accepted: 06/12/2023] [Indexed: 07/01/2023]
Abstract
Multi-modality cardiac imaging plays a key role in the management of patients with cardiovascular diseases. It allows a combination of complementary anatomical, morphological and functional information, increases diagnosis accuracy, and improves the efficacy of cardiovascular interventions and clinical outcomes. Fully-automated processing and quantitative analysis of multi-modality cardiac images could have a direct impact on clinical research and evidence-based patient management. However, these require overcoming significant challenges including inter-modality misalignment and finding optimal methods to integrate information from different modalities. This paper aims to provide a comprehensive review of multi-modality imaging in cardiology, the computing methods, the validation strategies, the related clinical workflows and future perspectives. For the computing methodologies, we have a favored focus on the three tasks, i.e., registration, fusion and segmentation, which generally involve multi-modality imaging data, either combining information from different modalities or transferring information across modalities. The review highlights that multi-modality cardiac imaging data has the potential of wide applicability in the clinic, such as trans-aortic valve implantation guidance, myocardial viability assessment, and catheter ablation therapy and its patient selection. Nevertheless, many challenges remain unsolved, such as missing modality, modality selection, combination of imaging and non-imaging data, and uniform analysis and representation of different modalities. There is also work to do in defining how the well-developed techniques fit in clinical workflows and how much additional and relevant information they introduce. These problems are likely to continue to be an active field of research and the questions to be answered in the future.
Collapse
Affiliation(s)
- Lei Li
- Department of Engineering Science, University of Oxford, Oxford, UK.
| | - Wangbin Ding
- College of Physics and Information Engineering, Fuzhou University, Fuzhou, China
| | - Liqin Huang
- College of Physics and Information Engineering, Fuzhou University, Fuzhou, China
| | - Xiahai Zhuang
- School of Data Science, Fudan University, Shanghai, China
| | - Vicente Grau
- Department of Engineering Science, University of Oxford, Oxford, UK
| |
Collapse
|
9
|
Buoso S, Joyce T, Schulthess N, Kozerke S. MRXCAT2.0: Synthesis of realistic numerical phantoms by combining left-ventricular shape learning, biophysical simulations and tissue texture generation. J Cardiovasc Magn Reson 2023; 25:25. [PMID: 37076840 PMCID: PMC10116689 DOI: 10.1186/s12968-023-00934-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2022] [Accepted: 03/15/2023] [Indexed: 04/21/2023] Open
Abstract
BACKGROUND Standardised performance assessment of image acquisition, reconstruction and processing methods is limited by the absence of images paired with ground truth reference values. To this end, we propose MRXCAT2.0 to generate synthetic data, covering healthy and pathological function, using a biophysical model. We exemplify the approach by generating cardiovascular magnetic resonance (CMR) images of healthy, infarcted, dilated and hypertrophic left-ventricular (LV) function. METHOD In MRXCAT2.0, the XCAT torso phantom is coupled with a statistical shape model, describing population (patho)physiological variability, and a biophysical model, providing known and detailed functional ground truth of LV morphology and function. CMR balanced steady-state free precession images are generated using MRXCAT2.0 while realistic image appearance is ensured by assigning texturized tissue properties to the phantom labels. FINDING Paired CMR image and ground truth data of LV function were generated with a range of LV masses (85-140 g), ejection fractions (34-51%) and peak radial and circumferential strains (0.45 to 0.95 and - 0.18 to - 0.13, respectively). These ranges cover healthy and pathological cases, including infarction, dilated and hypertrophic cardiomyopathy. The generation of the anatomy takes a few seconds and it improves on current state-of-the-art models where the pathological representation is not explicitly addressed. For the full simulation framework, the biophysical models require approximately two hours, while image generation requires a few minutes per slice. CONCLUSION MRXCAT2.0 offers synthesis of realistic images embedding population-based anatomical and functional variability and associated ground truth parameters to facilitate a standardized assessment of CMR acquisition, reconstruction and processing methods.
Collapse
Affiliation(s)
- Stefano Buoso
- Institute for Biomedical Engineering, ETH Zurich and University Zurich, Zurich, Switzerland.
| | - Thomas Joyce
- Institute for Biomedical Engineering, ETH Zurich and University Zurich, Zurich, Switzerland
| | - Nico Schulthess
- Institute for Biomedical Engineering, ETH Zurich and University Zurich, Zurich, Switzerland
| | - Sebastian Kozerke
- Institute for Biomedical Engineering, ETH Zurich and University Zurich, Zurich, Switzerland
| |
Collapse
|
10
|
Yu Z, Han X, Zhang S, Feng J, Peng T, Zhang XY. MouseGAN++: Unsupervised Disentanglement and Contrastive Representation for Multiple MRI Modalities Synthesis and Structural Segmentation of Mouse Brain. IEEE TRANSACTIONS ON MEDICAL IMAGING 2023; 42:1197-1209. [PMID: 36449589 DOI: 10.1109/tmi.2022.3225528] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/17/2023]
Abstract
Segmenting the fine structure of the mouse brain on magnetic resonance (MR) images is critical for delineating morphological regions, analyzing brain function, and understanding their relationships. Compared to a single MRI modality, multimodal MRI data provide complementary tissue features that can be exploited by deep learning models, resulting in better segmentation results. However, multimodal mouse brain MRI data is often lacking, making automatic segmentation of mouse brain fine structure a very challenging task. To address this issue, it is necessary to fuse multimodal MRI data to produce distinguished contrasts in different brain structures. Hence, we propose a novel disentangled and contrastive GAN-based framework, named MouseGAN++, to synthesize multiple MR modalities from single ones in a structure-preserving manner, thus improving the segmentation performance by imputing missing modalities and multi-modality fusion. Our results demonstrate that the translation performance of our method outperforms the state-of-the-art methods. Using the subsequently learned modality-invariant information as well as the modality-translated images, MouseGAN++ can segment fine brain structures with averaged dice coefficients of 90.0% (T2w) and 87.9% (T1w), respectively, achieving around +10% performance improvement compared to the state-of-the-art algorithms. Our results demonstrate that MouseGAN++, as a simultaneous image synthesis and segmentation method, can be used to fuse cross-modality information in an unpaired manner and yield more robust performance in the absence of multimodal data. We release our method as a mouse brain structural segmentation tool for free academic usage at https://github.com/yu02019.
Collapse
|
11
|
Salih A, Boscolo Galazzo I, Gkontra P, Lee AM, Lekadir K, Raisi-Estabragh Z, Petersen SE. Explainable Artificial Intelligence and Cardiac Imaging: Toward More Interpretable Models. Circ Cardiovasc Imaging 2023; 16:e014519. [PMID: 37042240 DOI: 10.1161/circimaging.122.014519] [Citation(s) in RCA: 12] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 04/13/2023]
Abstract
Artificial intelligence applications have shown success in different medical and health care domains, and cardiac imaging is no exception. However, some machine learning models, especially deep learning, are considered black box as they do not provide an explanation or rationale for model outcomes. Complexity and vagueness in these models necessitate a transition to explainable artificial intelligence (XAI) methods to ensure that model results are both transparent and understandable to end users. In cardiac imaging studies, there are a limited number of papers that use XAI methodologies. This article provides a comprehensive literature review of state-of-the-art works using XAI methods for cardiac imaging. Moreover, it provides simple and comprehensive guidelines on XAI. Finally, open issues and directions for XAI in cardiac imaging are discussed.
Collapse
Affiliation(s)
- Ahmed Salih
- William Harvey Research Institute, NIHR Barts Biomedical Research Centre, Queen Mary University of London, United Kingdom (A.S., A.M.L., Z.R.-E., S.E.P.)
| | | | - Polyxeni Gkontra
- Department of de Matemàtiques i Informàtica, University of Barcelona, Spain (P.G., K.L.)
| | - Aaron Mark Lee
- William Harvey Research Institute, NIHR Barts Biomedical Research Centre, Queen Mary University of London, United Kingdom (A.S., A.M.L., Z.R.-E., S.E.P.)
| | - Karim Lekadir
- Department of de Matemàtiques i Informàtica, University of Barcelona, Spain (P.G., K.L.)
| | - Zahra Raisi-Estabragh
- William Harvey Research Institute, NIHR Barts Biomedical Research Centre, Queen Mary University of London, United Kingdom (A.S., A.M.L., Z.R.-E., S.E.P.)
- Barts Heart Centre, St Bartholomew's Hospital, Barts Health NHS Trust, London, United Kingdom (Z.R.-E., S.E.P.)
| | - Steffen E Petersen
- William Harvey Research Institute, NIHR Barts Biomedical Research Centre, Queen Mary University of London, United Kingdom (A.S., A.M.L., Z.R.-E., S.E.P.)
- Barts Heart Centre, St Bartholomew's Hospital, Barts Health NHS Trust, London, United Kingdom (Z.R.-E., S.E.P.)
- Health Data Research UK, London (S.E.P.)
- Alan Turing Institute, London, United Kingdom (S.E.P.)
| |
Collapse
|
12
|
Attri-VAE: Attribute-based interpretable representations of medical images with variational autoencoders. Comput Med Imaging Graph 2023; 104:102158. [PMID: 36638626 DOI: 10.1016/j.compmedimag.2022.102158] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2022] [Revised: 12/06/2022] [Accepted: 12/06/2022] [Indexed: 12/13/2022]
Abstract
Deep learning (DL) methods where interpretability is intrinsically considered as part of the model are required to better understand the relationship of clinical and imaging-based attributes with DL outcomes, thus facilitating their use in the reasoning behind the medical decisions. Latent space representations built with variational autoencoders (VAE) do not ensure individual control of data attributes. Attribute-based methods enforcing attribute disentanglement have been proposed in the literature for classical computer vision tasks in benchmark data. In this paper, we propose a VAE approach, the Attri-VAE, that includes an attribute regularization term to associate clinical and medical imaging attributes with different regularized dimensions in the generated latent space, enabling a better-disentangled interpretation of the attributes. Furthermore, the generated attention maps explained the attribute encoding in the regularized latent space dimensions. Using the Attri-VAE approach we analyzed healthy and myocardial infarction patients with clinical, cardiac morphology, and radiomics attributes. The proposed model provided an excellent trade-off between reconstruction fidelity, disentanglement, and interpretability, outperforming state-of-the-art VAE approaches according to several quantitative metrics. The resulting latent space allowed the generation of realistic synthetic data in the trajectory between two distinct input samples or along a specific attribute dimension to better interpret changes between different cardiac conditions.
Collapse
|
13
|
Fischer M, Hepp T, Gatidis S, Yang B. Self-supervised contrastive learning with random walks for medical image segmentation with limited annotations. Comput Med Imaging Graph 2023; 104:102174. [PMID: 36640485 DOI: 10.1016/j.compmedimag.2022.102174] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2022] [Revised: 12/06/2022] [Accepted: 12/27/2022] [Indexed: 01/11/2023]
Abstract
Medical image segmentation has seen significant progress through the use of supervised deep learning. Hereby, large annotated datasets were employed to reliably segment anatomical structures. To reduce the requirement for annotated training data, self-supervised pre-training strategies on non-annotated data were designed. Especially contrastive learning schemes operating on dense pixel-wise representations have been introduced as an effective tool. In this work, we expand on this strategy and leverage inherent anatomical similarities in medical imaging data. We apply our approach to the task of semantic segmentation in a semi-supervised setting with limited amounts of annotated volumes. Trained alongside a segmentation loss in one single training stage, a contrastive loss aids to differentiate between salient anatomical regions that conform to the available annotations. Our approach builds upon the work of Jabri et al. (2020), who proposed cyclical contrastive random walks (CCRW) for self-supervision on palindromes of video frames. We adapt this scheme to operate on entries of paired embedded image slices. Using paths of cyclical random walks bypasses the need for negative samples, as commonly used in contrastive approaches, enabling the algorithm to discriminate among relevant salient (anatomical) regions implicitly. Further, a multi-level supervision strategy is employed, ensuring adequate representations of local and global characteristics of anatomical structures. The effectiveness of reducing the amount of required annotations is shown on three MRI datasets. A median increase of 8.01 and 5.90 pp in the Dice Similarity Coefficient (DSC) compared to our baseline could be achieved across all three datasets in the case of one and two available annotated examples per dataset.
Collapse
Affiliation(s)
- Marc Fischer
- Institute of Signal Processing and System Theory, University of Stuttgart, 70550 Stuttgart, Germany.
| | - Tobias Hepp
- Max Planck Institute for Intelligent Systems, 72076 Tübingen, Germany
| | - Sergios Gatidis
- Max Planck Institute for Intelligent Systems, 72076 Tübingen, Germany
| | - Bin Yang
- Institute of Signal Processing and System Theory, University of Stuttgart, 70550 Stuttgart, Germany
| |
Collapse
|
14
|
Eddahmani I, Pham CH, Napoléon T, Badoc I, Fouefack JR, El-Bouz M. Unsupervised Learning of Disentangled Representation via Auto-Encoding: A Survey. SENSORS (BASEL, SWITZERLAND) 2023; 23:2362. [PMID: 36850960 PMCID: PMC9960632 DOI: 10.3390/s23042362] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/14/2022] [Revised: 02/11/2023] [Accepted: 02/16/2023] [Indexed: 06/18/2023]
Abstract
In recent years, the rapid development of deep learning approaches has paved the way to explore the underlying factors that explain the data. In particular, several methods have been proposed to learn to identify and disentangle these underlying explanatory factors in order to improve the learning process and model generalization. However, extracting this representation with little or no supervision remains a key challenge in machine learning. In this paper, we provide a theoretical outlook on recent advances in the field of unsupervised representation learning with a focus on auto-encoding-based approaches and on the most well-known supervised disentanglement metrics. We cover the current state-of-the-art methods for learning disentangled representation in an unsupervised manner while pointing out the connection between each method and its added value on disentanglement. Further, we discuss how to quantify disentanglement and present an in-depth analysis of associated metrics. We conclude by carrying out a comparative evaluation of these metrics according to three criteria, (i) modularity, (ii) compactness and (iii) informativeness. Finally, we show that only the Mutual Information Gap score (MIG) meets all three criteria.
Collapse
Affiliation(s)
- Ikram Eddahmani
- L@bISEN, LSL Team, Yncrea Ouest, 29200 Brest, France
- Generix Group, 75012 Paris, France
| | - Chi-Hieu Pham
- L@bISEN, LSL Team, Yncrea Ouest, 29200 Brest, France
- LaTIM, INSERM UMR1101, University of Brest, 29200 Brest, France
| | | | | | | | - Marwa El-Bouz
- L@bISEN, LSL Team, Yncrea Ouest, 29200 Brest, France
| |
Collapse
|
15
|
Beetz M, Corral Acero J, Banerjee A, Eitel I, Zacur E, Lange T, Stiermaier T, Evertz R, Backhaus SJ, Thiele H, Bueno-Orovio A, Lamata P, Schuster A, Grau V. Interpretable cardiac anatomy modeling using variational mesh autoencoders. Front Cardiovasc Med 2022; 9:983868. [PMID: 36620629 PMCID: PMC9813669 DOI: 10.3389/fcvm.2022.983868] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2022] [Accepted: 10/24/2022] [Indexed: 12/24/2022] Open
Abstract
Cardiac anatomy and function vary considerably across the human population with important implications for clinical diagnosis and treatment planning. Consequently, many computer-based approaches have been developed to capture this variability for a wide range of applications, including explainable cardiac disease detection and prediction, dimensionality reduction, cardiac shape analysis, and the generation of virtual heart populations. In this work, we propose a variational mesh autoencoder (mesh VAE) as a novel geometric deep learning approach to model such population-wide variations in cardiac shapes. It embeds multi-scale graph convolutions and mesh pooling layers in a hierarchical VAE framework to enable direct processing of surface mesh representations of the cardiac anatomy in an efficient manner. The proposed mesh VAE achieves low reconstruction errors on a dataset of 3D cardiac meshes from over 1,000 patients with acute myocardial infarction, with mean surface distances between input and reconstructed meshes below the underlying image resolution. We also find that it outperforms a voxelgrid-based deep learning benchmark in terms of both mean surface distance and Hausdorff distance while requiring considerably less memory. Furthermore, we explore the quality and interpretability of the mesh VAE's latent space and showcase its ability to improve the prediction of major adverse cardiac events over a clinical benchmark. Finally, we investigate the method's ability to generate realistic virtual populations of cardiac anatomies and find good alignment between the synthesized and gold standard mesh populations in terms of multiple clinical metrics.
Collapse
Affiliation(s)
- Marcel Beetz
- Institute of Biomedical Engineering, Department of Engineering Science, University of Oxford, Oxford, United Kingdom
| | - Jorge Corral Acero
- Institute of Biomedical Engineering, Department of Engineering Science, University of Oxford, Oxford, United Kingdom
| | - Abhirup Banerjee
- Institute of Biomedical Engineering, Department of Engineering Science, University of Oxford, Oxford, United Kingdom
- Division of Cardiovascular Medicine, Radcliffe Department of Medicine, University of Oxford, Oxford, United Kingdom
| | - Ingo Eitel
- University Heart Center Lübeck, Medical Clinic II, Cardiology, Angiology, and Intensive Care Medicine, Lübeck, Germany
- University Hospital Schleswig-Holstein, Lübeck, Germany
- German Centre for Cardiovascular Research, Partner Site Lübeck, Lübeck, Germany
| | - Ernesto Zacur
- Institute of Biomedical Engineering, Department of Engineering Science, University of Oxford, Oxford, United Kingdom
| | - Torben Lange
- Department of Cardiology and Pneumology, University Medical Center Göttingen, Georg-August University, Göttingen, Germany
- German Centre for Cardiovascular Research, Partner Site Göttingen, Göttingen, Germany
| | - Thomas Stiermaier
- University Heart Center Lübeck, Medical Clinic II, Cardiology, Angiology, and Intensive Care Medicine, Lübeck, Germany
- University Hospital Schleswig-Holstein, Lübeck, Germany
- German Centre for Cardiovascular Research, Partner Site Lübeck, Lübeck, Germany
| | - Ruben Evertz
- Department of Cardiology and Pneumology, University Medical Center Göttingen, Georg-August University, Göttingen, Germany
- German Centre for Cardiovascular Research, Partner Site Göttingen, Göttingen, Germany
| | - Sören J. Backhaus
- Department of Cardiology and Pneumology, University Medical Center Göttingen, Georg-August University, Göttingen, Germany
- German Centre for Cardiovascular Research, Partner Site Göttingen, Göttingen, Germany
| | - Holger Thiele
- Department of Internal Medicine/Cardiology, Heart Center Leipzig at University of Leipzig, Leipzig, Germany
- Leipzig Heart Institute, Leipzig, Germany
| | | | - Pablo Lamata
- Department of Biomedical Engineering, King's College London, London, United Kingdom
| | - Andreas Schuster
- Department of Cardiology and Pneumology, University Medical Center Göttingen, Georg-August University, Göttingen, Germany
- German Centre for Cardiovascular Research, Partner Site Göttingen, Göttingen, Germany
| | - Vicente Grau
- Institute of Biomedical Engineering, Department of Engineering Science, University of Oxford, Oxford, United Kingdom
| |
Collapse
|
16
|
A Graphical Approach for Filter Pruning by Exploring the Similarity Relation between Feature Maps. Pattern Recognit Lett 2022. [DOI: 10.1016/j.patrec.2022.12.028] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
|
17
|
Hasan SMK, Linte CA. Learning Deep Representations of Cardiac Structures for 4D Cine MRI Image Segmentation through Semi-Supervised Learning. APPLIED SCIENCES (BASEL, SWITZERLAND) 2022; 12:12163. [PMID: 37125242 PMCID: PMC10134910 DOI: 10.3390/app122312163] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 05/02/2023]
Abstract
Learning good data representations for medical imaging tasks ensures the preservation of relevant information and the removal of irrelevant information from the data to improve the interpretability of the learned features. In this paper, we propose a semi-supervised model-namely, combine-all in semi-supervised learning (CqSL)-to demonstrate the power of a simple combination of a disentanglement block, variational autoencoder (VAE), generative adversarial network (GAN), and a conditioning layer-based reconstructor for performing two important tasks in medical imaging: segmentation and reconstruction. Our work is motivated by the recent progress in image segmentation using semi-supervised learning (SSL), which has shown good results with limited labeled data and large amounts of unlabeled data. A disentanglement block decomposes an input image into a domain-invariant spatial factor and a domain-specific non-spatial factor. We assume that medical images acquired using multiple scanners (different domain information) share a common spatial space but differ in non-spatial space (intensities, contrast, etc.). Hence, we utilize our spatial information to generate segmentation masks from unlabeled datasets using a generative adversarial network (GAN). Finally, to reconstruct the original image, our conditioning layer-based reconstruction block recombines spatial information with random non-spatial information sampled from the generative models. Our ablation study demonstrates the benefits of disentanglement in holding domain-invariant (spatial) as well as domain-specific (non-spatial) information with high accuracy. We further apply a structured L 2 similarity ( S L 2 SIM ) loss along with a mutual information minimizer (MIM) to improve the adversarially trained generative models for better reconstruction. Experimental results achieved on the STACOM 2017 ACDC cine cardiac magnetic resonance (MR) dataset suggest that our proposed (CqSL) model outperforms fully supervised and semi-supervised models, achieving an 83.2% performance accuracy even when using only 1% labeled data. We hypothesize that our proposed model has the potential to become an efficient semantic segmentation tool that may be used for domain adaptation in data-limited medical imaging scenarios, where annotations are expensive. Code, and experimental configurations will be made available publicly.
Collapse
Affiliation(s)
- S. M. Kamrul Hasan
- Center for Imaging Science, Rochester Institute of Technology, Rochester, NY 14623, USA
- Correspondence:
| | - Cristian A. Linte
- Center for Imaging Science, Rochester Institute of Technology, Rochester, NY 14623, USA
- Department of Biomedical Engineering, Rochester Institute of Technology, Rochester, NY 14623, USA
| |
Collapse
|
18
|
Liu Y, Carass A, Zuo L, He Y, Han S, Gregori L, Murray S, Mishra R, Lei J, Calabresi PA, Saidha S, Prince JL. Disentangled Representation Learning for OCTA Vessel Segmentation With Limited Training Data. IEEE TRANSACTIONS ON MEDICAL IMAGING 2022; 41:3686-3698. [PMID: 35862335 PMCID: PMC9910788 DOI: 10.1109/tmi.2022.3193029] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Optical coherence tomography angiography (OCTA) is an imaging modality that can be used for analyzing retinal vasculature. Quantitative assessment of en face OCTA images requires accurate segmentation of the capillaries. Using deep learning approaches for this task faces two major challenges. First, acquiring sufficient manual delineations for training can take hundreds of hours. Second, OCTA images suffer from numerous contrast-related artifacts that are currently inherent to the modality and vary dramatically across scanners. We propose to solve both problems by learning a disentanglement of an anatomy component and a local contrast component from paired OCTA scans. With the contrast removed from the anatomy component, a deep learning model that takes the anatomy component as input can learn to segment vessels with a limited portion of the training images being manually labeled. Our method demonstrates state-of-the-art performance for OCTA vessel segmentation.
Collapse
|
19
|
Painchaud N, Duchateau N, Bernard O, Jodoin PM. Echocardiography Segmentation With Enforced Temporal Consistency. IEEE TRANSACTIONS ON MEDICAL IMAGING 2022; 41:2867-2878. [PMID: 35533176 DOI: 10.1109/tmi.2022.3173669] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Convolutional neural networks (CNN) have demonstrated their ability to segment 2D cardiac ultrasound images. However, despite recent successes according to which the intra-observer variability on end-diastole and end-systole images has been reached, CNNs still struggle to leverage temporal information to provide accurate and temporally consistent segmentation maps across the whole cycle. Such consistency is required to accurately describe the cardiac function, a necessary step in diagnosing many cardiovascular diseases. In this paper, we propose a framework to learn the 2D+time apical long-axis cardiac shape such that the segmented sequences can benefit from temporal and anatomical consistency constraints. Our method is a post-processing that takes as input segmented echocardiographic sequences produced by any state-of-the-art method and processes it in two steps to (i) identify spatio-temporal inconsistencies according to the overall dynamics of the cardiac sequence and (ii) correct the inconsistencies. The identification and correction of cardiac inconsistencies relies on a constrained autoencoder trained to learn a physiologically interpretable embedding of cardiac shapes, where we can both detect and fix anomalies. We tested our framework on 98 full-cycle sequences from the CAMUS dataset, which are available alongside this paper. Our temporal regularization method not only improves the accuracy of the segmentation across the whole sequences, but also enforces temporal and anatomical consistency.
Collapse
|
20
|
Zhan B, Zhou L, Li Z, Wu X, Pu Y, Zhou J, Wang Y, Shen D. D2FE-GAN: Decoupled dual feature extraction based GAN for MRI image synthesis. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2022.109362] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
|
21
|
Bercea CI, Wiestler B, Rueckert D, Albarqouni S. Federated disentangled representation learning for unsupervised brain anomaly detection. NAT MACH INTELL 2022. [DOI: 10.1038/s42256-022-00515-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
22
|
Reaungamornrat S, Sari H, Catana C, Kamen A. Multimodal image synthesis based on disentanglement representations of anatomical and modality specific features, learned using uncooperative relativistic GAN. Med Image Anal 2022; 80:102514. [PMID: 35717874 PMCID: PMC9810205 DOI: 10.1016/j.media.2022.102514] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2021] [Revised: 05/20/2022] [Accepted: 06/10/2022] [Indexed: 01/05/2023]
Abstract
Growing number of methods for attenuation-coefficient map estimation from magnetic resonance (MR) images have recently been proposed because of the increasing interest in MR-guided radiotherapy and the introduction of positron emission tomography (PET) MR hybrid systems. We propose a deep-network ensemble incorporating stochastic-binary-anatomical encoders and imaging-modality variational autoencoders, to disentangle image-latent spaces into a space of modality-invariant anatomical features and spaces of modality attributes. The ensemble integrates modality-modulated decoders to normalize features and image intensities based on imaging modality. Besides promoting disentanglement, the architecture fosters uncooperative learning, offering ability to maintain anatomical structure in a cross-modality reconstruction. Introduction of a modality-invariant structural consistency constraint further enforces faithful embedding of anatomy. To improve training stability and fidelity of synthesized modalities, the ensemble is trained in a relativistic generative adversarial framework incorporating multiscale discriminators. Analyses of priors and network architectures as well as performance validation were performed on computed tomography (CT) and MR pelvis datasets. The proposed method demonstrated robustness against intensity inhomogeneity, improved tissue-class differentiation, and offered synthetic CT in Hounsfield units with intensities consistent and smooth across slices compared to the state-of-the-art approaches, offering median normalized mutual information of 1.28, normalized cross correlation of 0.97, and gradient cross correlation of 0.59 over 324 images.
Collapse
Affiliation(s)
| | - Hasan Sari
- Havard Medical School, Boston, MA 02115 USA
| | | | - Ali Kamen
- Siemens Healthineers, Digital Technology and Innovation, Princeton, NJ 08540 USA
| |
Collapse
|
23
|
Liu X, Sanchez P, Thermos S, O'Neil AQ, Tsaftaris SA. Learning disentangled representations in the imaging domain. Med Image Anal 2022; 80:102516. [PMID: 35751992 DOI: 10.1016/j.media.2022.102516] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2021] [Revised: 04/05/2022] [Accepted: 06/10/2022] [Indexed: 12/12/2022]
Abstract
Disentangled representation learning has been proposed as an approach to learning general representations even in the absence of, or with limited, supervision. A good general representation can be fine-tuned for new target tasks using modest amounts of data, or used directly in unseen domains achieving remarkable performance in the corresponding task. This alleviation of the data and annotation requirements offers tantalising prospects for applications in computer vision and healthcare. In this tutorial paper, we motivate the need for disentangled representations, revisit key concepts, and describe practical building blocks and criteria for learning such representations. We survey applications in medical imaging emphasising choices made in exemplar key works, and then discuss links to computer vision applications. We conclude by presenting limitations, challenges, and opportunities.
Collapse
Affiliation(s)
- Xiao Liu
- School of Engineering, The University of Edinburgh, Edinburgh EH9 3FG, UK.
| | - Pedro Sanchez
- School of Engineering, The University of Edinburgh, Edinburgh EH9 3FG, UK
| | - Spyridon Thermos
- School of Engineering, The University of Edinburgh, Edinburgh EH9 3FG, UK
| | - Alison Q O'Neil
- School of Engineering, The University of Edinburgh, Edinburgh EH9 3FG, UK; Canon Medical Research Europe, Edinburgh EH6 5NP, UK
| | - Sotirios A Tsaftaris
- School of Engineering, The University of Edinburgh, Edinburgh EH9 3FG, UK; The Alan Turing Institute, London NW1 2DB, UK
| |
Collapse
|
24
|
Barragán-Montero A, Bibal A, Dastarac MH, Draguet C, Valdés G, Nguyen D, Willems S, Vandewinckele L, Holmström M, Löfman F, Souris K, Sterpin E, Lee JA. Towards a safe and efficient clinical implementation of machine learning in radiation oncology by exploring model interpretability, explainability and data-model dependency. Phys Med Biol 2022; 67:10.1088/1361-6560/ac678a. [PMID: 35421855 PMCID: PMC9870296 DOI: 10.1088/1361-6560/ac678a] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2021] [Accepted: 04/14/2022] [Indexed: 01/26/2023]
Abstract
The interest in machine learning (ML) has grown tremendously in recent years, partly due to the performance leap that occurred with new techniques of deep learning, convolutional neural networks for images, increased computational power, and wider availability of large datasets. Most fields of medicine follow that popular trend and, notably, radiation oncology is one of those that are at the forefront, with already a long tradition in using digital images and fully computerized workflows. ML models are driven by data, and in contrast with many statistical or physical models, they can be very large and complex, with countless generic parameters. This inevitably raises two questions, namely, the tight dependence between the models and the datasets that feed them, and the interpretability of the models, which scales with its complexity. Any problems in the data used to train the model will be later reflected in their performance. This, together with the low interpretability of ML models, makes their implementation into the clinical workflow particularly difficult. Building tools for risk assessment and quality assurance of ML models must involve then two main points: interpretability and data-model dependency. After a joint introduction of both radiation oncology and ML, this paper reviews the main risks and current solutions when applying the latter to workflows in the former. Risks associated with data and models, as well as their interaction, are detailed. Next, the core concepts of interpretability, explainability, and data-model dependency are formally defined and illustrated with examples. Afterwards, a broad discussion goes through key applications of ML in workflows of radiation oncology as well as vendors' perspectives for the clinical implementation of ML.
Collapse
Affiliation(s)
- Ana Barragán-Montero
- Molecular Imaging, Radiation and Oncology (MIRO) Laboratory, Institut de Recherche Expérimentale et Clinique (IREC), UCLouvain, Belgium
| | - Adrien Bibal
- PReCISE, NaDI Institute, Faculty of Computer Science, UNamur and CENTAL, ILC, UCLouvain, Belgium
| | - Margerie Huet Dastarac
- Molecular Imaging, Radiation and Oncology (MIRO) Laboratory, Institut de Recherche Expérimentale et Clinique (IREC), UCLouvain, Belgium
| | - Camille Draguet
- Molecular Imaging, Radiation and Oncology (MIRO) Laboratory, Institut de Recherche Expérimentale et Clinique (IREC), UCLouvain, Belgium
- Department of Oncology, Laboratory of Experimental Radiotherapy, KU Leuven, Belgium
| | - Gilmer Valdés
- Department of Radiation Oncology, Department of Epidemiology and Biostatistics, University of California, San Francisco, United States of America
| | - Dan Nguyen
- Medical Artificial Intelligence and Automation (MAIA) Laboratory, Department of Radiation Oncology, UT Southwestern Medical Center, United States of America
| | - Siri Willems
- ESAT/PSI, KU Leuven Belgium & MIRC, UZ Leuven, Belgium
| | | | | | | | - Kevin Souris
- Molecular Imaging, Radiation and Oncology (MIRO) Laboratory, Institut de Recherche Expérimentale et Clinique (IREC), UCLouvain, Belgium
| | - Edmond Sterpin
- Molecular Imaging, Radiation and Oncology (MIRO) Laboratory, Institut de Recherche Expérimentale et Clinique (IREC), UCLouvain, Belgium
- Department of Oncology, Laboratory of Experimental Radiotherapy, KU Leuven, Belgium
| | - John A Lee
- Molecular Imaging, Radiation and Oncology (MIRO) Laboratory, Institut de Recherche Expérimentale et Clinique (IREC), UCLouvain, Belgium
| |
Collapse
|
25
|
Weine J, van Gorkum RJH, Stoeck CT, Vishnevskiy V, Kozerke S. Synthetically Trained Convolutional Neural Networks for Improved Tensor Estimation from Free-Breathing Cardiac DTI. Comput Med Imaging Graph 2022; 99:102075. [DOI: 10.1016/j.compmedimag.2022.102075] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2021] [Revised: 03/15/2022] [Accepted: 05/05/2022] [Indexed: 10/18/2022]
|
26
|
Wang C, Yang G, Papanastasiou G. Unsupervised Image Registration towards Enhancing Performance and Explainability in Cardiac and Brain Image Analysis. SENSORS (BASEL, SWITZERLAND) 2022; 22:2125. [PMID: 35336295 PMCID: PMC8951078 DOI: 10.3390/s22062125] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/04/2022] [Revised: 03/01/2022] [Accepted: 03/07/2022] [Indexed: 02/04/2023]
Abstract
Magnetic Resonance Imaging (MRI) typically recruits multiple sequences (defined here as "modalities"). As each modality is designed to offer different anatomical and functional clinical information, there are evident disparities in the imaging content across modalities. Inter- and intra-modality affine and non-rigid image registration is an essential medical image analysis process in clinical imaging, as for example before imaging biomarkers need to be derived and clinically evaluated across different MRI modalities, time phases and slices. Although commonly needed in real clinical scenarios, affine and non-rigid image registration is not extensively investigated using a single unsupervised model architecture. In our work, we present an unsupervised deep learning registration methodology that can accurately model affine and non-rigid transformations, simultaneously. Moreover, inverse-consistency is a fundamental inter-modality registration property that is not considered in deep learning registration algorithms. To address inverse consistency, our methodology performs bi-directional cross-modality image synthesis to learn modality-invariant latent representations, and involves two factorised transformation networks (one per each encoder-decoder channel) and an inverse-consistency loss to learn topology-preserving anatomical transformations. Overall, our model (named "FIRE") shows improved performances against the reference standard baseline method (i.e., Symmetric Normalization implemented using the ANTs toolbox) on multi-modality brain 2D and 3D MRI and intra-modality cardiac 4D MRI data experiments. We focus on explaining model-data components to enhance model explainability in medical image registration. On computational time experiments, we show that the FIRE model performs on a memory-saving mode, as it can inherently learn topology-preserving image registration directly in the training phase. We therefore demonstrate an efficient and versatile registration technique that can have merit in multi-modal image registrations in the clinical setting.
Collapse
Affiliation(s)
- Chengjia Wang
- Edinburgh Imaging Facility QMRI, Centre for Cardiovascular Science, University of Edinburgh, Edinburgh EH16 4TJ, UK;
| | - Guang Yang
- Faculty of Medicine, National Heart & Lung Institute, Imperial College London, London SW7 2BX, UK
| | - Giorgos Papanastasiou
- Edinburgh Imaging Facility QMRI, Centre for Cardiovascular Science, University of Edinburgh, Edinburgh EH16 4TJ, UK;
- School of Computer Science and Electronic Engineering, University of Essex, Colchester CO4 3SQ, UK
| |
Collapse
|
27
|
GAN-based disentanglement learning for chest X-ray rib suppression. Med Image Anal 2022; 77:102369. [DOI: 10.1016/j.media.2022.102369] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2021] [Revised: 12/09/2021] [Accepted: 01/10/2022] [Indexed: 11/19/2022]
|
28
|
CyCMIS: Cycle-consistent Cross-domain Medical Image Segmentation via diverse image augmentation. Med Image Anal 2021; 76:102328. [PMID: 34920236 DOI: 10.1016/j.media.2021.102328] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2021] [Revised: 11/15/2021] [Accepted: 12/01/2021] [Indexed: 01/26/2023]
Abstract
Domain shift, a phenomenon when there exists distribution discrepancy between training dataset (source domain) and test dataset (target domain), is very common in practical applications and may cause significant performance degradation, which hinders the effective deployment of deep learning models to clinical settings. Adaptation algorithms to improve the model generalizability from source domain to target domain has significant practical value. In this paper, we investigate unsupervised domain adaptation (UDA) technique to train a cross-domain segmentation method which is robust to domain shift, and which does not require any annotations on the test domain. To this end, we propose Cycle-consistent Cross-domain Medical Image Segmentation, referred as CyCMIS, integrating online diverse image translation via disentangled representation learning and semantic consistency regularization into one network. Different from learning one-to-one mapping, our method characterizes the complex relationship between domains as many-to-many mapping. A novel diverse inter-domain semantic consistency loss is then proposed to regularize the cross-domain segmentation process. We additionally introduce an intra-domain semantic consistency loss to encourage the segmentation consistency between the original input and the image after cross-cycle reconstruction. We conduct comprehensive experiments on two publicly available datasets to evaluate the effectiveness of the proposed method. Results demonstrate the efficacy of the present approach.
Collapse
|
29
|
Campello VM, Gkontra P, Izquierdo C, Martin-Isla C, Sojoudi A, Full PM, Maier-Hein K, Zhang Y, He Z, Ma J, Parreno M, Albiol A, Kong F, Shadden SC, Acero JC, Sundaresan V, Saber M, Elattar M, Li H, Menze B, Khader F, Haarburger C, Scannell CM, Veta M, Carscadden A, Punithakumar K, Liu X, Tsaftaris SA, Huang X, Yang X, Li L, Zhuang X, Vilades D, Descalzo ML, Guala A, Mura LL, Friedrich MG, Garg R, Lebel J, Henriques F, Karakas M, Cavus E, Petersen SE, Escalera S, Segui S, Rodriguez-Palomares JF, Lekadir K. Multi-Centre, Multi-Vendor and Multi-Disease Cardiac Segmentation: The M&Ms Challenge. IEEE TRANSACTIONS ON MEDICAL IMAGING 2021; 40:3543-3554. [PMID: 34138702 DOI: 10.1109/tmi.2021.3090082] [Citation(s) in RCA: 94] [Impact Index Per Article: 31.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
The emergence of deep learning has considerably advanced the state-of-the-art in cardiac magnetic resonance (CMR) segmentation. Many techniques have been proposed over the last few years, bringing the accuracy of automated segmentation close to human performance. However, these models have been all too often trained and validated using cardiac imaging samples from single clinical centres or homogeneous imaging protocols. This has prevented the development and validation of models that are generalizable across different clinical centres, imaging conditions or scanner vendors. To promote further research and scientific benchmarking in the field of generalizable deep learning for cardiac segmentation, this paper presents the results of the Multi-Centre, Multi-Vendor and Multi-Disease Cardiac Segmentation (M&Ms) Challenge, which was recently organized as part of the MICCAI 2020 Conference. A total of 14 teams submitted different solutions to the problem, combining various baseline models, data augmentation strategies, and domain adaptation techniques. The obtained results indicate the importance of intensity-driven data augmentation, as well as the need for further research to improve generalizability towards unseen scanner vendors or new imaging protocols. Furthermore, we present a new resource of 375 heterogeneous CMR datasets acquired by using four different scanner vendors in six hospitals and three different countries (Spain, Canada and Germany), which we provide as open-access for the community to enable future research in the field.
Collapse
|
30
|
Zuo L, Dewey BE, Liu Y, He Y, Newsome SD, Mowry EM, Resnick SM, Prince JL, Carass A. Unsupervised MR harmonization by learning disentangled representations using information bottleneck theory. Neuroimage 2021; 243:118569. [PMID: 34506916 PMCID: PMC10473284 DOI: 10.1016/j.neuroimage.2021.118569] [Citation(s) in RCA: 30] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2021] [Revised: 08/11/2021] [Accepted: 09/07/2021] [Indexed: 01/19/2023] Open
Abstract
In magnetic resonance (MR) imaging, a lack of standardization in acquisition often causes pulse sequence-based contrast variations in MR images from site to site, which impedes consistent measurements in automatic analyses. In this paper, we propose an unsupervised MR image harmonization approach, CALAMITI (Contrast Anatomy Learning and Analysis for MR Intensity Translation and Integration), which aims to alleviate contrast variations in multi-site MR imaging. Designed using information bottleneck theory, CALAMITI learns a globally disentangled latent space containing both anatomical and contrast information, which permits harmonization. In contrast to supervised harmonization methods, our approach does not need a sample population to be imaged across sites. Unlike traditional unsupervised harmonization approaches which often suffer from geometry shifts, CALAMITI better preserves anatomy by design. The proposed method is also able to adapt to a new testing site with a straightforward fine-tuning process. Experiments on MR images acquired from ten sites show that CALAMITI achieves superior performance compared with other harmonization approaches.
Collapse
Affiliation(s)
- Lianrui Zuo
- Department of Electrical and Computer Engineering, The Johns Hopkins University, Baltimore, MD 21218 USA; Laboratory of Behavioral Neuroscience, National Institute on Aging, National Institute of Health, Baltimore, MD 20892, USA.
| | - Blake E Dewey
- Department of Electrical and Computer Engineering, The Johns Hopkins University, Baltimore, MD 21218 USA
| | - Yihao Liu
- Department of Electrical and Computer Engineering, The Johns Hopkins University, Baltimore, MD 21218 USA
| | - Yufan He
- Department of Electrical and Computer Engineering, The Johns Hopkins University, Baltimore, MD 21218 USA
| | - Scott D Newsome
- Department of Neurology, The Johns Hopkins School of Medicine, Baltimore, MD 21287, USA
| | - Ellen M Mowry
- Department of Neurology, The Johns Hopkins School of Medicine, Baltimore, MD 21287, USA
| | - Susan M Resnick
- Laboratory of Behavioral Neuroscience, National Institute on Aging, National Institute of Health, Baltimore, MD 20892, USA
| | - Jerry L Prince
- Department of Electrical and Computer Engineering, The Johns Hopkins University, Baltimore, MD 21218 USA
| | - Aaron Carass
- Department of Electrical and Computer Engineering, The Johns Hopkins University, Baltimore, MD 21218 USA
| |
Collapse
|
31
|
Xing F, Cornish TC, Bennett TD, Ghosh D. Bidirectional Mapping-Based Domain Adaptation for Nucleus Detection in Cross-Modality Microscopy Images. IEEE TRANSACTIONS ON MEDICAL IMAGING 2021; 40:2880-2896. [PMID: 33284750 PMCID: PMC8543886 DOI: 10.1109/tmi.2020.3042789] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Cell or nucleus detection is a fundamental task in microscopy image analysis and has recently achieved state-of-the-art performance by using deep neural networks. However, training supervised deep models such as convolutional neural networks (CNNs) usually requires sufficient annotated image data, which is prohibitively expensive or unavailable in some applications. Additionally, when applying a CNN to new datasets, it is common to annotate individual cells/nuclei in those target datasets for model re-learning, leading to inefficient and low-throughput image analysis. To tackle these problems, we present a bidirectional, adversarial domain adaptation method for nucleus detection on cross-modality microscopy image data. Specifically, the method learns a deep regression model for individual nucleus detection with both source-to-target and target-to-source image translation. In addition, we explicitly extend this unsupervised domain adaptation method to a semi-supervised learning situation and further boost the nucleus detection performance. We evaluate the proposed method on three cross-modality microscopy image datasets, which cover a wide variety of microscopy imaging protocols or modalities, and obtain a significant improvement in nucleus detection compared to reference baseline approaches. In addition, our semi-supervised method is very competitive with recent fully supervised learning models trained with all real target training labels.
Collapse
|
32
|
Decomposing normal and abnormal features of medical images for content-based image retrieval of glioma imaging. Med Image Anal 2021; 74:102227. [PMID: 34543911 DOI: 10.1016/j.media.2021.102227] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2021] [Revised: 09/05/2021] [Accepted: 09/06/2021] [Indexed: 11/20/2022]
Abstract
In medical imaging, the characteristics purely derived from a disease should reflect the extent to which abnormal findings deviate from the normal features. Indeed, physicians often need corresponding images without abnormal findings of interest or, conversely, images that contain similar abnormal findings regardless of normal anatomical context. This is called comparative diagnostic reading of medical images, which is essential for a correct diagnosis. To support comparative diagnostic reading, content-based image retrieval (CBIR) that can selectively utilize normal and abnormal features in medical images as two separable semantic components will be useful. In this study, we propose a neural network architecture to decompose the semantic components of medical images into two latent codes: normal anatomy code and abnormal anatomy code. The normal anatomy code represents counterfactual normal anatomies that should have existed if the sample is healthy, whereas the abnormal anatomy code attributes to abnormal changes that reflect deviation from the normal baseline. By calculating the similarity based on either normal or abnormal anatomy codes or the combination of the two codes, our algorithm can retrieve images according to the selected semantic component from a dataset consisting of brain magnetic resonance images of gliomas. Moreover, it can utilize a synthetic query vector combining normal and abnormal anatomy codes from two different query images. To evaluate whether the retrieved images are acquired according to the targeted semantic component, the overlap of the ground-truth labels is calculated as metrics of the semantic consistency. Our algorithm provides a flexible CBIR framework by handling the decomposed features with qualitatively and quantitatively remarkable results.
Collapse
|
33
|
Wu Y, Tang Z, Li B, Firmin D, Yang G. Recent Advances in Fibrosis and Scar Segmentation From Cardiac MRI: A State-of-the-Art Review and Future Perspectives. Front Physiol 2021; 12:709230. [PMID: 34413789 PMCID: PMC8369509 DOI: 10.3389/fphys.2021.709230] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2021] [Accepted: 06/28/2021] [Indexed: 12/03/2022] Open
Abstract
Segmentation of cardiac fibrosis and scars is essential for clinical diagnosis and can provide invaluable guidance for the treatment of cardiac diseases. Late Gadolinium enhancement (LGE) cardiovascular magnetic resonance (CMR) has been successful in guiding the clinical diagnosis and treatment reliably. For LGE CMR, many methods have demonstrated success in accurately segmenting scarring regions. Co-registration with other non-contrast-agent (non-CA) modalities [e.g., balanced steady-state free precession (bSSFP) cine magnetic resonance imaging (MRI)] can further enhance the efficacy of automated segmentation of cardiac anatomies. Many conventional methods have been proposed to provide automated or semi-automated segmentation of scars. With the development of deep learning in recent years, we can also see more advanced methods that are more efficient in providing more accurate segmentations. This paper conducts a state-of-the-art review of conventional and current state-of-the-art approaches utilizing different modalities for accurate cardiac fibrosis and scar segmentation.
Collapse
Affiliation(s)
- Yinzhe Wu
- National Heart and Lung Institute, Faculty of Medicine, Imperial College London, London, United Kingdom.,Department of Bioengineering, Faculty of Engineering, Imperial College London, London, United Kingdom
| | - Zeyu Tang
- National Heart and Lung Institute, Faculty of Medicine, Imperial College London, London, United Kingdom.,Department of Bioengineering, Faculty of Engineering, Imperial College London, London, United Kingdom
| | - Binghuan Li
- Department of Bioengineering, Faculty of Engineering, Imperial College London, London, United Kingdom
| | - David Firmin
- National Heart and Lung Institute, Faculty of Medicine, Imperial College London, London, United Kingdom.,Cardiovascular Biomedical Research Unit, Royal Brompton Hospital, London, United Kingdom
| | - Guang Yang
- National Heart and Lung Institute, Faculty of Medicine, Imperial College London, London, United Kingdom.,Cardiovascular Biomedical Research Unit, Royal Brompton Hospital, London, United Kingdom
| |
Collapse
|
34
|
Artificial Intelligence in Computer Vision: Cardiac MRI and Multimodality Imaging Segmentation. CURRENT CARDIOVASCULAR RISK REPORTS 2021; 15. [PMID: 35693045 PMCID: PMC9187294 DOI: 10.1007/s12170-021-00678-4] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
Purpose of Review Anatomical segmentation has played a major role within clinical cardiology. Novel techniques through artificial intelligence-based computer vision have revolutionized this process through both automation and novel applications. This review discusses the history and clinical context of cardiac segmentation to provide a framework for a survey of recent manuscripts in artificial intelligence and cardiac segmentation. We aim to clarify for the reader the clinical question of "Why do we segment?" in order to understand the question of "Where is current research and where should be?". Recent Findings There has been increasing research in cardiac segmentation in recent years. Segmentation models are most frequently based on a U-Net structure. Multiple innovations have been added in terms of pre-processing or connection to analysis pipelines. Cardiac MRI is the most frequently segmented modality, which is due in part to the presence of publically-available, moderately sized, computer vision competition datasets. Further progress in data availability, model explanation, and clinical integration are being pursued. Summary The task of cardiac anatomical segmentation has experienced massive strides forward within the past five years due to convolutional neural networks. These advances provide a basis for streamlining image analysis, and a foundation for further analysis both by computer and human systems. While technical advances are clear, clinical benefit remains nascent. Novel approaches may improve measurement precision by decreasing inter-reader variability and appear to also have the potential for larger-reaching effects in the future within integrated analysis pipelines.
Collapse
|
35
|
Valvano G, Leo A, Tsaftaris SA. Learning to Segment From Scribbles Using Multi-Scale Adversarial Attention Gates. IEEE TRANSACTIONS ON MEDICAL IMAGING 2021; 40:1990-2001. [PMID: 33784616 DOI: 10.1109/tmi.2021.3069634] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Large, fine-grained image segmentation datasets, annotated at pixel-level, are difficult to obtain, particularly in medical imaging, where annotations also require expert knowledge. Weakly-supervised learning can train models by relying on weaker forms of annotation, such as scribbles. Here, we learn to segment using scribble annotations in an adversarial game. With unpaired segmentation masks, we train a multi-scale GAN to generate realistic segmentation masks at multiple resolutions, while we use scribbles to learn their correct position in the image. Central to the model's success is a novel attention gating mechanism, which we condition with adversarial signals to act as a shape prior, resulting in better object localization at multiple scales. Subject to adversarial conditioning, the segmentor learns attention maps that are semantic, suppress the noisy activations outside the objects, and reduce the vanishing gradient problem in the deeper layers of the segmentor. We evaluated our model on several medical (ACDC, LVSC, CHAOS) and non-medical (PPSS) datasets, and we report performance levels matching those achieved by models trained with fully annotated segmentation masks. We also demonstrate extensions in a variety of settings: semi-supervised learning; combining multiple scribble sources (a crowdsourcing scenario) and multi-task learning (combining scribble and mask supervision). We release expert-made scribble annotations for the ACDC dataset, and the code used for the experiments, at https://vios-s.github.io/multiscale-adversarial-attention-gates.
Collapse
|
36
|
Sermesant M, Delingette H, Cochet H, Jaïs P, Ayache N. Applications of artificial intelligence in cardiovascular imaging. Nat Rev Cardiol 2021; 18:600-609. [PMID: 33712806 DOI: 10.1038/s41569-021-00527-2] [Citation(s) in RCA: 55] [Impact Index Per Article: 18.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 02/08/2021] [Indexed: 01/31/2023]
Abstract
Research into artificial intelligence (AI) has made tremendous progress over the past decade. In particular, the AI-powered analysis of images and signals has reached human-level performance in many applications owing to the efficiency of modern machine learning methods, in particular deep learning using convolutional neural networks. Research into the application of AI to medical imaging is now very active, especially in the field of cardiovascular imaging because of the challenges associated with acquiring and analysing images of this dynamic organ. In this Review, we discuss the clinical questions in cardiovascular imaging that AI can be used to address and the principal methodological AI approaches that have been developed to solve the related image analysis problems. Some approaches are purely data-driven and rely mainly on statistical associations, whereas others integrate anatomical and physiological information through additional statistical, geometric and biophysical models of the human heart. In a structured manner, we provide representative examples of each of these approaches, with particular attention to the underlying computational imaging challenges. Finally, we discuss the remaining limitations of AI approaches in cardiovascular imaging (such as generalizability and explainability) and how they can be overcome.
Collapse
Affiliation(s)
| | | | - Hubert Cochet
- IHU Liryc, CHU Bordeaux, Université Bordeaux, Inserm 1045, Pessac, France
| | - Pierre Jaïs
- IHU Liryc, CHU Bordeaux, Université Bordeaux, Inserm 1045, Pessac, France
| | | |
Collapse
|
37
|
Guo S, Xu L, Feng C, Xiong H, Gao Z, Zhang H. Multi-level semantic adaptation for few-shot segmentation on cardiac image sequences. Med Image Anal 2021; 73:102170. [PMID: 34380105 DOI: 10.1016/j.media.2021.102170] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2021] [Revised: 06/04/2021] [Accepted: 07/12/2021] [Indexed: 01/01/2023]
Abstract
Obtaining manual labels is time-consuming and labor-intensive on cardiac image sequences. Few-shot segmentation can utilize limited labels to learn new tasks. However, it suffers from two challenges: spatial-temporal distribution bias and long-term information bias. These challenges derive from the impact of the time dimension on cardiac image sequences, resulting in serious over-adaptation. In this paper, we propose the multi-level semantic adaptation (MSA) for few-shot segmentation on cardiac image sequences. The MSA addresses the two biases by exploring the domain adaptation and the weight adaptation on the semantic features in multiple levels, including sequence-level, frame-level, and pixel-level. First, the MSA proposes the dual-level feature adjustment for domain adaptation in spatial and temporal directions. This adjustment explicitly aligns the frame-level feature and the sequence-level feature to improve the model adaptation on diverse modalities. Second, the MSA explores the hierarchical attention metric for weight adaptation in the frame-level feature and the pixel-level feature. This metric focuses on the similar frame and the target region to promote the model discrimination on the border features. The extensive experiments demonstrate that our MSA is effective in few-shot segmentation on cardiac image sequences with three modalities, i.e. MR, CT, and Echo (e.g. the average Dice is 0.9243), as well as superior to the ten state-of-the-art methods.
Collapse
Affiliation(s)
- Saidi Guo
- School of Biomedical Engineering, Sun Yat-sen University, China
| | - Lin Xu
- General Hospital of the Southern Theatre Command, PLA, Guangdong, China; The First School of Clinical Medicine, Southern Medical University, Guangdong, China
| | - Cheng Feng
- Department of Ultrasound, The Third People's Hospital of Shenzhen, Guangdong, China
| | - Huahua Xiong
- Department of Ultrasound, The First Affiliated Hospital of Shenzhen University, Shenzhen Second People's Hospital, Guangdong, China
| | - Zhifan Gao
- School of Biomedical Engineering, Sun Yat-sen University, China.
| | - Heye Zhang
- School of Biomedical Engineering, Sun Yat-sen University, China.
| |
Collapse
|
38
|
Vesal S, Gu M, Maier A, Ravikumar N. Spatio-Temporal Multi-Task Learning for Cardiac MRI Left Ventricle Quantification. IEEE J Biomed Health Inform 2021; 25:2698-2709. [PMID: 33351771 DOI: 10.1109/jbhi.2020.3046449] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Quantitative assessment of cardiac left ventricle (LV) morphology is essential to assess cardiac function and improve the diagnosis of different cardiovascular diseases. In current clinical practice, LV quantification depends on the measurement of myocardial shape indices, which is usually achieved by manual contouring of the endo- and epicardial. However, this process subjected to inter and intra-observer variability, and it is a time-consuming and tedious task. In this article, we propose a spatio-temporal multi-task learning approach to obtain a complete set of measurements quantifying cardiac LV morphology, regional-wall thickness (RWT), and additionally detecting the cardiac phase cycle (systole and diastole) for a given 3D Cine-magnetic resonance (MR) image sequence. We first segment cardiac LVs using an encoder-decoder network and then introduce a multitask framework to regress 11 LV indices and classify the cardiac phase, as parallel tasks during model optimization. The proposed deep learning model is based on the 3D spatio-temporal convolutions, which extract spatial and temporal features from MR images. We demonstrate the efficacy of the proposed method using cine-MR sequences of 145 subjects and comparing the performance with other state-of-the-art quantification methods. The proposed method obtained high prediction accuracy, with an average mean absolute error (MAE) of 129 mm 2, 1.23 mm, 1.76 mm, Pearson correlation coefficient (PCC) of 96.4%, 87.2%, and 97.5% for LV and myocardium (Myo) cavity regions, 6 RWTs, 3 LV dimensions, and an error rate of 9.0% for phase classification. The experimental results highlight the robustness of the proposed method, despite varying degrees of cardiac morphology, image appearance, and low contrast in the cardiac MR sequences.
Collapse
|
39
|
Xia T, Chartsias A, Wang C, Tsaftaris SA. Learning to synthesise the ageing brain without longitudinal data. Med Image Anal 2021; 73:102169. [PMID: 34311421 DOI: 10.1016/j.media.2021.102169] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2020] [Revised: 07/01/2021] [Accepted: 07/09/2021] [Indexed: 12/21/2022]
Abstract
How will my face look when I get older? Or, for a more challenging question: How will my brain look when I get older? To answer this question one must devise (and learn from data) a multivariate auto-regressive function which given an image and a desired target age generates an output image. While collecting data for faces may be easier, collecting longitudinal brain data is not trivial. We propose a deep learning-based method that learns to simulate subject-specific brain ageing trajectories without relying on longitudinal data. Our method synthesises images conditioned on two factors: age (a continuous variable), and status of Alzheimer's Disease (AD, an ordinal variable). With an adversarial formulation we learn the joint distribution of brain appearance, age and AD status, and define reconstruction losses to address the challenging problem of preserving subject identity. We compare with several benchmarks using two widely used datasets. We evaluate the quality and realism of synthesised images using ground-truth longitudinal data and a pre-trained age predictor. We show that, despite the use of cross-sectional data, our model learns patterns of gray matter atrophy in the middle temporal gyrus in patients with AD. To demonstrate generalisation ability, we train on one dataset and evaluate predictions on the other. In conclusion, our model shows an ability to separate age, disease influence and anatomy using only 2D cross-sectional data that should be useful in large studies into neurodegenerative disease, that aim to combine several data sources. To facilitate such future studies by the community at large our code is made available at https://github.com/xiat0616/BrainAgeing.
Collapse
Affiliation(s)
- Tian Xia
- Institute for Digital Communications, School of Engineering, University of Edinburgh, West Mains Rd, Edinburgh EH9 3FB, UK.
| | - Agisilaos Chartsias
- Institute for Digital Communications, School of Engineering, University of Edinburgh, West Mains Rd, Edinburgh EH9 3FB, UK
| | - Chengjia Wang
- The BHF Centre for Cardiovascular Science, Edinburgh EH16 4TJ, UK
| | - Sotirios A Tsaftaris
- Institute for Digital Communications, School of Engineering, University of Edinburgh, West Mains Rd, Edinburgh EH9 3FB, UK; The Alan Turing Institute, London NW1 2DB, UK
| | | |
Collapse
|
40
|
Cheng J, Gao M, Liu J, Yue H, Kuang H, Liu J, Wang J. Multimodal Disentangled Variational Autoencoder with Game Theoretic Interpretability for Glioma grading. IEEE J Biomed Health Inform 2021; 26:673-684. [PMID: 34236971 DOI: 10.1109/jbhi.2021.3095476] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
Effective fusion of multimodal magnetic resonance imaging (MRI) is of great significance to boost the accuracy of glioma grading thanks to the complementary information provided by different imaging modalities. However, how to extract the common and distinctive information from MRI to achieve complementarity is still an open problem in information fusion research. In this study, we propose a deep neural network model termed as multimodal disentangled variational autoencoder (MMD-VAE) for glioma grading based on radiomics features extracted from preoperative multimodal MRI images. Specifically, the radiomics features are quantized and extracted from the region of interest for each modality. Then, the latent representations of variational autoencoder for these features are disentangled into common and distinctive representations to obtain the shared and complementary data among modalities. Afterward, cross-modality reconstruction loss and common-distinctive loss are designed to ensure the effectiveness of the disentangled representations. Finally, the disentangled common and distinctive representations are fused to predict the glioma grades, and SHapley Additive exPlanations (SHAP) is adopted to quantitatively interpret and analyze the contribution of the important features to grading. Experimental results on two benchmark datasets demonstrate that the proposed MMD-VAE model achieves encouraging predictive performance (AUC:0.9939) on a public dataset, and good generalization performance (AUC:0.9611) on a cross-institutional private dataset. These quantitative results and interpretations may help radiologists understand gliomas better and make better treatment decisions for improving clinical outcomes.
Collapse
|
41
|
Kläser K, Varsavsky T, Markiewicz P, Vercauteren T, Hammers A, Atkinson D, Thielemans K, Hutton B, Cardoso MJ, Ourselin S. Imitation learning for improved 3D PET/MR attenuation correction. Med Image Anal 2021; 71:102079. [PMID: 33951598 PMCID: PMC7611431 DOI: 10.1016/j.media.2021.102079] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2020] [Revised: 04/01/2021] [Accepted: 04/06/2021] [Indexed: 12/24/2022]
Abstract
The assessment of the quality of synthesised/pseudo Computed Tomography (pCT) images is commonly measured by an intensity-wise similarity between the ground truth CT and the pCT. However, when using the pCT as an attenuation map (μ-map) for PET reconstruction in Positron Emission Tomography Magnetic Resonance Imaging (PET/MRI) minimising the error between pCT and CT neglects the main objective of predicting a pCT that when used as μ-map reconstructs a pseudo PET (pPET) which is as similar as possible to the gold standard CT-derived PET reconstruction. This observation motivated us to propose a novel multi-hypothesis deep learning framework explicitly aimed at PET reconstruction application. A convolutional neural network (CNN) synthesises pCTs by minimising a combination of the pixel-wise error between pCT and CT and a novel metric-loss that itself is defined by a CNN and aims to minimise consequent PET residuals. Training is performed on a database of twenty 3D MR/CT/PET brain image pairs. Quantitative results on a fully independent dataset of twenty-three 3D MR/CT/PET image pairs show that the network is able to synthesise more accurate pCTs. The Mean Absolute Error on the pCT (110.98 HU ± 19.22 HU) compared to a baseline CNN (172.12 HU ± 19.61 HU) and a multi-atlas propagation approach (153.40 HU ± 18.68 HU), and subsequently lead to a significant improvement in the PET reconstruction error (4.74% ± 1.52% compared to baseline 13.72% ± 2.48% and multi-atlas propagation 6.68% ± 2.06%).
Collapse
Affiliation(s)
- Kerstin Kläser
- Department of Medical Physics & Biomedical Engineering, University College London, London WC1E 6BT, UK; School of Biomedical Engineering & Imaging Sciences, King's College London, London SE1 7EH, UK.
| | - Thomas Varsavsky
- Department of Medical Physics & Biomedical Engineering, University College London, London WC1E 6BT, UK; School of Biomedical Engineering & Imaging Sciences, King's College London, London SE1 7EH, UK
| | - Pawel Markiewicz
- Department of Medical Physics & Biomedical Engineering, University College London, London WC1E 6BT, UK; School of Biomedical Engineering & Imaging Sciences, King's College London, London SE1 7EH, UK
| | - Tom Vercauteren
- School of Biomedical Engineering & Imaging Sciences, King's College London, London SE1 7EH, UK
| | - Alexander Hammers
- School of Biomedical Engineering & Imaging Sciences, King's College London, London SE1 7EH, UK; Kings College London & GSTT PET Centre, St. Thomas Hospital, London, UK
| | - David Atkinson
- Centre for Medical Imaging, University College London, London W1W 7TS, UK
| | - Kris Thielemans
- Institute of Nuclear Medicine, University College London, London NW1 2BU, UK
| | - Brian Hutton
- Institute of Nuclear Medicine, University College London, London NW1 2BU, UK
| | - M J Cardoso
- School of Biomedical Engineering & Imaging Sciences, King's College London, London SE1 7EH, UK
| | - Sébastien Ourselin
- School of Biomedical Engineering & Imaging Sciences, King's College London, London SE1 7EH, UK
| |
Collapse
|
42
|
Ren M, Dey N, Fishbaugh J, Gerig G. Segmentation-Renormalized Deep Feature Modulation for Unpaired Image Harmonization. IEEE TRANSACTIONS ON MEDICAL IMAGING 2021; 40:1519-1530. [PMID: 33591913 PMCID: PMC8294062 DOI: 10.1109/tmi.2021.3059726] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/03/2023]
Abstract
Deep networks are now ubiquitous in large-scale multi-center imaging studies. However, the direct aggregation of images across sites is contraindicated for downstream statistical and deep learning-based image analysis due to inconsistent contrast, resolution, and noise. To this end, in the absence of paired data, variations of Cycle-consistent Generative Adversarial Networks have been used to harmonize image sets between a source and target domain. Importantly, these methods are prone to instability, contrast inversion, intractable manipulation of pathology, and steganographic mappings which limit their reliable adoption in real-world medical imaging. In this work, based on an underlying assumption that morphological shape is consistent across imaging sites, we propose a segmentation-renormalized image translation framework to reduce inter-scanner heterogeneity while preserving anatomical layout. We replace the affine transformations used in the normalization layers within generative networks with trainable scale and shift parameters conditioned on jointly learned anatomical segmentation embeddings to modulate features at every level of translation. We evaluate our methodologies against recent baselines across several imaging modalities (T1w MRI, FLAIR MRI, and OCT) on datasets with and without lesions. Segmentation-renormalization for translation GANs yields superior image harmonization as quantified by Inception distances, demonstrates improved downstream utility via post-hoc segmentation accuracy, and improved robustness to translation perturbation and self-adversarial attacks.
Collapse
|
43
|
Havaei M, Mao X, Wang Y, Lao Q. Conditional generation of medical images via disentangled adversarial inference. Med Image Anal 2021; 72:102106. [PMID: 34153625 DOI: 10.1016/j.media.2021.102106] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2020] [Revised: 03/30/2021] [Accepted: 05/12/2021] [Indexed: 02/05/2023]
Abstract
Synthetic medical image generation has a huge potential for improving healthcare through many applications, from data augmentation for training machine learning systems to preserving patient privacy. Conditional Adversarial Generative Networks (cGANs) use a conditioning factor to generate images and have shown great success in recent years. Intuitively, the information in an image can be divided into two parts: 1) content which is presented through the conditioning vector and 2) style which is the undiscovered information missing from the conditioning vector. Current practices in using cGANs for medical image generation, only use a single variable for image generation (i.e., content) and therefore, do not provide much flexibility nor control over the generated image. In this work we propose DRAI-a dual adversarial inference framework with augmented disentanglement constraints-to learn from the image itself, disentangled representations of style and content, and use this information to impose control over the generation process. In this framework, style is learned in a fully unsupervised manner, while content is learned through both supervised learning (using the conditioning vector) and unsupervised learning (with the inference mechanism). We undergo two novel regularization steps to ensure content-style disentanglement. First, we minimize the shared information between content and style by introducing a novel application of the gradient reverse layer (GRL); second, we introduce a self-supervised regularization method to further separate information in the content and style variables. For evaluation, we consider two types of baselines: single latent variable models that infer a single variable, and double latent variable models that infer two variables (style and content). We conduct extensive qualitative and quantitative assessments on two publicly available medical imaging datasets (LIDC and HAM10000) and test for conditional image generation, image retrieval and style-content disentanglement. We show that in general, two latent variable models achieve better performance and give more control over the generated image. We also show that our proposed model (DRAI) achieves the best disentanglement score and has the best overall performance.
Collapse
Affiliation(s)
| | - Ximeng Mao
- Montréal Institute for Learning Algorithms (MILA), Université de Montréal, Canada
| | | | - Qicheng Lao
- Imagia, Canada; Montréal Institute for Learning Algorithms (MILA), Université de Montréal, Canada; West China Biomedical Big Data Center, West China Hospital of Sichuan University, Chengdu, China.
| |
Collapse
|
44
|
Zhou SK, Greenspan H, Davatzikos C, Duncan JS, van Ginneken B, Madabhushi A, Prince JL, Rueckert D, Summers RM. A review of deep learning in medical imaging: Imaging traits, technology trends, case studies with progress highlights, and future promises. PROCEEDINGS OF THE IEEE. INSTITUTE OF ELECTRICAL AND ELECTRONICS ENGINEERS 2021; 109:820-838. [PMID: 37786449 PMCID: PMC10544772 DOI: 10.1109/jproc.2021.3054390] [Citation(s) in RCA: 176] [Impact Index Per Article: 58.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/04/2023]
Abstract
Since its renaissance, deep learning has been widely used in various medical imaging tasks and has achieved remarkable success in many medical imaging applications, thereby propelling us into the so-called artificial intelligence (AI) era. It is known that the success of AI is mostly attributed to the availability of big data with annotations for a single task and the advances in high performance computing. However, medical imaging presents unique challenges that confront deep learning approaches. In this survey paper, we first present traits of medical imaging, highlight both clinical needs and technical challenges in medical imaging, and describe how emerging trends in deep learning are addressing these issues. We cover the topics of network architecture, sparse and noisy labels, federating learning, interpretability, uncertainty quantification, etc. Then, we present several case studies that are commonly found in clinical practice, including digital pathology and chest, brain, cardiovascular, and abdominal imaging. Rather than presenting an exhaustive literature survey, we instead describe some prominent research highlights related to these case study applications. We conclude with a discussion and presentation of promising future directions.
Collapse
Affiliation(s)
- S Kevin Zhou
- School of Biomedical Engineering, University of Science and Technology of China and Institute of Computing Technology, Chinese Academy of Sciences
| | - Hayit Greenspan
- Biomedical Engineering Department, Tel-Aviv University, Israel
| | - Christos Davatzikos
- Radiology Department and Electrical and Systems Engineering Department, University of Pennsylvania, USA
| | - James S Duncan
- Departments of Biomedical Engineering and Radiology & Biomedical Imaging, Yale University
| | | | - Anant Madabhushi
- Department of Biomedical Engineering, Case Western Reserve University and Louis Stokes Cleveland Veterans Administration Medical Center, USA
| | - Jerry L Prince
- Electrical and Computer Engineering Department, Johns Hopkins University, USA
| | - Daniel Rueckert
- Klinikum rechts der Isar, TU Munich, Germany and Department of Computing, Imperial College, UK
| | | |
Collapse
|
45
|
Chen X, Lian C, Wang L, Deng H, Kuang T, Fung SH, Gateno J, Shen D, Xia JJ, Yap PT. Diverse data augmentation for learning image segmentation with cross-modality annotations. Med Image Anal 2021; 71:102060. [PMID: 33957558 DOI: 10.1016/j.media.2021.102060] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2020] [Revised: 03/20/2021] [Accepted: 03/29/2021] [Indexed: 10/21/2022]
Abstract
The dearth of annotated data is a major hurdle in building reliable image segmentation models. Manual annotation of medical images is tedious, time-consuming, and significantly variable across imaging modalities. The need for annotation can be ameliorated by leveraging an annotation-rich source modality in learning a segmentation model for an annotation-poor target modality. In this paper, we introduce a diverse data augmentation generative adversarial network (DDA-GAN) to train a segmentation model for an unannotated target image domain by borrowing information from an annotated source image domain. This is achieved by generating diverse augmented data for the target domain by one-to-many source-to-target translation. The DDA-GAN uses unpaired images from the source and target domains and is an end-to-end convolutional neural network that (i) explicitly disentangles domain-invariant structural features related to segmentation from domain-specific appearance features, (ii) combines structural features from the source domain with appearance features randomly sampled from the target domain for data augmentation, and (iii) train the segmentation model with the augmented data in the target domain and the annotations from the source domain. The effectiveness of our method is demonstrated both qualitatively and quantitatively in comparison with the state of the art for segmentation of craniomaxillofacial bony structures via MRI and cardiac substructures via CT.
Collapse
Affiliation(s)
- Xu Chen
- Department of Radiology and Biomedical Research Imaging Center (BRIC), University of North Carolina, Chapel Hill, NC, USA
| | - Chunfeng Lian
- Department of Radiology and Biomedical Research Imaging Center (BRIC), University of North Carolina, Chapel Hill, NC, USA
| | - Li Wang
- Department of Radiology and Biomedical Research Imaging Center (BRIC), University of North Carolina, Chapel Hill, NC, USA
| | - Hannah Deng
- Department of Oral and Maxillofacial Surgery, Houston Methodist Research Institute, TX, USA
| | - Tianshu Kuang
- Department of Oral and Maxillofacial Surgery, Houston Methodist Research Institute, TX, USA
| | - Steve H Fung
- Department of Radiology, Houston Methodist Hospital, TX, USA
| | - Jaime Gateno
- Department of Oral and Maxillofacial Surgery, Houston Methodist Research Institute, TX, USA
| | - Dinggang Shen
- Department of Radiology and Biomedical Research Imaging Center (BRIC), University of North Carolina, Chapel Hill, NC, USA
| | - James J Xia
- Department of Oral and Maxillofacial Surgery, Houston Methodist Research Institute, TX, USA.
| | - Pew-Thian Yap
- Department of Radiology and Biomedical Research Imaging Center (BRIC), University of North Carolina, Chapel Hill, NC, USA.
| |
Collapse
|
46
|
Wang C, Yang G, Papanastasiou G, Tsaftaris SA, Newby DE, Gray C, Macnaught G, MacGillivray TJ. DiCyc: GAN-based deformation invariant cross-domain information fusion for medical image synthesis. AN INTERNATIONAL JOURNAL ON INFORMATION FUSION 2021; 67:147-160. [PMID: 33658909 PMCID: PMC7763495 DOI: 10.1016/j.inffus.2020.10.015] [Citation(s) in RCA: 25] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/04/2020] [Revised: 10/19/2020] [Accepted: 10/21/2020] [Indexed: 05/22/2023]
Abstract
Cycle-consistent generative adversarial network (CycleGAN) has been widely used for cross-domain medical image synthesis tasks particularly due to its ability to deal with unpaired data. However, most CycleGAN-based synthesis methods cannot achieve good alignment between the synthesized images and data from the source domain, even with additional image alignment losses. This is because the CycleGAN generator network can encode the relative deformations and noises associated to different domains. This can be detrimental for the downstream applications that rely on the synthesized images, such as generating pseudo-CT for PET-MR attenuation correction. In this paper, we present a deformation invariant cycle-consistency model that can filter out these domain-specific deformation. The deformation is globally parameterized by thin-plate-spline (TPS), and locally learned by modified deformable convolutional layers. Robustness to domain-specific deformations has been evaluated through experiments on multi-sequence brain MR data and multi-modality abdominal CT and MR data. Experiment results demonstrated that our method can achieve better alignment between the source and target data while maintaining superior image quality of signal compared to several state-of-the-art CycleGAN-based methods.
Collapse
Affiliation(s)
- Chengjia Wang
- BHF Centre for Cardiovascular Science, University of Edinburgh, Edinburgh, UK
- Corresponding author.
| | - Guang Yang
- National Heart and Lung Institute, Imperial College London, London, UK
| | | | - Sotirios A. Tsaftaris
- Institute for Digital Communications, School of Engineering, University of Edinburgh, Edinburgh, UK
| | - David E. Newby
- BHF Centre for Cardiovascular Science, University of Edinburgh, Edinburgh, UK
| | - Calum Gray
- Edinburgh Imaging Facility QMRI, University of Edinburgh, Edinburgh, UK
| | - Gillian Macnaught
- Edinburgh Imaging Facility QMRI, University of Edinburgh, Edinburgh, UK
| | | |
Collapse
|
47
|
Chartsias A, Papanastasiou G, Wang C, Semple S, Newby DE, Dharmakumar R, Tsaftaris SA. Disentangle, Align and Fuse for Multimodal and Semi-Supervised Image Segmentation. IEEE TRANSACTIONS ON MEDICAL IMAGING 2021; 40:781-792. [PMID: 33156786 PMCID: PMC8011298 DOI: 10.1109/tmi.2020.3036584] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/22/2023]
Abstract
Magnetic resonance (MR) protocols rely on several sequences to assess pathology and organ status properly. Despite advances in image analysis, we tend to treat each sequence, here termed modality, in isolation. Taking advantage of the common information shared between modalities (an organ's anatomy) is beneficial for multi-modality processing and learning. However, we must overcome inherent anatomical misregistrations and disparities in signal intensity across the modalities to obtain this benefit. We present a method that offers improved segmentation accuracy of the modality of interest (over a single input model), by learning to leverage information present in other modalities, even if few (semi-supervised) or no (unsupervised) annotations are available for this specific modality. Core to our method is learning a disentangled decomposition into anatomical and imaging factors. Shared anatomical factors from the different inputs are jointly processed and fused to extract more accurate segmentation masks. Image misregistrations are corrected with a Spatial Transformer Network, which non-linearly aligns the anatomical factors. The imaging factor captures signal intensity characteristics across different modality data and is used for image reconstruction, enabling semi-supervised learning. Temporal and slice pairing between inputs are learned dynamically. We demonstrate applications in Late Gadolinium Enhanced (LGE) and Blood Oxygenation Level Dependent (BOLD) cardiac segmentation, as well as in T2 abdominal segmentation. Code is available at https://github.com/vios-s/multimodal_segmentation.
Collapse
|
48
|
Vu QD, Kim K, Kwak JT. Unsupervised Tumor Characterization via Conditional Generative Adversarial Networks. IEEE J Biomed Health Inform 2021; 25:348-357. [PMID: 32396112 DOI: 10.1109/jbhi.2020.2993560] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Grading for cancer, based upon the degree of cancer differentiation, plays a major role in describing the characteristics and behavior of the cancer and determining treatment plan for patients. The grade is determined by a subjective and qualitative assessment of tissues under microscope, which suffers from high inter- and intra-observer variability among pathologists. Digital pathology offers an alternative means to automate the procedure as well as to improve the accuracy and robustness of cancer grading. However, most of such methods tend to mimic or reproduce cancer grade determined by human experts. Herein, we propose an alternative, quantitative means of assessing and characterizing cancers in an unsupervised manner. The proposed method utilizes conditional generative adversarial networks to characterize tissues. The proposed method is evaluated using whole slide images (WSIs) and tissue microarrays (TMAs) of colorectal cancer specimens. The results suggest that the proposed method holds a potential for quantifying cancer characteristics and improving cancer pathology.
Collapse
|
49
|
Meng Q. Mutual Information-Based Disentangled Neural Networks for Classifying Unseen Categories in Different Domains: Application to Fetal Ultrasound Imaging. IEEE TRANSACTIONS ON MEDICAL IMAGING 2021; 40:722-734. [PMID: 33141662 PMCID: PMC7116845 DOI: 10.1109/tmi.2020.3035424] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Deep neural networks exhibit limited generalizability across images with different entangled domain features and categorical features. Learning generalizable features that can form universal categorical decision boundaries across domains is an interesting and difficult challenge. This problem occurs frequently in medical imaging applications when attempts are made to deploy and improve deep learning models across different image acquisition devices, across acquisition parameters or if some classes are unavailable in new training databases. To address this problem, we propose Mutual Information-based Disentangled Neural Networks (MIDNet), which extract generalizable categorical features to transfer knowledge to unseen categories in a target domain. The proposed MIDNet adopts a semi-supervised learning paradigm to alleviate the dependency on labeled data. This is important for real-world applications where data annotation is time-consuming, costly and requires training and expertise. We extensively evaluate the proposed method on fetal ultrasound datasets for two different image classification tasks where domain features are respectively defined by shadow artifacts and image acquisition devices. Experimental results show that the proposed method outperforms the state-of-the-art on the classification of unseen categories in a target domain with sparsely labeled training data.
Collapse
|
50
|
Wu F, Zhuang X. CF Distance: A New Domain Discrepancy Metric and Application to Explicit Domain Adaptation for Cross-Modality Cardiac Image Segmentation. IEEE TRANSACTIONS ON MEDICAL IMAGING 2020; 39:4274-4285. [PMID: 32784131 DOI: 10.1109/tmi.2020.3016144] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Domain adaptation has great values in unpaired cross-modality image segmentation, where the training images with gold standard segmentation are not available from the target image domain. The aim is to reduce the distribution discrepancy between the source and target domains. Hence, an effective measurement for this discrepancy is critical. In this work, we propose a new metric based on characteristic functions of distributions. This metric, referred to as CF distance, enables explicit domain adaptation, in contrast to the implicit manners minimizing domain discrepancy via adversarial training. Based on this CF distance, we propose an unsupervised domain adaptation framework for cross-modality cardiac segmentation, which consists of image reconstruction and prior distribution matching. We validated the method on two tasks, i.e., the CT-MR cross-modality segmentation and the multi-sequence cardiac MR segmentation. Results showed that the proposed explicit metric was effective in domain adaptation, and the segmentation method delivered promising and superior performance, compared to other state-of-the-art techniques. The data and source code of this work has been released via https://zmiclab.github.io/projects.html.
Collapse
|