1
|
Huang Y, Wu Z, Xu X, Zhang M, Wang S, Liu Q. Partition-based k-space synthesis for multi-contrast parallel imaging. Magn Reson Imaging 2025; 117:110297. [PMID: 39647517 DOI: 10.1016/j.mri.2024.110297] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2024] [Revised: 12/02/2024] [Accepted: 12/03/2024] [Indexed: 12/10/2024]
Abstract
PURPOSE Multi-contrast magnetic resonance imaging is a significant and essential medical imaging technique. However, multi-contrast imaging has longer acquisition time and is easy to cause motion artifacts. In particular, the acquisition time for a T2-weighted image is prolonged due to its longer repetition time (TR). On the contrary, T1-weighted image has a shorter TR. Therefore, utilizing complementary information across T1 and T2-weighted image is a way to decrease the overall imaging time. Previous T1-assisted T2 reconstruction methods have mostly focused on image domain using whole-based image fusion approaches. The image domain reconstruction method has the defects of high computational complexity and limited flexibility. To address this issue, we propose a novel multi-contrast imaging method called partition-based k-space synthesis (PKS) which can achieve better reconstruction quality of T2-weighted image by feature fusion. METHODS Concretely, we first decompose fully-sampled T1 k-space data and under-sampled T2 k-space data into two sub-data, separately. Then two new objects are constructed by combining the two sub-T1/T2 data. After that, the two new objects as the whole data to realize the reconstruction of T2-weighted image. RESULTS Experimental results showed that the developed PKS scheme can achieve comparable or better results than using traditional k-space parallel imaging (SAKE) that processes each contrast independently. At the same time, our method showed good adaptability and robustness under different contrast-assisted and T1-T2 ratios. Efficient target modal image reconstruction under various conditions were realized and had excellent performance in restoring image quality and preserving details. CONCLUSIONS This work proposed a PKS multi-contrast method to assist in target mode image reconstruction. We have conducted extensive experiments on different multi-contrast, diverse ratios of T1 to T2 and different sampling masks to demonstrate the generalization and robustness of our proposed model.
Collapse
Affiliation(s)
- Yuxia Huang
- Department of Electronic Information Engineering, Nanchang University, Nanchang 330031, China
| | - Zhonghui Wu
- Department of Electronic Information Engineering, Nanchang University, Nanchang 330031, China
| | - Xiaoling Xu
- Department of Electronic Information Engineering, Nanchang University, Nanchang 330031, China
| | - Minghui Zhang
- Department of Electronic Information Engineering, Nanchang University, Nanchang 330031, China
| | - Shanshan Wang
- Paul C. Lauterbur Research Center for Biomedical Imaging, SIAT, Chinese Academy of Sciences, Shenzhen 518055, China.
| | - Qiegen Liu
- Department of Electronic Information Engineering, Nanchang University, Nanchang 330031, China.
| |
Collapse
|
2
|
Patil SS, Rajak R, Ramteke M, Rathore AS. MMIT-DDPM - Multilateral medical image translation with class and structure supervised diffusion-based model. Comput Biol Med 2025; 185:109501. [PMID: 39626456 DOI: 10.1016/j.compbiomed.2024.109501] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2023] [Revised: 10/03/2024] [Accepted: 11/27/2024] [Indexed: 01/26/2025]
Abstract
Unified translation of medical images from one-to-many distinct modalities is desirable in healthcare settings. A ubiquitous approach for bilateral medical scan translation is one-to-one mapping with GANs. However, its efficacy in encapsulating diversity in a pool of medical scans and performing one-to-many translation is questionable. In contrast, the Denoising Diffusion Probabilistic Model (DDPM) exhibits exceptional ability in image generation due to its scalability and ability to capture the distribution of whole training data. Therefore, we propose a novel conditioning mechanism for the deterministic translation of medical scans to any target modality from a source modality with a DDPM model. This model denoises the target modality under the guidance of a source-modality structure encoder and source-to-target class conditioner. Consequently, this mechanism serves as prior information for sampling the desired target modality during inference. The training and testing have been carried out on the T1-weighted, T2-weighted, and Fluid Attenuated Inversion Recovery (FLAIR) sequence of the BraTS 2021 dataset. The proposed model is capable of unified multi-lateral translation among six combinations of T1ce, T2, and FLAIR sequences of brain MRI, eliminating the need for multiple bilateral translation models. We have analyzed the performance of our architecture against State-of-the-art, Convolution, and Transformer-based GANs. The diffusion model efficiently covers the distribution of multiple modalities while producing better image quality of the translated sequences, as evidenced by the average improvement of 8.06 % in Multi-Scale Structural Similarity (MSSIM) and 2.52 in Fréchet Inception Distance (FID) metrics compared with the CNN and transformer-based GAN architecture.
Collapse
Affiliation(s)
| | - Rishav Rajak
- Department of Chemical Engineering, IIT Delhi, India
| | - Manojkumar Ramteke
- Department of Chemical Engineering, IIT Delhi, India; Yardi School of Artificial Intelligence, IIT Delhi, India
| | - Anurag S Rathore
- Department of Chemical Engineering, IIT Delhi, India; Yardi School of Artificial Intelligence, IIT Delhi, India.
| |
Collapse
|
3
|
Zhang J, Zeng X. M2OCNN: Many-to-One Collaboration Neural Networks for simultaneously multi-modal medical image synthesis and fusion. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2025; 261:108612. [PMID: 39908634 DOI: 10.1016/j.cmpb.2025.108612] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/31/2024] [Revised: 01/10/2025] [Accepted: 01/19/2025] [Indexed: 02/07/2025]
Abstract
BACKGROUND AND OBJECTIVE Acquiring comprehensive information from multi-modal medical images remains a challenge in clinical diagnostics and treatment, due to complex inter-modal dependencies and missing modalities. While cross-modal medical image synthesis (CMIS) and multi-modal medical image fusion (MMIF) address certain issues, existing methods typically treat these as separate tasks, lacking a unified framework that can generate both synthesized and fused images in the presence of missing modalities. METHODS In this paper, we propose the Many-to-One Collaboration Neural Network (M2OCNN), a unified model designed to simultaneously address CMIS and MMIF. Unlike traditional approaches, M2OCNN treats fusion as a specific form of synthesis and provides a comprehensive solution even when modalities are missing. The network consists of three modules: the Parallel Untangling Hybrid Network, Comprehensive Feature Router, and Series Omni-modal Hybrid Network. Additionally, we introduce a mixed-resolution attention mechanism and two transformer variants, Coarsormer and ReCoarsormer, to suppress high-frequency interference and enhance model performance. M2OCNN outperformed state-of-the-art methods on three multi-modal medical imaging datasets, achieving an average PSNR improvement of 2.4 dB in synthesis tasks and producing high-quality fusion images despite missing modalities. The source code is available at https://github.com/zjno108/M2OCNN. CONCLUSION M2OCNN offers a novel solution by unifying CMIS and MMIF tasks in a single framework, enabling the generation of both synthesized and fused images from a single modality. This approach sets a new direction for research in multi-modal medical imaging, with implications for improving clinical diagnosis and treatment.
Collapse
Affiliation(s)
- Jian Zhang
- Chongqing Key Laboratory of Image Cognition, College of Computer Science and Technology, Chongqing University of Posts and Telecommunication, Chongqing, 400065, China.
| | - Xianhua Zeng
- Chongqing Key Laboratory of Image Cognition, College of Computer Science and Technology, Chongqing University of Posts and Telecommunication, Chongqing, 400065, China.
| |
Collapse
|
4
|
Moya-Sáez E, de Luis-García R, Nunez-Gonzalez L, Alberola-López C, Hernández-Tamames JA. Brain tumor enhancement prediction from pre-contrast conventional weighted images using synthetic multiparametric mapping and generative artificial intelligence. Quant Imaging Med Surg 2025; 15:42-54. [PMID: 39839033 PMCID: PMC11744120 DOI: 10.21037/qims-24-721] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2024] [Accepted: 07/22/2024] [Indexed: 01/23/2025]
Abstract
Background Gadolinium-based contrast agents (GBCAs) are usually employed for glioma diagnosis. However, GBCAs raise safety concerns, lead to patient discomfort and increase costs. Parametric maps offer a potential solution by enabling quantification of subtle tissue changes without GBCAs, but they are not commonly used in clinical practice due to the need for specifically targeted sequences. This work proposes to predict post-contrast T1-weighted enhancement without GBCAs from pre-contrast conventional weighted images through synthetic parametric maps computed with generative artificial intelligence (deep learning). Methods In this retrospective study, three datasets have been employed: (I) a proprietary dataset with 15 glioma patients (hereafter, GLIOMA dataset); (II) relaxometry maps from 5 healthy volunteers; and (III) UPenn-GBM, a public dataset with 493 glioblastoma patients. A deep learning method for synthesizing parametric maps from only two conventional weighted images is proposed. Particularly, we synthesize longitudinal relaxation time (T1), transversal relaxation time (T2), and proton density (PD) maps. The deep learning method is trained in a supervised manner with the GLIOMA dataset, which comprises weighted images and parametric maps obtained with magnetic resonance image compilation (MAGiC). Thus, MAGiC maps were used as references for the training. For testing, a leave-one-out scheme is followed. Finally, the synthesized maps are employed to predict T1-weighted enhancement without GBCAs. Our results are compared with those obtained by MAGiC; specifically, both the maps obtained with MAGiC and the synthesized maps are used to distinguish between healthy and abnormal tissue (ABN) and, particularly, tissues with and without T1-weighted enhancement. The generalization capability of the method was also tested on two additional datasets (healthy volunteers and the UPenn-GBM). Results Parametric maps synthesized with deep learning obtained similar performance compared to MAGiC for discriminating normal from ABN (sensitivities: 88.37% vs. 89.35%) and tissue with and without T1-weighted enhancement (sensitivities: 93.26% vs. 87.29%) on the GLIOMA dataset. These values were comparable to those obtained on UPenn-GBM (sensitivities of 91.23% and 81.04% for each classification). Conclusions Our results suggest the feasibility to predict T1-weighted-enhanced tissues from pre-contrast conventional weighted images using deep learning for the synthesis of parametric maps.
Collapse
Affiliation(s)
- Elisa Moya-Sáez
- Image Processing Lab, University of Valladolid, Valladolid, Spain
| | | | - Laura Nunez-Gonzalez
- Radiology and Nuclear Medicine Department, Erasmus MC, Rotterdam, The Netherlands
| | | | - Juan Antonio Hernández-Tamames
- Radiology and Nuclear Medicine Department, Erasmus MC, Rotterdam, The Netherlands
- Imaging Physics Department, TU Delft, Delft, The Netherlands
| |
Collapse
|
5
|
Chen S, Zhang R, Liang H, Qian Y, Zhou X. Coupling of state space modules and attention mechanisms: An input-aware multi-contrast MRI synthesis method. Med Phys 2024. [PMID: 39714363 DOI: 10.1002/mp.17598] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2024] [Revised: 11/19/2024] [Accepted: 12/05/2024] [Indexed: 12/24/2024] Open
Abstract
BACKGROUND Medical imaging plays a pivotal role in the real-time monitoring of patients during the diagnostic and therapeutic processes. However, in clinical scenarios, the acquisition of multi-modal imaging protocols is often impeded by a number of factors, including time and economic costs, the cooperation willingness of patients, imaging quality, and even safety concerns. PURPOSE We proposed a learning-based medical image synthesis method to simplify the acquisition of multi-contrast MRI. METHODS We redesigned the basic structure of the Mamba block and explored different integration patterns between Mamba layers and Transformer layers to make it more suitable for medical image synthesis tasks. Experiments were conducted on the IXI (a total of 575 samples, training set: 450 samples; validation set: 25 samples; test set: 100 samples) and BRATS (a total of 494 samples, training set: 350 samples; validation set: 44 samples; test set: 100 samples) datasets to assess the synthesis performance of our proposed method in comparison to some state-of-the-art models on the task of multi-contrast MRI synthesis. RESULTS Our proposed model outperformed other state-of-the-art models in some multi-contrast MRI synthesis tasks. In the synthesis task from T1 to PD, our proposed method achieved the peak signal-to-noise ratio (PSNR) of 33.70 dB (95% CI, 33.61, 33.79) and the structural similarity index (SSIM) of 0.966 (95% CI, 0.964, 0.968). In the synthesis task from T2 to PD, the model achieved a PSNR of 33.90 dB (95% CI, 33.82, 33.98) and SSMI of 0.971 (95% CI, 0.969, 0.973). In the synthesis task from FLAIR to T2, the model achieved PSNR of 30.43 dB (95% CI, 30.29, 30.57) and SSIM of 0.938 (95% CI, 0.935, 0.941). CONCLUSIONS Our proposed method could effectively model not only the high-dimensional, nonlinear mapping relationships between the magnetic signals of the hydrogen nucleus in tissues and the proton density signals in tissues, but also of the recovery process of suppressed liquid signals in FLAIR. The model proposed in our work employed distinct mechanisms in the synthesis of images belonging to normal and lesion samples, which demonstrated that our model had a profound comprehension of the input data. We also proved that in a hierarchical network, only the deeper self-attention layers were responsible for directing more attention on lesion areas.
Collapse
Affiliation(s)
- Shuai Chen
- Jiangsu Key Laboratory for Biomaterials and Devices, School of Biological Science and Medical Engineering, Southeast University, Nanjing, China
| | - Ruoyu Zhang
- Jiangsu Key Laboratory for Biomaterials and Devices, School of Biological Science and Medical Engineering, Southeast University, Nanjing, China
| | - Huazheng Liang
- Monash Suzhou Research Institute, Suzhou, Jiangsu Province, China
- Shanghai Key Laboratory of Anesthesiology and Brain Functional Modulation, Clinical Research Center for Anesthesiology and Perioperative Medicine, Shanghai Fourth People's Hospital, School of Medicine, Translational Research Institute of Brain and Brain-Like Intelligence, Shanghai Fourth People's Hospital, School of Medicine, Tongji University, Shanghai, China
| | - Yunzhu Qian
- Department of Stomatology, The Fourth Affiliated Hospital of Soochow University, Suzhou Dushu Lake Hospital, Medical Center of Soochow University, Suzhou, Jiangsu Province, China
| | - Xuefeng Zhou
- Jiangsu Key Laboratory for Biomaterials and Devices, School of Biological Science and Medical Engineering, Southeast University, Nanjing, China
| |
Collapse
|
6
|
Rudroff T, Klén R, Rainio O, Tuulari J. The Untapped Potential of Dimension Reduction in Neuroimaging: Artificial Intelligence-Driven Multimodal Analysis of Long COVID Fatigue. Brain Sci 2024; 14:1209. [PMID: 39766408 PMCID: PMC11674449 DOI: 10.3390/brainsci14121209] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2024] [Revised: 11/19/2024] [Accepted: 11/26/2024] [Indexed: 01/11/2025] Open
Abstract
This perspective paper explores the untapped potential of artificial intelligence (AI), particularly machine learning-based dimension reduction techniques in multimodal neuroimaging analysis of Long COVID fatigue. The complexity and high dimensionality of neuroimaging data from modalities such as positron emission tomography (PET) and magnetic resonance imaging (MRI) pose significant analytical challenges. Deep neural networks and other machine learning approaches offer powerful tools for managing this complexity and extracting meaningful patterns. The paper discusses current challenges in neuroimaging data analysis, reviews state-of-the-art AI approaches for dimension reduction and multimodal integration, and examines their potential applications in Long COVID research. Key areas of focus include the development of AI-based biomarkers, AI-informed treatment strategies, and personalized medicine approaches. The authors argue that AI-driven multimodal neuroimaging analysis represents a paradigm shift in studying complex brain disorders like Long COVID. While acknowledging technical and ethical challenges, the paper emphasizes the potential of these advanced techniques to uncover new insights into the condition, which might lead to improved diagnostic and therapeutic strategies for those affected by Long COVID fatigue. The broader implications for understanding and treating other complex neurological and psychiatric conditions are also discussed.
Collapse
Affiliation(s)
- Thorsten Rudroff
- Turku PET Centre, University of Turku, Turku University Hospital, 20520 Turku, Finland; (R.K.); (O.R.); (J.T.)
| | | | | | | |
Collapse
|
7
|
Diniz E, Santini T, Helmet K, Aizenstein HJ, Ibrahim TS. Cross-modality image translation of 3 Tesla Magnetic Resonance Imaging to 7 Tesla using Generative Adversarial Networks. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.10.16.24315609. [PMID: 39484249 PMCID: PMC11527090 DOI: 10.1101/2024.10.16.24315609] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 11/03/2024]
Abstract
The rapid advancements in magnetic resonance imaging (MRI) technology have precipitated a new paradigm wherein cross-modality data translation across diverse imaging platforms, field strengths, and different sites is increasingly challenging. This issue is particularly accentuated when transitioning from 3 Tesla (3T) to 7 Tesla (7T) MRI systems. This study proposes a novel solution to these challenges using generative adversarial networks (GANs)-specifically, the CycleGAN architecture-to create synthetic 7T images from 3T data. Employing a dataset of 1112 and 490 unpaired 3T and 7T MR images, respectively, we trained a 2-dimensional (2D) CycleGAN model, evaluating its performance on a paired dataset of 22 participants scanned at 3T and 7T. Independent testing on 22 distinct participants affirmed the model's proficiency in accurately predicting various tissue types, encompassing cerebral spinal fluid, gray matter, and white matter. Our approach provides a reliable and efficient methodology for synthesizing 7T images, achieving a median Dice of 6.82%,7,63%, and 4.85% for Cerebral Spinal Fluid (CSF), Gray Matter (GM), and White Matter (WM), respectively, in the testing dataset, thereby significantly aiding in harmonizing heterogeneous datasets. Furthermore, it delineates the potential of GANs in amplifying the contrast-to-noise ratio (CNR) from 3T, potentially enhancing the diagnostic capability of the images. While acknowledging the risk of model overfitting, our research underscores a promising progression towards harnessing the benefits of 7T MR systems in research investigations while preserving compatibility with existent 3T MR data. This work was previously presented at the ISMRM 2021 conference (Diniz, Helmet, Santini, Aizenstein, & Ibrahim, 2021).
Collapse
Affiliation(s)
- Eduardo Diniz
- Department of Electrical and Computer Engineering, University of Pittsburgh, Pennsylvania, United States
| | - Tales Santini
- Department of Bioengineering, University of Pittsburgh, Pennsylvania, United States
| | - Karim Helmet
- Department of Bioengineering, University of Pittsburgh, Pennsylvania, United States
- Department of Psychiatry, University of Pittsburgh, Pennsylvania, United States
| | - Howard J. Aizenstein
- Department of Bioengineering, University of Pittsburgh, Pennsylvania, United States
- Department of Psychiatry, University of Pittsburgh, Pennsylvania, United States
| | - Tamer S. Ibrahim
- Department of Bioengineering, University of Pittsburgh, Pennsylvania, United States
| |
Collapse
|
8
|
Zhang J, Cui Z, Jiang C, Guo S, Gao F, Shen D. Hierarchical Organ-Aware Total-Body Standard-Dose PET Reconstruction From Low-Dose PET and CT Images. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:13258-13270. [PMID: 37159324 DOI: 10.1109/tnnls.2023.3266551] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]
Abstract
Positron emission tomography (PET) is an important functional imaging technology in early disease diagnosis. Generally, the gamma ray emitted by standard-dose tracer inevitably increases the exposure risk to patients. To reduce dosage, a lower dose tracer is often used and injected into patients. However, this often leads to low-quality PET images. In this article, we propose a learning-based method to reconstruct total-body standard-dose PET (SPET) images from low-dose PET (LPET) images and corresponding total-body computed tomography (CT) images. Different from previous works focusing only on a certain part of human body, our framework can hierarchically reconstruct total-body SPET images, considering varying shapes and intensity distributions of different body parts. Specifically, we first use one global total-body network to coarsely reconstruct total-body SPET images. Then, four local networks are designed to finely reconstruct head-neck, thorax, abdomen-pelvic, and leg parts of human body. Moreover, to enhance each local network learning for the respective local body part, we design an organ-aware network with a residual organ-aware dynamic convolution (RO-DC) module by dynamically adapting organ masks as additional inputs. Extensive experiments on 65 samples collected from uEXPLORER PET/CT system demonstrate that our hierarchical framework can consistently improve the performance of all body parts, especially for total-body PET images with PSNR of 30.6 dB, outperforming the state-of-the-art methods in SPET image reconstruction.
Collapse
|
9
|
Gourdeau D, Duchesne S, Archambault L. An hetero-modal deep learning framework for medical image synthesis applied to contrast and non-contrast MRI. Biomed Phys Eng Express 2024; 10:065015. [PMID: 39178886 DOI: 10.1088/2057-1976/ad72f9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2024] [Accepted: 08/23/2024] [Indexed: 08/26/2024]
Abstract
Some pathologies such as cancer and dementia require multiple imaging modalities to fully diagnose and assess the extent of the disease. Magnetic resonance imaging offers this kind of polyvalence, but examinations take time and can require contrast agent injection. The flexible synthesis of these imaging sequences based on the available ones for a given patient could help reduce scan times or circumvent the need for contrast agent injection. In this work, we propose a deep learning architecture that can perform the synthesis of all missing imaging sequences from any subset of available images. The network is trained adversarially, with the generator consisting of parallel 3D U-Net encoders and decoders that optimally combines their multi-resolution representations with a fusion operation learned by an attention network trained conjointly with the generator network. We compare our synthesis performance with 3D networks using other types of fusion and a comparable number of trainable parameters, such as the mean/variance fusion. In all synthesis scenarios except one, the synthesis performance of the network using attention-guided fusion was better than the other fusion schemes. We also inspect the encoded representations and the attention network outputs to gain insights into the synthesis process, and uncover desirable behaviors such as prioritization of specific modalities, flexible construction of the representation when important modalities are missing, and modalities being selected in regions where they carry sequence-specific information. This work suggests that a better construction of the latent representation space in hetero-modal networks can be achieved by using an attention network.
Collapse
Affiliation(s)
- Daniel Gourdeau
- CERVO Brain Research Center, Québec, Québec, Canada
- Physics Department, Université Laval, Québec, Québec, Canada
| | - Simon Duchesne
- CERVO Brain Research Center, Québec, Québec, Canada
- Department of Radiology and Nuclear Medicine, Université Laval, Québec, Québec, Canada
| | | |
Collapse
|
10
|
Sinha A, Kawahara J, Pakzad A, Abhishek K, Ruthven M, Ghorbel E, Kacem A, Aouada D, Hamarneh G. DermSynth3D: Synthesis of in-the-wild annotated dermatology images. Med Image Anal 2024; 95:103145. [PMID: 38615432 DOI: 10.1016/j.media.2024.103145] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2023] [Revised: 02/11/2024] [Accepted: 03/18/2024] [Indexed: 04/16/2024]
Abstract
In recent years, deep learning (DL) has shown great potential in the field of dermatological image analysis. However, existing datasets in this domain have significant limitations, including a small number of image samples, limited disease conditions, insufficient annotations, and non-standardized image acquisitions. To address these shortcomings, we propose a novel framework called DermSynth3D. DermSynth3D blends skin disease patterns onto 3D textured meshes of human subjects using a differentiable renderer and generates 2D images from various camera viewpoints under chosen lighting conditions in diverse background scenes. Our method adheres to top-down rules that constrain the blending and rendering process to create 2D images with skin conditions that mimic in-the-wild acquisitions, ensuring more meaningful results. The framework generates photo-realistic 2D dermatological images and the corresponding dense annotations for semantic segmentation of the skin, skin conditions, body parts, bounding boxes around lesions, depth maps, and other 3D scene parameters, such as camera position and lighting conditions. DermSynth3D allows for the creation of custom datasets for various dermatology tasks. We demonstrate the effectiveness of data generated using DermSynth3D by training DL models on synthetic data and evaluating them on various dermatology tasks using real 2D dermatological images. We make our code publicly available at https://github.com/sfu-mial/DermSynth3D.
Collapse
Affiliation(s)
- Ashish Sinha
- Medical Image Analysis Lab, School of Computing Science, Simon Fraser University, Burnaby V5A 1S6, Canada
| | - Jeremy Kawahara
- Medical Image Analysis Lab, School of Computing Science, Simon Fraser University, Burnaby V5A 1S6, Canada
| | - Arezou Pakzad
- Medical Image Analysis Lab, School of Computing Science, Simon Fraser University, Burnaby V5A 1S6, Canada
| | - Kumar Abhishek
- Medical Image Analysis Lab, School of Computing Science, Simon Fraser University, Burnaby V5A 1S6, Canada
| | - Matthieu Ruthven
- Computer Vision, Imaging & Machine Intelligence Research Group, Interdisciplinary Centre for Security, Reliability and Trust (SnT), University of Luxembourg, L-1855, Luxembourg
| | - Enjie Ghorbel
- Computer Vision, Imaging & Machine Intelligence Research Group, Interdisciplinary Centre for Security, Reliability and Trust (SnT), University of Luxembourg, L-1855, Luxembourg; Cristal Laboratory, National School of Computer Sciences, University of Manouba, 2010, Tunisia
| | - Anis Kacem
- Computer Vision, Imaging & Machine Intelligence Research Group, Interdisciplinary Centre for Security, Reliability and Trust (SnT), University of Luxembourg, L-1855, Luxembourg
| | - Djamila Aouada
- Computer Vision, Imaging & Machine Intelligence Research Group, Interdisciplinary Centre for Security, Reliability and Trust (SnT), University of Luxembourg, L-1855, Luxembourg
| | - Ghassan Hamarneh
- Medical Image Analysis Lab, School of Computing Science, Simon Fraser University, Burnaby V5A 1S6, Canada.
| |
Collapse
|
11
|
Huang L, Zhou J, Jiao J, Zhou S, Chang C, Wang Y, Guo Y. Standardization of ultrasound images across various centers: M2O-DiffGAN bridging the gaps among unpaired multi-domain ultrasound images. Med Image Anal 2024; 95:103187. [PMID: 38705056 DOI: 10.1016/j.media.2024.103187] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2023] [Revised: 02/20/2024] [Accepted: 04/22/2024] [Indexed: 05/07/2024]
Abstract
Domain shift problem is commonplace for ultrasound image analysis due to difference imaging setting and diverse medical centers, which lead to poor generalizability of deep learning-based methods. Multi-Source Domain Transformation (MSDT) provides a promising way to tackle the performance degeneration caused by the domain shift, which is more practical and challenging compared to conventional single-source transformation tasks. An effective unsupervised domain combination strategy is highly required to handle multiple domains without annotations. Fidelity and quality of generated images are also important to ensure the accuracy of computer-aided diagnosis. However, existing MSDT approaches underperform in above two areas. In this paper, an efficient domain transformation model named M2O-DiffGAN is introduced to achieve a unified mapping from multiple unlabeled source domains to the target domain. A cycle-consistent "many-to-one" adversarial learning architecture is introduced to model various unlabeled domains jointly. A condition adversarial diffusion process is employed to generate images with high-fidelity, combining an adversarial projector to capture reverse transition probabilities over large step sizes for accelerating sampling. Considering the limited perceptual information of ultrasound images, an ultrasound-specific content loss helps to capture more perceptual features for synthesizing high-quality ultrasound images. Massive comparisons on six clinical datasets covering thyroid, carotid and breast demonstrate the superiority of the M2O-DiffGAN in the performance of bridging the domain gaps and enlarging the generalization of downstream analysis methods compared to state-of-the-art algorithms. It improves the mean MI, Bhattacharyya Coefficient, dice and IoU assessments by 0.390, 0.120, 0.245 and 0.250, presenting promising clinical applications.
Collapse
Affiliation(s)
- Lihong Huang
- Department of Electronic Engineering, School of Information Science and Technology, Fudan University, Shanghai, China
| | - Jin Zhou
- Fudan University Shanghai Cancer Center, Shanghai, China
| | - Jing Jiao
- Department of Electronic Engineering, School of Information Science and Technology, Fudan University, Shanghai, China
| | - Shichong Zhou
- Fudan University Shanghai Cancer Center, Shanghai, China
| | - Cai Chang
- Fudan University Shanghai Cancer Center, Shanghai, China
| | - Yuanyuan Wang
- Department of Electronic Engineering, School of Information Science and Technology, Fudan University, Shanghai, China; Key Laboratory of Medical Imaging Computing and Computer Assisted Intervention of Shanghai, Shanghai, China.
| | - Yi Guo
- Department of Electronic Engineering, School of Information Science and Technology, Fudan University, Shanghai, China; Key Laboratory of Medical Imaging Computing and Computer Assisted Intervention of Shanghai, Shanghai, China.
| |
Collapse
|
12
|
Meng X, Sun K, Xu J, He X, Shen D. Multi-Modal Modality-Masked Diffusion Network for Brain MRI Synthesis With Random Modality Missing. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:2587-2598. [PMID: 38393846 DOI: 10.1109/tmi.2024.3368664] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/25/2024]
Abstract
Synthesis of unavailable imaging modalities from available ones can generate modality-specific complementary information and enable multi-modality based medical images diagnosis or treatment. Existing generative methods for medical image synthesis are usually based on cross-modal translation between acquired and missing modalities. These methods are usually dedicated to specific missing modality and perform synthesis in one shot, which cannot deal with varying number of missing modalities flexibly and construct the mapping across modalities effectively. To address the above issues, in this paper, we propose a unified Multi-modal Modality-masked Diffusion Network (M2DN), tackling multi-modal synthesis from the perspective of "progressive whole-modality inpainting", instead of "cross-modal translation". Specifically, our M2DN considers the missing modalities as random noise and takes all the modalities as a unity in each reverse diffusion step. The proposed joint synthesis scheme performs synthesis for the missing modalities and self-reconstruction for the available ones, which not only enables synthesis for arbitrary missing scenarios, but also facilitates the construction of common latent space and enhances the model representation ability. Besides, we introduce a modality-mask scheme to encode availability status of each incoming modality explicitly in a binary mask, which is adopted as condition for the diffusion model to further enhance the synthesis performance of our M2DN for arbitrary missing scenarios. We carry out experiments on two public brain MRI datasets for synthesis and downstream segmentation tasks. Experimental results demonstrate that our M2DN outperforms the state-of-the-art models significantly and shows great generalizability for arbitrary missing modalities.
Collapse
|
13
|
Li L, Yu J, Li Y, Wei J, Fan R, Wu D, Ye Y. Multi-sequence generative adversarial network: better generation for enhanced magnetic resonance imaging images. Front Comput Neurosci 2024; 18:1365238. [PMID: 38841427 PMCID: PMC11151883 DOI: 10.3389/fncom.2024.1365238] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2024] [Accepted: 03/27/2024] [Indexed: 06/07/2024] Open
Abstract
Introduction MRI is one of the commonly used diagnostic methods in clinical practice, especially in brain diseases. There are many sequences in MRI, but T1CE images can only be obtained by using contrast agents. Many patients (such as cancer patients) must undergo alignment of multiple MRI sequences for diagnosis, especially the contrast-enhanced magnetic resonance sequence. However, some patients such as pregnant women, children, etc. find it difficult to use contrast agents to obtain enhanced sequences, and contrast agents have many adverse reactions, which can pose a significant risk. With the continuous development of deep learning, the emergence of generative adversarial networks makes it possible to extract features from one type of image to generate another type of image. Methods We propose a generative adversarial network model with multimodal inputs and end-to-end decoding based on the pix2pix model. For the pix2pix model, we used four evaluation metrics: NMSE, RMSE, SSIM, and PNSR to assess the effectiveness of our generated model. Results Through statistical analysis, we compared our proposed new model with pix2pix and found significant differences between the two. Our model outperformed pix2pix, with higher SSIM and PNSR, lower NMSE and RMSE. We also found that the input of T1W images and T2W images had better effects than other combinations, providing new ideas for subsequent work on generating magnetic resonance enhancement sequence images. By using our model, it is possible to generate magnetic resonance enhanced sequence images based on magnetic resonance non-enhanced sequence images. Discussion This has significant implications as it can greatly reduce the use of contrast agents to protect populations such as pregnant women and children who are contraindicated for contrast agents. Additionally, contrast agents are relatively expensive, and this generation method may bring about substantial economic benefits.
Collapse
Affiliation(s)
- Leizi Li
- South China Normal University-Panyu Central Hospital Joint Laboratory of Basic and Translational Medical Research, Guangzhou Panyu Central Hospital, Guangzhou, China
- Guangzhou Key Laboratory of Subtropical Biodiversity and Biomonitoring and Guangdong Provincial Engineering Technology Research Center for Drug and Food Biological Resources Processing and Comprehensive Utilization, School of Life Sciences, South China Normal University, Guangzhou, China
| | - Jingchun Yu
- Guangzhou Key Laboratory of Subtropical Biodiversity and Biomonitoring and Guangdong Provincial Engineering Technology Research Center for Drug and Food Biological Resources Processing and Comprehensive Utilization, School of Life Sciences, South China Normal University, Guangzhou, China
| | - Yijin Li
- Guangzhou Key Laboratory of Subtropical Biodiversity and Biomonitoring and Guangdong Provincial Engineering Technology Research Center for Drug and Food Biological Resources Processing and Comprehensive Utilization, School of Life Sciences, South China Normal University, Guangzhou, China
| | - Jinbo Wei
- South China Normal University-Panyu Central Hospital Joint Laboratory of Basic and Translational Medical Research, Guangzhou Panyu Central Hospital, Guangzhou, China
| | - Ruifang Fan
- Guangzhou Key Laboratory of Subtropical Biodiversity and Biomonitoring and Guangdong Provincial Engineering Technology Research Center for Drug and Food Biological Resources Processing and Comprehensive Utilization, School of Life Sciences, South China Normal University, Guangzhou, China
| | - Dieen Wu
- South China Normal University-Panyu Central Hospital Joint Laboratory of Basic and Translational Medical Research, Guangzhou Panyu Central Hospital, Guangzhou, China
| | - Yufeng Ye
- South China Normal University-Panyu Central Hospital Joint Laboratory of Basic and Translational Medical Research, Guangzhou Panyu Central Hospital, Guangzhou, China
- Medical Imaging Institute of Panyu, Guangzhou, China
| |
Collapse
|
14
|
Dai X, Ma N, Du L, Wang X, Ju Z, Jie C, Gong H, Ge R, Yu W, Qu B. Application of MR images in radiotherapy planning for brain tumor based on deep learning. Int J Neurosci 2024:1-11. [PMID: 38712669 DOI: 10.1080/00207454.2024.2352784] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2024] [Accepted: 05/03/2024] [Indexed: 05/08/2024]
Abstract
PURPOSE Explore the function and dose calculation accuracy of MRI images in radiotherapy planning through deep learning methods. METHODS 131 brain tumor patients undergoing radiotherapy with previous MR and CT images were recruited for this study. A new series of MRI from the aligned MR was firstly registered to CT images strictly using MIM software and then resampled. A deep learning method (U-NET) was used to establish a MRI-to-CT conversion model, for which 105 patient images were used as the training set and 26 patient images were used as the tuning set. Data from additional 8 patients were collected as the test set, and the accuracy of the model was evaluated from a dosimetric standpoint. RESULTS Comparing the synthetic CT images with the original CT images, the difference in dosimetric parameters D98, D95, D2 and Dmean of PTV in 8 patients was less than 0.5%. The gamma passed rates of PTV and whole body volume were: 1%/1 mm: 93.96%±6.75%, 2%/2 mm: 99.87%±0.30%, 3%/3 mm: 100.00%±0.00%; and 1%/1 mm: 99.14%±0.80%, 2%/2 mm: 99.92%±0.08%, 3%/3 mm: 99.99%±0.01%. CONCLUSION MR images can be used both in delineation and treatment efficacy evaluation and in dose calculation. Using the deep learning way to convert MR image to CT image is a viable method and can be further used in dose calculation.
Collapse
Affiliation(s)
- Xiangkun Dai
- Department of Radiotherapy, First Medical Center of PLA General Hospital, Beijing, China
| | - Na Ma
- Department of Radiotherapy, First Medical Center of PLA General Hospital, Beijing, China
- School of Biological Science and Medical Engineering, Beihang, University, Beijing, China
| | - Lehui Du
- Department of Radiotherapy, First Medical Center of PLA General Hospital, Beijing, China
| | | | - Zhongjian Ju
- Department of Radiotherapy, First Medical Center of PLA General Hospital, Beijing, China
| | - Chuanbin Jie
- Department of Radiotherapy, First Medical Center of PLA General Hospital, Beijing, China
| | - Hanshun Gong
- Department of Radiotherapy, First Medical Center of PLA General Hospital, Beijing, China
| | - Ruigang Ge
- Department of Radiotherapy, First Medical Center of PLA General Hospital, Beijing, China
| | - Wei Yu
- Department of Radiotherapy, First Medical Center of PLA General Hospital, Beijing, China
| | - Baolin Qu
- Department of Radiotherapy, First Medical Center of PLA General Hospital, Beijing, China
| |
Collapse
|
15
|
Yunde A, Maki S, Furuya T, Okimatsu S, Inoue T, Miura M, Shiratani Y, Nagashima Y, Maruyama J, Shiga Y, Inage K, Eguchi Y, Orita S, Ohtori S. Conversion of T2-Weighted Magnetic Resonance Images of Cervical Spine Trauma to Short T1 Inversion Recovery (STIR) Images by Generative Adversarial Network. Cureus 2024; 16:e60381. [PMID: 38883049 PMCID: PMC11178942 DOI: 10.7759/cureus.60381] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/09/2024] [Indexed: 06/18/2024] Open
Abstract
INTRODUCTION The short T1 inversion recovery (STIR) sequence is advantageous for visualizing ligamentous injuries, but the STIR sequence may be missing in some cases. The purpose of this study was to generate synthetic STIR images from MRI T2-weighted images (T2WI) of patients with cervical spine trauma using a generative adversarial network (GAN). Methods: A total of 969 pairs of T2WI and STIR images were extracted from 79 patients with cervical spine trauma. The synthetic model was trained 100 times, and the performance of the model was evaluated with five-fold cross-validation. Results: As for quantitative validation, the structural similarity score was 0.519±0.1 and the peak signal-to-noise ratio score was 19.37±1.9 dB. As for qualitative validation, the incorporation of synthetic STIR images generated by a GAN alongside T2WI substantially enhances sensitivity in the detection of interspinous ligament injuries, outperforming assessments reliant solely on T2WI. CONCLUSION The GAN model can generate synthetic STIRs from T2 images of cervical spine trauma using image-to-image conversion techniques. The use of a combination of synthetic STIR images generated by a GAN and T2WI improves sensitivity in detecting interspinous ligament injuries compared to assessments that use only T2WI.
Collapse
Affiliation(s)
- Atsushi Yunde
- Department of Orthopaedic Surgery, Chiba University, Graduate School of Medicine, Chiba, JPN
| | - Satoshi Maki
- Department of Orthopaedic Surgery, Chiba University, Graduate School of Medicine, Chiba, JPN
| | - Takeo Furuya
- Department of Orthopaedic Surgery, Chiba University, Graduate School of Medicine, Chiba, JPN
| | - Sho Okimatsu
- Department of Orthopaedic Surgery, Chiba University, Graduate School of Medicine, Chiba, JPN
| | - Takaki Inoue
- Department of Orthopaedic Surgery, Chiba University, Graduate School of Medicine, Chiba, JPN
| | - Masataka Miura
- Department of Orthopaedic Surgery, Chiba University, Graduate School of Medicine, Chiba, JPN
| | - Yuki Shiratani
- Department of Orthopaedic Surgery, Chiba University, Graduate School of Medicine, Chiba, JPN
| | - Yuki Nagashima
- Department of Orthopaedic Surgery, Chiba University, Graduate School of Medicine, Chiba, JPN
| | - Juntaro Maruyama
- Department of Orthopaedic Surgery, Chiba University, Graduate School of Medicine, Chiba, JPN
| | - Yasuhiro Shiga
- Department of Orthopaedic Surgery, Chiba University, Graduate School of Medicine, Chiba, JPN
| | - Kazuhide Inage
- Department of Orthopaedic Surgery, Chiba University, Graduate School of Medicine, Chiba, JPN
| | - Yawara Eguchi
- Department of Orthopaedic Surgery, Chiba University, Graduate School of Medicine, Chiba, JPN
| | - Sumihisa Orita
- Department of Orthopaedic Surgery, Chiba University, Graduate School of Medicine, Chiba, JPN
| | - Seiji Ohtori
- Department of Orthopaedic Surgery, Chiba University, Graduate School of Medicine, Chiba, JPN
| |
Collapse
|
16
|
He Q, Summerfield N, Dong M, Glide-Hurst C. MODALITY-AGNOSTIC LEARNING FOR MEDICAL IMAGE SEGMENTATION USING MULTI-MODALITY SELF-DISTILLATION. PROCEEDINGS. IEEE INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING 2024; 2024:10.1109/isbi56570.2024.10635881. [PMID: 39735423 PMCID: PMC11673955 DOI: 10.1109/isbi56570.2024.10635881] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/31/2024]
Abstract
In medical image segmentation, although multi-modality training is possible, clinical translation is challenged by the limited availability of all image types for a given patient. Different from typical segmentation models, modality-agnostic (MAG) learning trains a single model based on all available modalities but remains input-agnostic, allowing a single model to produce accurate segmentation given any modality combinations. In this paper, we propose a novel frame-work, MAG learning through Multi-modality Self-distillation (MAG-MS), for medical image segmentation. MAG-MS distills knowledge from the fusion of multiple modalities and applies it to enhance representation learning for individual modalities. This makes it an adaptable and efficient solution for handling limited modalities during testing scenarios. Our extensive experiments on benchmark datasets demonstrate its superior segmentation accuracy, MAG robustness, and efficiency than the current state-of-the-art methods.
Collapse
Affiliation(s)
- Qisheng He
- Department of Computer Science, Wayne State University, Detroit, MI, USA
| | - Nicholas Summerfield
- Department of Human Oncology, University of Wisconsin-Madison, Madison, WI, USA
- Department of Medical Physics, University of Wisconsin-Madison, Madison, WI, USA
| | - Ming Dong
- Department of Computer Science, Wayne State University, Detroit, MI, USA
| | - Carri Glide-Hurst
- Department of Human Oncology, University of Wisconsin-Madison, Madison, WI, USA
- Department of Medical Physics, University of Wisconsin-Madison, Madison, WI, USA
| |
Collapse
|
17
|
Valabregue R, Girka F, Pron A, Rousseau F, Auzias G. Comprehensive analysis of synthetic learning applied to neonatal brain MRI segmentation. Hum Brain Mapp 2024; 45:e26674. [PMID: 38651625 PMCID: PMC11036377 DOI: 10.1002/hbm.26674] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2023] [Revised: 03/09/2024] [Accepted: 03/18/2024] [Indexed: 04/25/2024] Open
Abstract
Brain segmentation from neonatal MRI images is a very challenging task due to large changes in the shape of cerebral structures and variations in signal intensities reflecting the gestational process. In this context, there is a clear need for segmentation techniques that are robust to variations in image contrast and to the spatial configuration of anatomical structures. In this work, we evaluate the potential of synthetic learning, a contrast-independent model trained using synthetic images generated from the ground truth labels of very few subjects. We base our experiments on the dataset released by the developmental Human Connectome Project, for which high-quality images are available for more than 700 babies aged between 26 and 45 weeks postconception. First, we confirm the impressive performance of a standard UNet trained on a few volumes, but also confirm that such models learn intensity-related features specific to the training domain. We then confirm the robustness of the synthetic learning approach to variations in image contrast. However, we observe a clear influence of the age of the baby on the predictions. We improve the performance of this model by enriching the synthetic training set with realistic motion artifacts and over-segmentation of the white matter. Based on extensive visual assessment, we argue that the better performance of the model trained on real T2w data may be due to systematic errors in the ground truth. We propose an original experiment allowing us to show that learning from real data will reproduce any systematic bias affecting the training set, while synthetic models can avoid this limitation. Overall, our experiments confirm that synthetic learning is an effective solution for segmenting neonatal brain MRI. Our adapted synthetic learning approach combines key features that will be instrumental for large multisite studies and clinical applications.
Collapse
Affiliation(s)
- R. Valabregue
- CENIR, Institut du Cerveau (ICM)—Paris Brain Institute, Inserm U 1127, CNRS UMR 7225, Sorbonne UniversitéParisFrance
| | - F. Girka
- CENIR, Institut du Cerveau (ICM)—Paris Brain Institute, Inserm U 1127, CNRS UMR 7225, Sorbonne UniversitéParisFrance
| | - A. Pron
- Aix‐Marseille Université, CNRS, Institut de Neurosciences de la Timone, UMR 7289MarseilleFrance
| | - F. Rousseau
- IMT Atlantique, LaTIM INSERM U1101BrestFrance
| | - G. Auzias
- Aix‐Marseille Université, CNRS, Institut de Neurosciences de la Timone, UMR 7289MarseilleFrance
| |
Collapse
|
18
|
Carass A, Greenman D, Dewey BE, Calabresi PA, Prince JL, Pham DL. Image harmonization improves consistency of intra-rater delineations of MS lesions in heterogeneous MRI. NEUROIMAGE. REPORTS 2024; 4:100195. [PMID: 38370461 PMCID: PMC10871705 DOI: 10.1016/j.ynirp.2024.100195] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/20/2024]
Abstract
Clinical magnetic resonance images (MRIs) lack a standard intensity scale due to differences in scanner hardware and the pulse sequences used to acquire the images. When MRIs are used for quantification, as in the evaluation of white matter lesions (WMLs) in multiple sclerosis, this lack of intensity standardization becomes a critical problem affecting both the staging and tracking of the disease and its treatment. This paper presents a study of harmonization on WML segmentation consistency, which is evaluated using an object detection classification scheme that incorporates manual delineations from both the original and harmonized MRIs. A cohort of ten people scanned on two different imaging platforms was studied. An expert rater, blinded to the image source, manually delineated WMLs on images from both scanners before and after harmonization. It was found that there is closer agreement in both global and per-lesion WML volume and spatial distribution after harmonization, demonstrating the importance of image harmonization prior to the creation of manual delineations. These results could lead to better truth models in both the development and evaluation of automated lesion segmentation algorithms.
Collapse
Affiliation(s)
- Aaron Carass
- Department of Electrical and Computer Engineering, Johns Hopkins University, Baltimore, MD 21218, USA
| | - Danielle Greenman
- Center for Neuroscience and Regenerative Medicine, The Henry M. Jackson Foundation for the Advancement of Military Medicine, Bethesda, MD 20817, USA
| | - Blake E. Dewey
- Department of Neurology, Johns Hopkins School of Medicine, Baltimore, MD 21287, USA
| | - Peter A. Calabresi
- Department of Neurology, Johns Hopkins School of Medicine, Baltimore, MD 21287, USA
| | - Jerry L. Prince
- Department of Electrical and Computer Engineering, Johns Hopkins University, Baltimore, MD 21218, USA
| | - Dzung L. Pham
- Department of Radiology, Uniformed Services University of the Health Sciences, Bethesda, MD 20814, USA
| |
Collapse
|
19
|
Dayarathna S, Islam KT, Uribe S, Yang G, Hayat M, Chen Z. Deep learning based synthesis of MRI, CT and PET: Review and analysis. Med Image Anal 2024; 92:103046. [PMID: 38052145 DOI: 10.1016/j.media.2023.103046] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2023] [Revised: 11/14/2023] [Accepted: 11/29/2023] [Indexed: 12/07/2023]
Abstract
Medical image synthesis represents a critical area of research in clinical decision-making, aiming to overcome the challenges associated with acquiring multiple image modalities for an accurate clinical workflow. This approach proves beneficial in estimating an image of a desired modality from a given source modality among the most common medical imaging contrasts, such as Computed Tomography (CT), Magnetic Resonance Imaging (MRI), and Positron Emission Tomography (PET). However, translating between two image modalities presents difficulties due to the complex and non-linear domain mappings. Deep learning-based generative modelling has exhibited superior performance in synthetic image contrast applications compared to conventional image synthesis methods. This survey comprehensively reviews deep learning-based medical imaging translation from 2018 to 2023 on pseudo-CT, synthetic MR, and synthetic PET. We provide an overview of synthetic contrasts in medical imaging and the most frequently employed deep learning networks for medical image synthesis. Additionally, we conduct a detailed analysis of each synthesis method, focusing on their diverse model designs based on input domains and network architectures. We also analyse novel network architectures, ranging from conventional CNNs to the recent Transformer and Diffusion models. This analysis includes comparing loss functions, available datasets and anatomical regions, and image quality assessments and performance in other downstream tasks. Finally, we discuss the challenges and identify solutions within the literature, suggesting possible future directions. We hope that the insights offered in this survey paper will serve as a valuable roadmap for researchers in the field of medical image synthesis.
Collapse
Affiliation(s)
- Sanuwani Dayarathna
- Department of Data Science and AI, Faculty of Information Technology, Monash University, Clayton VIC 3800, Australia.
| | | | - Sergio Uribe
- Department of Medical Imaging and Radiation Sciences, Faculty of Medicine, Monash University, Clayton VIC 3800, Australia
| | - Guang Yang
- Bioengineering Department and Imperial-X, Imperial College London, W12 7SL, United Kingdom
| | - Munawar Hayat
- Department of Data Science and AI, Faculty of Information Technology, Monash University, Clayton VIC 3800, Australia
| | - Zhaolin Chen
- Department of Data Science and AI, Faculty of Information Technology, Monash University, Clayton VIC 3800, Australia; Monash Biomedical Imaging, Clayton VIC 3800, Australia
| |
Collapse
|
20
|
Wang Z, Fang M, Zhang J, Tang L, Zhong L, Li H, Cao R, Zhao X, Liu S, Zhang R, Xie X, Mai H, Qiu S, Tian J, Dong D. Radiomics and Deep Learning in Nasopharyngeal Carcinoma: A Review. IEEE Rev Biomed Eng 2024; 17:118-135. [PMID: 37097799 DOI: 10.1109/rbme.2023.3269776] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/26/2023]
Abstract
Nasopharyngeal carcinoma is a common head and neck malignancy with distinct clinical management compared to other types of cancer. Precision risk stratification and tailored therapeutic interventions are crucial to improving the survival outcomes. Artificial intelligence, including radiomics and deep learning, has exhibited considerable efficacy in various clinical tasks for nasopharyngeal carcinoma. These techniques leverage medical images and other clinical data to optimize clinical workflow and ultimately benefit patients. In this review, we provide an overview of the technical aspects and basic workflow of radiomics and deep learning in medical image analysis. We then conduct a detailed review of their applications to seven typical tasks in the clinical diagnosis and treatment of nasopharyngeal carcinoma, covering various aspects of image synthesis, lesion segmentation, diagnosis, and prognosis. The innovation and application effects of cutting-edge research are summarized. Recognizing the heterogeneity of the research field and the existing gap between research and clinical translation, potential avenues for improvement are discussed. We propose that these issues can be gradually addressed by establishing standardized large datasets, exploring the biological characteristics of features, and technological upgrades.
Collapse
|
21
|
Ozbey M, Dalmaz O, Dar SUH, Bedel HA, Ozturk S, Gungor A, Cukur T. Unsupervised Medical Image Translation With Adversarial Diffusion Models. IEEE TRANSACTIONS ON MEDICAL IMAGING 2023; 42:3524-3539. [PMID: 37379177 DOI: 10.1109/tmi.2023.3290149] [Citation(s) in RCA: 54] [Impact Index Per Article: 27.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/30/2023]
Abstract
Imputation of missing images via source-to-target modality translation can improve diversity in medical imaging protocols. A pervasive approach for synthesizing target images involves one-shot mapping through generative adversarial networks (GAN). Yet, GAN models that implicitly characterize the image distribution can suffer from limited sample fidelity. Here, we propose a novel method based on adversarial diffusion modeling, SynDiff, for improved performance in medical image translation. To capture a direct correlate of the image distribution, SynDiff leverages a conditional diffusion process that progressively maps noise and source images onto the target image. For fast and accurate image sampling during inference, large diffusion steps are taken with adversarial projections in the reverse diffusion direction. To enable training on unpaired datasets, a cycle-consistent architecture is devised with coupled diffusive and non-diffusive modules that bilaterally translate between two modalities. Extensive assessments are reported on the utility of SynDiff against competing GAN and diffusion models in multi-contrast MRI and MRI-CT translation. Our demonstrations indicate that SynDiff offers quantitatively and qualitatively superior performance against competing baselines.
Collapse
|
22
|
Yang H, Sun J, Xu Z. Learning Unified Hyper-Network for Multi-Modal MR Image Synthesis and Tumor Segmentation With Missing Modalities. IEEE TRANSACTIONS ON MEDICAL IMAGING 2023; 42:3678-3689. [PMID: 37540616 DOI: 10.1109/tmi.2023.3301934] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/06/2023]
Abstract
Accurate segmentation of brain tumors is of critical importance in clinical assessment and treatment planning, which requires multiple MR modalities providing complementary information. However, due to practical limits, one or more modalities may be missing in real scenarios. To tackle this problem, existing methods need to train multiple networks or a unified but fixed network for various possible missing modality cases, which leads to high computational burdens or sub-optimal performance. In this paper, we propose a unified and adaptive multi-modal MR image synthesis method, and further apply it to tumor segmentation with missing modalities. Based on the decomposition of multi-modal MR images into common and modality-specific features, we design a shared hyper-encoder for embedding each available modality into the feature space, a graph-attention-based fusion block to aggregate the features of available modalities to the fused features, and a shared hyper-decoder for image reconstruction. We also propose an adversarial common feature constraint to enforce the fused features to be in a common space. As for missing modality segmentation, we first conduct the feature-level and image-level completion using our synthesis method and then segment the tumors based on the completed MR images together with the extracted common features. Moreover, we design a hypernet-based modulation module to adaptively utilize the real and synthetic modalities. Experimental results suggest that our method can not only synthesize reasonable multi-modal MR images, but also achieve state-of-the-art performance on brain tumor segmentation with missing modalities.
Collapse
|
23
|
周 家, 郭 红, 陈 红. [Deep learning method for magnetic resonance imaging fluid-attenuated inversion recovery image synthesis]. SHENG WU YI XUE GONG CHENG XUE ZA ZHI = JOURNAL OF BIOMEDICAL ENGINEERING = SHENGWU YIXUE GONGCHENGXUE ZAZHI 2023; 40:903-911. [PMID: 37879919 PMCID: PMC10600433 DOI: 10.7507/1001-5515.202302012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Subscribe] [Scholar Register] [Received: 02/08/2023] [Revised: 08/19/2023] [Indexed: 10/27/2023]
Abstract
Magnetic resonance imaging(MRI) can obtain multi-modal images with different contrast, which provides rich information for clinical diagnosis. However, some contrast images are not scanned or the quality of the acquired images cannot meet the diagnostic requirements due to the difficulty of patient's cooperation or the limitation of scanning conditions. Image synthesis techniques have become a method to compensate for such image deficiencies. In recent years, deep learning has been widely used in the field of MRI synthesis. In this paper, a synthesis network based on multi-modal fusion is proposed, which firstly uses a feature encoder to encode the features of multiple unimodal images separately, and then fuses the features of different modal images through a feature fusion module, and finally generates the target modal image. The similarity measure between the target image and the predicted image in the network is improved by introducing a dynamic weighted combined loss function based on the spatial domain and K-space domain. After experimental validation and quantitative comparison, the multi-modal fusion deep learning network proposed in this paper can effectively synthesize high-quality MRI fluid-attenuated inversion recovery (FLAIR) images. In summary, the method proposed in this paper can reduce MRI scanning time of the patient, as well as solve the clinical problem of missing FLAIR images or image quality that is difficult to meet diagnostic requirements.
Collapse
Affiliation(s)
- 家柠 周
- 沈阳工业大学 电气工程学院(沈阳 110870)School of Electrical Engineering, Shenyang University of Technology, Shenyang 110870, P. R. China
| | - 红宇 郭
- 沈阳工业大学 电气工程学院(沈阳 110870)School of Electrical Engineering, Shenyang University of Technology, Shenyang 110870, P. R. China
- 东软医疗系统股份有限公司(沈阳 110167)Neusoft Medical System Co. Ltd, Shenyang 110167, P. R. China
| | - 红 陈
- 沈阳工业大学 电气工程学院(沈阳 110870)School of Electrical Engineering, Shenyang University of Technology, Shenyang 110870, P. R. China
| |
Collapse
|
24
|
Dorent R, Haouchine N, Kogl F, Joutard S, Juvekar P, Torio E, Golby A, Ourselin S, Frisken S, Vercauteren T, Kapur T, Wells WM. Unified Brain MR-Ultrasound Synthesis using Multi-Modal Hierarchical Representations. MEDICAL IMAGE COMPUTING AND COMPUTER-ASSISTED INTERVENTION : MICCAI ... INTERNATIONAL CONFERENCE ON MEDICAL IMAGE COMPUTING AND COMPUTER-ASSISTED INTERVENTION 2023; 2023:448-458. [PMID: 38655383 PMCID: PMC7615858 DOI: 10.1007/978-3-031-43999-5_43] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 04/26/2024]
Abstract
We introduce MHVAE, a deep hierarchical variational autoencoder (VAE) that synthesizes missing images from various modalities. Extending multi-modal VAEs with a hierarchical latent structure, we introduce a probabilistic formulation for fusing multi-modal images in a common latent representation while having the flexibility to handle incomplete image sets as input. Moreover, adversarial learning is employed to generate sharper images. Extensive experiments are performed on the challenging problem of joint intra-operative ultrasound (iUS) and Magnetic Resonance (MR) synthesis. Our model outperformed multi-modal VAEs, conditional GANs, and the current state-of-the-art unified method (ResViT) for synthesizing missing images, demonstrating the advantage of using a hierarchical latent representation and a principled probabilistic fusion operation. Our code is publicly available.
Collapse
Affiliation(s)
- Reuben Dorent
- Harvard Medical School, Brigham and Women's Hospital, Boston, MA, USA
| | - Nazim Haouchine
- Harvard Medical School, Brigham and Women's Hospital, Boston, MA, USA
| | - Fryderyk Kogl
- Harvard Medical School, Brigham and Women's Hospital, Boston, MA, USA
| | | | - Parikshit Juvekar
- Harvard Medical School, Brigham and Women's Hospital, Boston, MA, USA
| | - Erickson Torio
- Harvard Medical School, Brigham and Women's Hospital, Boston, MA, USA
| | - Alexandra Golby
- Harvard Medical School, Brigham and Women's Hospital, Boston, MA, USA
| | | | - Sarah Frisken
- Harvard Medical School, Brigham and Women's Hospital, Boston, MA, USA
| | | | - Tina Kapur
- Harvard Medical School, Brigham and Women's Hospital, Boston, MA, USA
| | - William M Wells
- Harvard Medical School, Brigham and Women's Hospital, Boston, MA, USA
- Massachusetts Institute of Technology, Cambridge, MA, USA
| |
Collapse
|
25
|
Konwer A, Hu X, Bae J, Xu X, Chen C, Prasanna P. Enhancing Modality-Agnostic Representations via Meta-learning for Brain Tumor Segmentation. PROCEEDINGS. IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION 2023; 2023:21358-21368. [PMID: 38737337 PMCID: PMC11087061 DOI: 10.1109/iccv51070.2023.01958] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/14/2024]
Abstract
In medical vision, different imaging modalities provide complementary information. However, in practice, not all modalities may be available during inference or even training. Previous approaches, e.g., knowledge distillation or image synthesis, often assume the availability of full modalities for all subjects during training; this is unrealistic and impractical due to the variability in data collection across sites. We propose a novel approach to learn enhanced modality-agnostic representations by employing a meta-learning strategy in training, even when only limited full modality samples are available. Meta-learning enhances partial modality representations to full modality representations by meta-training on partial modality data and meta-testing on limited full modality samples. Additionally, we co-supervise this feature enrichment by introducing an auxiliary adversarial learning branch. More specifically, a missing modality detector is used as a discriminator to mimic the full modality setting. Our segmentation framework significantly outperforms state-of-the-art brain tumor segmentation techniques in missing modality scenarios.
Collapse
Affiliation(s)
- Aishik Konwer
- Department of Computer Science, Stony Brook University
| | - Xiaoling Hu
- Department of Computer Science, Stony Brook University
| | - Joseph Bae
- Department of Biomedical Informatics, Stony Brook University
| | - Xuan Xu
- Department of Computer Science, Stony Brook University
| | - Chao Chen
- Department of Biomedical Informatics, Stony Brook University
| | | |
Collapse
|
26
|
Genc O, Morrison MA, Villanueva-Meyer J, Burns B, Hess CP, Banerjee S, Lupo JM. DeepSWI: Using Deep Learning to Enhance Susceptibility Contrast on T2*-Weighted MRI. J Magn Reson Imaging 2023; 58:1200-1210. [PMID: 36733222 PMCID: PMC10443940 DOI: 10.1002/jmri.28622] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2022] [Revised: 01/19/2023] [Accepted: 01/20/2023] [Indexed: 02/04/2023] Open
Abstract
BACKGROUND Although susceptibility-weighted imaging (SWI) is the gold standard for visualizing cerebral microbleeds (CMBs) in the brain, the required phase data are not always available clinically. Having a postprocessing tool for generating SWI contrast from T2*-weighted magnitude images is therefore advantageous. PURPOSE To create synthetic SWI images from clinical T2*-weighted magnitude images using deep learning and evaluate the resulting images in terms of similarity to conventional SWI images and ability to detect radiation-associated CMBs. STUDY TYPE Retrospective. POPULATION A total of 145 adults (87 males/58 females; 43.9 years old) with radiation-associated CMBs were used to train (16,093 patches/121 patients), validate (484 patches/4 patients), and test (2420 patches/20 patients) our networks. FIELD STRENGTH/SEQUENCE 3D T2*-weighted, gradient-echo acquired at 3 T. ASSESSMENT Structural similarity index (SSIM), peak signal-to-noise-ratio (PSNR), normalized mean-squared-error (nMSE), CMB counts, and line profiles were compared among magnitude, original SWI, and synthetic SWI images. Three blinded raters (J.E.V.M., M.A.M., B.B. with 8-, 6-, and 4-years of experience, respectively) independently rated and classified test-set images. STATISTICAL TESTS Kruskall-Wallis and Wilcoxon signed-rank tests were used to compare SSIM, PSNR, nMSE, and CMB counts among magnitude, original SWI, and predicted synthetic SWI images. Intraclass correlation assessed interrater variability. P values <0.005 were considered statistically significant. RESULTS SSIM values of the predicted vs. original SWI (0.972, 0.995, 0.9864) were statistically significantly higher than that of the magnitude vs. original SWI (0.970, 0.994, 0.9861) for whole brain, vascular structures, and brain tissue regions, respectively; 67% (19/28) CMBs detected on original SWI images were also detected on the predicted SWI, whereas only 10 (36%) were detected on magnitude images. Overall image quality was similar between the synthetic and original SWI images, with less artifacts on the former. CONCLUSIONS This study demonstrated that deep learning can increase the susceptibility contrast present in neurovasculature and CMBs on T2*-weighted magnitude images, without residual susceptibility-induced artifacts. This may be useful for more accurately estimating CMB burden from magnitude images alone. EVIDENCE LEVEL 3. TECHNICAL EFFICACY Stage 2.
Collapse
Affiliation(s)
- Ozan Genc
- Department of Radiology and Biomedical Imaging, University of California, San Francisco, CA
- Boğaziçi University, Istanbul, Turkey
| | - Melanie A. Morrison
- Department of Radiology and Biomedical Imaging, University of California, San Francisco, CA
| | - Javier Villanueva-Meyer
- Department of Radiology and Biomedical Imaging, University of California, San Francisco, CA
- Department of Neurological Surgery, University of California, San Francisco, CA
| | | | - Christopher P. Hess
- Department of Radiology and Biomedical Imaging, University of California, San Francisco, CA
- Department of Neurology, University of California, San Francisco, CA
| | | | - Janine M. Lupo
- Department of Radiology and Biomedical Imaging, University of California, San Francisco, CA
- UCSF/UC Berkeley Graduate Group of Bioengineering, University of California, Berkeley and San Francisco, CA
| |
Collapse
|
27
|
Liu J, Pasumarthi S, Duffy B, Gong E, Datta K, Zaharchuk G. One Model to Synthesize Them All: Multi-Contrast Multi-Scale Transformer for Missing Data Imputation. IEEE TRANSACTIONS ON MEDICAL IMAGING 2023; 42:2577-2591. [PMID: 37030684 PMCID: PMC10543020 DOI: 10.1109/tmi.2023.3261707] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Multi-contrast magnetic resonance imaging (MRI) is widely used in clinical practice as each contrast provides complementary information. However, the availability of each imaging contrast may vary amongst patients, which poses challenges to radiologists and automated image analysis algorithms. A general approach for tackling this problem is missing data imputation, which aims to synthesize the missing contrasts from existing ones. While several convolutional neural networks (CNN) based algorithms have been proposed, they suffer from the fundamental limitations of CNN models, such as the requirement for fixed numbers of input and output channels, the inability to capture long-range dependencies, and the lack of interpretability. In this work, we formulate missing data imputation as a sequence-to-sequence learning problem and propose a multi-contrast multi-scale Transformer (MMT), which can take any subset of input contrasts and synthesize those that are missing. MMT consists of a multi-scale Transformer encoder that builds hierarchical representations of inputs combined with a multi-scale Transformer decoder that generates the outputs in a coarse-to-fine fashion. The proposed multi-contrast Swin Transformer blocks can efficiently capture intra- and inter-contrast dependencies for accurate image synthesis. Moreover, MMT is inherently interpretable as it allows us to understand the importance of each input contrast in different regions by analyzing the in-built attention maps of Transformer blocks in the decoder. Extensive experiments on two large-scale multi-contrast MRI datasets demonstrate that MMT outperforms the state-of-the-art methods quantitatively and qualitatively.
Collapse
|
28
|
Mori S, Hirai R, Sakata Y, Tachibana Y, Koto M, Ishikawa H. Deep neural network-based synthetic image digital fluoroscopy using digitally reconstructed tomography. Phys Eng Sci Med 2023; 46:1227-1237. [PMID: 37349631 DOI: 10.1007/s13246-023-01290-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2023] [Accepted: 06/16/2023] [Indexed: 06/24/2023]
Abstract
We developed a deep neural network (DNN) to generate X-ray flat panel detector (FPD) images from digitally reconstructed radiographic (DRR) images. FPD and treatment planning CT images were acquired from patients with prostate and head and neck (H&N) malignancies. The DNN parameters were optimized for FPD image synthesis. The synthetic FPD images' features were evaluated to compare to the corresponding ground-truth FPD images using mean absolute error (MAE), peak signal-to-noise ratio (PSNR), and structural similarity index measure (SSIM). The image quality of the synthetic FPD image was also compared with that of the DRR image to understand the performance of our DNN. For the prostate cases, the MAE of the synthetic FPD image was improved (= 0.12 ± 0.02) from that of the input DRR image (= 0.35 ± 0.08). The synthetic FPD image showed higher PSNRs (= 16.81 ± 1.54 dB) than those of the DRR image (= 8.74 ± 1.56 dB), while SSIMs for both images (= 0.69) were almost the same. All metrics for the synthetic FPD images of the H&N cases were improved (MAE 0.08 ± 0.03, PSNR 19.40 ± 2.83 dB, and SSIM 0.80 ± 0.04) compared to those for the DRR image (MAE 0.48 ± 0.11, PSNR 5.74 ± 1.63 dB, and SSIM 0.52 ± 0.09). Our DNN successfully generated FPD images from DRR images. This technique would be useful to increase throughput when images from two different modalities are compared by visual inspection.
Collapse
Affiliation(s)
- Shinichiro Mori
- National Institutes for Quantum Science and Technology, Quantum Life and Medical Science Directorate, Institute for Quantum Medical Science, Inage-ku, Chiba, 263-8555, Japan.
| | - Ryusuke Hirai
- Corporate Research and Development Center, Toshiba Corporation, Kanagawa, 212-8582, Japan
| | - Yukinobu Sakata
- Corporate Research and Development Center, Toshiba Corporation, Kanagawa, 212-8582, Japan
| | - Yasuhiko Tachibana
- National Institutes for Quantum Science and Technology, Quantum Life and Medical Science Directorate, Institute for Quantum Medical Science, Inage-ku, Chiba, 263-8555, Japan
| | - Masashi Koto
- QST hospital, National Institutes for Quantum Science and Technology, Inage-ku, Chiba, 263-8555, Japan
| | - Hitoshi Ishikawa
- QST hospital, National Institutes for Quantum Science and Technology, Inage-ku, Chiba, 263-8555, Japan
| |
Collapse
|
29
|
Liu X, Prince JL, Xing F, Zhuo J, Reese T, Stone M, El Fakhri G, Woo J. Attentive continuous generative self-training for unsupervised domain adaptive medical image translation. Med Image Anal 2023; 88:102851. [PMID: 37329854 PMCID: PMC10527936 DOI: 10.1016/j.media.2023.102851] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2022] [Revised: 03/28/2023] [Accepted: 05/23/2023] [Indexed: 06/19/2023]
Abstract
Self-training is an important class of unsupervised domain adaptation (UDA) approaches that are used to mitigate the problem of domain shift, when applying knowledge learned from a labeled source domain to unlabeled and heterogeneous target domains. While self-training-based UDA has shown considerable promise on discriminative tasks, including classification and segmentation, through reliable pseudo-label filtering based on the maximum softmax probability, there is a paucity of prior work on self-training-based UDA for generative tasks, including image modality translation. To fill this gap, in this work, we seek to develop a generative self-training (GST) framework for domain adaptive image translation with continuous value prediction and regression objectives. Specifically, we quantify both aleatoric and epistemic uncertainties within our GST using variational Bayes learning to measure the reliability of synthesized data. We also introduce a self-attention scheme that de-emphasizes the background region to prevent it from dominating the training process. The adaptation is then carried out by an alternating optimization scheme with target domain supervision that focuses attention on the regions with reliable pseudo-labels. We evaluated our framework on two cross-scanner/center, inter-subject translation tasks, including tagged-to-cine magnetic resonance (MR) image translation and T1-weighted MR-to-fractional anisotropy translation. Extensive validations with unpaired target domain data showed that our GST yielded superior synthesis performance in comparison to adversarial training UDA methods.
Collapse
Affiliation(s)
- Xiaofeng Liu
- Gordon Center for Medical Imaging, Department of Radiology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, 02114, USA.
| | - Jerry L Prince
- Department of Electrical and Computer Engineering, Johns Hopkins University, Baltimore, MD, USA
| | - Fangxu Xing
- Gordon Center for Medical Imaging, Department of Radiology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, 02114, USA
| | - Jiachen Zhuo
- Department of Neural and Pain Sciences, University of Maryland School of Dentistry, Baltimore, MD, USA
| | - Timothy Reese
- Athinoula A. Martinos Center for Biomedical Imaging, Department of Radiology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
| | - Maureen Stone
- Department of Neural and Pain Sciences, University of Maryland School of Dentistry, Baltimore, MD, USA
| | - Georges El Fakhri
- Gordon Center for Medical Imaging, Department of Radiology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, 02114, USA
| | - Jonghye Woo
- Gordon Center for Medical Imaging, Department of Radiology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, 02114, USA
| |
Collapse
|
30
|
Kazerouni A, Aghdam EK, Heidari M, Azad R, Fayyaz M, Hacihaliloglu I, Merhof D. Diffusion models in medical imaging: A comprehensive survey. Med Image Anal 2023; 88:102846. [PMID: 37295311 DOI: 10.1016/j.media.2023.102846] [Citation(s) in RCA: 81] [Impact Index Per Article: 40.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2022] [Revised: 05/12/2023] [Accepted: 05/16/2023] [Indexed: 06/12/2023]
Abstract
Denoising diffusion models, a class of generative models, have garnered immense interest lately in various deep-learning problems. A diffusion probabilistic model defines a forward diffusion stage where the input data is gradually perturbed over several steps by adding Gaussian noise and then learns to reverse the diffusion process to retrieve the desired noise-free data from noisy data samples. Diffusion models are widely appreciated for their strong mode coverage and quality of the generated samples in spite of their known computational burdens. Capitalizing on the advances in computer vision, the field of medical imaging has also observed a growing interest in diffusion models. With the aim of helping the researcher navigate this profusion, this survey intends to provide a comprehensive overview of diffusion models in the discipline of medical imaging. Specifically, we start with an introduction to the solid theoretical foundation and fundamental concepts behind diffusion models and the three generic diffusion modeling frameworks, namely, diffusion probabilistic models, noise-conditioned score networks, and stochastic differential equations. Then, we provide a systematic taxonomy of diffusion models in the medical domain and propose a multi-perspective categorization based on their application, imaging modality, organ of interest, and algorithms. To this end, we cover extensive applications of diffusion models in the medical domain, including image-to-image translation, reconstruction, registration, classification, segmentation, denoising, 2/3D generation, anomaly detection, and other medically-related challenges. Furthermore, we emphasize the practical use case of some selected approaches, and then we discuss the limitations of the diffusion models in the medical domain and propose several directions to fulfill the demands of this field. Finally, we gather the overviewed studies with their available open-source implementations at our GitHub.1 We aim to update the relevant latest papers within it regularly.
Collapse
Affiliation(s)
- Amirhossein Kazerouni
- School of Electrical Engineering, Iran University of Science and Technology, Tehran, Iran
| | | | - Moein Heidari
- School of Electrical Engineering, Iran University of Science and Technology, Tehran, Iran
| | - Reza Azad
- Faculty of Electrical Engineering and Information Technology, RWTH Aachen University, Aachen, Germany
| | | | - Ilker Hacihaliloglu
- Department of Radiology, University of British Columbia, Vancouver, Canada; Department of Medicine, University of British Columbia, Vancouver, Canada
| | - Dorit Merhof
- Faculty of Informatics and Data Science, University of Regensburg, Regensburg, Germany; Fraunhofer Institute for Digital Medicine MEVIS, Bremen, Germany.
| |
Collapse
|
31
|
Haase R, Pinetz T, Kobler E, Paech D, Effland A, Radbruch A, Deike-Hofmann K. Artificial Contrast: Deep Learning for Reducing Gadolinium-Based Contrast Agents in Neuroradiology. Invest Radiol 2023; 58:539-547. [PMID: 36822654 DOI: 10.1097/rli.0000000000000963] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/25/2023]
Abstract
ABSTRACT Deep learning approaches are playing an ever-increasing role throughout diagnostic medicine, especially in neuroradiology, to solve a wide range of problems such as segmentation, synthesis of missing sequences, and image quality improvement. Of particular interest is their application in the reduction of gadolinium-based contrast agents, the administration of which has been under cautious reevaluation in recent years because of concerns about gadolinium deposition and its unclear long-term consequences. A growing number of studies are investigating the reduction (low-dose approach) or even complete substitution (zero-dose approach) of gadolinium-based contrast agents in diverse patient populations using a variety of deep learning methods. This work aims to highlight selected research and discusses the advantages and limitations of recent deep learning approaches, the challenges of assessing its output, and the progress toward clinical applicability distinguishing between the low-dose and zero-dose approach.
Collapse
Affiliation(s)
| | - Thomas Pinetz
- Institute of Applied Mathematics, Rheinische Friedrich-Wilhelms-Universität Bonn, Bonn, Germany
| | - Erich Kobler
- From the Department of Neuroradiology, University Medical Center Bonn, Rheinische Friedrich-Wilhelms-Universität Bonn
| | | | - Alexander Effland
- Institute of Applied Mathematics, Rheinische Friedrich-Wilhelms-Universität Bonn, Bonn, Germany
| | | | | |
Collapse
|
32
|
Jiao C, Ling D, Bian S, Vassantachart A, Cheng K, Mehta S, Lock D, Zhu Z, Feng M, Thomas H, Scholey JE, Sheng K, Fan Z, Yang W. Contrast-Enhanced Liver Magnetic Resonance Image Synthesis Using Gradient Regularized Multi-Modal Multi-Discrimination Sparse Attention Fusion GAN. Cancers (Basel) 2023; 15:3544. [PMID: 37509207 PMCID: PMC10377331 DOI: 10.3390/cancers15143544] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2023] [Revised: 07/03/2023] [Accepted: 07/05/2023] [Indexed: 07/30/2023] Open
Abstract
PURPOSES To provide abdominal contrast-enhanced MR image synthesis, we developed an gradient regularized multi-modal multi-discrimination sparse attention fusion generative adversarial network (GRMM-GAN) to avoid repeated contrast injections to patients and facilitate adaptive monitoring. METHODS With IRB approval, 165 abdominal MR studies from 61 liver cancer patients were retrospectively solicited from our institutional database. Each study included T2, T1 pre-contrast (T1pre), and T1 contrast-enhanced (T1ce) images. The GRMM-GAN synthesis pipeline consists of a sparse attention fusion network, an image gradient regularizer (GR), and a generative adversarial network with multi-discrimination. The studies were randomly divided into 115 for training, 20 for validation, and 30 for testing. The two pre-contrast MR modalities, T2 and T1pre images, were adopted as inputs in the training phase. The T1ce image at the portal venous phase was used as an output. The synthesized T1ce images were compared with the ground truth T1ce images. The evaluation metrics include peak signal-to-noise ratio (PSNR), structural similarity index (SSIM), and mean squared error (MSE). A Turing test and experts' contours evaluated the image synthesis quality. RESULTS The proposed GRMM-GAN model achieved a PSNR of 28.56, an SSIM of 0.869, and an MSE of 83.27. The proposed model showed statistically significant improvements in all metrics tested with p-values < 0.05 over the state-of-the-art model comparisons. The average Turing test score was 52.33%, which is close to random guessing, supporting the model's effectiveness for clinical application. In the tumor-specific region analysis, the average tumor contrast-to-noise ratio (CNR) of the synthesized MR images was not statistically significant from the real MR images. The average DICE from real vs. synthetic images was 0.90 compared to the inter-operator DICE of 0.91. CONCLUSION We demonstrated the function of a novel multi-modal MR image synthesis neural network GRMM-GAN for T1ce MR synthesis based on pre-contrast T1 and T2 MR images. GRMM-GAN shows promise for avoiding repeated contrast injections during radiation therapy treatment.
Collapse
Affiliation(s)
- Changzhe Jiao
- Department of Radiation Oncology, Keck School of Medicine of USC, Los Angeles, CA 90033, USA (A.V.); (S.M.)
- Department of Radiation Oncology, UC San Francisco, San Francisco, CA 94143, USA
| | - Diane Ling
- Department of Radiation Oncology, Keck School of Medicine of USC, Los Angeles, CA 90033, USA (A.V.); (S.M.)
| | - Shelly Bian
- Department of Radiation Oncology, Keck School of Medicine of USC, Los Angeles, CA 90033, USA (A.V.); (S.M.)
| | - April Vassantachart
- Department of Radiation Oncology, Keck School of Medicine of USC, Los Angeles, CA 90033, USA (A.V.); (S.M.)
| | - Karen Cheng
- Department of Radiation Oncology, Keck School of Medicine of USC, Los Angeles, CA 90033, USA (A.V.); (S.M.)
| | - Shahil Mehta
- Department of Radiation Oncology, Keck School of Medicine of USC, Los Angeles, CA 90033, USA (A.V.); (S.M.)
| | - Derrick Lock
- Department of Radiation Oncology, Keck School of Medicine of USC, Los Angeles, CA 90033, USA (A.V.); (S.M.)
| | - Zhenyu Zhu
- Guangzhou Institute of Technology, Xidian University, Guangzhou 510555, China;
| | - Mary Feng
- Department of Radiation Oncology, UC San Francisco, San Francisco, CA 94143, USA
| | - Horatio Thomas
- Department of Radiation Oncology, UC San Francisco, San Francisco, CA 94143, USA
| | - Jessica E. Scholey
- Department of Radiation Oncology, UC San Francisco, San Francisco, CA 94143, USA
| | - Ke Sheng
- Department of Radiation Oncology, UC San Francisco, San Francisco, CA 94143, USA
| | - Zhaoyang Fan
- Department of Radiology, Keck School of Medicine of USC, Los Angeles, CA 90033, USA
| | - Wensha Yang
- Department of Radiation Oncology, Keck School of Medicine of USC, Los Angeles, CA 90033, USA (A.V.); (S.M.)
- Department of Radiation Oncology, UC San Francisco, San Francisco, CA 94143, USA
| |
Collapse
|
33
|
Zhu J, Chen X, Liu Y, Yang B, Wei R, Qin S, Yang Z, Hu Z, Dai J, Men K. Improving accelerated 3D imaging in MRI-guided radiotherapy for prostate cancer using a deep learning method. Radiat Oncol 2023; 18:108. [PMID: 37393282 DOI: 10.1186/s13014-023-02306-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2023] [Accepted: 06/21/2023] [Indexed: 07/03/2023] Open
Abstract
PURPOSE This study was to improve image quality for high-speed MR imaging using a deep learning method for online adaptive radiotherapy in prostate cancer. We then evaluated its benefits on image registration. METHODS Sixty pairs of 1.5 T MR images acquired with an MR-linac were enrolled. The data included low-speed, high-quality (LSHQ), and high-speed low-quality (HSLQ) MR images. We proposed a CycleGAN, which is based on the data augmentation technique, to learn the mapping between the HSLQ and LSHQ images and then generate synthetic LSHQ (synLSHQ) images from the HSLQ images. Five-fold cross-validation was employed to test the CycleGAN model. The normalized mean absolute error (nMAE), peak signal-to-noise ratio (PSNR), structural similarity index measurement (SSIM), and edge keeping index (EKI) were calculated to determine image quality. The Jacobian determinant value (JDV), Dice similarity coefficient (DSC), and mean distance to agreement (MDA) were used to analyze deformable registration. RESULTS Compared with the LSHQ, the proposed synLSHQ achieved comparable image quality and reduced imaging time by ~ 66%. Compared with the HSLQ, the synLSHQ had better image quality with improvement of 57%, 3.4%, 26.9%, and 3.6% for nMAE, SSIM, PSNR, and EKI, respectively. Furthermore, the synLSHQ enhanced registration accuracy with a superior mean JDV (6%) and preferable DSC and MDA values compared with HSLQ. CONCLUSION The proposed method can generate high-quality images from high-speed scanning sequences. As a result, it shows potential to shorten the scan time while ensuring the accuracy of radiotherapy.
Collapse
Affiliation(s)
- Ji Zhu
- National Cancer Center, National Clinical Research Center for Cancer, Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100021, China
| | - Xinyuan Chen
- National Cancer Center, National Clinical Research Center for Cancer, Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100021, China
| | - Yuxiang Liu
- National Cancer Center, National Clinical Research Center for Cancer, Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100021, China
- School of Physics and Technology, Wuhan University, Wuhan, 430072, China
| | - Bining Yang
- National Cancer Center, National Clinical Research Center for Cancer, Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100021, China
| | - Ran Wei
- National Cancer Center, National Clinical Research Center for Cancer, Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100021, China
| | - Shirui Qin
- National Cancer Center, National Clinical Research Center for Cancer, Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100021, China
| | - Zhuanbo Yang
- National Cancer Center, National Clinical Research Center for Cancer, Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100021, China
| | - Zhihui Hu
- National Cancer Center, National Clinical Research Center for Cancer, Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100021, China
| | - Jianrong Dai
- National Cancer Center, National Clinical Research Center for Cancer, Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100021, China
| | - Kuo Men
- National Cancer Center, National Clinical Research Center for Cancer, Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100021, China.
| |
Collapse
|
34
|
Li Z, Wang Y, Zhu Y, Xu J, Wei J, Xie J, Zhang J. Modality-based attention and dual-stream multiple instance convolutional neural network for predicting microvascular invasion of hepatocellular carcinoma. Front Oncol 2023; 13:1195110. [PMID: 37434971 PMCID: PMC10331018 DOI: 10.3389/fonc.2023.1195110] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2023] [Accepted: 05/30/2023] [Indexed: 07/13/2023] Open
Abstract
Background and purpose The presence of microvascular invasion (MVI) is a crucial indicator of postoperative recurrence in patients with hepatocellular carcinoma (HCC). Detecting MVI before surgery can improve personalized surgical planning and enhance patient survival. However, existing automatic diagnosis methods for MVI have certain limitations. Some methods only analyze information from a single slice and overlook the context of the entire lesion, while others require high computational resources to process the entire tumor with a three-dimension (3D) convolutional neural network (CNN), which could be challenging to train. To address these limitations, this paper proposes a modality-based attention and dual-stream multiple instance learning(MIL) CNN. Materials and methods In this retrospective study, 283 patients with histologically confirmed HCC who underwent surgical resection between April 2017 and September 2019 were included. Five magnetic resonance (MR) modalities including T2-weighted, arterial phase, venous phase, delay phase and apparent diffusion coefficient images were used in image acquisition of each patient. Firstly, Each two-dimension (2D) slice of HCC magnetic resonance image (MRI) was converted into an instance embedding. Secondly, modality attention module was designed to emulates the decision-making process of doctors and helped the model to focus on the important MRI sequences. Thirdly, instance embeddings of 3D scans were aggregated into a bag embedding by a dual-stream MIL aggregator, in which the critical slices were given greater consideration. The dataset was split into a training set and a testing set in a 4:1 ratio, and model performance was evaluated using five-fold cross-validation. Results Using the proposed method, the prediction of MVI achieved an accuracy of 76.43% and an AUC of 74.22%, significantly surpassing the performance of the baseline methods. Conclusion Our modality-based attention and dual-stream MIL CNN can achieve outstanding results for MVI prediction.
Collapse
Affiliation(s)
- Zhi Li
- School of Medicine, Shanghai University, Shanghai, China
- Shanghai Universal Medical Imaging Diagnostic Center, Shanghai University, Shanghai, China
| | - Yutao Wang
- The First Affiliated Hospital of Ningbo University, Ningbo, China
| | - Yuzhao Zhu
- Shanghai Universal Medical Imaging Diagnostic Center, Shanghai University, Shanghai, China
| | - Jiafeng Xu
- Shanghai Universal Medical Imaging Diagnostic Center, Shanghai University, Shanghai, China
| | - Jinzhu Wei
- School of Medicine, Shanghai University, Shanghai, China
| | - Jiang Xie
- School of Computer Engineering and Science, Shanghai University, Shanghai, China
| | - Jian Zhang
- Shanghai Universal Medical Imaging Diagnostic Center, Shanghai University, Shanghai, China
| |
Collapse
|
35
|
Jin D, Zheng H, Yuan H. Exploring the Possibility of Measuring Vertebrae Bone Structure Metrics Using MDCT Images: An Unpaired Image-to-Image Translation Method. Bioengineering (Basel) 2023; 10:716. [PMID: 37370647 DOI: 10.3390/bioengineering10060716] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2023] [Revised: 06/05/2023] [Accepted: 06/08/2023] [Indexed: 06/29/2023] Open
Abstract
Bone structure metrics are vital for the evaluation of vertebral bone strength. However, the gold standard for measuring bone structure metrics, micro-Computed Tomography (micro-CT), cannot be used in vivo, which hinders the early diagnosis of fragility fractures. This paper used an unpaired image-to-image translation method to capture the mapping between clinical multidetector computed tomography (MDCT) and micro-CT images and then generated micro-CT-like images to measure bone structure metrics. MDCT and micro-CT images were scanned from 75 human lumbar spine specimens and formed training and testing sets. The generator in the model focused on learning both the structure and detailed pattern of bone trabeculae and generating micro-CT-like images, and the discriminator determined whether the generated images were micro-CT images or not. Based on similarity metrics (i.e., SSIM and FID) and bone structure metrics (i.e., bone volume fraction, trabecular separation and trabecular thickness), a set of comparisons were performed. The results show that the proposed method can perform better in terms of both similarity metrics and bone structure metrics and the improvement is statistically significant. In particular, we compared the proposed method with the paired image-to-image method and analyzed the pros and cons of the method used.
Collapse
Affiliation(s)
- Dan Jin
- Department of Radiology, Peking University Third Hospital, Beijing 100191, China
| | - Han Zheng
- School of Traffic and Transportation, Beijing Jiaotong University, Beijing 100044, China
| | - Huishu Yuan
- Department of Radiology, Peking University Third Hospital, Beijing 100191, China
| |
Collapse
|
36
|
Liu X, Prince JL, Xing F, Zhuo J, Reese T, Stone M, El Fakhri G, Woo J. Attentive Continuous Generative Self-training for Unsupervised Domain Adaptive Medical Image Translation. ARXIV 2023:arXiv:2305.14589v1. [PMID: 37292465 PMCID: PMC10246114] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Self-training is an important class of unsupervised domain adaptation (UDA) approaches that are used to mitigate the problem of domain shift, when applying knowledge learned from a labeled source domain to unlabeled and heterogeneous target domains. While self-training-based UDA has shown considerable promise on discriminative tasks, including classification and segmentation, through reliable pseudo-label filtering based on the maximum softmax probability, there is a paucity of prior work on self-training-based UDA for generative tasks, including image modality translation. To fill this gap, in this work, we seek to develop a generative self-training (GST) framework for domain adaptive image translation with continuous value prediction and regression objectives. Specifically, we quantify both aleatoric and epistemic uncertainties within our GST using variational Bayes learning to measure the reliability of synthesized data. We also introduce a self-attention scheme that de-emphasizes the background region to prevent it from dominating the training process. The adaptation is then carried out by an alternating optimization scheme with target domain supervision that focuses attention on the regions with reliable pseudo-labels. We evaluated our framework on two cross-scanner/center, inter-subject translation tasks, including tagged-to-cine magnetic resonance (MR) image translation and T1-weighted MR-to-fractional anisotropy translation. Extensive validations with unpaired target domain data showed that our GST yielded superior synthesis performance in comparison to adversarial training UDA methods.
Collapse
Affiliation(s)
- Xiaofeng Liu
- Gordon Center for Medical Imaging, Department of Radiology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, 02114
| | - Jerry L Prince
- Department of Electrical and Computer Engineering, Johns Hopkins University, Baltimore, MD, USA
| | - Fangxu Xing
- Gordon Center for Medical Imaging, Department of Radiology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, 02114
| | - Jiachen Zhuo
- Department of Neural and Pain Sciences, University of Maryland School of Dentistry, Baltimore, MD, USA
| | - Timothy Reese
- Athinoula A. Martinos Center for Biomedical Imaging, Dept. of Radiology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
| | - Maureen Stone
- Department of Neural and Pain Sciences, University of Maryland School of Dentistry, Baltimore, MD, USA
| | - Georges El Fakhri
- Gordon Center for Medical Imaging, Department of Radiology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, 02114
| | - Jonghye Woo
- Gordon Center for Medical Imaging, Department of Radiology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, 02114
| |
Collapse
|
37
|
Billot B, Greve DN, Puonti O, Thielscher A, Van Leemput K, Fischl B, Dalca AV, Iglesias JE. SynthSeg: Segmentation of brain MRI scans of any contrast and resolution without retraining. Med Image Anal 2023; 86:102789. [PMID: 36857946 PMCID: PMC10154424 DOI: 10.1016/j.media.2023.102789] [Citation(s) in RCA: 115] [Impact Index Per Article: 57.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2022] [Revised: 01/20/2023] [Accepted: 02/22/2023] [Indexed: 03/03/2023]
Abstract
Despite advances in data augmentation and transfer learning, convolutional neural networks (CNNs) difficultly generalise to unseen domains. When segmenting brain scans, CNNs are highly sensitive to changes in resolution and contrast: even within the same MRI modality, performance can decrease across datasets. Here we introduce SynthSeg, the first segmentation CNN robust against changes in contrast and resolution. SynthSeg is trained with synthetic data sampled from a generative model conditioned on segmentations. Crucially, we adopt a domain randomisation strategy where we fully randomise the contrast and resolution of the synthetic training data. Consequently, SynthSeg can segment real scans from a wide range of target domains without retraining or fine-tuning, which enables straightforward analysis of huge amounts of heterogeneous clinical data. Because SynthSeg only requires segmentations to be trained (no images), it can learn from labels obtained by automated methods on diverse populations (e.g., ageing and diseased), thus achieving robustness to a wide range of morphological variability. We demonstrate SynthSeg on 5,000 scans of six modalities (including CT) and ten resolutions, where it exhibits unparallelled generalisation compared with supervised CNNs, state-of-the-art domain adaptation, and Bayesian segmentation. Finally, we demonstrate the generalisability of SynthSeg by applying it to cardiac MRI and CT scans.
Collapse
Affiliation(s)
- Benjamin Billot
- Centre for Medical Image Computing, University College London, UK.
| | - Douglas N Greve
- Martinos Center for Biomedical Imaging, Department of Radiology, Massachusetts General Hospital and Harvard Medical School, USA
| | - Oula Puonti
- Danish Research Centre for Magnetic Resonance, Centre for Functional and Diagnostic Imaging and Research, Copenhagen University Hospital, Denmark
| | - Axel Thielscher
- Danish Research Centre for Magnetic Resonance, Centre for Functional and Diagnostic Imaging and Research, Copenhagen University Hospital, Denmark; Department of Health Technology, Technical University of, Denmark
| | - Koen Van Leemput
- Martinos Center for Biomedical Imaging, Department of Radiology, Massachusetts General Hospital and Harvard Medical School, USA; Department of Health Technology, Technical University of, Denmark
| | - Bruce Fischl
- Martinos Center for Biomedical Imaging, Department of Radiology, Massachusetts General Hospital and Harvard Medical School, USA; Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, USA; Program in Health Sciences and Technology, Massachusetts Institute of Technology, USA
| | - Adrian V Dalca
- Martinos Center for Biomedical Imaging, Department of Radiology, Massachusetts General Hospital and Harvard Medical School, USA; Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, USA
| | - Juan Eugenio Iglesias
- Centre for Medical Image Computing, University College London, UK; Martinos Center for Biomedical Imaging, Department of Radiology, Massachusetts General Hospital and Harvard Medical School, USA; Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, USA
| |
Collapse
|
38
|
Touati R, Kadoury S. A least square generative network based on invariant contrastive feature pair learning for multimodal MR image synthesis. Int J Comput Assist Radiol Surg 2023:10.1007/s11548-023-02916-z. [PMID: 37103727 DOI: 10.1007/s11548-023-02916-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2023] [Accepted: 04/12/2023] [Indexed: 04/28/2023]
Abstract
PURPOSE During MR-guided neurosurgical procedures, several factors may limit the acquisition of additional MR sequences, which are needed by neurosurgeons to adjust surgical plans or ensure complete tumor resection. Automatically synthesized MR contrasts generated from other available heterogeneous MR sequences could alleviate timing constraints. METHODS We propose a new multimodal MR synthesis approach leveraging a combination of MR modalities presenting glioblastomas to generate an additional modality. The proposed learning approach relies on a least square GAN (LSGAN) using an unsupervised contrastive learning strategy. We incorporate a contrastive encoder, which extracts an invariant contrastive representation from augmented pairs of the generated and real target MR contrasts. This contrastive representation describes a pair of features for each input channel, allowing to regularize the generator to be invariant to the high-frequency orientations. Moreover, when training the generator, we impose on the LSGAN loss another term reformulated as the combination of a reconstruction and a novel perception loss based on a pair of features. RESULTS When compared to other multimodal MR synthesis approaches evaluated on the BraTS'18 brain dataset, the model yields the highest Dice score with [Formula: see text] and achieves the lowest variability information of [Formula: see text], with a probability rand index score of [Formula: see text] and a global consistency error of [Formula: see text]. CONCLUSION The proposed model allows to generate reliable MR contrasts with enhanced tumors on the synthesized image using a brain tumor dataset (BraTS'18). In future work, we will perform a clinical evaluation of residual tumor segmentations during MR-guided neurosurgeries, where limited MR contrasts will be acquired during the procedure.
Collapse
Affiliation(s)
- Redha Touati
- Polytechnique Montréal, Montreal, QC, H3T 1J4, Canada.
| | - Samuel Kadoury
- Polytechnique Montréal, Montreal, QC, H3T 1J4, Canada
- CHUM, Université de Montréal, Montreal, H2X 0A9, Canada
| |
Collapse
|
39
|
Xia Y, Ravikumar N, Lassila T, Frangi AF. Virtual high-resolution MR angiography from non-angiographic multi-contrast MRIs: synthetic vascular model populations for in-silico trials. Med Image Anal 2023; 87:102814. [PMID: 37196537 DOI: 10.1016/j.media.2023.102814] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2022] [Revised: 04/04/2023] [Accepted: 04/08/2023] [Indexed: 05/19/2023]
Abstract
Despite success on multi-contrast MR image synthesis, generating specific modalities remains challenging. Those include Magnetic Resonance Angiography (MRA) that highlights details of vascular anatomy using specialised imaging sequences for emphasising inflow effect. This work proposes an end-to-end generative adversarial network that can synthesise anatomically plausible, high-resolution 3D MRA images using commonly acquired multi-contrast MR images (e.g. T1/T2/PD-weighted MR images) for the same subject whilst preserving the continuity of vascular anatomy. A reliable technique for MRA synthesis would unleash the research potential of very few population databases with imaging modalities (such as MRA) that enable quantitative characterisation of whole-brain vasculature. Our work is motivated by the need to generate digital twins and virtual patients of cerebrovascular anatomy for in-silico studies and/or in-silico trials. We propose a dedicated generator and discriminator that leverage the shared and complementary features of multi-source images. We design a composite loss function for emphasising vascular properties by minimising the statistical difference between the feature representations of the target images and the synthesised outputs in both 3D volumetric and 2D projection domains. Experimental results show that the proposed method can synthesise high-quality MRA images and outperform the state-of-the-art generative models both qualitatively and quantitatively. The importance assessment reveals that T2 and PD-weighted images are better predictors of MRA images than T1; and PD-weighted images contribute to better visibility of small vessel branches towards the peripheral regions. In addition, the proposed approach can generalise to unseen data acquired at different imaging centres with different scanners, whilst synthesising MRAs and vascular geometries that maintain vessel continuity. The results show the potential for use of the proposed approach to generating digital twin cohorts of cerebrovascular anatomy at scale from structural MR images typically acquired in population imaging initiatives.
Collapse
Affiliation(s)
- Yan Xia
- Centre for Computational Imaging and Simulation Technologies in Biomedicine (CISTIB), School of Computing, University of Leeds, Leeds, UK.
| | - Nishant Ravikumar
- Centre for Computational Imaging and Simulation Technologies in Biomedicine (CISTIB), School of Computing, University of Leeds, Leeds, UK
| | - Toni Lassila
- Centre for Computational Imaging and Simulation Technologies in Biomedicine (CISTIB), School of Computing, University of Leeds, Leeds, UK
| | - Alejandro F Frangi
- Centre for Computational Imaging and Simulation Technologies in Biomedicine (CISTIB), School of Computing, University of Leeds, Leeds, UK; Leeds Institute for Cardiovascular and Metabolic Medicine (LICAMM), School of Medicine, University of Leeds, Leeds, UK; Medical Imaging Research Center (MIRC), Cardiovascular Science and Electronic Engineering Departments, KU Leuven, Leuven, Belgium; Alan Turing Institute, London, UK
| |
Collapse
|
40
|
Zhou T, Cheng Q, Lu H, Li Q, Zhang X, Qiu S. Deep learning methods for medical image fusion: A review. Comput Biol Med 2023; 160:106959. [PMID: 37141652 DOI: 10.1016/j.compbiomed.2023.106959] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2022] [Revised: 04/12/2023] [Accepted: 04/17/2023] [Indexed: 05/06/2023]
Abstract
The image fusion methods based on deep learning has become a research hotspot in the field of computer vision in recent years. This paper reviews these methods from five aspects: Firstly, the principle and advantages of image fusion methods based on deep learning are expounded; Secondly, the image fusion methods are summarized in two aspects: End-to-End and Non-End-to-End, according to the different tasks of deep learning in the feature processing stage, the non-end-to-end image fusion methods are divided into two categories: deep learning for decision mapping and deep learning for feature extraction. According to the different types of the networks, the end-to-end image fusion methods are divided into three categories: image fusion methods based on Convolutional Neural Network, Generative Adversarial Network, and Encoder-Decoder Network; Thirdly, the application of the image fusion methods based on deep learning in medical image field is summarized from two aspects: method and data set; Fourthly, evaluation metrics commonly used in the field of medical image fusion are sorted out from 14 aspects; Fifthly, the main challenges faced by the medical image fusion are discussed from two aspects: data sets and fusion methods. And the future development direction is prospected. This paper systematically summarizes the image fusion methods based on the deep learning, which has a positive guiding significance for the in-depth study of multi modal medical images.
Collapse
Affiliation(s)
- Tao Zhou
- School of Computer Science and Engineering, North Minzu University, Yinchuan, 750021, China; Key Laboratory of Image and Graphics Intelligent Processing of State Ethnic Affairs Commission, North Minzu University, Yinchuan, 750021, China
| | - QianRu Cheng
- School of Computer Science and Engineering, North Minzu University, Yinchuan, 750021, China; Key Laboratory of Image and Graphics Intelligent Processing of State Ethnic Affairs Commission, North Minzu University, Yinchuan, 750021, China.
| | - HuiLing Lu
- School of Science, Ningxia Medical University, Yinchuan, 750004, China.
| | - Qi Li
- School of Computer Science and Engineering, North Minzu University, Yinchuan, 750021, China; Key Laboratory of Image and Graphics Intelligent Processing of State Ethnic Affairs Commission, North Minzu University, Yinchuan, 750021, China
| | - XiangXiang Zhang
- School of Computer Science and Engineering, North Minzu University, Yinchuan, 750021, China; Key Laboratory of Image and Graphics Intelligent Processing of State Ethnic Affairs Commission, North Minzu University, Yinchuan, 750021, China
| | - Shi Qiu
- Key Laboratory of Spectral Imaging Technology CAS, Xi'an Institute of Optics and Precision Mechanics, Chinese Academy of Sciences, Xi'an, 710119, China
| |
Collapse
|
41
|
Feature generation and multi-sequence fusion based deep convolutional network for breast tumor diagnosis with missing MR sequences. Biomed Signal Process Control 2023. [DOI: 10.1016/j.bspc.2022.104536] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
|
42
|
Ouyang C, Chen C, Li S, Li Z, Qin C, Bai W, Rueckert D. Causality-Inspired Single-Source Domain Generalization for Medical Image Segmentation. IEEE TRANSACTIONS ON MEDICAL IMAGING 2023; 42:1095-1106. [PMID: 36417741 DOI: 10.1109/tmi.2022.3224067] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Deep learning models usually suffer from the domain shift issue, where models trained on one source domain do not generalize well to other unseen domains. In this work, we investigate the single-source domain generalization problem: training a deep network that is robust to unseen domains, under the condition that training data are only available from one source domain, which is common in medical imaging applications. We tackle this problem in the context of cross-domain medical image segmentation. In this scenario, domain shifts are mainly caused by different acquisition processes. We propose a simple causality-inspired data augmentation approach to expose a segmentation model to synthesized domain-shifted training examples. Specifically, 1) to make the deep model robust to discrepancies in image intensities and textures, we employ a family of randomly-weighted shallow networks. They augment training images using diverse appearance transformations. 2) Further we show that spurious correlations among objects in an image are detrimental to domain robustness. These correlations might be taken by the network as domain-specific clues for making predictions, and they may break on unseen domains. We remove these spurious correlations via causal intervention. This is achieved by resampling the appearances of potentially correlated objects independently. The proposed approach is validated on three cross-domain segmentation scenarios: cross-modality (CT-MRI) abdominal image segmentation, cross-sequence (bSSFP-LGE) cardiac MRI segmentation, and cross-site prostate MRI segmentation. The proposed approach yields consistent performance gains compared with competitive methods when tested on unseen domains.
Collapse
|
43
|
Liu Z, Wolfe S, Yu Z, Laforest R, Mhlanga JC, Fraum TJ, Itani M, Dehdashti F, Siegel BA, Jha AK. Observer-study-based approaches to quantitatively evaluate the realism of synthetic medical images. Phys Med Biol 2023; 68:10.1088/1361-6560/acc0ce. [PMID: 36863028 PMCID: PMC10411234 DOI: 10.1088/1361-6560/acc0ce] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2022] [Accepted: 03/02/2023] [Indexed: 03/04/2023]
Abstract
Objective.Synthetic images generated by simulation studies have a well-recognized role in developing and evaluating imaging systems and methods. However, for clinically relevant development and evaluation, the synthetic images must be clinically realistic and, ideally, have the same distribution as that of clinical images. Thus, mechanisms that can quantitatively evaluate this clinical realism and, ideally, the similarity in distributions of the real and synthetic images, are much needed.Approach.We investigated two observer-study-based approaches to quantitatively evaluate the clinical realism of synthetic images. In the first approach, we presented a theoretical formalism for the use of an ideal-observer study to quantitatively evaluate the similarity in distributions between the real and synthetic images. This theoretical formalism provides a direct relationship between the area under the receiver operating characteristic curve, AUC, for an ideal observer and the distributions of real and synthetic images. The second approach is based on the use of expert-human-observer studies to quantitatively evaluate the realism of synthetic images. In this approach, we developed a web-based software to conduct two-alternative forced-choice (2-AFC) experiments with expert human observers. The usability of this software was evaluated by conducting a system usability scale (SUS) survey with seven expert human readers and five observer-study designers. Further, we demonstrated the application of this software to evaluate a stochastic and physics-based image-synthesis technique for oncologic positron emission tomography (PET). In this evaluation, the 2-AFC study with our software was performed by six expert human readers, who were highly experienced in reading PET scans, with years of expertise ranging from 7 to 40 years (median: 12 years, average: 20.4 years).Main results.In the ideal-observer-study-based approach, we theoretically demonstrated that the AUC for an ideal observer can be expressed, to an excellent approximation, by the Bhattacharyya distance between the distributions of the real and synthetic images. This relationship shows that a decrease in the ideal-observer AUC indicates a decrease in the distance between the two image distributions. Moreover, a lower bound of ideal-observer AUC = 0.5 implies that the distributions of synthetic and real images exactly match. For the expert-human-observer-study-based approach, our software for performing the 2-AFC experiments is available athttps://apps.mir.wustl.edu/twoafc. Results from the SUS survey demonstrate that the web application is very user friendly and accessible. As a secondary finding, evaluation of a stochastic and physics-based PET image-synthesis technique using our software showed that expert human readers had limited ability to distinguish the real images from the synthetic images.Significance.This work addresses the important need for mechanisms to quantitatively evaluate the clinical realism of synthetic images. The mathematical treatment in this paper shows that quantifying the similarity in the distribution of real and synthetic images is theoretically possible by using an ideal-observer-study-based approach. Our developed software provides a platform for designing and performing 2-AFC experiments with human observers in a highly accessible, efficient, and secure manner. Additionally, our results on the evaluation of the stochastic and physics-based image-synthesis technique motivate the application of this technique to develop and evaluate a wide array of PET imaging methods.
Collapse
Affiliation(s)
- Ziping Liu
- Department of Biomedical Engineering, Washington University, St. Louis, MO 63130, United States of America
| | - Scott Wolfe
- Mallinckrodt Institute of Radiology, Washington University School of Medicine, St. Louis, MO 63110, United States of America
| | - Zitong Yu
- Department of Biomedical Engineering, Washington University, St. Louis, MO 63130, United States of America
| | - Richard Laforest
- Mallinckrodt Institute of Radiology, Washington University School of Medicine, St. Louis, MO 63110, United States of America
- Alvin J. Siteman Cancer Center, Washington University School of Medicine, St. Louis, MO 63110, United States of America
| | - Joyce C Mhlanga
- Mallinckrodt Institute of Radiology, Washington University School of Medicine, St. Louis, MO 63110, United States of America
| | - Tyler J Fraum
- Mallinckrodt Institute of Radiology, Washington University School of Medicine, St. Louis, MO 63110, United States of America
- Alvin J. Siteman Cancer Center, Washington University School of Medicine, St. Louis, MO 63110, United States of America
| | - Malak Itani
- Mallinckrodt Institute of Radiology, Washington University School of Medicine, St. Louis, MO 63110, United States of America
| | - Farrokh Dehdashti
- Mallinckrodt Institute of Radiology, Washington University School of Medicine, St. Louis, MO 63110, United States of America
- Alvin J. Siteman Cancer Center, Washington University School of Medicine, St. Louis, MO 63110, United States of America
| | - Barry A Siegel
- Mallinckrodt Institute of Radiology, Washington University School of Medicine, St. Louis, MO 63110, United States of America
- Alvin J. Siteman Cancer Center, Washington University School of Medicine, St. Louis, MO 63110, United States of America
| | - Abhinav K Jha
- Department of Biomedical Engineering, Washington University, St. Louis, MO 63130, United States of America
- Mallinckrodt Institute of Radiology, Washington University School of Medicine, St. Louis, MO 63110, United States of America
- Alvin J. Siteman Cancer Center, Washington University School of Medicine, St. Louis, MO 63110, United States of America
| |
Collapse
|
44
|
Zhou T, Ruan S, Hu H. A literature survey of MR-based brain tumor segmentation with missing modalities. Comput Med Imaging Graph 2023; 104:102167. [PMID: 36584536 DOI: 10.1016/j.compmedimag.2022.102167] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2022] [Revised: 11/01/2022] [Accepted: 12/22/2022] [Indexed: 12/28/2022]
Abstract
Multimodal MR brain tumor segmentation is one of the hottest issues in the community of medical image processing. However, acquiring the complete set of MR modalities is not always possible in clinical practice, due to the acquisition protocols, image corruption, scanner availability, scanning cost or allergies to certain contrast materials. The missing information can cause some restraints to brain tumor diagnosis, monitoring, treatment planning and prognosis. Thus, it is highly desirable to develop brain tumor segmentation methods to address the missing modalities problem. Based on the recent advancements, in this review, we provide a detailed analysis of the missing modality issue in MR-based brain tumor segmentation. First, we briefly introduce the biomedical background concerning brain tumor, MR imaging techniques, and the current challenges in brain tumor segmentation. Then, we provide a taxonomy of the state-of-the-art methods with five categories, namely, image synthesis-based method, latent feature space-based model, multi-source correlation-based method, knowledge distillation-based method, and domain adaptation-based method. In addition, the principles, architectures, benefits and limitations are elaborated in each method. Following that, the corresponding datasets and widely used evaluation metrics are described. Finally, we analyze the current challenges and provide a prospect for future development trends. This review aims to provide readers with a thorough knowledge of the recent contributions in the field of brain tumor segmentation with missing modalities and suggest potential future directions.
Collapse
Affiliation(s)
- Tongxue Zhou
- School of Information Science and Technology, Hangzhou Normal University, Hangzhou 311121, China
| | - Su Ruan
- Université de Rouen Normandie, LITIS - QuantIF, Rouen 76183, France
| | - Haigen Hu
- College of Computer Science and Technology, Zhejiang University of Technology, Hangzhou 310023, China; Key Laboratory of Visual Media Intelligent Processing Technology of Zhejiang Province, Hangzhou 310023, China.
| |
Collapse
|
45
|
Iglesias JE, Billot B, Balbastre Y, Magdamo C, Arnold SE, Das S, Edlow BL, Alexander DC, Golland P, Fischl B. SynthSR: A public AI tool to turn heterogeneous clinical brain scans into high-resolution T1-weighted images for 3D morphometry. SCIENCE ADVANCES 2023; 9:eadd3607. [PMID: 36724222 PMCID: PMC9891693 DOI: 10.1126/sciadv.add3607] [Citation(s) in RCA: 41] [Impact Index Per Article: 20.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/07/2022] [Accepted: 01/04/2023] [Indexed: 05/10/2023]
Abstract
Every year, millions of brain magnetic resonance imaging (MRI) scans are acquired in hospitals across the world. These have the potential to revolutionize our understanding of many neurological diseases, but their morphometric analysis has not yet been possible due to their anisotropic resolution. We present an artificial intelligence technique, "SynthSR," that takes clinical brain MRI scans with any MR contrast (T1, T2, etc.), orientation (axial/coronal/sagittal), and resolution and turns them into high-resolution T1 scans that are usable by virtually all existing human neuroimaging tools. We present results on segmentation, registration, and atlasing of >10,000 scans of controls and patients with brain tumors, strokes, and Alzheimer's disease. SynthSR yields morphometric results that are very highly correlated with what one would have obtained with high-resolution T1 scans. SynthSR allows sample sizes that have the potential to overcome the power limitations of prospective research studies and shed new light on the healthy and diseased human brain.
Collapse
Affiliation(s)
- Juan E. Iglesias
- Athinoula A. Martinos Center for Biomedical Imaging, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
- Centre for Medical Image Computing, Department of Computer Science, University College London, London, UK
- Computer Science and Artificial Intelligence Laboratory (CSAIL), Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Benjamin Billot
- Centre for Medical Image Computing, Department of Computer Science, University College London, London, UK
| | - Yaël Balbastre
- Athinoula A. Martinos Center for Biomedical Imaging, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
| | - Colin Magdamo
- Department of Neurology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
| | - Steven E. Arnold
- Department of Neurology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
| | - Sudeshna Das
- Department of Neurology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
| | - Brian L. Edlow
- Athinoula A. Martinos Center for Biomedical Imaging, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
- Department of Neurology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
- Center for Neurotechnology and Neurorecovery, Massachusetts General Hospital, Boston, MA, USA
| | - Daniel C. Alexander
- Centre for Medical Image Computing, Department of Computer Science, University College London, London, UK
| | - Polina Golland
- Computer Science and Artificial Intelligence Laboratory (CSAIL), Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Bruce Fischl
- Athinoula A. Martinos Center for Biomedical Imaging, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
- Computer Science and Artificial Intelligence Laboratory (CSAIL), Massachusetts Institute of Technology, Cambridge, MA, USA
| |
Collapse
|
46
|
Use of semi-synthetic data for catheter segmentation improvement. Comput Med Imaging Graph 2023; 106:102188. [PMID: 36867896 DOI: 10.1016/j.compmedimag.2023.102188] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2022] [Revised: 01/15/2023] [Accepted: 01/16/2023] [Indexed: 02/05/2023]
Abstract
In the era of data-driven machine learning algorithms, data is the new oil. For the most optimal results, datasets should be large, heterogeneous and, crucially, correctly labeled. However, data collection and labeling are time-consuming and labor-intensive processes. In the field of medical device segmentation, present during minimally invasive surgery, this leads to a lack of informative data. Motivated by this drawback, we developed an algorithm generating semi-synthetic images based on real ones. The concept of this algorithm is to place a randomly shaped catheter in an empty heart cavity, where the shape of the catheter is generated by forward kinematics of continuum robots. Having implemented the proposed algorithm, we generated new images of heart cavities with various artificial catheters. We compared the results of deep neural networks trained purely on real datasets with respect to networks trained on both real and semi-synthetic datasets, highlighting that semi-synthetic data improves catheter segmentation accuracy. A modified U-Net trained on combined datasets performed the segmentation with a Dice similarity coefficient of 92.6 ± 2.2%, while the same model trained only on real images achieved a Dice similarity coefficient of 86.5 ± 3.6%. Therefore, using semi-synthetic data allows for the decrease of accuracy spread, improves model generalization, reduces subjectivity, shortens the labeling routine, increases the number of samples, and improves the heterogeneity.
Collapse
|
47
|
Chen C, Raymond C, Speier W, Jin X, Cloughesy TF, Enzmann D, Ellingson BM, Arnold CW. Synthesizing MR Image Contrast Enhancement Using 3D High-Resolution ConvNets. IEEE Trans Biomed Eng 2023; 70:401-412. [PMID: 35853075 PMCID: PMC9928432 DOI: 10.1109/tbme.2022.3192309] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
OBJECTIVE Gadolinium-based contrast agents (GBCAs) have been widely used to better visualize disease in brain magnetic resonance imaging (MRI). However, gadolinium deposition within the brain and body has raised safety concerns about the use of GBCAs. Therefore, the development of novel approaches that can decrease or even eliminate GBCA exposure while providing similar contrast information would be of significant use clinically. METHODS In this work, we present a deep learning based approach for contrast-enhanced T1 synthesis on brain tumor patients. A 3D high-resolution fully convolutional network (FCN), which maintains high resolution information through processing and aggregates multi-scale information in parallel, is designed to map pre-contrast MRI sequences to contrast-enhanced MRI sequences. Specifically, three pre-contrast MRI sequences, T1, T2 and apparent diffusion coefficient map (ADC), are utilized as inputs and the post-contrast T1 sequences are utilized as target output. To alleviate the data imbalance problem between normal tissues and the tumor regions, we introduce a local loss to improve the contribution of the tumor regions, which leads to better enhancement results on tumors. RESULTS Extensive quantitative and visual assessments are performed, with our proposed model achieving a PSNR of 28.24 dB in the brain and 21.2 dB in tumor regions. CONCLUSION AND SIGNIFICANCE Our results suggest the potential of substituting GBCAs with synthetic contrast images generated via deep learning.
Collapse
|
48
|
Moya-Sáez E, de Luis-García R, Alberola-López C. Toward deep learning replacement of gadolinium in neuro-oncology: A review of contrast-enhanced synthetic MRI. FRONTIERS IN NEUROIMAGING 2023; 2:1055463. [PMID: 37554645 PMCID: PMC10406200 DOI: 10.3389/fnimg.2023.1055463] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/27/2022] [Accepted: 01/04/2023] [Indexed: 08/10/2023]
Abstract
Gadolinium-based contrast agents (GBCAs) have become a crucial part of MRI acquisitions in neuro-oncology for the detection, characterization and monitoring of brain tumors. However, contrast-enhanced (CE) acquisitions not only raise safety concerns, but also lead to patient discomfort, the need of more skilled manpower and cost increase. Recently, several proposed deep learning works intend to reduce, or even eliminate, the need of GBCAs. This study reviews the published works related to the synthesis of CE images from low-dose and/or their native -non CE- counterparts. The data, type of neural network, and number of input modalities for each method are summarized as well as the evaluation methods. Based on this analysis, we discuss the main issues that these methods need to overcome in order to become suitable for their clinical usage. We also hypothesize some future trends that research on this topic may follow.
Collapse
Affiliation(s)
- Elisa Moya-Sáez
- Laboratorio de Procesado de Imagen, ETSI Telecomunicación, Universidad de Valladolid, Valladolid, Spain
| | | | | |
Collapse
|
49
|
Yurt M, Dalmaz O, Dar S, Ozbey M, Tinaz B, Oguz K, Cukur T. Semi-Supervised Learning of MRI Synthesis Without Fully-Sampled Ground Truths. IEEE TRANSACTIONS ON MEDICAL IMAGING 2022; 41:3895-3906. [PMID: 35969576 DOI: 10.1109/tmi.2022.3199155] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Learning-based translation between MRI contrasts involves supervised deep models trained using high-quality source- and target-contrast images derived from fully-sampled acquisitions, which might be difficult to collect under limitations on scan costs or time. To facilitate curation of training sets, here we introduce the first semi-supervised model for MRI contrast translation (ssGAN) that can be trained directly using undersampled k-space data. To enable semi-supervised learning on undersampled data, ssGAN introduces novel multi-coil losses in image, k-space, and adversarial domains. The multi-coil losses are selectively enforced on acquired k-space samples unlike traditional losses in single-coil synthesis models. Comprehensive experiments on retrospectively undersampled multi-contrast brain MRI datasets are provided. Our results demonstrate that ssGAN yields on par performance to a supervised model, while outperforming single-coil models trained on coil-combined magnitude images. It also outperforms cascaded reconstruction-synthesis models where a supervised synthesis model is trained following self-supervised reconstruction of undersampled data. Thus, ssGAN holds great promise to improve the feasibility of learning-based multi-contrast MRI synthesis.
Collapse
|
50
|
Dalmaz O, Yurt M, Cukur T. ResViT: Residual Vision Transformers for Multimodal Medical Image Synthesis. IEEE TRANSACTIONS ON MEDICAL IMAGING 2022; 41:2598-2614. [PMID: 35436184 DOI: 10.1109/tmi.2022.3167808] [Citation(s) in RCA: 103] [Impact Index Per Article: 34.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Generative adversarial models with convolutional neural network (CNN) backbones have recently been established as state-of-the-art in numerous medical image synthesis tasks. However, CNNs are designed to perform local processing with compact filters, and this inductive bias compromises learning of contextual features. Here, we propose a novel generative adversarial approach for medical image synthesis, ResViT, that leverages the contextual sensitivity of vision transformers along with the precision of convolution operators and realism of adversarial learning. ResViT's generator employs a central bottleneck comprising novel aggregated residual transformer (ART) blocks that synergistically combine residual convolutional and transformer modules. Residual connections in ART blocks promote diversity in captured representations, while a channel compression module distills task-relevant information. A weight sharing strategy is introduced among ART blocks to mitigate computational burden. A unified implementation is introduced to avoid the need to rebuild separate synthesis models for varying source-target modality configurations. Comprehensive demonstrations are performed for synthesizing missing sequences in multi-contrast MRI, and CT images from MRI. Our results indicate superiority of ResViT against competing CNN- and transformer-based methods in terms of qualitative observations and quantitative metrics.
Collapse
|