1
|
Vukovic D, Ruvinov I, Antico M, Steffens M, Fontanarosa D. Automatic GAN-based MRI volume synthesis from US volumes: a proof of concept investigation. Sci Rep 2023; 13:21716. [PMID: 38066019 PMCID: PMC10709581 DOI: 10.1038/s41598-023-48595-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2023] [Accepted: 11/28/2023] [Indexed: 12/18/2023] Open
Abstract
Usually, a baseline image, either through magnetic resonance imaging (MRI) or computed tomography (CT), is captured as a reference before medical procedures such as respiratory interventions like Thoracentesis. In these procedures, ultrasound (US) imaging is often employed for guiding needle placement during Thoracentesis or providing image guidance in MISS procedures within the thoracic region. Following the procedure, a post-procedure image is acquired to monitor and evaluate the patient's progress. Currently, there are no real-time guidance and tracking capabilities that allow a surgeon to perform their procedure using the familiarity of the reference imaging modality. In this work, we propose a real-time volumetric indirect registration using a deep learning approach where the fusion of multi-imaging modalities will allow for guidance and tracking of surgical procedures using US while displaying the resultant changes in a clinically friendly reference imaging modality (MRI). The deep learning method employs a series of generative adversarial networks (GANs), specifically CycleGAN, to conduct an unsupervised image-to-image translation. This process produces spatially aligned US and MRI volumes corresponding to their respective input volumes (MRI and US) of the thoracic spine anatomical region. In this preliminary proof-of-concept study, the focus was on the T9 vertebrae. A clinical expert performs anatomical validation of randomly selected real and generated volumes of the T9 thoracic vertebrae and gives a score of 0 (conclusive anatomical structures present) or 1 (inconclusive anatomical structures present) to each volume to check if the volumes are anatomically accurate. The Dice and Overlap metrics show how accurate the shape of T9 is when compared to real volumes and how consistent the shape of T9 is when compared to other generated volumes. The average Dice, Overlap and Accuracy to clearly label all the anatomical structures of the T9 vertebrae are approximately 80% across the board.
Collapse
Affiliation(s)
- Damjan Vukovic
- School of Clinical Sciences, Queensland University of Technology, Gardens Point Campus, 2 George St, Brisbane, QLD, 4000, Australia.
- Centre for Biomedical Technologies (CBT), Queensland University of Technology, Brisbane, QLD, 4000, Australia.
| | - Igor Ruvinov
- School of Clinical Sciences, Queensland University of Technology, Gardens Point Campus, 2 George St, Brisbane, QLD, 4000, Australia
| | - Maria Antico
- CSIRO Health and Biosecurity, The Australian eHealth Research Centre, Herston, QLD, 4029, Australia
| | - Marian Steffens
- School of Clinical Sciences, Queensland University of Technology, Gardens Point Campus, 2 George St, Brisbane, QLD, 4000, Australia
| | - Davide Fontanarosa
- School of Clinical Sciences, Queensland University of Technology, Gardens Point Campus, 2 George St, Brisbane, QLD, 4000, Australia.
- Centre for Biomedical Technologies (CBT), Queensland University of Technology, Brisbane, QLD, 4000, Australia.
| |
Collapse
|
2
|
Oh JH, Kim HG, Lee KM. Developing and Evaluating Deep Learning Algorithms for Object Detection: Key Points for Achieving Superior Model Performance. Korean J Radiol 2023; 24:698-714. [PMID: 37404112 DOI: 10.3348/kjr.2022.0765] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2022] [Revised: 04/29/2023] [Accepted: 05/16/2023] [Indexed: 07/06/2023] Open
Abstract
In recent years, artificial intelligence, especially object detection-based deep learning in computer vision, has made significant advancements, driven by the development of computing power and the widespread use of graphic processor units. Object detection-based deep learning techniques have been applied in various fields, including the medical imaging domain, where remarkable achievements have been reported in disease detection. However, the application of deep learning does not always guarantee satisfactory performance, and researchers have been employing trial-and-error to identify the factors contributing to performance degradation and enhance their models. Moreover, due to the black-box problem, the intermediate processes of a deep learning network cannot be comprehended by humans; as a result, identifying problems in a deep learning model that exhibits poor performance can be challenging. This article highlights potential issues that may cause performance degradation at each deep learning step in the medical imaging domain and discusses factors that must be considered to improve the performance of deep learning models. Researchers who wish to begin deep learning research can reduce the required amount of trial-and-error by understanding the issues discussed in this study.
Collapse
Affiliation(s)
- Jang-Hoon Oh
- Department of Radiology, Kyung Hee University Hospital, Kyung Hee University College of Medicine, Seoul, Korea
| | - Hyug-Gi Kim
- Department of Radiology, Kyung Hee University Hospital, Kyung Hee University College of Medicine, Seoul, Korea
| | - Kyung Mi Lee
- Department of Radiology, Kyung Hee University Hospital, Kyung Hee University College of Medicine, Seoul, Korea.
| |
Collapse
|
3
|
Iglesias JE, Billot B, Balbastre Y, Magdamo C, Arnold SE, Das S, Edlow BL, Alexander DC, Golland P, Fischl B. SynthSR: A public AI tool to turn heterogeneous clinical brain scans into high-resolution T1-weighted images for 3D morphometry. SCIENCE ADVANCES 2023; 9:eadd3607. [PMID: 36724222 PMCID: PMC9891693 DOI: 10.1126/sciadv.add3607] [Citation(s) in RCA: 22] [Impact Index Per Article: 22.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/07/2022] [Accepted: 01/04/2023] [Indexed: 05/10/2023]
Abstract
Every year, millions of brain magnetic resonance imaging (MRI) scans are acquired in hospitals across the world. These have the potential to revolutionize our understanding of many neurological diseases, but their morphometric analysis has not yet been possible due to their anisotropic resolution. We present an artificial intelligence technique, "SynthSR," that takes clinical brain MRI scans with any MR contrast (T1, T2, etc.), orientation (axial/coronal/sagittal), and resolution and turns them into high-resolution T1 scans that are usable by virtually all existing human neuroimaging tools. We present results on segmentation, registration, and atlasing of >10,000 scans of controls and patients with brain tumors, strokes, and Alzheimer's disease. SynthSR yields morphometric results that are very highly correlated with what one would have obtained with high-resolution T1 scans. SynthSR allows sample sizes that have the potential to overcome the power limitations of prospective research studies and shed new light on the healthy and diseased human brain.
Collapse
Affiliation(s)
- Juan E. Iglesias
- Athinoula A. Martinos Center for Biomedical Imaging, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
- Centre for Medical Image Computing, Department of Computer Science, University College London, London, UK
- Computer Science and Artificial Intelligence Laboratory (CSAIL), Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Benjamin Billot
- Centre for Medical Image Computing, Department of Computer Science, University College London, London, UK
| | - Yaël Balbastre
- Athinoula A. Martinos Center for Biomedical Imaging, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
| | - Colin Magdamo
- Department of Neurology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
| | - Steven E. Arnold
- Department of Neurology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
| | - Sudeshna Das
- Department of Neurology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
| | - Brian L. Edlow
- Athinoula A. Martinos Center for Biomedical Imaging, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
- Department of Neurology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
- Center for Neurotechnology and Neurorecovery, Massachusetts General Hospital, Boston, MA, USA
| | - Daniel C. Alexander
- Centre for Medical Image Computing, Department of Computer Science, University College London, London, UK
| | - Polina Golland
- Computer Science and Artificial Intelligence Laboratory (CSAIL), Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Bruce Fischl
- Athinoula A. Martinos Center for Biomedical Imaging, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
- Computer Science and Artificial Intelligence Laboratory (CSAIL), Massachusetts Institute of Technology, Cambridge, MA, USA
| |
Collapse
|
4
|
AGGN: Attention-based glioma grading network with multi-scale feature extraction and multi-modal information fusion. Comput Biol Med 2023; 152:106457. [PMID: 36571937 DOI: 10.1016/j.compbiomed.2022.106457] [Citation(s) in RCA: 18] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2022] [Revised: 12/06/2022] [Accepted: 12/19/2022] [Indexed: 12/24/2022]
Abstract
In this paper, a magnetic resonance imaging (MRI) oriented novel attention-based glioma grading network (AGGN) is proposed. By applying the dual-domain attention mechanism, both channel and spatial information can be considered to assign weights, which benefits highlighting the key modalities and locations in the feature maps. Multi-branch convolution and pooling operations are applied in a multi-scale feature extraction module to separately obtain shallow and deep features on each modality, and a multi-modal information fusion module is adopted to sufficiently merge low-level detailed and high-level semantic features, which promotes the synergistic interaction among different modality information. The proposed AGGN is comprehensively evaluated through extensive experiments, and the results have demonstrated the effectiveness and superiority of the proposed AGGN in comparison to other advanced models, which also presents high generalization ability and strong robustness. In addition, even without the manually labeled tumor masks, AGGN can present considerable performance as other state-of-the-art algorithms, which alleviates the excessive reliance on supervised information in the end-to-end learning paradigm.
Collapse
|
5
|
Yan S, Wang C, Chen W, Lyu J. Swin transformer-based GAN for multi-modal medical image translation. Front Oncol 2022; 12:942511. [PMID: 36003791 PMCID: PMC9395186 DOI: 10.3389/fonc.2022.942511] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2022] [Accepted: 07/11/2022] [Indexed: 11/13/2022] Open
Abstract
Medical image-to-image translation is considered a new direction with many potential applications in the medical field. The medical image-to-image translation is dominated by two models, including supervised Pix2Pix and unsupervised cyclic-consistency generative adversarial network (GAN). However, existing methods still have two shortcomings: 1) the Pix2Pix requires paired and pixel-aligned images, which are difficult to acquire. Nevertheless, the optimum output of the cycle-consistency model may not be unique. 2) They are still deficient in capturing the global features and modeling long-distance interactions, which are critical for regions with complex anatomical structures. We propose a Swin Transformer-based GAN for Multi-Modal Medical Image Translation, named MMTrans. Specifically, MMTrans consists of a generator, a registration network, and a discriminator. The Swin Transformer-based generator enables to generate images with the same content as source modality images and similar style information of target modality images. The encoder part of the registration network, based on Swin Transformer, is utilized to predict deformable vector fields. The convolution-based discriminator determines whether the target modality images are similar to the generator or from the real images. Extensive experiments conducted using the public dataset and clinical datasets showed that our network outperformed other advanced medical image translation methods in both aligned and unpaired datasets and has great potential to be applied in clinical applications.
Collapse
Affiliation(s)
- Shouang Yan
- School of Computer and Control Engineering, Yantai University, Yantai, China
| | - Chengyan Wang
- Human Phenome Institute, Fudan University, Shanghai, China
| | | | - Jun Lyu
- School of Computer and Control Engineering, Yantai University, Yantai, China
- *Correspondence: Jun Lyu,
| |
Collapse
|
6
|
Liu X, Sanchez P, Thermos S, O'Neil AQ, Tsaftaris SA. Learning disentangled representations in the imaging domain. Med Image Anal 2022; 80:102516. [PMID: 35751992 DOI: 10.1016/j.media.2022.102516] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2021] [Revised: 04/05/2022] [Accepted: 06/10/2022] [Indexed: 12/12/2022]
Abstract
Disentangled representation learning has been proposed as an approach to learning general representations even in the absence of, or with limited, supervision. A good general representation can be fine-tuned for new target tasks using modest amounts of data, or used directly in unseen domains achieving remarkable performance in the corresponding task. This alleviation of the data and annotation requirements offers tantalising prospects for applications in computer vision and healthcare. In this tutorial paper, we motivate the need for disentangled representations, revisit key concepts, and describe practical building blocks and criteria for learning such representations. We survey applications in medical imaging emphasising choices made in exemplar key works, and then discuss links to computer vision applications. We conclude by presenting limitations, challenges, and opportunities.
Collapse
Affiliation(s)
- Xiao Liu
- School of Engineering, The University of Edinburgh, Edinburgh EH9 3FG, UK.
| | - Pedro Sanchez
- School of Engineering, The University of Edinburgh, Edinburgh EH9 3FG, UK
| | - Spyridon Thermos
- School of Engineering, The University of Edinburgh, Edinburgh EH9 3FG, UK
| | - Alison Q O'Neil
- School of Engineering, The University of Edinburgh, Edinburgh EH9 3FG, UK; Canon Medical Research Europe, Edinburgh EH6 5NP, UK
| | - Sotirios A Tsaftaris
- School of Engineering, The University of Edinburgh, Edinburgh EH9 3FG, UK; The Alan Turing Institute, London NW1 2DB, UK
| |
Collapse
|
7
|
Zhan B, Xiao J, Cao C, Peng X, Zu C, Zhou J, Wang Y. Multi-constraint generative adversarial network for dose prediction in radiotherapy. Med Image Anal 2021; 77:102339. [PMID: 34990905 DOI: 10.1016/j.media.2021.102339] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2021] [Revised: 12/14/2021] [Accepted: 12/16/2021] [Indexed: 02/05/2023]
Abstract
Radiation therapy (RT) is regarded as the primary treatment for cancer in the clinic, aiming to deliver an accurate dose to the planning target volume (PTV) while protecting the surrounding organs at risk (OARs). To improve the effectiveness of the treatment planning, deep learning methods are widely adopted to predict dose distribution maps for clinical treatment planning. In this paper, we present a novel multi-constraint dose prediction model based on generative adversarial network, named Mc-GAN, to automatically predict the dose distribution map from the computer tomography (CT) images and the masks of PTV and OARs. Specifically, the generator is an embedded UNet-like structure with dilated convolution to capture both the global and local information. During the feature extraction, a dual attention module (DAM) is embedded to force the generator to take more heed of internal semantic relevance. To improve the prediction accuracy, two additional losses, i.e., the locality-constrained loss (LCL) and the self-supervised perceptual loss (SPL), are introduced besides the conventional global pixel-level loss and adversarial loss. Concretely, the LCL tries to focus on the predictions of locally important areas while the SPL aims to prevent the predicted dose maps from the possible distortion at the feature level. Evaluated on two in-house datasets, our proposed Mc-GAN has been demonstrated to outperform other state-of-the-art methods in almost all PTV and OARs criteria.
Collapse
Affiliation(s)
- Bo Zhan
- School of Computer Science, Sichuan University, China
| | - Jianghong Xiao
- Department of Radiation Oncology, Cancer Center West China Hospital, Sichuan University, China
| | - Chongyang Cao
- School of Computer Science, Sichuan University, China
| | - Xingchen Peng
- Department of Biotherapy, Cancer Center West China Hospital, Sichuan University, China
| | - Chen Zu
- Department of Risk Controlling Research, JD.com, China
| | - Jiliu Zhou
- School of Computer Science, Sichuan University, China; School of Computer Science, Chengdu University of Information Technology, China
| | - Yan Wang
- School of Computer Science, Sichuan University, China.
| |
Collapse
|
8
|
Luo Y, Nie D, Zhan B, Li Z, Wu X, Zhou J, Wang Y, Shen D. Edge-preserving MRI image synthesis via adversarial network with iterative multi-scale fusion. Neurocomputing 2021. [DOI: 10.1016/j.neucom.2021.04.060] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
|
9
|
Zhan B, Li D, Wu X, Zhou J, Wang Y. Multi-modal MRI Image Synthesis via GAN with Multi-scale Gate Mergence. IEEE J Biomed Health Inform 2021; 26:17-26. [PMID: 34125692 DOI: 10.1109/jbhi.2021.3088866] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Multi-modal magnetic resonance imaging (MRI) plays a critical role in clinical diagnosis and treatment nowadays. Each modality of MRI presents its own specific anatomical features which serve as complementary information to other modalities and can provide rich diagnostic information. However, due to the limitations of time consuming and expensive cost, some image sequences of patients may be lost or corrupted, posing an obstacle for accurate diagnosis. Although current multi-modal image synthesis approaches are able to alleviate the issues to some extent, they are still far short of fusing modalities effectively. In light of this, we propose a multi-scale gate mergence based generative adversarial network model, namely MGM-GAN, to synthesize one modality of MRI from others. Notably, we have multiple down-sampling branches corresponding to input modalities to specifically extract their unique features. In contrast to the generic multi-modal fusion approach of averaging or maximizing operations, we introduce a gate mergence (GM) mechanism to automatically learn the weights of different modalities across locations, enhancing the task-related information while suppressing the irrelative information. As such, the feature maps of all the input modalities at each down-sampling level, i.e., multi-scale levels, are integrated via GM module. In addition, both the adversarial loss and the pixel-wise loss, as well as gradient difference loss (GDL) are applied to train the network to produce the desired modality accurately. Extensive experiments demonstrate that the proposed method outperforms the state-of-the-art multi-modal image synthesis methods.
Collapse
|