1
|
Chen Z, Ren H, Li Q, Li X. Motion correction and super-resolution for multi-slice cardiac magnetic resonance imaging via an end-to-end deep learning approach. Comput Med Imaging Graph 2024; 115:102389. [PMID: 38692199 PMCID: PMC11144076 DOI: 10.1016/j.compmedimag.2024.102389] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2023] [Revised: 03/08/2024] [Accepted: 04/19/2024] [Indexed: 05/03/2024]
Abstract
Accurate reconstruction of a high-resolution 3D volume of the heart is critical for comprehensive cardiac assessments. However, cardiac magnetic resonance (CMR) data is usually acquired as a stack of 2D short-axis (SAX) slices, which suffers from the inter-slice misalignment due to cardiac motion and data sparsity from large gaps between SAX slices. Therefore, we aim to propose an end-to-end deep learning (DL) model to address these two challenges simultaneously, employing specific model components for each challenge. The objective is to reconstruct a high-resolution 3D volume of the heart (VHR) from acquired CMR SAX slices (VLR). We define the transformation from VLR to VHR as a sequential process of motion correction and super-resolution. Accordingly, our DL model incorporates two distinct components. The first component conducts motion correction by predicting displacement vectors to re-position each SAX slice accurately. The second component takes the motion-corrected SAX slices from the first component and performs the super-resolution to fill the data gaps. These two components operate in a sequential way, and the entire model is trained end-to-end. Our model significantly reduced inter-slice misalignment from originally 3.33±0.74 mm to 1.36±0.63 mm and generated accurate high resolution 3D volumes with Dice of 0.974±0.010 for left ventricle (LV) and 0.938±0.017 for myocardium in a simulation dataset. When compared to the LAX contours in a real-world dataset, our model achieved Dice of 0.945±0.023 for LV and 0.786±0.060 for myocardium. In both datasets, our model with specific components for motion correction and super-resolution significantly enhance the performance compared to the model without such design considerations. The codes for our model are available at https://github.com/zhennongchen/CMR_MC_SR_End2End.
Collapse
Affiliation(s)
- Zhennong Chen
- Center for Advanced Medical Computing and Analysis, Massachusetts General Hospital and Harvard Medical School, Boston, USA
| | - Hui Ren
- Center for Advanced Medical Computing and Analysis, Massachusetts General Hospital and Harvard Medical School, Boston, USA
| | - Quanzheng Li
- Center for Advanced Medical Computing and Analysis, Massachusetts General Hospital and Harvard Medical School, Boston, USA
| | - Xiang Li
- Center for Advanced Medical Computing and Analysis, Massachusetts General Hospital and Harvard Medical School, Boston, USA.
| |
Collapse
|
2
|
Ramamoorthy P, Ramakantha Reddy BR, Askar SS, Abouhawwash M. Histopathology-based breast cancer prediction using deep learning methods for healthcare applications. Front Oncol 2024; 14:1300997. [PMID: 38894870 PMCID: PMC11184215 DOI: 10.3389/fonc.2024.1300997] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2023] [Accepted: 04/12/2024] [Indexed: 06/21/2024] Open
Abstract
Breast cancer (BC) is the leading cause of female cancer mortality and is a type of cancer that is a major threat to women's health. Deep learning methods have been used extensively in many medical domains recently, especially in detection and classification applications. Studying histological images for the automatic diagnosis of BC is important for patients and their prognosis. Owing to the complication and variety of histology images, manual examination can be difficult and susceptible to errors and thus needs the services of experienced pathologists. Therefore, publicly accessible datasets called BreakHis and invasive ductal carcinoma (IDC) are used in this study to analyze histopathological images of BC. Next, using super-resolution generative adversarial networks (SRGANs), which create high-resolution images from low-quality images, the gathered images from BreakHis and IDC are pre-processed to provide useful results in the prediction stage. The components of conventional generative adversarial network (GAN) loss functions and effective sub-pixel nets were combined to create the concept of SRGAN. Next, the high-quality images are sent to the data augmentation stage, where new data points are created by making small adjustments to the dataset using rotation, random cropping, mirroring, and color-shifting. Next, patch-based feature extraction using Inception V3 and Resnet-50 (PFE-INC-RES) is employed to extract the features from the augmentation. After the features have been extracted, the next step involves processing them and applying transductive long short-term memory (TLSTM) to improve classification accuracy by decreasing the number of false positives. The results of suggested PFE-INC-RES is evaluated using existing methods on the BreakHis dataset, with respect to accuracy (99.84%), specificity (99.71%), sensitivity (99.78%), and F1-score (99.80%), while the suggested PFE-INC-RES performed better in the IDC dataset based on F1-score (99.08%), accuracy (99.79%), specificity (98.97%), and sensitivity (99.17%).
Collapse
Affiliation(s)
- Prabhu Ramamoorthy
- Department of Electronics and Communication Engineering, Gnanamani College of Technology, Namakkal, India
| | | | - S. S. Askar
- Department of Statistics and Operations Research, College of Science, King Saud University, Riyadh, Saudi Arabia
| | - Mohamed Abouhawwash
- Department of Mathematics, Faculty of Science, Mansoura University, Mansoura, Egypt
| |
Collapse
|
3
|
Wang L, Zhang W, Chen W, He Z, Jia Y, Du J. Cross-Modality Reference and Feature Mutual-Projection for 3D Brain MRI Image Super-Resolution. JOURNAL OF IMAGING INFORMATICS IN MEDICINE 2024:10.1007/s10278-024-01139-1. [PMID: 38829472 DOI: 10.1007/s10278-024-01139-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/13/2023] [Revised: 03/30/2024] [Accepted: 04/21/2024] [Indexed: 06/05/2024]
Abstract
High-resolution (HR) magnetic resonance imaging (MRI) can reveal rich anatomical structures for clinical diagnoses. However, due to hardware and signal-to-noise ratio limitations, MRI images are often collected with low resolution (LR) which is not conducive to diagnosing and analyzing clinical diseases. Recently, deep learning super-resolution (SR) methods have demonstrated great potential in enhancing the resolution of MRI images; however, most of them did not take the cross-modality and internal priors of MR seriously, which hinders the SR performance. In this paper, we propose a cross-modality reference and feature mutual-projection (CRFM) method to enhance the spatial resolution of brain MRI images. Specifically, we feed the gradients of HR MRI images from referenced imaging modality into the SR network to transform true clear textures to LR feature maps. Meanwhile, we design a plug-in feature mutual-projection (FMP) method to capture the cross-scale dependency and cross-modality similarity details of MRI images. Finally, we fuse all feature maps with parallel attentions to produce and refine the HR features adaptively. Extensive experiments on MRI images in the image domain and k-space show that our CRFM method outperforms existing state-of-the-art MRI SR methods.
Collapse
Affiliation(s)
- Lulu Wang
- Faculty of Information Engineering and Automation, Kunming University of Science and Technology and Yunnan Key Laboratory of Computer Technologies Application, Kunming, 650500, China.
| | - Wanqi Zhang
- College of Computer Science, Chongqing University, Chongqing, 400044, China
| | - Wei Chen
- College of Computer Science, Chongqing University, Chongqing, 400044, China
| | - Zhongshi He
- College of Computer Science, Chongqing University, Chongqing, 400044, China
| | - Yuanyuan Jia
- Medical Data Science Academy and College of Medical Informatics, Chongqing Medical University, Chongqing, 400016, China
| | - Jinglong Du
- Medical Data Science Academy and College of Medical Informatics, Chongqing Medical University, Chongqing, 400016, China
| |
Collapse
|
4
|
Yuan T, Yang J, Chi J, Yu T, Liu F. A cross-domain complex convolution neural network for undersampled magnetic resonance image reconstruction. Magn Reson Imaging 2024; 108:86-97. [PMID: 38331053 DOI: 10.1016/j.mri.2024.02.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2023] [Revised: 02/01/2024] [Accepted: 02/05/2024] [Indexed: 02/10/2024]
Abstract
To introduce a new cross-domain complex convolution neural network for accurate MR image reconstruction from undersampled k-space data. Most reconstruction methods utilize neural networks or cascade neural networks in either the image domain and/or the k-space domain. However, these methods encounter several challenges: 1) Applying neural networks directly in the k-space domain is suboptimal for feature extraction; 2) Classic image-domain networks have difficulty in fully extracting texture features; and 3) Existing cross-domain methods still face challenges in extracting and fusing features from both image and k-space domains simultaneously. In this work, we propose a novel deep-learning-based 2-D single-coil complex-valued MR reconstruction network termed TEID-Net. TEID-Net integrates three modules: 1) TE-Net, an image-domain-based sub-network designed to enhance contrast in input features by incorporating a Texture Enhancement Module; 2) ID-Net, an intermediate-domain sub-network tailored to operate in the image-Fourier space, with the specific goal of reducing aliasing artifacts realized by leveraging the superior incoherence property of the decoupled one-dimensional signals; and 3) TEID-Net, a cross-domain reconstruction network in which ID-Nets and TE-Nets are combined and cascaded to boost the quality of image reconstruction further. Extensive experiments have been conducted on the fastMRI and Calgary-Campinas datasets. Results demonstrate the effectiveness of the proposed TEID-Net in mitigating undersampling-induced artifacts and producing high-quality image reconstructions, outperforming several state-of-the-art methods while utilizing fewer network parameters. The cross-domain TEID-Net excels in restoring tissue structures and intricate texture details. The results illustrate that TEID-Net is particularly well-suited for regular Cartesian undersampling scenarios.
Collapse
Affiliation(s)
- Tengfei Yuan
- College of Electronics and Information, Qingdao University, Qingdao, Shandong, China
| | - Jie Yang
- College of Mechanical and Electrical Engineering, Qingdao University, Qingdao, Shandong, China
| | - Jieru Chi
- College of Electronics and Information, Qingdao University, Qingdao, Shandong, China.
| | - Teng Yu
- College of Electronics and Information, Qingdao University, Qingdao, Shandong, China
| | - Feng Liu
- School of Electrical Engineering and Computer Science, University of Queensland, Brisbane, Brisbane, Australia
| |
Collapse
|
5
|
Kim J, Li Y, Shin BS. 3D-DGGAN: A Data-Guided Generative Adversarial Network for High Fidelity in Medical Image Generation. IEEE J Biomed Health Inform 2024; 28:2904-2915. [PMID: 38416610 DOI: 10.1109/jbhi.2024.3367375] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/01/2024]
Abstract
Three-dimensional images are frequently used in medical imaging research for classification, segmentation, and detection. However, the limited availability of 3D images hinders research progress due to network training difficulties. Generative methods have been proposed to create medical images using AI techniques. Nevertheless, 2D approaches have difficulty dealing with 3D anatomical structures, which can result in discontinuities between slices. To mitigate these discontinuities, several 3D generative networks have been proposed. However, the scarcity of available 3D images makes training these networks with limited samples inadequate for producing high-fidelity 3D images. We propose a data-guided generative adversarial network to provide high fidelity in 3D image generation. The generator creates fake images with noise using reference code obtained by extracting features from real images. The generator also creates decoded images using reference code without noise. These decoded images are compared to the real images to evaluate fidelity in the reference code. This generation process can create high-fidelity 3D images from only a small amount of real training data. Additionally, our method employs three types of discriminator: volume (evaluates all the slices), slab (evaluates a set of consecutive slices), and slice (evaluates randomly selected slices). The proposed discriminator enhances fidelity by differentiating between real and fake images based on detailed characteristics. Results from our method are compared with existing methods by using quantitative analysis such as Fréchet inception distance and maximum mean discrepancy. The results demonstrate that our method produces more realistic 3D images than existing methods.
Collapse
|
6
|
Kim J, Li Y, Shin BS. Volumetric Imitation Generative Adversarial Networks for Anatomical Human Body Modeling. Bioengineering (Basel) 2024; 11:163. [PMID: 38391649 PMCID: PMC10886047 DOI: 10.3390/bioengineering11020163] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2024] [Revised: 02/02/2024] [Accepted: 02/06/2024] [Indexed: 02/24/2024] Open
Abstract
Volumetric representation is a technique used to express 3D objects in various fields, such as medical applications. On the other hand, tomography images for reconstructing volumetric data have limited utilization because they contain personal information. Existing GAN-based medical image generation techniques can produce virtual tomographic images for volume reconstruction while preserving the patient's privacy. Nevertheless, these images often do not consider vertical correlations between the adjacent slices, leading to erroneous results in 3D reconstruction. Furthermore, while volume generation techniques have been introduced, they often focus on surface modeling, making it challenging to represent the internal anatomical features accurately. This paper proposes volumetric imitation GAN (VI-GAN), which imitates a human anatomical model to generate volumetric data. The primary goal of this model is to capture the attributes and 3D structure, including the external shape, internal slices, and the relationship between the vertical slices of the human anatomical model. The proposed network consists of a generator for feature extraction and up-sampling based on a 3D U-Net and ResNet structure and a 3D-convolution-based LFFB (local feature fusion block). In addition, a discriminator utilizes 3D convolution to evaluate the authenticity of the generated volume compared to the ground truth. VI-GAN also devises reconstruction loss, including feature and similarity losses, to converge the generated volumetric data into a human anatomical model. In this experiment, the CT data of 234 people were used to assess the reliability of the results. When using volume evaluation metrics to measure similarity, VI-GAN generated a volume that realistically represented the human anatomical model compared to existing volume generation methods.
Collapse
Affiliation(s)
- Jion Kim
- Department of Electrical and Computer Engineering, Inha University, Incheon 22212, Republic of Korea
| | - Yan Li
- Department of Electrical and Computer Engineering, Inha University, Incheon 22212, Republic of Korea
| | - Byeong-Seok Shin
- Department of Electrical and Computer Engineering, Inha University, Incheon 22212, Republic of Korea
| |
Collapse
|
7
|
Hossain MB, Shinde RK, Oh S, Kwon KC, Kim N. A Systematic Review and Identification of the Challenges of Deep Learning Techniques for Undersampled Magnetic Resonance Image Reconstruction. SENSORS (BASEL, SWITZERLAND) 2024; 24:753. [PMID: 38339469 PMCID: PMC10856856 DOI: 10.3390/s24030753] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/30/2023] [Revised: 01/05/2024] [Accepted: 01/22/2024] [Indexed: 02/12/2024]
Abstract
Deep learning (DL) in magnetic resonance imaging (MRI) shows excellent performance in image reconstruction from undersampled k-space data. Artifact-free and high-quality MRI reconstruction is essential for ensuring accurate diagnosis, supporting clinical decision-making, enhancing patient safety, facilitating efficient workflows, and contributing to the validity of research studies and clinical trials. Recently, deep learning has demonstrated several advantages over conventional MRI reconstruction methods. Conventional methods rely on manual feature engineering to capture complex patterns and are usually computationally demanding due to their iterative nature. Conversely, DL methods use neural networks with hundreds of thousands of parameters and automatically learn relevant features and representations directly from the data. Nevertheless, there are some limitations to DL-based techniques concerning MRI reconstruction tasks, such as the need for large, labeled datasets, the possibility of overfitting, and the complexity of model training. Researchers are striving to develop DL models that are more efficient, adaptable, and capable of providing valuable information for medical practitioners. We provide a comprehensive overview of the current developments and clinical uses by focusing on state-of-the-art DL architectures and tools used in MRI reconstruction. This study has three objectives. Our main objective is to describe how various DL designs have changed over time and talk about cutting-edge tactics, including their advantages and disadvantages. Hence, data pre- and post-processing approaches are assessed using publicly available MRI datasets and source codes. Secondly, this work aims to provide an extensive overview of the ongoing research on transformers and deep convolutional neural networks for rapid MRI reconstruction. Thirdly, we discuss several network training strategies, like supervised, unsupervised, transfer learning, and federated learning for rapid and efficient MRI reconstruction. Consequently, this article provides significant resources for future improvement of MRI data pre-processing and fast image reconstruction.
Collapse
Affiliation(s)
- Md. Biddut Hossain
- School of Information and Communication Engineering, Chungbuk National University, Cheongju-si 28644, Chungcheongbuk-do, Republic of Korea; (M.B.H.); (R.K.S.)
| | - Rupali Kiran Shinde
- School of Information and Communication Engineering, Chungbuk National University, Cheongju-si 28644, Chungcheongbuk-do, Republic of Korea; (M.B.H.); (R.K.S.)
| | - Sukhoon Oh
- Research Equipment Operation Department, Korea Basic Science Institute, Cheongju-si 28119, Chungcheongbuk-do, Republic of Korea;
| | - Ki-Chul Kwon
- School of Information and Communication Engineering, Chungbuk National University, Cheongju-si 28644, Chungcheongbuk-do, Republic of Korea; (M.B.H.); (R.K.S.)
| | - Nam Kim
- School of Information and Communication Engineering, Chungbuk National University, Cheongju-si 28644, Chungcheongbuk-do, Republic of Korea; (M.B.H.); (R.K.S.)
| |
Collapse
|
8
|
Shao L, Chen B, Zhang Z, Zhang Z, Chen X. Artificial intelligence generated content (AIGC) in medicine: A narrative review. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2024; 21:1672-1711. [PMID: 38303483 DOI: 10.3934/mbe.2024073] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/03/2024]
Abstract
Recently, artificial intelligence generated content (AIGC) has been receiving increased attention and is growing exponentially. AIGC is generated based on the intentional information extracted from human-provided instructions by generative artificial intelligence (AI) models. AIGC quickly and automatically generates large amounts of high-quality content. Currently, there is a shortage of medical resources and complex medical procedures in medicine. Due to its characteristics, AIGC can help alleviate these problems. As a result, the application of AIGC in medicine has gained increased attention in recent years. Therefore, this paper provides a comprehensive review on the recent state of studies involving AIGC in medicine. First, we present an overview of AIGC. Furthermore, based on recent studies, the application of AIGC in medicine is reviewed from two aspects: medical image processing and medical text generation. The basic generative AI models, tasks, target organs, datasets and contribution of studies are considered and summarized. Finally, we also discuss the limitations and challenges faced by AIGC and propose possible solutions with relevant studies. We hope this review can help readers understand the potential of AIGC in medicine and obtain some innovative ideas in this field.
Collapse
Affiliation(s)
- Liangjing Shao
- Academy for Engineering & Technology, Fudan University, Shanghai 200433, China
- Shanghai Key Laboratory of Medical Image Computing and Computer Assisted Intervention, Fudan University, Shanghai 200032, China
| | - Benshuang Chen
- Academy for Engineering & Technology, Fudan University, Shanghai 200433, China
- Shanghai Key Laboratory of Medical Image Computing and Computer Assisted Intervention, Fudan University, Shanghai 200032, China
| | - Ziqun Zhang
- Information office, Fudan University, Shanghai 200032, China
| | - Zhen Zhang
- Baoshan Branch of Ren Ji Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai 200444, China
| | - Xinrong Chen
- Academy for Engineering & Technology, Fudan University, Shanghai 200433, China
- Shanghai Key Laboratory of Medical Image Computing and Computer Assisted Intervention, Fudan University, Shanghai 200032, China
| |
Collapse
|
9
|
Yang C, Bian T, Yang J, Hou J, Cao Y, Han Z, Zhao X, Wen W, Zhu X. Plane-wave medical image reconstruction based on dynamic Criss-Cross attention and multi-scale convolution. Technol Health Care 2024; 32:299-312. [PMID: 38759058 PMCID: PMC11191515 DOI: 10.3233/thc-248026] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/19/2024]
Abstract
BACKGROUND Plane-wave imaging is widely employed in medical imaging due to its ultra-fast imaging speed. However, the image quality is compromised. Existing techniques to enhance image quality tend to sacrifice the imaging frame rate. OBJECTIVE The study aims to reconstruct high-quality plane-wave images while maintaining the imaging frame rate. METHODS The proposed method utilizes a U-Net-based generator incorporating a multi-scale convolution module in the encoder to extract information at different levels. Additionally, a Dynamic Criss-Cross Attention (DCCA) mechanism is proposed in the decoder of the U-Net-based generator to extract both local and global features of plane-wave images while avoiding interference caused by irrelevant regions. RESULTS In the reconstruction of point targets, the experimental images achieved a reduction in Full Width at Half Maximum (FWHM) of 0.0499 mm, compared to the Coherent Plane-Wave Compounding (CPWC) method using 75-beam plane waves. For the reconstruction of cyst targets, the simulated image achieved a 3.78% improvement in Contrast Ratio (CR) compared to CPWC. CONCLUSIONS The proposed model effectively addresses the issue of unclear lesion sites in plane-wave images.
Collapse
Affiliation(s)
- Cuiyun Yang
- College of Information Science and Technology, Qingdao University of Science and Technology, Qingdao, Shandong, China
| | - Taicheng Bian
- College of Information Science and Technology, Qingdao University of Science and Technology, Qingdao, Shandong, China
| | - Jin Yang
- College of Information Science and Technology, Qingdao University of Science and Technology, Qingdao, Shandong, China
| | - Junyi Hou
- College of Information Science and Technology, Qingdao University of Science and Technology, Qingdao, Shandong, China
| | - Yiliang Cao
- College of Information Science and Technology, Qingdao University of Science and Technology, Qingdao, Shandong, China
| | - Zhihui Han
- Department of Biomedical Engineering, School of Instrument Science and Optoelectronics Engineering, Hefei University of Technology, Hefei, Anhui, China
| | - Xiaoyan Zhao
- College of Information Science and Technology, Qingdao University of Science and Technology, Qingdao, Shandong, China
| | - Weijun Wen
- College of Information Science and Technology, Qingdao University of Science and Technology, Qingdao, Shandong, China
| | - Xijun Zhu
- College of Information Science and Technology, Qingdao University of Science and Technology, Qingdao, Shandong, China
| |
Collapse
|
10
|
Lin Z, Lei C, Yang L. Modern Image-Guided Surgery: A Narrative Review of Medical Image Processing and Visualization. SENSORS (BASEL, SWITZERLAND) 2023; 23:9872. [PMID: 38139718 PMCID: PMC10748263 DOI: 10.3390/s23249872] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/01/2023] [Revised: 11/15/2023] [Accepted: 12/13/2023] [Indexed: 12/24/2023]
Abstract
Medical image analysis forms the basis of image-guided surgery (IGS) and many of its fundamental tasks. Driven by the growing number of medical imaging modalities, the research community of medical imaging has developed methods and achieved functionality breakthroughs. However, with the overwhelming pool of information in the literature, it has become increasingly challenging for researchers to extract context-relevant information for specific applications, especially when many widely used methods exist in a variety of versions optimized for their respective application domains. By being further equipped with sophisticated three-dimensional (3D) medical image visualization and digital reality technology, medical experts could enhance their performance capabilities in IGS by multiple folds. The goal of this narrative review is to organize the key components of IGS in the aspects of medical image processing and visualization with a new perspective and insights. The literature search was conducted using mainstream academic search engines with a combination of keywords relevant to the field up until mid-2022. This survey systemically summarizes the basic, mainstream, and state-of-the-art medical image processing methods as well as how visualization technology like augmented/mixed/virtual reality (AR/MR/VR) are enhancing performance in IGS. Further, we hope that this survey will shed some light on the future of IGS in the face of challenges and opportunities for the research directions of medical image processing and visualization.
Collapse
Affiliation(s)
- Zhefan Lin
- School of Mechanical Engineering, Zhejiang University, Hangzhou 310030, China;
- ZJU-UIUC Institute, International Campus, Zhejiang University, Haining 314400, China;
| | - Chen Lei
- ZJU-UIUC Institute, International Campus, Zhejiang University, Haining 314400, China;
| | - Liangjing Yang
- School of Mechanical Engineering, Zhejiang University, Hangzhou 310030, China;
- ZJU-UIUC Institute, International Campus, Zhejiang University, Haining 314400, China;
| |
Collapse
|
11
|
Wang W, Shen H, Chen J, Xing F. MHAN: Multi-Stage Hybrid Attention Network for MRI reconstruction and super-resolution. Comput Biol Med 2023; 163:107181. [PMID: 37352637 DOI: 10.1016/j.compbiomed.2023.107181] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2023] [Revised: 05/29/2023] [Accepted: 06/13/2023] [Indexed: 06/25/2023]
Abstract
High-quality magnetic resonance imaging (MRI) affords clear body tissue structure for reliable diagnosing. However, there is a principal problem of the trade-off between acquisition speed and image quality. Image reconstruction and super-resolution are crucial techniques to solve these problems. In the main field of MR image restoration, most researchers mainly focus on only one of these aspects, namely reconstruction or super-resolution. In this paper, we propose an efficient model called Multi-Stage Hybrid Attention Network (MHAN) that performs the multi-task of recovering high-resolution (HR) MR images from low-resolution (LR) under-sampled measurements. Our model is highlighted by three major modules: (i) an Amplified Spatial Attention Block (ASAB) capable of enhancing the differences in spatial information, (ii) a Self-Attention Block with a Data-Consistency Layer (DC-SAB), which can improve the accuracy of the extracted feature information, (iii) an Adaptive Local Residual Attention Block (ALRAB) that focuses on both spatial and channel information. MHAN employs an encoder-decoder architecture to deeply extract contextual information and a pipeline to provide spatial accuracy. Compared with the recent multi-task model T2Net, our MHAN improves by 2.759 dB in PSNR and 0.026 in SSIM with scaling factor ×2 and acceleration factor 4× on T2 modality.
Collapse
Affiliation(s)
- Wanliang Wang
- College of Computer Science and Technology, Zhejiang University of Technology, Hangzhou, 310023, China.
| | - Haoxin Shen
- College of Computer Science and Technology, Zhejiang University of Technology, Hangzhou, 310023, China.
| | - Jiacheng Chen
- College of Computer Science and Technology, Zhejiang University of Technology, Hangzhou, 310023, China.
| | - Fangsen Xing
- College of Computer Science and Technology, Zhejiang University of Technology, Hangzhou, 310023, China.
| |
Collapse
|
12
|
Lin J, Miao QI, Surawech C, Raman SS, Zhao K, Wu HH, Sung K. High-Resolution 3D MRI With Deep Generative Networks via Novel Slice-Profile Transformation Super-Resolution. IEEE ACCESS : PRACTICAL INNOVATIONS, OPEN SOLUTIONS 2023; 11:95022-95036. [PMID: 37711392 PMCID: PMC10501177 DOI: 10.1109/access.2023.3307577] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 09/16/2023]
Abstract
High-resolution magnetic resonance imaging (MRI) sequences, such as 3D turbo or fast spin-echo (TSE/FSE) imaging, are clinically desirable but suffer from long scanning time-related blurring when reformatted into preferred orientations. Instead, multi-slice two-dimensional (2D) TSE imaging is commonly used because of its high in-plane resolution but is limited clinically by poor through-plane resolution due to elongated voxels and the inability to generate multi-planar reformations due to staircase artifacts. Therefore, multiple 2D TSE scans are acquired in various orthogonal imaging planes, increasing the overall MRI scan time. In this study, we propose a novel slice-profile transformation super-resolution (SPTSR) framework with deep generative learning for through-plane super-resolution (SR) of multi-slice 2D TSE imaging. The deep generative networks were trained by synthesized low-resolution training input via slice-profile downsampling (SP-DS), and the trained networks inferred on the slice profile convolved (SP-conv) testing input for 5.5x through-plane SR. The network output was further slice-profile deconvolved (SP-deconv) to achieve an isotropic super-resolution. Compared to SMORE SR method and the networks trained by conventional downsampling, our SPTSR framework demonstrated the best overall image quality from 50 testing cases, evaluated by two abdominal radiologists. The quantitative analysis cross-validated the expert reader study results. 3D simulation experiments confirmed the quantitative improvement of the proposed SPTSR and the effectiveness of the SP-deconv step, compared to 3D ground-truths. Ablation studies were conducted on the individual contributions of SP-DS and SP-conv, networks structure, training dataset size, and different slice profiles.
Collapse
Affiliation(s)
- Jiahao Lin
- Department of Radiological Sciences, University of California at Los Angeles, Los Angeles, CA 90095, USA
- Department of Electrical and Computer Engineering, University of California at Los Angeles, Los Angeles, CA 90095, USA
| | - Q I Miao
- Department of Radiological Sciences, University of California at Los Angeles, Los Angeles, CA 90095, USA
- Department of Radiology, The First Affiliated Hospital of China Medical University, Shenyang, Liaoning 110001, China
| | - Chuthaporn Surawech
- Department of Radiological Sciences, University of California at Los Angeles, Los Angeles, CA 90095, USA
- Department of Radiology, Faculty of Medicine, Chulalongkorn University, Bangkok 10330, Thailand
- Division of Diagnostic Radiology, Department of Radiology, King Chulalongkorn Memorial Hospital, Bangkok 10330, Thailand
| | - Steven S Raman
- Department of Radiological Sciences, University of California at Los Angeles, Los Angeles, CA 90095, USA
| | - Kai Zhao
- Department of Radiological Sciences, University of California at Los Angeles, Los Angeles, CA 90095, USA
| | - Holden H Wu
- Department of Radiological Sciences, University of California at Los Angeles, Los Angeles, CA 90095, USA
| | - Kyunghyun Sung
- Department of Radiological Sciences, University of California at Los Angeles, Los Angeles, CA 90095, USA
| |
Collapse
|
13
|
Brémond-Martin C, Simon-Chane C, Clouchoux C, Histace A. Brain organoid data synthesis and evaluation. Front Neurosci 2023; 17:1220172. [PMID: 37650105 PMCID: PMC10465177 DOI: 10.3389/fnins.2023.1220172] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2023] [Accepted: 07/24/2023] [Indexed: 09/01/2023] Open
Abstract
Introduction Datasets containing only few images are common in the biomedical field. This poses a global challenge for the development of robust deep-learning analysis tools, which require a large number of images. Generative Adversarial Networks (GANs) are an increasingly used solution to expand small datasets, specifically in the biomedical domain. However, the validation of synthetic images by metrics is still controversial and psychovisual evaluations are time consuming. Methods We augment a small brain organoid bright-field database of 40 images using several GAN optimizations. We compare these synthetic images to the original dataset using similitude metrcis and we perform an psychovisual evaluation of the 240 images generated. Eight biological experts labeled the full dataset (280 images) as syntetic or natural using a custom-built software. We calculate the error rate per loss optimization as well as the hesitation time. We then compare these results to those provided by the similarity metrics. We test the psychovalidated images in a training step of a segmentation task. Results and discussion Generated images are considered as natural as the original dataset, with no increase of the hesitation time by experts. Experts are particularly misled by perceptual and Wasserstein loss optimization. These optimizations render the most qualitative and similar images according to metrics to the original dataset. We do not observe a strong correlation but links between some metrics and psychovisual decision according to the kind of generation. Particular Blur metric combinations could maybe replace the psychovisual evaluation. Segmentation task which use the most psychovalidated images are the most accurate.
Collapse
Affiliation(s)
- Clara Brémond-Martin
- ETIS Laboratory UMR 8051 (CY Cergy Paris Université, ENSEA, CNRS), Cergy, France
- Witsee, Neoxia, Paris, France
| | - Camille Simon-Chane
- ETIS Laboratory UMR 8051 (CY Cergy Paris Université, ENSEA, CNRS), Cergy, France
| | | | - Aymeric Histace
- ETIS Laboratory UMR 8051 (CY Cergy Paris Université, ENSEA, CNRS), Cergy, France
| |
Collapse
|
14
|
Behara K, Bhero E, Agee JT. Skin Lesion Synthesis and Classification Using an Improved DCGAN Classifier. Diagnostics (Basel) 2023; 13:2635. [PMID: 37627894 PMCID: PMC10453872 DOI: 10.3390/diagnostics13162635] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2023] [Revised: 08/06/2023] [Accepted: 08/07/2023] [Indexed: 08/27/2023] Open
Abstract
The prognosis for patients with skin cancer improves with regular screening and checkups. Unfortunately, many people with skin cancer do not receive a diagnosis until the disease has advanced beyond the point of effective therapy. Early detection is critical, and automated diagnostic technologies like dermoscopy, an imaging device that detects skin lesions early in the disease, are a driving factor. The lack of annotated data and class-imbalance datasets makes using automated diagnostic methods challenging for skin lesion classification. In recent years, deep learning models have performed well in medical diagnosis. Unfortunately, such models require a substantial amount of annotated data for training. Applying a data augmentation method based on generative adversarial networks (GANs) to classify skin lesions is a plausible solution by generating synthetic images to address the problem. This article proposes a skin lesion synthesis and classification model based on an Improved Deep Convolutional Generative Adversarial Network (DCGAN). The proposed system generates realistic images using several convolutional neural networks, making training easier. Scaling, normalization, sharpening, color transformation, and median filters enhance image details during training. The proposed model uses generator and discriminator networks, global average pooling with 2 × 2 fractional-stride, backpropagation with a constant learning rate of 0.01 instead of 0.0002, and the most effective hyperparameters for optimization to efficiently generate high-quality synthetic skin lesion images. As for the classification, the final layer of the Discriminator is labeled as a classifier for predicting the target class. This study deals with a binary classification predicting two classes-benign and malignant-in the ISIC2017 dataset: accuracy, recall, precision, and F1-score model classification performance. BAS measures classifier accuracy on imbalanced datasets. The DCGAN Classifier model demonstrated superior performance with a notable accuracy of 99.38% and 99% for recall, precision, F1 score, and BAS, outperforming the state-of-the-art deep learning models. These results show that the DCGAN Classifier can generate high-quality skin lesion images and accurately classify them, making it a promising tool for deep learning-based medical image analysis.
Collapse
Affiliation(s)
- Kavita Behara
- Department of Electrical Engineering, Mangosuthu University of Technology, Durban 4031, South Africa
| | - Ernest Bhero
- Discipline of Electrical, Electronic and Computer Engineering, University of KwaZulu-Natal, Durban 4041, South Africa; (E.B.); (J.T.A.)
| | - John Terhile Agee
- Discipline of Electrical, Electronic and Computer Engineering, University of KwaZulu-Natal, Durban 4041, South Africa; (E.B.); (J.T.A.)
| |
Collapse
|
15
|
Wu Z, Chen X, Xie S, Shen J, Zeng Y. Super-resolution of brain MRI images based on denoising diffusion probabilistic model. Biomed Signal Process Control 2023. [DOI: 10.1016/j.bspc.2023.104901] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/03/2023]
|
16
|
Ko K, Lee B, Hong J, Kim D, Ko H. MRIFlow: Magnetic resonance image super-resolution based on normalizing flow and frequency prior. JOURNAL OF MAGNETIC RESONANCE (SAN DIEGO, CALIF. : 1997) 2023; 352:107477. [PMID: 37263100 DOI: 10.1016/j.jmr.2023.107477] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/16/2022] [Revised: 05/01/2023] [Accepted: 05/03/2023] [Indexed: 06/03/2023]
Abstract
Super-resolution (SR) is a computer vision task that involves recovering high-resolution (HR) images from low-resolution (LR) ones. While SR is applied to various disciplines, it is particularly important in the medical field which requires accurate diagnosis. L1 and L2 loss-based SR methods produce high values for the peak signal-to-noise ratio and structural similarity index measure but do not have high perceptual quality because SR methods are trained with the average of plausible HR predictions. In addition, SR is an ill-posed problem because only one LR image can be mapped to various HR images. This is crucial because poorly generated HR images can lead to misdiagnosis. In this paper, we propose MRIFlow, a novel method based on normalizing flow that transforms LR magnetic resonance (MR) images into HR MR images. MRIFlow contains frequency affine injectors to reflect frequency information. The frequency affine injector receives the output of a pre-trained LR encoder as the input and obtains frequency information from a wavelet transform based on ScatterNet. Using this method, its inverse operation is possible. MRIFlow has two versions based on the type of ScatterNet employed. In this paper, MRIFlow is compared with normalizing flow-based SR methods by using various MR image datasets such as IXI dataset, NYU fastMRI dataset, and LGG dataset and is demonstrated to produce better quantitative and qualitative results.
Collapse
Affiliation(s)
- Kyungdeuk Ko
- School of Electrical Engineering, Korea University, Seoul 02841, South Korea
| | - Bokyeung Lee
- School of Electrical Engineering, Korea University, Seoul 02841, South Korea
| | - Jonghwan Hong
- School of Electrical Engineering, Korea University, Seoul 02841, South Korea
| | - Donghyeon Kim
- School of Electrical Engineering, Korea University, Seoul 02841, South Korea
| | - Hanseok Ko
- School of Electrical Engineering, Korea University, Seoul 02841, South Korea.
| |
Collapse
|
17
|
Shin M, Peng Z, Kim HJ, Yoo SS, Yoon K. Multivariable-incorporating super-resolution residual network for transcranial focused ultrasound simulation. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2023; 237:107591. [PMID: 37182263 DOI: 10.1016/j.cmpb.2023.107591] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/17/2023] [Revised: 05/02/2023] [Accepted: 05/06/2023] [Indexed: 05/16/2023]
Abstract
BACKGROUND AND OBJECTIVE Transcranial focused ultrasound (tFUS) has emerged as a new non-invasive brain stimulation (NIBS) modality, with its exquisite ability to reach deep brain areas at a high spatial resolution. Accurate placement of an acoustic focus to a target region of the brain is crucial during tFUS treatment; however, the distortion of acoustic wave propagation through the intact skull casts challenges. High-resolution numerical simulation allows for monitoring of the acoustic pressure field in the cranium but also demands extensive computational loads. In this study, we adopt a super-resolution residual network technique based on a deep convolution to enhance the prediction quality of the FUS acoustic pressure field in the targeted brain regions. METHODS The training dataset was acquired by numerical simulations performed at low-(1.0 mm) and high-resolutions (0.5mm) on three ex vivo human calvariae. Five different super-resolution (SR) network models were trained by using a multivariable dataset in 3D, which incorporated information on the acoustic pressure field, wave velocity, and localized skull computed tomography (CT) images. RESULTS The accuracy of 80.87±4.50% in predicting the focal volume with a substantial improvement of 86.91% in computational cost compared to the conventional high-resolution numerical simulation was achieved. The results suggest that the method can greatly reduce the simulation time without sacrificing accuracy and improve the accuracy further with the use of additional inputs. CONCLUSIONS In this research, we developed multivariable-incorporating SR neural networks for transcranial focused ultrasound simulation. Our super-resolution technique may contribute to promoting the safety and efficacy of tFUS-mediated NIBS by providing on-site feedback information on the intracranial pressure field to the operator.
Collapse
Affiliation(s)
- Minwoo Shin
- School of Mathematics and Computing (Computational Science and Engineering), Seoul 03722, Republic of Korea
| | - Zhuogang Peng
- Department of Aerospace and Mechanical Engineering, University of Notre Dame, Notre Dame 46556, IN, USA
| | - Hyo-Jin Kim
- School of Mathematics and Computing (Computational Science and Engineering), Seoul 03722, Republic of Korea
| | - Seung-Schik Yoo
- Department of Radiology, Brigham and Women's Hospital, Harvard Medical School, Boston 02115, MA, USA
| | - Kyungho Yoon
- School of Mathematics and Computing (Computational Science and Engineering), Seoul 03722, Republic of Korea.
| |
Collapse
|
18
|
Hossain MB, Kwon KC, Shinde RK, Imtiaz SM, Kim N. A Hybrid Residual Attention Convolutional Neural Network for Compressed Sensing Magnetic Resonance Image Reconstruction. Diagnostics (Basel) 2023; 13:diagnostics13071306. [PMID: 37046524 PMCID: PMC10093476 DOI: 10.3390/diagnostics13071306] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2023] [Revised: 03/20/2023] [Accepted: 03/29/2023] [Indexed: 04/03/2023] Open
Abstract
We propose a dual-domain deep learning technique for accelerating compressed sensing magnetic resonance image reconstruction. An advanced convolutional neural network with residual connectivity and an attention mechanism was developed for frequency and image domains. First, the sensor domain subnetwork estimates the unmeasured frequencies of k-space to reduce aliasing artifacts. Second, the image domain subnetwork performs a pixel-wise operation to remove blur and noisy artifacts. The skip connections efficiently concatenate the feature maps to alleviate the vanishing gradient problem. An attention gate in each decoder layer enhances network generalizability and speeds up image reconstruction by eliminating irrelevant activations. The proposed technique reconstructs real-valued clinical images from sparsely sampled k-spaces that are identical to the reference images. The performance of this novel approach was compared with state-of-the-art direct mapping, single-domain, and multi-domain methods. With acceleration factors (AFs) of 4 and 5, our method improved the mean peak signal-to-noise ratio (PSNR) to 8.67 and 9.23, respectively, compared with the single-domain Unet model; similarly, our approach increased the average PSNR to 3.72 and 4.61, respectively, compared with the multi-domain W-net. Remarkably, using an AF of 6, it enhanced the PSNR by 9.87 ± 1.55 and 6.60 ± 0.38 compared with Unet and W-net, respectively.
Collapse
|
19
|
Chen Z, Pawar K, Ekanayake M, Pain C, Zhong S, Egan GF. Deep Learning for Image Enhancement and Correction in Magnetic Resonance Imaging-State-of-the-Art and Challenges. J Digit Imaging 2023; 36:204-230. [PMID: 36323914 PMCID: PMC9984670 DOI: 10.1007/s10278-022-00721-9] [Citation(s) in RCA: 18] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2022] [Revised: 09/09/2022] [Accepted: 10/17/2022] [Indexed: 11/06/2022] Open
Abstract
Magnetic resonance imaging (MRI) provides excellent soft-tissue contrast for clinical diagnoses and research which underpin many recent breakthroughs in medicine and biology. The post-processing of reconstructed MR images is often automated for incorporation into MRI scanners by the manufacturers and increasingly plays a critical role in the final image quality for clinical reporting and interpretation. For image enhancement and correction, the post-processing steps include noise reduction, image artefact correction, and image resolution improvements. With the recent success of deep learning in many research fields, there is great potential to apply deep learning for MR image enhancement, and recent publications have demonstrated promising results. Motivated by the rapidly growing literature in this area, in this review paper, we provide a comprehensive overview of deep learning-based methods for post-processing MR images to enhance image quality and correct image artefacts. We aim to provide researchers in MRI or other research fields, including computer vision and image processing, a literature survey of deep learning approaches for MR image enhancement. We discuss the current limitations of the application of artificial intelligence in MRI and highlight possible directions for future developments. In the era of deep learning, we highlight the importance of a critical appraisal of the explanatory information provided and the generalizability of deep learning algorithms in medical imaging.
Collapse
Affiliation(s)
- Zhaolin Chen
- Monash Biomedical Imaging, Monash University, Melbourne, VIC, 3168, Australia.
- Department of Data Science and AI, Monash University, Melbourne, VIC, Australia.
| | - Kamlesh Pawar
- Monash Biomedical Imaging, Monash University, Melbourne, VIC, 3168, Australia
| | - Mevan Ekanayake
- Monash Biomedical Imaging, Monash University, Melbourne, VIC, 3168, Australia
- Department of Electrical and Computer Systems Engineering, Monash University, Melbourne, VIC, Australia
| | - Cameron Pain
- Monash Biomedical Imaging, Monash University, Melbourne, VIC, 3168, Australia
- Department of Electrical and Computer Systems Engineering, Monash University, Melbourne, VIC, Australia
| | - Shenjun Zhong
- Monash Biomedical Imaging, Monash University, Melbourne, VIC, 3168, Australia
- National Imaging Facility, Brisbane, QLD, Australia
| | - Gary F Egan
- Monash Biomedical Imaging, Monash University, Melbourne, VIC, 3168, Australia
- Turner Institute for Brain and Mental Health, Monash University, Melbourne, VIC, Australia
| |
Collapse
|
20
|
Basty N, Thanaj M, Cule M, Sorokin EP, Liu Y, Thomas EL, Bell JD, Whitcher B. Artifact-free fat-water separation in Dixon MRI using deep learning. JOURNAL OF BIG DATA 2023; 10:4. [PMID: 36686622 PMCID: PMC9835035 DOI: 10.1186/s40537-022-00677-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/04/2022] [Accepted: 12/25/2022] [Indexed: 06/17/2023]
Abstract
Chemical-shift encoded MRI (CSE-MRI) is a widely used technique for the study of body composition and metabolic disorders, where derived fat and water signals enable the quantification of adipose tissue and muscle. The UK Biobank is acquiring whole-body Dixon MRI (a specific implementation of CSE-MRI) for over 100,000 participants. Current processing methods associated with large whole-body volumes are time intensive and prone to artifacts during fat-water separation performed by the scanner, making quantitative analysis challenging. The most common artifacts are fat-water swaps, where the labels are inverted at the voxel level. It is common for researchers to discard swapped data (generally around 10%), which is wasteful and may lead to unintended biases. Given the large number of whole-body Dixon MRI acquisitions in the UK Biobank, thousands of swaps are expected to be present in the fat and water volumes from image reconstruction performed on the scanner. If they go undetected, errors will propagate into processes such as organ segmentation, and dilute the results in population-based analyses. There is a clear need for a robust method to accurately separate fat and water volumes in big data collections like the UK Biobank. We formulate fat-water separation as a style transfer problem, where swap-free fat and water volumes are predicted from the acquired Dixon MRI data using a conditional generative adversarial network, and introduce a new loss function for the generator model. Our method is able to predict highly accurate fat and water volumes free from artifacts in the UK Biobank. We show that our model separates fat and water volumes using either single input (in-phase only) or dual input (in-phase and opposed-phase) data, with the latter producing superior results. Our proposed method enables faster and more accurate downstream analysis of body composition from Dixon MRI in population studies by eliminating the need for visual inspection or discarding data due to fat-water swaps. Supplementary Information The online version contains supplementary material available at 10.1186/s40537-022-00677-1.
Collapse
Affiliation(s)
- Nicolas Basty
- Research Centre for Optimal Health, University of Westminster, London, UK
| | - Marjola Thanaj
- Research Centre for Optimal Health, University of Westminster, London, UK
| | | | | | - Yi Liu
- Calico Life Sciences LLC, South San Francisco, USA
| | - E. Louise Thomas
- Research Centre for Optimal Health, University of Westminster, London, UK
| | - Jimmy D. Bell
- Research Centre for Optimal Health, University of Westminster, London, UK
| | - Brandon Whitcher
- Research Centre for Optimal Health, University of Westminster, London, UK
| |
Collapse
|
21
|
Xu Y, Dai S, Song H, Du L, Chen Y. Multi-modal brain MRI images enhancement based on framelet and local weights super-resolution. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2023; 20:4258-4273. [PMID: 36899626 DOI: 10.3934/mbe.2023199] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/18/2023]
Abstract
Magnetic resonance (MR) image enhancement technology can reconstruct high-resolution image from a low-resolution image, which is of great significance for clinical application and scientific research. T1 weighting and T2 weighting are the two common magnetic resonance imaging modes, each of which has its own advantages, but the imaging time of T2 is much longer than that of T1. Related studies have shown that they have very similar anatomical structures in brain images, which can be utilized to enhance the resolution of low-resolution T2 images by using the edge information of high-resolution T1 images that can be rapidly imaged, so as to shorten the imaging time needed for T2 images. In order to overcome the inflexibility of traditional methods using fixed weights for interpolation and the inaccuracy of using gradient threshold to determine edge regions, we propose a new model based on previous studies on multi-contrast MR image enhancement. Our model uses framelet decomposition to finely separate the edge structure of the T2 brain image, and uses the local regression weights calculated from T1 image to construct a global interpolation matrix, so that our model can not only guide the edge reconstruction more accurately where the weights are shared, but also carry out collaborative global optimization for the remaining pixels and their interpolated weights. Experimental results on a set of simulated MR data and two sets of real MR images show that the enhanced images obtained by the proposed method are superior to the compared methods in terms of visual sharpness or qualitative indicators.
Collapse
Affiliation(s)
- Yingying Xu
- School of Electronics and Information Engineering, Taizhou University, Taizhou 318000, China
| | - Songsong Dai
- School of Electronics and Information Engineering, Taizhou University, Taizhou 318000, China
| | - Haifeng Song
- School of Electronics and Information Engineering, Taizhou University, Taizhou 318000, China
| | - Lei Du
- School of Electronics and Information Engineering, Taizhou University, Taizhou 318000, China
| | - Ying Chen
- School of Electronics and Information Engineering, Taizhou University, Taizhou 318000, China
| |
Collapse
|
22
|
Hossain MB, Kwon KC, Imtiaz SM, Nam OS, Jeon SH, Kim N. De-Aliasing and Accelerated Sparse Magnetic Resonance Image Reconstruction Using Fully Dense CNN with Attention Gates. BIOENGINEERING (BASEL, SWITZERLAND) 2022; 10:bioengineering10010022. [PMID: 36671594 PMCID: PMC9854709 DOI: 10.3390/bioengineering10010022] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/25/2022] [Revised: 12/19/2022] [Accepted: 12/19/2022] [Indexed: 12/24/2022]
Abstract
When sparsely sampled data are used to accelerate magnetic resonance imaging (MRI), conventional reconstruction approaches produce significant artifacts that obscure the content of the image. To remove aliasing artifacts, we propose an advanced convolutional neural network (CNN) called fully dense attention CNN (FDA-CNN). We updated the Unet model with the fully dense connectivity and attention mechanism for MRI reconstruction. The main benefit of FDA-CNN is that an attention gate in each decoder layer increases the learning process by focusing on the relevant image features and provides a better generalization of the network by reducing irrelevant activations. Moreover, densely interconnected convolutional layers reuse the feature maps and prevent the vanishing gradient problem. Additionally, we also implement a new, proficient under-sampling pattern in the phase direction that takes low and high frequencies from the k-space both randomly and non-randomly. The performance of FDA-CNN was evaluated quantitatively and qualitatively with three different sub-sampling masks and datasets. Compared with five current deep learning-based and two compressed sensing MRI reconstruction techniques, the proposed method performed better as it reconstructed smoother and brighter images. Furthermore, FDA-CNN improved the mean PSNR by 2 dB, SSIM by 0.35, and VIFP by 0.37 compared with Unet for the acceleration factor of 5.
Collapse
Affiliation(s)
- Md. Biddut Hossain
- School of Information and Communication Engineering, Chungbuk National University, Cheongju-si 28644, Chungcheongbuk-do, Republic of Korea
| | - Ki-Chul Kwon
- School of Information and Communication Engineering, Chungbuk National University, Cheongju-si 28644, Chungcheongbuk-do, Republic of Korea
| | - Shariar Md Imtiaz
- School of Information and Communication Engineering, Chungbuk National University, Cheongju-si 28644, Chungcheongbuk-do, Republic of Korea
| | - Oh-Seung Nam
- School of Information and Communication Engineering, Chungbuk National University, Cheongju-si 28644, Chungcheongbuk-do, Republic of Korea
| | - Seok-Hee Jeon
- Department of Electronics Engineering, Incheon National University, 119 Academy-ro, Yeonsu-gu, Incheon 22012, Gyeonggi-do, Republic of Korea
| | - Nam Kim
- School of Information and Communication Engineering, Chungbuk National University, Cheongju-si 28644, Chungcheongbuk-do, Republic of Korea
- Correspondence: ; Tel.: +82-043-261-2482
| |
Collapse
|
23
|
A Rolling Bearing Fault Diagnosis Based on Conditional Depth Convolution Countermeasure Generation Networks under Small Samples. SENSORS 2022; 22:s22155658. [PMID: 35957215 PMCID: PMC9370996 DOI: 10.3390/s22155658] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/24/2022] [Revised: 07/25/2022] [Accepted: 07/26/2022] [Indexed: 11/17/2022]
Abstract
Aiming at the problems of low fault diagnosis accuracy caused by insufficient samples and unbalanced data sample distribution in bearing fault diagnosis, this paper proposes a fault diagnosis method for rolling bearings referencing conditional deep convolution adversarial generative networks (C−DCGAN) for efficient data augmentation. Firstly, the concept of conditional constraints is used to guide and improve the sample generation process of the original generative adversarial network, and specific constraints are added to the data generation model to perform a balanced expansion of muti-category fault data for small sample data sets. Secondly, aiming at the phenomena of training instability, gradient disappearance and gradient explosion in the imbalanced sample set, it is proposed to optimize the structure of the generative network by using the structure of self-defined skip connections and spectral normalization, while using the Wasserstein distance with penalty term instead of cross entropy. The function is used as the loss function of the generative adversarial network to improve the stable feature extraction ability of the generative network and the effect of the training process; in this way, simulation sample data with only a small variation from the real data distribution can be generated. Finally, the complete fault data set (after mixing the original data with sufficient fault category and sample number) and the generated data are input into the one-dimensional convolution neural network for fault diagnosis of rolling bearing. The experiment’s results show that the diagnosis method in this paper can improve the fault classification effect of rolling bearings by generating balanced and sufficient sample data.
Collapse
|
24
|
|
25
|
Ma X, Wang Z, Hu S, Kan S. Multi-Focus Image Fusion Based on Multi-Scale Generative Adversarial Network. ENTROPY 2022; 24:e24050582. [PMID: 35626467 PMCID: PMC9140435 DOI: 10.3390/e24050582] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/11/2022] [Revised: 04/15/2022] [Accepted: 04/19/2022] [Indexed: 02/05/2023]
Abstract
The methods based on the convolutional neural network have demonstrated its powerful information integration ability in image fusion. However, most of the existing methods based on neural networks are only applied to a part of the fusion process. In this paper, an end-to-end multi-focus image fusion method based on a multi-scale generative adversarial network (MsGAN) is proposed that makes full use of image features by a combination of multi-scale decomposition with a convolutional neural network. Extensive qualitative and quantitative experiments on the synthetic and Lytro datasets demonstrated the effectiveness and superiority of the proposed MsGAN compared to the state-of-the-art multi-focus image fusion methods.
Collapse
Affiliation(s)
- Xiaole Ma
- School of Computer and Information Technology, Beijing Jiaotong University, Beijing 100044, China; (X.M.); (Z.W.)
- Beijing Key Laboratory of Advanced Information Science and Network Technology, Beijing 100044, China
| | - Zhihai Wang
- School of Computer and Information Technology, Beijing Jiaotong University, Beijing 100044, China; (X.M.); (Z.W.)
| | - Shaohai Hu
- School of Computer and Information Technology, Beijing Jiaotong University, Beijing 100044, China; (X.M.); (Z.W.)
- Beijing Key Laboratory of Advanced Information Science and Network Technology, Beijing 100044, China
- Correspondence:
| | - Shichao Kan
- School of Computer Science and Engineering, Central South University, Changsha 410083, China;
| |
Collapse
|
26
|
Ueki W, Nishii T, Umehara K, Ota J, Higuchi S, Ohta Y, Nagai Y, Murakawa K, Ishida T, Fukuda T. Generative adversarial network-based post-processed image super-resolution technology for accelerating brain MRI: comparison with compressed sensing. Acta Radiol 2022; 64:336-345. [PMID: 35118883 DOI: 10.1177/02841851221076330] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
BACKGROUND It is unclear whether deep-learning-based super-resolution technology (SR) or compressed sensing technology (CS) can accelerate magnetic resonance imaging (MRI) . PURPOSE To compare SR accelerated images with CS images regarding the image similarity to reference 2D- and 3D gradient-echo sequence (GRE) brain MRI. MATERIAL AND METHODS We prospectively acquired 1.3× and 2.0× faster 2D and 3D GRE images of 20 volunteers from the reference time by reducing the matrix size or increasing the CS factor. For SR, we trained the generative adversarial network (GAN), upscaling the low-resolution images to the reference images with twofold cross-validation. We compared the structural similarity (SSIM) index of accelerated images to the reference image. The rate of incorrect answers of a radiologist discriminating faster and reference image was used as a subjective image similarity (ISM) index. RESULTS The SR demonstrated significantly higher SSIM than the CS (SSIM=0.9993-0.999 vs. 0.9947-0.9986; P < 0.001). In 2D GRE, it was challenging to discriminate the SR image from the reference image, compared to the CS (ISM index 40% vs. 17.5% in 1.3×; P = 0.039 and 17.5% vs. 2.5% in 2.0×; P = 0.034). In 3D GRE, the CS revealed a significantly higher ISM index than the SR (22.5% vs. 2.5%; P = 0.011) in 2.0 × faster images. However, the ISM index was identical for the 2.0× CS and 1.3× SR (22.5% vs. 27.5%; P = 0.62) with comparable time costs. CONCLUSION The GAN-based SR outperformed CS in image similarity with 2D GRE for MRI acceleration. In addition, CS was more advantageous in 3D GRE than SR.
Collapse
Affiliation(s)
- Wataru Ueki
- Department of Radiology, National Cerebral and Cardiovascular Center, Suita, Osaka, Japan
| | - Tatsuya Nishii
- Department of Radiology, National Cerebral and Cardiovascular Center, Suita, Osaka, Japan
| | - Kensuke Umehara
- Medical Informatics Section, QST Hospital, National Institutes for Quantum Science and Technology, Chiba, Japan
- Applied MRI Research, Department of Molecular Imaging and Theranostics, Institute for Quantum Medical Science, National Institutes for Quantum Science and Technology, Chiba, Japan
- Department of Medical Physics and Engineering, Graduate School of Medicine, Osaka University, Suita, Osaka, Japan
| | - Junko Ota
- Medical Informatics Section, QST Hospital, National Institutes for Quantum Science and Technology, Chiba, Japan
- Applied MRI Research, Department of Molecular Imaging and Theranostics, Institute for Quantum Medical Science, National Institutes for Quantum Science and Technology, Chiba, Japan
- Department of Medical Physics and Engineering, Graduate School of Medicine, Osaka University, Suita, Osaka, Japan
| | - Satoshi Higuchi
- Department of Radiology, National Cerebral and Cardiovascular Center, Suita, Osaka, Japan
| | - Yasutoshi Ohta
- Department of Radiology, National Cerebral and Cardiovascular Center, Suita, Osaka, Japan
| | - Yasuhiro Nagai
- Department of Radiology, National Cerebral and Cardiovascular Center, Suita, Osaka, Japan
| | - Keizo Murakawa
- Department of Radiology, National Cerebral and Cardiovascular Center, Suita, Osaka, Japan
| | - Takayuki Ishida
- Department of Medical Physics and Engineering, Graduate School of Medicine, Osaka University, Suita, Osaka, Japan
| | - Tetsuya Fukuda
- Department of Radiology, National Cerebral and Cardiovascular Center, Suita, Osaka, Japan
| |
Collapse
|
27
|
Huang J, Ding W, Lv J, Yang J, Dong H, Del Ser J, Xia J, Ren T, Wong ST, Yang G. Edge-enhanced dual discriminator generative adversarial network for fast MRI with parallel imaging using multi-view information. APPL INTELL 2022; 52:14693-14710. [PMID: 36199853 PMCID: PMC9526695 DOI: 10.1007/s10489-021-03092-w] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/09/2021] [Indexed: 12/24/2022]
Abstract
In clinical medicine, magnetic resonance imaging (MRI) is one of the most important tools for diagnosis, triage, prognosis, and treatment planning. However, MRI suffers from an inherent slow data acquisition process because data is collected sequentially in k-space. In recent years, most MRI reconstruction methods proposed in the literature focus on holistic image reconstruction rather than enhancing the edge information. This work steps aside this general trend by elaborating on the enhancement of edge information. Specifically, we introduce a novel parallel imaging coupled dual discriminator generative adversarial network (PIDD-GAN) for fast multi-channel MRI reconstruction by incorporating multi-view information. The dual discriminator design aims to improve the edge information in MRI reconstruction. One discriminator is used for holistic image reconstruction, whereas the other one is responsible for enhancing edge information. An improved U-Net with local and global residual learning is proposed for the generator. Frequency channel attention blocks (FCA Blocks) are embedded in the generator for incorporating attention mechanisms. Content loss is introduced to train the generator for better reconstruction quality. We performed comprehensive experiments on Calgary-Campinas public brain MR dataset and compared our method with state-of-the-art MRI reconstruction methods. Ablation studies of residual learning were conducted on the MICCAI13 dataset to validate the proposed modules. Results show that our PIDD-GAN provides high-quality reconstructed MR images, with well-preserved edge information. The time of single-image reconstruction is below 5ms, which meets the demand of faster processing.
Collapse
Affiliation(s)
- Jiahao Huang
- College of Information Science and Technology, Zhejiang Shuren University, 310015 Hangzhou, China
- National Heart and Lung Institute, Imperial College London, London, UK
| | - Weiping Ding
- School of Information Science and Technology, Nantong University, 226019 Nantong, China
| | - Jun Lv
- School of Computer and Control Engineering, Yantai University, 264005 Yantai, China
| | - Jingwen Yang
- Department of Prosthodontics, Peking University School and Hospital of Stomatology, Beijing, China
| | - Hao Dong
- Center on Frontiers of Computing Studies, Peking University, Beijing, China
| | - Javier Del Ser
- TECNALIA, Basque Research and Technology Alliance (BRTA), 48160 Derio, Spain
- University of the Basque Country (UPV/EHU), 48013 Bilbao, Spain
| | - Jun Xia
- Department of Radiology, Shenzhen Second People’s Hospital, The First Afliated Hospital of Shenzhen University Health Science Center, Shenzhen, China
| | - Tiaojuan Ren
- College of Information Science and Technology, Zhejiang Shuren University, 310015 Hangzhou, China
| | - Stephen T. Wong
- Systems Medicine and Bioengineering Department, Departments of Radiology and Pathology, Houston Methodist Cancer Center, Houston Methodist Hospital, Weill Cornell Medicine, 77030 Houston, TX USA
| | - Guang Yang
- National Heart and Lung Institute, Imperial College London, London, UK
- Cardiovascular Research Centre, Royal Brompton Hospital, London, UK
| |
Collapse
|
28
|
Ahmad B, Sun J, You Q, Palade V, Mao Z. Brain Tumor Classification Using a Combination of Variational Autoencoders and Generative Adversarial Networks. Biomedicines 2022; 10:biomedicines10020223. [PMID: 35203433 PMCID: PMC8869455 DOI: 10.3390/biomedicines10020223] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2021] [Revised: 12/23/2021] [Accepted: 01/03/2022] [Indexed: 11/16/2022] Open
Abstract
Brain tumors are a pernicious cancer with one of the lowest five-year survival rates. Neurologists often use magnetic resonance imaging (MRI) to diagnose the type of brain tumor. Automated computer-assisted tools can help them speed up the diagnosis process and reduce the burden on the health care systems. Recent advances in deep learning for medical imaging have shown remarkable results, especially in the automatic and instant diagnosis of various cancers. However, we need a large amount of data (images) to train the deep learning models in order to obtain good results. Large public datasets are rare in medicine. This paper proposes a framework based on unsupervised deep generative neural networks to solve this limitation. We combine two generative models in the proposed framework: variational autoencoders (VAEs) and generative adversarial networks (GANs). We swap the encoder–decoder network after initially training it on the training set of available MR images. The output of this swapped network is a noise vector that has information of the image manifold, and the cascaded generative adversarial network samples the input from this informative noise vector instead of random Gaussian noise. The proposed method helps the GAN to avoid mode collapse and generate realistic-looking brain tumor magnetic resonance images. These artificially generated images could solve the limitation of small medical datasets up to a reasonable extent and help the deep learning models perform acceptably. We used the ResNet50 as a classifier, and the artificially generated brain tumor images are used to augment the real and available images during the classifier training. We compared the classification results with several existing studies and state-of-the-art machine learning models. Our proposed methodology noticeably achieved better results. By using brain tumor images generated artificially by our proposed method, the classification average accuracy improved from 72.63% to 96.25%. For the most severe class of brain tumor, glioma, we achieved 0.769, 0.837, 0.833, and 0.80 values for recall, specificity, precision, and F1-score, respectively. The proposed generative model framework could be used to generate medical images in any domain, including PET (positron emission tomography) and MRI scans of various parts of the body, and the results show that it could be a useful clinical tool for medical experts.
Collapse
Affiliation(s)
- Bilal Ahmad
- School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi 214122, China; (B.A.); (Q.Y.); (Z.M.)
| | - Jun Sun
- School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi 214122, China; (B.A.); (Q.Y.); (Z.M.)
- Correspondence:
| | - Qi You
- School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi 214122, China; (B.A.); (Q.Y.); (Z.M.)
| | - Vasile Palade
- Centre for Computational Science and Mathematical Modelling, Coventry University, Coventry CV1 5FB, UK;
| | - Zhongjie Mao
- School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi 214122, China; (B.A.); (Q.Y.); (Z.M.)
| |
Collapse
|
29
|
Wu X, Li C, Zeng X, Wei H, Deng HW, Zhang J, Xu M. CryoETGAN: Cryo-Electron Tomography Image Synthesis via Unpaired Image Translation. Front Physiol 2022; 13:760404. [PMID: 35370760 PMCID: PMC8970048 DOI: 10.3389/fphys.2022.760404] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2021] [Accepted: 01/17/2022] [Indexed: 12/02/2022] Open
Abstract
Cryo-electron tomography (Cryo-ET) has been regarded as a revolution in structural biology and can reveal molecular sociology. Its unprecedented quality enables it to visualize cellular organelles and macromolecular complexes at nanometer resolution with native conformations. Motivated by developments in nanotechnology and machine learning, establishing machine learning approaches such as classification, detection and averaging for Cryo-ET image analysis has inspired broad interest. Yet, deep learning-based methods for biomedical imaging typically require large labeled datasets for good results, which can be a great challenge due to the expense of obtaining and labeling training data. To deal with this problem, we propose a generative model to simulate Cryo-ET images efficiently and reliably: CryoETGAN. This cycle-consistent and Wasserstein generative adversarial network (GAN) is able to generate images with an appearance similar to the original experimental data. Quantitative and visual grading results on generated images are provided to show that the results of our proposed method achieve better performance compared to the previous state-of-the-art simulation methods. Moreover, CryoETGAN is stable to train and capable of generating plausibly diverse image samples.
Collapse
Affiliation(s)
- Xindi Wu
- Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA, United States
| | - Chengkun Li
- École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
| | - Xiangrui Zeng
- Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA, United States
| | - Haocheng Wei
- Department of Electrical & Computer Engineering, University of Toronto, Toronto, ON, Canada
| | - Hong-Wen Deng
- Center for Biomedical Informatics & Genomics, Tulane University, New Orleans, LA, United States
| | - Jing Zhang
- Department of Computer Science, University of California, Irvine, Irvine, CA, United States
| | - Min Xu
- Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA, United States
| |
Collapse
|
30
|
Improving Skin Cancer Classification Using Heavy-Tailed Student T-Distribution in Generative Adversarial Networks (TED-GAN). Diagnostics (Basel) 2021; 11:diagnostics11112147. [PMID: 34829494 PMCID: PMC8621489 DOI: 10.3390/diagnostics11112147] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2021] [Revised: 11/03/2021] [Accepted: 11/09/2021] [Indexed: 11/16/2022] Open
Abstract
Deep learning has gained immense attention from researchers in medicine, especially in medical imaging. The main bottleneck is the unavailability of sufficiently large medical datasets required for the good performance of deep learning models. This paper proposes a new framework consisting of one variational autoencoder (VAE), two generative adversarial networks, and one auxiliary classifier to artificially generate realistic-looking skin lesion images and improve classification performance. We first train the encoder-decoder network to obtain the latent noise vector with the image manifold’s information and let the generative adversarial network sample the input from this informative noise vector in order to generate the skin lesion images. The use of informative noise allows the GAN to avoid mode collapse and creates faster convergence. To improve the diversity in the generated images, we use another GAN with an auxiliary classifier, which samples the noise vector from a heavy-tailed student t-distribution instead of a random noise Gaussian distribution. The proposed framework was named TED-GAN, with T from the t-distribution and ED from the encoder-decoder network which is part of the solution. The proposed framework could be used in a broad range of areas in medical imaging. We used it here to generate skin lesion images and have obtained an improved classification performance on the skin lesion classification task, rising from 66% average accuracy to 92.5%. The results show that TED-GAN has a better impact on the classification task because of its diverse range of generated images due to the use of a heavy-tailed t-distribution.
Collapse
|