1
|
Huang Y, Gomaa A, Höfler D, Schubert P, Gaipl U, Frey B, Fietkau R, Bert C, Putz F. Principles of artificial intelligence in radiooncology. Strahlenther Onkol 2024:10.1007/s00066-024-02272-0. [PMID: 39105746 DOI: 10.1007/s00066-024-02272-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2024] [Accepted: 06/17/2024] [Indexed: 08/07/2024]
Abstract
PURPOSE In the rapidly expanding field of artificial intelligence (AI) there is a wealth of literature detailing the myriad applications of AI, particularly in the realm of deep learning. However, a review that elucidates the technical principles of deep learning as relevant to radiation oncology in an easily understandable manner is still notably lacking. This paper aims to fill this gap by providing a comprehensive guide to the principles of deep learning that is specifically tailored toward radiation oncology. METHODS In light of the extensive variety of AI methodologies, this review selectively concentrates on the specific domain of deep learning. It emphasizes the principal categories of deep learning models and delineates the methodologies for training these models effectively. RESULTS This review initially delineates the distinctions between AI and deep learning as well as between supervised and unsupervised learning. Subsequently, it elucidates the fundamental principles of major deep learning models, encompassing multilayer perceptrons (MLPs), convolutional neural networks (CNNs), recurrent neural networks (RNNs), transformers, generative adversarial networks (GANs), diffusion-based generative models, and reinforcement learning. For each category, it presents representative networks alongside their specific applications in radiation oncology. Moreover, the review outlines critical factors essential for training deep learning models, such as data preprocessing, loss functions, optimizers, and other pivotal training parameters including learning rate and batch size. CONCLUSION This review provides a comprehensive overview of deep learning principles tailored toward radiation oncology. It aims to enhance the understanding of AI-based research and software applications, thereby bridging the gap between complex technological concepts and clinical practice in radiation oncology.
Collapse
Affiliation(s)
- Yixing Huang
- Department of Radiation Oncology, Universitätsklinikum Erlangen, Friedrich-Alexander Universität Erlangen-Nürnberg, 91054, Erlangen, Germany.
- Comprehensive Cancer Center Erlangen-EMN (CCC ER-EMN), 91054, Erlangen, Germany.
| | - Ahmed Gomaa
- Department of Radiation Oncology, Universitätsklinikum Erlangen, Friedrich-Alexander Universität Erlangen-Nürnberg, 91054, Erlangen, Germany
- Comprehensive Cancer Center Erlangen-EMN (CCC ER-EMN), 91054, Erlangen, Germany
| | - Daniel Höfler
- Department of Radiation Oncology, Universitätsklinikum Erlangen, Friedrich-Alexander Universität Erlangen-Nürnberg, 91054, Erlangen, Germany
- Comprehensive Cancer Center Erlangen-EMN (CCC ER-EMN), 91054, Erlangen, Germany
| | - Philipp Schubert
- Department of Radiation Oncology, Universitätsklinikum Erlangen, Friedrich-Alexander Universität Erlangen-Nürnberg, 91054, Erlangen, Germany
- Comprehensive Cancer Center Erlangen-EMN (CCC ER-EMN), 91054, Erlangen, Germany
| | - Udo Gaipl
- Comprehensive Cancer Center Erlangen-EMN (CCC ER-EMN), 91054, Erlangen, Germany
- Translational Radiobiology, Department of Radiation Oncology, Universitätsklinikum Erlangen, Friedrich-Alexander Universität Erlangen-Nürnberg, 91054, Erlangen, Germany
| | - Benjamin Frey
- Comprehensive Cancer Center Erlangen-EMN (CCC ER-EMN), 91054, Erlangen, Germany
- Translational Radiobiology, Department of Radiation Oncology, Universitätsklinikum Erlangen, Friedrich-Alexander Universität Erlangen-Nürnberg, 91054, Erlangen, Germany
| | - Rainer Fietkau
- Department of Radiation Oncology, Universitätsklinikum Erlangen, Friedrich-Alexander Universität Erlangen-Nürnberg, 91054, Erlangen, Germany
- Comprehensive Cancer Center Erlangen-EMN (CCC ER-EMN), 91054, Erlangen, Germany
| | - Christoph Bert
- Department of Radiation Oncology, Universitätsklinikum Erlangen, Friedrich-Alexander Universität Erlangen-Nürnberg, 91054, Erlangen, Germany
- Comprehensive Cancer Center Erlangen-EMN (CCC ER-EMN), 91054, Erlangen, Germany
| | - Florian Putz
- Department of Radiation Oncology, Universitätsklinikum Erlangen, Friedrich-Alexander Universität Erlangen-Nürnberg, 91054, Erlangen, Germany
- Comprehensive Cancer Center Erlangen-EMN (CCC ER-EMN), 91054, Erlangen, Germany
| |
Collapse
|
2
|
Wang Z, Yang Y, Chen Y, Yuan T, Sermesant M, Delingette H, Wu O. Mutual Information Guided Diffusion for Zero-Shot Cross-Modality Medical Image Translation. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:2825-2838. [PMID: 38551825 DOI: 10.1109/tmi.2024.3382043] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/02/2024]
Abstract
Cross-modality data translation has attracted great interest in medical image computing. Deep generative models show performance improvement in addressing related challenges. Nevertheless, as a fundamental challenge in image translation, the problem of zero-shot learning cross-modality image translation with fidelity remains unanswered. To bridge this gap, we propose a novel unsupervised zero-shot learning method called Mutual Information guided Diffusion Model, which learns to translate an unseen source image to the target modality by leveraging the inherent statistical consistency of Mutual Information between different modalities. To overcome the prohibitive high dimensional Mutual Information calculation, we propose a differentiable local-wise mutual information layer for conditioning the iterative denoising process. The Local-wise-Mutual-Information-Layer captures identical cross-modality features in the statistical domain, offering diffusion guidance without relying on direct mappings between the source and target domains. This advantage allows our method to adapt to changing source domains without the need for retraining, making it highly practical when sufficient labeled source domain data is not available. We demonstrate the superior performance of MIDiffusion in zero-shot cross-modality translation tasks through empirical comparisons with other generative models, including adversarial-based and diffusion-based models. Finally, we showcase the real-world application of MIDiffusion in 3D zero-shot learning-based cross-modality image segmentation tasks.
Collapse
|
3
|
Wang S, Wu R, Jia S, Diakite A, Li C, Liu Q, Zheng H, Ying L. Knowledge-driven deep learning for fast MR imaging: Undersampled MR image reconstruction from supervised to un-supervised learning. Magn Reson Med 2024; 92:496-518. [PMID: 38624162 DOI: 10.1002/mrm.30105] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2023] [Revised: 03/19/2024] [Accepted: 03/20/2024] [Indexed: 04/17/2024]
Abstract
Deep learning (DL) has emerged as a leading approach in accelerating MRI. It employs deep neural networks to extract knowledge from available datasets and then applies the trained networks to reconstruct accurate images from limited measurements. Unlike natural image restoration problems, MRI involves physics-based imaging processes, unique data properties, and diverse imaging tasks. This domain knowledge needs to be integrated with data-driven approaches. Our review will introduce the significant challenges faced by such knowledge-driven DL approaches in the context of fast MRI along with several notable solutions, which include learning neural networks and addressing different imaging application scenarios. The traits and trends of these techniques have also been given which have shifted from supervised learning to semi-supervised learning, and finally, to unsupervised learning methods. In addition, MR vendors' choices of DL reconstruction have been provided along with some discussions on open questions and future directions, which are critical for the reliable imaging systems.
Collapse
Affiliation(s)
- Shanshan Wang
- Paul C Lauterbur Research Center for Biomedical Imaging, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
| | - Ruoyou Wu
- Paul C Lauterbur Research Center for Biomedical Imaging, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
| | - Sen Jia
- Paul C Lauterbur Research Center for Biomedical Imaging, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
| | - Alou Diakite
- Paul C Lauterbur Research Center for Biomedical Imaging, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Cheng Li
- Paul C Lauterbur Research Center for Biomedical Imaging, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
| | - Qiegen Liu
- Department of Electronic Information Engineering, Nanchang University, Nanchang, China
| | - Hairong Zheng
- Paul C Lauterbur Research Center for Biomedical Imaging, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
| | - Leslie Ying
- Department of Biomedical Engineering and Department of Electrical Engineering, The State University of New York, Buffalo, New York, USA
| |
Collapse
|
4
|
Chen X, Qiu RLJ, Peng J, Shelton JW, Chang CW, Yang X, Kesarwala AH. CBCT-based synthetic CT image generation using a diffusion model for CBCT-guided lung radiotherapy. Med Phys 2024. [PMID: 39088750 DOI: 10.1002/mp.17328] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2024] [Revised: 07/01/2024] [Accepted: 07/04/2024] [Indexed: 08/03/2024] Open
Abstract
BACKGROUND Although cone beam computed tomography (CBCT) has lower resolution compared to planning CTs (pCT), its lower dose, higher high-contrast resolution, and shorter scanning time support its widespread use in clinical applications, especially in ensuring accurate patient positioning during the image-guided radiation therapy (IGRT) process. PURPOSE While CBCT is critical to IGRT, CBCT image quality can be compromised by severe stripe and scattering artifacts. Tumor movement secondary to respiratory motion also decreases CBCT resolution. In order to improve the image quality of CBCT, we propose a Lung Diffusion Model (L-DM) framework. METHODS Our proposed algorithm is based on a conditional diffusion model trained on pCT and deformed CBCT (dCBCT) image pairs to synthesize lung CT images from dCBCT images and benefit CBCT-based radiotherapy. dCBCT images were used as the constraint for the L-DM. The image quality and Hounsfield unit (HU) values of the synthetic CTs (sCT) images generated by the proposed L-DM were compared to three selected mainstream generation models. RESULTS We verified our model in both an institutional lung cancer dataset and a selected public dataset. Our L-DM showed significant improvement in the four metrics of mean absolute error (MAE), peak signal-to-noise ratio (PSNR), normalized cross-correlation (NCC), and structural similarity index measure (SSIM). In our institutional dataset, our proposed L-DM decreased the MAE from 101.47 to 37.87 HU and increased the PSNR from 24.97 to 29.89 dB, the NCC from 0.81 to 0.97, and the SSIM from 0.80 to 0.93. In the public dataset, our proposed L-DM decreased the MAE from 173.65 to 58.95 HU, while increasing the PSNR, NCC, and SSIM from 13.07 to 24.05 dB, 0.68 to 0.94, and 0.41 to 0.88, respectively. CONCLUSIONS The proposed L-DM significantly improved sCT image quality compared to the pre-correction CBCT and three mainstream generative models. Our model can benefit CBCT-based IGRT and other potential clinical applications as it increases the HU accuracy and decreases the artifacts from input CBCT images.
Collapse
Affiliation(s)
- Xiaoqian Chen
- Department of Radiation Oncology, Winship Cancer Institute, Emory University School of Medicine, Atlanta, Georgia, USA
| | - Richard L J Qiu
- Department of Radiation Oncology, Winship Cancer Institute, Emory University School of Medicine, Atlanta, Georgia, USA
| | - Junbo Peng
- Department of Radiation Oncology, Winship Cancer Institute, Emory University School of Medicine, Atlanta, Georgia, USA
| | - Joseph W Shelton
- Department of Radiation Oncology, Winship Cancer Institute, Emory University School of Medicine, Atlanta, Georgia, USA
| | - Chih-Wei Chang
- Department of Radiation Oncology, Winship Cancer Institute, Emory University School of Medicine, Atlanta, Georgia, USA
| | - Xiaofeng Yang
- Department of Radiation Oncology, Winship Cancer Institute, Emory University School of Medicine, Atlanta, Georgia, USA
| | - Aparna H Kesarwala
- Department of Radiation Oncology, Winship Cancer Institute, Emory University School of Medicine, Atlanta, Georgia, USA
| |
Collapse
|
5
|
Touati R, Trung Le W, Kadoury S. Multi-planar dual adversarial network based on dynamic 3D features for MRI-CT head and neck image synthesis. Phys Med Biol 2024; 69:155012. [PMID: 38981593 DOI: 10.1088/1361-6560/ad611a] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2024] [Accepted: 07/09/2024] [Indexed: 07/11/2024]
Abstract
Objective.Head and neck radiotherapy planning requires electron densities from different tissues for dose calculation. Dose calculation from imaging modalities such as MRI remains an unsolved problem since this imaging modality does not provide information about the density of electrons.Approach.We propose a generative adversarial network (GAN) approach that synthesizes CT (sCT) images from T1-weighted MRI acquisitions in head and neck cancer patients. Our contribution is to exploit new features that are relevant for improving multimodal image synthesis, and thus improving the quality of the generated CT images. More precisely, we propose a Dual branch generator based on the U-Net architecture and on an augmented multi-planar branch. The augmented branch learns specific 3D dynamic features, which describe the dynamic image shape variations and are extracted from different view-points of the volumetric input MRI. The architecture of the proposed model relies on an end-to-end convolutional U-Net embedding network.Results.The proposed model achieves a mean absolute error (MAE) of18.76±5.167in the target Hounsfield unit (HU) space on sagittal head and neck patients, with a mean structural similarity (MSSIM) of0.95±0.09and a Frechet inception distance (FID) of145.60±8.38. The model yields a MAE of26.83±8.27to generate specific primary tumor regions on axial patient acquisitions, with a Dice score of0.73±0.06and a FID distance equal to122.58±7.55. The improvement of our model over other state-of-the-art GAN approaches is of 3.8%, on a tumor test set. On both sagittal and axial acquisitions, the model yields the best peak signal-to-noise ratio of27.89±2.22and26.08±2.95to synthesize MRI from CT input.Significance.The proposed model synthesizes both sagittal and axial CT tumor images, used for radiotherapy treatment planning in head and neck cancer cases. The performance analysis across different imaging metrics and under different evaluation strategies demonstrates the effectiveness of our dual CT synthesis model to produce high quality sCT images compared to other state-of-the-art approaches. Our model could improve clinical tumor analysis, in which a further clinical validation remains to be explored.
Collapse
Affiliation(s)
- Redha Touati
- MedICAL Laboratory, Polytechnique Montreal, Montreal, QC, Canada
| | - William Trung Le
- MedICAL Laboratory, Polytechnique Montreal, Montreal, QC, Canada
| | - Samuel Kadoury
- MedICAL Laboratory, Polytechnique Montreal, Montreal, QC, Canada
- CHUM Research Center, Montreal, QC, Canada
| |
Collapse
|
6
|
Chaudhary MFA, Gerard SE, Christensen GE, Cooper CB, Schroeder JD, Hoffman EA, Reinhardt JM. LungViT: Ensembling Cascade of Texture Sensitive Hierarchical Vision Transformers for Cross-Volume Chest CT Image-to-Image Translation. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:2448-2465. [PMID: 38373126 PMCID: PMC11227912 DOI: 10.1109/tmi.2024.3367321] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/21/2024]
Abstract
Chest computed tomography (CT) at inspiration is often complemented by an expiratory CT to identify peripheral airways disease. Additionally, co-registered inspiratory-expiratory volumes can be used to derive various markers of lung function. Expiratory CT scans, however, may not be acquired due to dose or scan time considerations or may be inadequate due to motion or insufficient exhale; leading to a missed opportunity to evaluate underlying small airways disease. Here, we propose LungViT- a generative adversarial learning approach using hierarchical vision transformers for translating inspiratory CT intensities to corresponding expiratory CT intensities. LungViT addresses several limitations of the traditional generative models including slicewise discontinuities, limited size of generated volumes, and their inability to model texture transfer at volumetric level. We propose a shifted-window hierarchical vision transformer architecture with squeeze-and-excitation decoder blocks for modeling dependencies between features. We also propose a multiview texture similarity distance metric for texture and style transfer in 3D. To incorporate global information into the training process and refine the output of our model, we use ensemble cascading. LungViT is able to generate large 3D volumes of size 320×320×320 . We train and validate our model using a diverse cohort of 1500 subjects with varying disease severity. To assess model generalizability beyond the development set biases, we evaluate our model on an out-of-distribution external validation set of 200 subjects. Clinical validation on internal and external testing sets shows that synthetic volumes could be reliably adopted for deriving clinical endpoints of chronic obstructive pulmonary disease.
Collapse
|
7
|
Meng X, Sun K, Xu J, He X, Shen D. Multi-Modal Modality-Masked Diffusion Network for Brain MRI Synthesis With Random Modality Missing. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:2587-2598. [PMID: 38393846 DOI: 10.1109/tmi.2024.3368664] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/25/2024]
Abstract
Synthesis of unavailable imaging modalities from available ones can generate modality-specific complementary information and enable multi-modality based medical images diagnosis or treatment. Existing generative methods for medical image synthesis are usually based on cross-modal translation between acquired and missing modalities. These methods are usually dedicated to specific missing modality and perform synthesis in one shot, which cannot deal with varying number of missing modalities flexibly and construct the mapping across modalities effectively. To address the above issues, in this paper, we propose a unified Multi-modal Modality-masked Diffusion Network (M2DN), tackling multi-modal synthesis from the perspective of "progressive whole-modality inpainting", instead of "cross-modal translation". Specifically, our M2DN considers the missing modalities as random noise and takes all the modalities as a unity in each reverse diffusion step. The proposed joint synthesis scheme performs synthesis for the missing modalities and self-reconstruction for the available ones, which not only enables synthesis for arbitrary missing scenarios, but also facilitates the construction of common latent space and enhances the model representation ability. Besides, we introduce a modality-mask scheme to encode availability status of each incoming modality explicitly in a binary mask, which is adopted as condition for the diffusion model to further enhance the synthesis performance of our M2DN for arbitrary missing scenarios. We carry out experiments on two public brain MRI datasets for synthesis and downstream segmentation tasks. Experimental results demonstrate that our M2DN outperforms the state-of-the-art models significantly and shows great generalizability for arbitrary missing modalities.
Collapse
|
8
|
Rousta F, Esteki A, Shalbaf A, Sadeghi A, Moghadam PK, Voshagh A. Application of artificial intelligence in pancreas endoscopic ultrasound imaging- A systematic review. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2024; 250:108205. [PMID: 38703435 DOI: 10.1016/j.cmpb.2024.108205] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/04/2024] [Revised: 04/13/2024] [Accepted: 04/24/2024] [Indexed: 05/06/2024]
Abstract
The pancreas is a vital organ in digestive system which has significant health implications. It is imperative to evaluate and identify malignant pancreatic lesions promptly in light of the high mortality rate linked to such malignancies. Endoscopic Ultrasound (EUS) is a non-invasive precise technique to detect pancreas disorders, but it is highly operator dependent. Artificial intelligence (AI), including traditional machine learning (ML) and deep learning (DL) techniques can play a pivotal role to enhancing the performance of EUS regardless of operator. AI performs a critical function in the detection, classification, and segmentation of medical images. The utilization of AI-assisted systems has improved the accuracy and productivity of pancreatic analysis, including the detection of diverse pancreatic disorders (e.g., pancreatitis, masses, and cysts) as well as landmarks and parenchyma. This systematic review examines the rapidly developing domain of AI-assisted system in EUS of the pancreas. Its objective is to present a thorough study of the present research status and developments in this area. This paper explores the significant challenges of AI-assisted system in pancreas EUS imaging, highlights the potential of AI techniques in addressing these challenges, and suggests the scope for future research in domain of AI-assisted EUS systems.
Collapse
Affiliation(s)
- Fatemeh Rousta
- Department of Biomedical Engineering and Physics, School of Medicine, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Ali Esteki
- Department of Biomedical Engineering and Physics, School of Medicine, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Ahmad Shalbaf
- Department of Biomedical Engineering and Physics, School of Medicine, Shahid Beheshti University of Medical Sciences, Tehran, Iran.
| | - Amir Sadeghi
- Research Institute for Gastroenterology and Liver Diseases, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Pardis Ketabi Moghadam
- Research Institute for Gastroenterology and Liver Diseases, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Ardalan Voshagh
- Faculty of Electrical Engineering, Shahid Beheshti University, Tehran, Iran
| |
Collapse
|
9
|
Hu Y, Zhou H, Cao N, Li C, Hu C. Synthetic CT generation based on CBCT using improved vision transformer CycleGAN. Sci Rep 2024; 14:11455. [PMID: 38769329 PMCID: PMC11106312 DOI: 10.1038/s41598-024-61492-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2024] [Accepted: 05/06/2024] [Indexed: 05/22/2024] Open
Abstract
Cone-beam computed tomography (CBCT) is a crucial component of adaptive radiation therapy; however, it frequently encounters challenges such as artifacts and noise, significantly constraining its clinical utility. While CycleGAN is a widely employed method for CT image synthesis, it has notable limitations regarding the inadequate capture of global features. To tackle these challenges, we introduce a refined unsupervised learning model called improved vision transformer CycleGAN (IViT-CycleGAN). Firstly, we integrate a U-net framework that builds upon ViT. Next, we augment the feed-forward neural network by incorporating deep convolutional networks. Lastly, we enhance the stability of the model training process by introducing gradient penalty and integrating an additional loss term into the generator loss. The experiment demonstrates from multiple perspectives that our model-generated synthesizing CT(sCT) has significant advantages compared to other unsupervised learning models, thereby validating the clinical applicability and robustness of our model. In future clinical practice, our model has the potential to assist clinical practitioners in formulating precise radiotherapy plans.
Collapse
Affiliation(s)
- Yuxin Hu
- School of Computer and Software, Hohai University, Nanjing, 211100, China
| | - Han Zhou
- School of Electronic Science and Engineering, Nanjing University, NanJing, 210046, China
- Department of Radiation Oncology, The Fourth Affiliated Hospital of Nanjing Medical University, Nanjing, 210013, China
| | - Ning Cao
- School of Computer and Software, Hohai University, Nanjing, 211100, China
| | - Can Li
- Engineering Research Center of TCM Intelligence Health Service, School of Artificial Intelligence and Information Technology, Nanjing University of Chinese Medicine, Nanjing, 210023, China.
| | - Can Hu
- School of Computer and Software, Hohai University, Nanjing, 211100, China.
| |
Collapse
|
10
|
Chen C, Chen Y, Li X, Ning H, Xiao R. Linear semantic transformation for semi-supervised medical image segmentation. Comput Biol Med 2024; 173:108331. [PMID: 38522252 DOI: 10.1016/j.compbiomed.2024.108331] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2024] [Revised: 02/29/2024] [Accepted: 03/17/2024] [Indexed: 03/26/2024]
Abstract
Medical image segmentation is a focus research and foundation in developing intelligent medical systems. Recently, deep learning for medical image segmentation has become a standard process and succeeded significantly, promoting the development of reconstruction, and surgical planning of disease diagnosis. However, semantic learning is often inefficient owing to the lack of supervision of feature maps, resulting in that high-quality segmentation models always rely on numerous and accurate data annotations. Learning robust semantic representation in latent spaces remains a challenge. In this paper, we propose a novel semi-supervised learning framework to learn vital attributes in medical images, which constructs generalized representation from diverse semantics to realize medical image segmentation. We first build a self-supervised learning part that achieves context recovery by reconstructing space and intensity of medical images, which conduct semantic representation for feature maps. Subsequently, we combine semantic-rich feature maps and utilize simple linear semantic transformation to convert them into image segmentation. The proposed framework was tested using five medical segmentation datasets. Quantitative assessments indicate the highest scores of our method on IXI (73.78%), ScaF (47.50%), COVID-19-Seg (50.72%), PC-Seg (65.06%), and Brain-MR (72.63%) datasets. Finally, we compared our method with the latest semi-supervised learning methods and obtained 77.15% and 75.22% DSC values, respectively, ranking first on two representative datasets. The experimental results not only proved that the proposed linear semantic transformation was effectively applied to medical image segmentation, but also presented its simplicity and ease-of-use to pursue robust segmentation in semi-supervised learning. Our code is now open at: https://github.com/QingYunA/Linear-Semantic-Transformation-for-Semi-Supervised-Medical-Image-Segmentation.
Collapse
Affiliation(s)
- Cheng Chen
- School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing, 100083, China
| | - Yunqing Chen
- School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing, 100083, China
| | - Xiaoheng Li
- School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing, 100083, China
| | - Huansheng Ning
- School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing, 100083, China
| | - Ruoxiu Xiao
- School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing, 100083, China; Shunde Innovation School, University of Science and Technology Beijing, Foshan, 100024, China.
| |
Collapse
|
11
|
Dalmaz O, Mirza MU, Elmas G, Ozbey M, Dar SUH, Ceyani E, Oguz KK, Avestimehr S, Çukur T. One model to unite them all: Personalized federated learning of multi-contrast MRI synthesis. Med Image Anal 2024; 94:103121. [PMID: 38402791 DOI: 10.1016/j.media.2024.103121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2023] [Revised: 02/20/2024] [Accepted: 02/21/2024] [Indexed: 02/27/2024]
Abstract
Curation of large, diverse MRI datasets via multi-institutional collaborations can help improve learning of generalizable synthesis models that reliably translate source- onto target-contrast images. To facilitate collaborations, federated learning (FL) adopts decentralized model training while mitigating privacy concerns by avoiding sharing of imaging data. However, conventional FL methods can be impaired by the inherent heterogeneity in the data distribution, with domain shifts evident within and across imaging sites. Here we introduce the first personalized FL method for MRI Synthesis (pFLSynth) that improves reliability against data heterogeneity via model specialization to individual sites and synthesis tasks (i.e., source-target contrasts). To do this, pFLSynth leverages an adversarial model equipped with novel personalization blocks that control the statistics of generated feature maps across the spatial/channel dimensions, given latent variables specific to sites and tasks. To further promote communication efficiency and site specialization, partial network aggregation is employed over later generator stages while earlier generator stages and the discriminator are trained locally. As such, pFLSynth enables multi-task training of multi-site synthesis models with high generalization performance across sites and tasks. Comprehensive experiments demonstrate the superior performance and reliability of pFLSynth in MRI synthesis against prior federated methods.
Collapse
Affiliation(s)
- Onat Dalmaz
- Department of Electrical and Electronics Engineering, Bilkent University, Ankara 06800, Turkey; National Magnetic Resonance Research Center (UMRAM), Bilkent University, Ankara 06800, Turkey
| | - Muhammad U Mirza
- Department of Electrical and Electronics Engineering, Bilkent University, Ankara 06800, Turkey; National Magnetic Resonance Research Center (UMRAM), Bilkent University, Ankara 06800, Turkey
| | - Gokberk Elmas
- Department of Electrical and Electronics Engineering, Bilkent University, Ankara 06800, Turkey; National Magnetic Resonance Research Center (UMRAM), Bilkent University, Ankara 06800, Turkey
| | - Muzaffer Ozbey
- Department of Electrical and Electronics Engineering, Bilkent University, Ankara 06800, Turkey; National Magnetic Resonance Research Center (UMRAM), Bilkent University, Ankara 06800, Turkey
| | - Salman U H Dar
- Department of Electrical and Electronics Engineering, Bilkent University, Ankara 06800, Turkey; National Magnetic Resonance Research Center (UMRAM), Bilkent University, Ankara 06800, Turkey
| | - Emir Ceyani
- Department of Electrical and Computer Engineering, University of Southern California, Los Angeles, CA 90089, USA
| | - Kader K Oguz
- Department of Radiology, University of California, Davis Medical Center, Sacramento, CA 95817, USA
| | - Salman Avestimehr
- Department of Electrical and Computer Engineering, University of Southern California, Los Angeles, CA 90089, USA
| | - Tolga Çukur
- Department of Electrical and Electronics Engineering, Bilkent University, Ankara 06800, Turkey; National Magnetic Resonance Research Center (UMRAM), Bilkent University, Ankara 06800, Turkey; Neuroscience Program, Bilkent University, Ankara 06800, Turkey.
| |
Collapse
|
12
|
Fan M, Cao X, Lü F, Xie S, Yu Z, Chen Y, Lü Z, Li L. Generative adversarial network-based synthesis of contrast-enhanced MR images from precontrast images for predicting histological characteristics in breast cancer. Phys Med Biol 2024; 69:095002. [PMID: 38537294 DOI: 10.1088/1361-6560/ad3889] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2023] [Accepted: 03/27/2024] [Indexed: 04/16/2024]
Abstract
Objective. Dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI) is a sensitive tool for assessing breast cancer by analyzing tumor blood flow, but it requires gadolinium-based contrast agents, which carry risks such as brain retention and astrocyte migration. Contrast-free MRI is thus preferable for patients with renal impairment or who are pregnant. This study aimed to investigate the feasibility of generating contrast-enhanced MR images from precontrast images and to evaluate the potential use of synthetic images in diagnosing breast cancer.Approach. This retrospective study included 322 women with invasive breast cancer who underwent preoperative DCE-MRI. A generative adversarial network (GAN) based postcontrast image synthesis (GANPIS) model with perceptual loss was proposed to generate contrast-enhanced MR images from precontrast images. The quality of the synthesized images was evaluated using the peak signal-to-noise ratio (PSNR) and structural similarity (SSIM). The diagnostic performance of the generated images was assessed using a convolutional neural network to predict Ki-67, luminal A and histological grade with the area under the receiver operating characteristic curve (AUC). The patients were divided into training (n= 200), validation (n= 60), and testing sets (n= 62).Main results. Quantitative analysis revealed strong agreement between the generated and real postcontrast images in the test set, with PSNR and SSIM values of 36.210 ± 2.670 and 0.988 ± 0.006, respectively. The generated postcontrast images achieved AUCs of 0.918 ± 0.018, 0.842 ± 0.028 and 0.815 ± 0.019 for predicting the Ki-67 expression level, histological grade, and luminal A subtype, respectively. These results showed a significant improvement compared to the use of precontrast images alone, which achieved AUCs of 0.764 ± 0.031, 0.741 ± 0.035, and 0.797 ± 0.021, respectively.Significance. This study proposed a GAN-based MR image synthesis method for breast cancer that aims to generate postcontrast images from precontrast images, allowing the use of contrast-free images to simulate kinetic features for improved diagnosis.
Collapse
Affiliation(s)
- Ming Fan
- Institute of Intelligent Biomedicine, Hangzhou Dianzi University,Hangzhou 310018, Zhejiang, People's Republic of China
| | - Xuan Cao
- Institute of Intelligent Biomedicine, Hangzhou Dianzi University,Hangzhou 310018, Zhejiang, People's Republic of China
| | - Fuqing Lü
- Institute of Intelligent Biomedicine, Hangzhou Dianzi University,Hangzhou 310018, Zhejiang, People's Republic of China
| | - Sangma Xie
- Institute of Intelligent Biomedicine, Hangzhou Dianzi University,Hangzhou 310018, Zhejiang, People's Republic of China
| | - Zhou Yu
- Institute of Intelligent Biomedicine, Hangzhou Dianzi University,Hangzhou 310018, Zhejiang, People's Republic of China
| | - Yuanlin Chen
- Institute of Intelligent Biomedicine, Hangzhou Dianzi University,Hangzhou 310018, Zhejiang, People's Republic of China
| | - Zhong Lü
- Affiliated Dongyang Hospital of Wenzhou Medical University,People's Republic of China
| | - Lihua Li
- Institute of Intelligent Biomedicine, Hangzhou Dianzi University,Hangzhou 310018, Zhejiang, People's Republic of China
| |
Collapse
|
13
|
Jiang M, Wang S, Song Z, Song L, Wang Y, Zhu C, Zheng Q. Cross 2SynNet: cross-device-cross-modal synthesis of routine brain MRI sequences from CT with brain lesion. MAGMA (NEW YORK, N.Y.) 2024; 37:241-256. [PMID: 38315352 DOI: 10.1007/s10334-023-01145-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/08/2023] [Revised: 11/28/2023] [Accepted: 12/27/2023] [Indexed: 02/07/2024]
Abstract
OBJECTIVES CT and MR are often needed to determine the location and extent of brain lesions collectively to improve diagnosis. However, patients with acute brain diseases cannot complete the MRI examination within a short time. The aim of the study is to devise a cross-device and cross-modal medical image synthesis (MIS) method Cross2SynNet for synthesizing routine brain MRI sequences of T1WI, T2WI, FLAIR, and DWI from CT with stroke and brain tumors. MATERIALS AND METHODS For the retrospective study, the participants covered four different diseases of cerebral ischemic stroke (CIS-cohort), cerebral hemorrhage (CH-cohort), meningioma (M-cohort), glioma (G-cohort). The MIS model Cross2SynNet was established on the basic architecture of conditional generative adversarial network (CGAN), of which, the fully convolutional Transformer (FCT) module was adopted into generator to capture the short- and long-range dependencies between healthy and pathological tissues, and the edge loss function was to minimize the difference in gradient magnitude between synthetic image and ground truth. Three metrics of mean square error (MSE), peak signal-to-noise ratio (PSNR), and structure similarity index measure (SSIM) were used for evaluation. RESULTS A total of 230 participants (mean patient age, 59.77 years ± 13.63 [standard deviation]; 163 men [71%] and 67 women [29%]) were included, including CIS-cohort (95 participants between Dec 2019 and Feb 2022), CH-cohort (69 participants between Jan 2020 and Dec 2021), M-cohort (40 participants between Sep 2018 and Dec 2021), and G-cohort (26 participants between Sep 2019 and Dec 2021). The Cross2SynNet achieved averaged values of MSE = 0.008, PSNR = 21.728, and SSIM = 0.758 when synthesizing MRIs from CT, outperforming the CycleGAN, pix2pix, RegGAN, Pix2PixHD, and ResViT. The Cross2SynNet could synthesize the brain lesion on pseudo DWI even if the CT image did not exhibit clear signal in the acute ischemic stroke patients. CONCLUSIONS Cross2SynNet could achieve routine brain MRI synthesis of T1WI, T2WI, FLAIR, and DWI from CT with promising performance given the brain lesion of stroke and brain tumor.
Collapse
Affiliation(s)
- Minbo Jiang
- School of Computer and Control Engineering, Yantai University, No 30, Qingquan Road, Laishan District, Yantai, 264005, Shandong, China
| | - Shuai Wang
- Department of Radiology, Binzhou Medical University Hospital, Binzhou, 256603, China
| | - Zhiwei Song
- School of Computer and Control Engineering, Yantai University, No 30, Qingquan Road, Laishan District, Yantai, 264005, Shandong, China
| | - Limei Song
- School of Medical Imaging, Weifang Medical University, Weifang, 261000, China
| | - Yi Wang
- School of Computer and Control Engineering, Yantai University, No 30, Qingquan Road, Laishan District, Yantai, 264005, Shandong, China
| | - Chuanzhen Zhu
- School of Computer and Control Engineering, Yantai University, No 30, Qingquan Road, Laishan District, Yantai, 264005, Shandong, China
| | - Qiang Zheng
- School of Computer and Control Engineering, Yantai University, No 30, Qingquan Road, Laishan District, Yantai, 264005, Shandong, China.
| |
Collapse
|
14
|
Li Y, Shao HC, Liang X, Chen L, Li R, Jiang S, Wang J, Zhang Y. Zero-Shot Medical Image Translation via Frequency-Guided Diffusion Models. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:980-993. [PMID: 37851552 PMCID: PMC11000254 DOI: 10.1109/tmi.2023.3325703] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/20/2023]
Abstract
Recently, the diffusion model has emerged as a superior generative model that can produce high quality and realistic images. However, for medical image translation, the existing diffusion models are deficient in accurately retaining structural information since the structure details of source domain images are lost during the forward diffusion process and cannot be fully recovered through learned reverse diffusion, while the integrity of anatomical structures is extremely important in medical images. For instance, errors in image translation may distort, shift, or even remove structures and tumors, leading to incorrect diagnosis and inadequate treatments. Training and conditioning diffusion models using paired source and target images with matching anatomy can help. However, such paired data are very difficult and costly to obtain, and may also reduce the robustness of the developed model to out-of-distribution testing data. We propose a frequency-guided diffusion model (FGDM) that employs frequency-domain filters to guide the diffusion model for structure-preserving image translation. Based on its design, FGDM allows zero-shot learning, as it can be trained solely on the data from the target domain, and used directly for source-to-target domain translation without any exposure to the source-domain data during training. We evaluated it on three cone-beam CT (CBCT)-to-CT translation tasks for different anatomical sites, and a cross-institutional MR imaging translation task. FGDM outperformed the state-of-the-art methods (GAN-based, VAE-based, and diffusion-based) in metrics of Fréchet Inception Distance (FID), Peak Signal-to-Noise Ratio (PSNR), and Structural Similarity Index Measure (SSIM), showing its significant advantages in zero-shot medical image translation.
Collapse
|
15
|
Kaleta J, Dall'Alba D, Płotka S, Korzeniowski P. Minimal data requirement for realistic endoscopic image generation with Stable Diffusion. Int J Comput Assist Radiol Surg 2024; 19:531-539. [PMID: 37934401 PMCID: PMC10881618 DOI: 10.1007/s11548-023-03030-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2023] [Accepted: 10/11/2023] [Indexed: 11/08/2023]
Abstract
PURPOSE Computer-assisted surgical systems provide support information to the surgeon, which can improve the execution and overall outcome of the procedure. These systems are based on deep learning models that are trained on complex and challenging-to-annotate data. Generating synthetic data can overcome these limitations, but it is necessary to reduce the domain gap between real and synthetic data. METHODS We propose a method for image-to-image translation based on a Stable Diffusion model, which generates realistic images starting from synthetic data. Compared to previous works, the proposed method is better suited for clinical application as it requires a much smaller amount of input data and allows finer control over the generation of details by introducing different variants of supporting control networks. RESULTS The proposed method is applied in the context of laparoscopic cholecystectomy, using synthetic and real data from public datasets. It achieves a mean Intersection over Union of 69.76%, significantly improving the baseline results (69.76 vs. 42.21%). CONCLUSIONS The proposed method for translating synthetic images into images with realistic characteristics will enable the training of deep learning methods that can generalize optimally to real-world contexts, thereby improving computer-assisted intervention guidance systems.
Collapse
Affiliation(s)
- Joanna Kaleta
- Sano Centre for Computational Medicine, Krakow, Poland
| | - Diego Dall'Alba
- Sano Centre for Computational Medicine, Krakow, Poland.
- Department of Computer Science, University of Verona, Verona, Italy.
| | - Szymon Płotka
- Sano Centre for Computational Medicine, Krakow, Poland
- Informatics Institute, University of Amsterdam, Amsterdam, The Netherlands
- Department of Biomedical Engineering and Physics, Amsterdam University Medical Center, Amsterdam, The Netherlands
| | | |
Collapse
|
16
|
Wang S, Wu J, Chen M, Huang S, Huang Q. Balanced transformer: efficient classification of glioblastoma and primary central nervous system lymphoma. Phys Med Biol 2024; 69:045032. [PMID: 38232389 DOI: 10.1088/1361-6560/ad1f88] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2023] [Accepted: 01/17/2024] [Indexed: 01/19/2024]
Abstract
Objective.Primary central nervous system lymphoma (PCNSL) and glioblastoma (GBM) are malignant primary brain tumors with different biological characteristics. Great differences exist between the treatment strategies of PCNSL and GBM. Thus, accurately distinguishing between PCNSL and GBM before surgery is very important for guiding neurosurgery. At present, the spinal fluid of patients is commonly extracted to find tumor markers for diagnosis. However, this method not only causes secondary injury to patients, but also easily delays treatment. Although diagnosis using radiology images is non-invasive, the morphological features and texture features of the two in magnetic resonance imaging (MRI) are quite similar, making distinction with human eyes and image diagnosis very difficult. In order to solve the problem of insufficient number of samples and sample imbalance, we used data augmentation and balanced sample sampling methods. Conventional Transformer networks use patch segmentation operations to divide images into small patches, but the lack of communication between patches leads to unbalanced data layers.Approach.To address this problem, we propose a balanced patch embedding approach that extracts high-level semantic information by reducing the feature dimensionality and maintaining the geometric variation invariance of the features. This approach balances the interactions between the information and improves the representativeness of the data. To further address the imbalance problem, the balanced patch partition method is proposed to increase the receptive field by sampling the four corners of the sliding window and introducing a linear encoding component without increasing the computational effort, and designed a new balanced loss function.Main results.Benefiting from the overall balance design, we conducted an experiment using Balanced Transformer and obtained an accuracy of 99.89%, sensitivity of 99.74%, specificity of 99.73% and AUC of 99.19%, which is far higher than the previous results (accuracy of 89.6% ∼ 96.8%, sensitivity of 74.3% ∼ 91.3%, specificity of 88.9% ∼ 96.02% and AUC of 87.8% ∼ 94.9%).Significance.This study can accurately distinguish PCNSL and GBM before surgery. Because GBM is a common type of malignant tumor, the 1% improvement in accuracy has saved many patients and reduced treatment times considerably. Thus, it can provide doctors with a good basis for auxiliary diagnosis.
Collapse
Affiliation(s)
- Shigang Wang
- Department of Electronic Engineering, College of Communication Engineering, Jilin University, Changchun 130012, People's Republic of China
| | - Jinyang Wu
- Department of Electronic Engineering, College of Communication Engineering, Jilin University, Changchun 130012, People's Republic of China
| | - Meimei Chen
- Department of Electronic Engineering, College of Communication Engineering, Jilin University, Changchun 130012, People's Republic of China
| | - Sa Huang
- Department of Radiology, the Second Hospital of Jilin University, Changchun 130012, People's Republic of China
| | - Qian Huang
- Department of Radiology, the Second Hospital of Jilin University, Changchun 130012, People's Republic of China
| |
Collapse
|
17
|
Shao L, Chen B, Zhang Z, Zhang Z, Chen X. Artificial intelligence generated content (AIGC) in medicine: A narrative review. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2024; 21:1672-1711. [PMID: 38303483 DOI: 10.3934/mbe.2024073] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/03/2024]
Abstract
Recently, artificial intelligence generated content (AIGC) has been receiving increased attention and is growing exponentially. AIGC is generated based on the intentional information extracted from human-provided instructions by generative artificial intelligence (AI) models. AIGC quickly and automatically generates large amounts of high-quality content. Currently, there is a shortage of medical resources and complex medical procedures in medicine. Due to its characteristics, AIGC can help alleviate these problems. As a result, the application of AIGC in medicine has gained increased attention in recent years. Therefore, this paper provides a comprehensive review on the recent state of studies involving AIGC in medicine. First, we present an overview of AIGC. Furthermore, based on recent studies, the application of AIGC in medicine is reviewed from two aspects: medical image processing and medical text generation. The basic generative AI models, tasks, target organs, datasets and contribution of studies are considered and summarized. Finally, we also discuss the limitations and challenges faced by AIGC and propose possible solutions with relevant studies. We hope this review can help readers understand the potential of AIGC in medicine and obtain some innovative ideas in this field.
Collapse
Affiliation(s)
- Liangjing Shao
- Academy for Engineering & Technology, Fudan University, Shanghai 200433, China
- Shanghai Key Laboratory of Medical Image Computing and Computer Assisted Intervention, Fudan University, Shanghai 200032, China
| | - Benshuang Chen
- Academy for Engineering & Technology, Fudan University, Shanghai 200433, China
- Shanghai Key Laboratory of Medical Image Computing and Computer Assisted Intervention, Fudan University, Shanghai 200032, China
| | - Ziqun Zhang
- Information office, Fudan University, Shanghai 200032, China
| | - Zhen Zhang
- Baoshan Branch of Ren Ji Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai 200444, China
| | - Xinrong Chen
- Academy for Engineering & Technology, Fudan University, Shanghai 200433, China
- Shanghai Key Laboratory of Medical Image Computing and Computer Assisted Intervention, Fudan University, Shanghai 200032, China
| |
Collapse
|
18
|
Schaudt D, Späte C, von Schwerin R, Reichert M, von Schwerin M, Beer M, Kloth C. A Critical Assessment of Generative Models for Synthetic Data Augmentation on Limited Pneumonia X-ray Data. Bioengineering (Basel) 2023; 10:1421. [PMID: 38136012 PMCID: PMC10741143 DOI: 10.3390/bioengineering10121421] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2023] [Revised: 11/28/2023] [Accepted: 12/12/2023] [Indexed: 12/24/2023] Open
Abstract
In medical imaging, deep learning models serve as invaluable tools for expediting diagnoses and aiding specialized medical professionals in making clinical decisions. However, effectively training deep learning models typically necessitates substantial quantities of high-quality data, a resource often lacking in numerous medical imaging scenarios. One way to overcome this deficiency is to artificially generate such images. Therefore, in this comparative study we train five generative models to artificially increase the amount of available data in such a scenario. This synthetic data approach is evaluated on a a downstream classification task, predicting four causes for pneumonia as well as healthy cases on 1082 chest X-ray images. Quantitative and medical assessments show that a Generative Adversarial Network (GAN)-based approach significantly outperforms more recent diffusion-based approaches on this limited dataset with better image quality and pathological plausibility. We show that better image quality surprisingly does not translate to improved classification performance by evaluating five different classification models and varying the amount of additional training data. Class-specific metrics like precision, recall, and F1-score show a substantial improvement by using synthetic images, emphasizing the data rebalancing effect of less frequent classes. However, overall performance does not improve for most models and configurations, except for a DreamBooth approach which shows a +0.52 improvement in overall accuracy. The large variance of performance impact in this study suggests a careful consideration of utilizing generative models for limited data scenarios, especially with an unexpected negative correlation between image quality and downstream classification improvement.
Collapse
Affiliation(s)
- Daniel Schaudt
- Institute of Databases and Information Systems, Ulm University, James-Franck-Ring, 89081 Ulm, Germany
| | - Christian Späte
- DASU Transferzentrum für Digitalisierung, Analytics und Data Science Ulm, Olgastraße 94, 89073 Ulm, Germany
| | - Reinhold von Schwerin
- Department of Computer Science, Ulm University of Applied Science, Albert–Einstein–Allee 55, 89081 Ulm, Germany
| | - Manfred Reichert
- Institute of Databases and Information Systems, Ulm University, James-Franck-Ring, 89081 Ulm, Germany
| | - Marianne von Schwerin
- Department of Computer Science, Ulm University of Applied Science, Albert–Einstein–Allee 55, 89081 Ulm, Germany
| | - Meinrad Beer
- Department of Radiology, University Hospital of Ulm, Albert–Einstein–Allee 23, 89081 Ulm, Germany
| | - Christopher Kloth
- Department of Radiology, University Hospital of Ulm, Albert–Einstein–Allee 23, 89081 Ulm, Germany
| |
Collapse
|
19
|
Ansari MY, Qaraqe M, Righetti R, Serpedin E, Qaraqe K. Unveiling the future of breast cancer assessment: a critical review on generative adversarial networks in elastography ultrasound. Front Oncol 2023; 13:1282536. [PMID: 38125949 PMCID: PMC10731303 DOI: 10.3389/fonc.2023.1282536] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2023] [Accepted: 10/27/2023] [Indexed: 12/23/2023] Open
Abstract
Elastography Ultrasound provides elasticity information of the tissues, which is crucial for understanding the density and texture, allowing for the diagnosis of different medical conditions such as fibrosis and cancer. In the current medical imaging scenario, elastograms for B-mode Ultrasound are restricted to well-equipped hospitals, making the modality unavailable for pocket ultrasound. To highlight the recent progress in elastogram synthesis, this article performs a critical review of generative adversarial network (GAN) methodology for elastogram generation from B-mode Ultrasound images. Along with a brief overview of cutting-edge medical image synthesis, the article highlights the contribution of the GAN framework in light of its impact and thoroughly analyzes the results to validate whether the existing challenges have been effectively addressed. Specifically, This article highlights that GANs can successfully generate accurate elastograms for deep-seated breast tumors (without having artifacts) and improve diagnostic effectiveness for pocket US. Furthermore, the results of the GAN framework are thoroughly analyzed by considering the quantitative metrics, visual evaluations, and cancer diagnostic accuracy. Finally, essential unaddressed challenges that lie at the intersection of elastography and GANs are presented, and a few future directions are shared for the elastogram synthesis research.
Collapse
Affiliation(s)
- Mohammed Yusuf Ansari
- Electrical and Computer Engineering, Texas A&M University, College Station, TX, United States
- Electrical and Computer Engineering, Texas A&M University at Qatar, Doha, Qatar
| | - Marwa Qaraqe
- Electrical and Computer Engineering, Texas A&M University at Qatar, Doha, Qatar
- College of Science and Engineering, Hamad Bin Khalifa University, Doha, Qatar
| | - Raffaella Righetti
- Electrical and Computer Engineering, Texas A&M University, College Station, TX, United States
| | - Erchin Serpedin
- Electrical and Computer Engineering, Texas A&M University, College Station, TX, United States
| | - Khalid Qaraqe
- Electrical and Computer Engineering, Texas A&M University at Qatar, Doha, Qatar
| |
Collapse
|
20
|
Dar SUH, Öztürk Ş, Özbey M, Oguz KK, Çukur T. Parallel-stream fusion of scan-specific and scan-general priors for learning deep MRI reconstruction in low-data regimes. Comput Biol Med 2023; 167:107610. [PMID: 37883853 DOI: 10.1016/j.compbiomed.2023.107610] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2023] [Revised: 09/20/2023] [Accepted: 10/17/2023] [Indexed: 10/28/2023]
Abstract
Magnetic resonance imaging (MRI) is an essential diagnostic tool that suffers from prolonged scan times. Reconstruction methods can alleviate this limitation by recovering clinically usable images from accelerated acquisitions. In particular, learning-based methods promise performance leaps by employing deep neural networks as data-driven priors. A powerful approach uses scan-specific (SS) priors that leverage information regarding the underlying physical signal model for reconstruction. SS priors are learned on each individual test scan without the need for a training dataset, albeit they suffer from computationally burdening inference with nonlinear networks. An alternative approach uses scan-general (SG) priors that instead leverage information regarding the latent features of MRI images for reconstruction. SG priors are frozen at test time for efficiency, albeit they require learning from a large training dataset. Here, we introduce a novel parallel-stream fusion model (PSFNet) that synergistically fuses SS and SG priors for performant MRI reconstruction in low-data regimes, while maintaining competitive inference times to SG methods. PSFNet implements its SG prior based on a nonlinear network, yet it forms its SS prior based on a linear network to maintain efficiency. A pervasive framework for combining multiple priors in MRI reconstruction is algorithmic unrolling that uses serially alternated projections, causing error propagation under low-data regimes. To alleviate error propagation, PSFNet combines its SS and SG priors via a novel parallel-stream architecture with learnable fusion parameters. Demonstrations are performed on multi-coil brain MRI for varying amounts of training data. PSFNet outperforms SG methods in low-data regimes, and surpasses SS methods with few tens of training samples. On average across tasks, PSFNet achieves 3.1 dB higher PSNR, 2.8% higher SSIM, and 0.3 × lower RMSE than baselines. Furthermore, in both supervised and unsupervised setups, PSFNet requires an order of magnitude lower samples compared to SG methods, and enables an order of magnitude faster inference compared to SS methods. Thus, the proposed model improves deep MRI reconstruction with elevated learning and computational efficiency.
Collapse
Affiliation(s)
- Salman Ul Hassan Dar
- Department of Internal Medicine III, Heidelberg University Hospital, 69120, Heidelberg, Germany; AI Health Innovation Cluster, Heidelberg, Germany
| | - Şaban Öztürk
- Department of Electrical and Electronics Engineering, Bilkent University, Ankara 06800, Turkey; Department of Electrical-Electronics Engineering, Amasya University, Amasya 05100, Turkey
| | - Muzaffer Özbey
- Department of Electrical and Computer Engineering, University of Illinois Urbana-Champaign, IL 61820, United States
| | - Kader Karli Oguz
- Department of Radiology, University of California, Davis, CA 95616, United States; Department of Radiology, Hacettepe University, Ankara, Turkey
| | - Tolga Çukur
- Department of Electrical and Electronics Engineering, Bilkent University, Ankara 06800, Turkey; Department of Radiology, Hacettepe University, Ankara, Turkey; National Magnetic Resonance Research Center (UMRAM), Bilkent University, Ankara 06800, Turkey; Neuroscience Graduate Program, Bilkent University, Ankara 06800, Turkey.
| |
Collapse
|
21
|
Graf R, Schmitt J, Schlaeger S, Möller HK, Sideri-Lampretsa V, Sekuboyina A, Krieg SM, Wiestler B, Menze B, Rueckert D, Kirschke JS. Denoising diffusion-based MRI to CT image translation enables automated spinal segmentation. Eur Radiol Exp 2023; 7:70. [PMID: 37957426 PMCID: PMC10643734 DOI: 10.1186/s41747-023-00385-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2023] [Accepted: 09/12/2023] [Indexed: 11/15/2023] Open
Abstract
BACKGROUND Automated segmentation of spinal magnetic resonance imaging (MRI) plays a vital role both scientifically and clinically. However, accurately delineating posterior spine structures is challenging. METHODS This retrospective study, approved by the ethical committee, involved translating T1-weighted and T2-weighted images into computed tomography (CT) images in a total of 263 pairs of CT/MR series. Landmark-based registration was performed to align image pairs. We compared two-dimensional (2D) paired - Pix2Pix, denoising diffusion implicit models (DDIM) image mode, DDIM noise mode - and unpaired (SynDiff, contrastive unpaired translation) image-to-image translation using "peak signal-to-noise ratio" as quality measure. A publicly available segmentation network segmented the synthesized CT datasets, and Dice similarity coefficients (DSC) were evaluated on in-house test sets and the "MRSpineSeg Challenge" volumes. The 2D findings were extended to three-dimensional (3D) Pix2Pix and DDIM. RESULTS 2D paired methods and SynDiff exhibited similar translation performance and DCS on paired data. DDIM image mode achieved the highest image quality. SynDiff, Pix2Pix, and DDIM image mode demonstrated similar DSC (0.77). For craniocaudal axis rotations, at least two landmarks per vertebra were required for registration. The 3D translation outperformed the 2D approach, resulting in improved DSC (0.80) and anatomically accurate segmentations with higher spatial resolution than that of the original MRI series. CONCLUSIONS Two landmarks per vertebra registration enabled paired image-to-image translation from MRI to CT and outperformed all unpaired approaches. The 3D techniques provided anatomically correct segmentations, avoiding underprediction of small structures like the spinous process. RELEVANCE STATEMENT This study addresses the unresolved issue of translating spinal MRI to CT, making CT-based tools usable for MRI data. It generates whole spine segmentation, previously unavailable in MRI, a prerequisite for biomechanical modeling and feature extraction for clinical applications. KEY POINTS • Unpaired image translation lacks in converting spine MRI to CT effectively. • Paired translation needs registration with two landmarks per vertebra at least. • Paired image-to-image enables segmentation transfer to other domains. • 3D translation enables super resolution from MRI to CT. • 3D translation prevents underprediction of small structures.
Collapse
Affiliation(s)
- Robert Graf
- Department of Diagnostic and Interventional Neuroradiology, School of Medicine, Technical University of Munich, Munich, Germany.
| | - Joachim Schmitt
- Department of Diagnostic and Interventional Neuroradiology, School of Medicine, Technical University of Munich, Munich, Germany
| | - Sarah Schlaeger
- Department of Diagnostic and Interventional Neuroradiology, School of Medicine, Technical University of Munich, Munich, Germany
| | - Hendrik Kristian Möller
- Department of Diagnostic and Interventional Neuroradiology, School of Medicine, Technical University of Munich, Munich, Germany
| | - Vasiliki Sideri-Lampretsa
- Institut Für KI Und Informatik in Der Medizin, Klinikum Rechts Der Isar, Technical University of Munich, Munich, Germany
| | - Anjany Sekuboyina
- Department of Diagnostic and Interventional Neuroradiology, School of Medicine, Technical University of Munich, Munich, Germany
- Department of Quantitative Biomedicine, University of Zurich, Zurich, Switzerland
| | - Sandro Manuel Krieg
- Department of Neurosurgery, Klinikum Rechts Der Isar, School of Medicine, Technical University of Munich, Munich, Germany
| | - Benedikt Wiestler
- Department of Diagnostic and Interventional Neuroradiology, School of Medicine, Technical University of Munich, Munich, Germany
| | - Bjoern Menze
- Department of Quantitative Biomedicine, University of Zurich, Zurich, Switzerland
| | - Daniel Rueckert
- Institut Für KI Und Informatik in Der Medizin, Klinikum Rechts Der Isar, Technical University of Munich, Munich, Germany
- Visual Information Processing, Imperial College London, London, UK
| | - Jan Stefan Kirschke
- Department of Diagnostic and Interventional Neuroradiology, School of Medicine, Technical University of Munich, Munich, Germany
| |
Collapse
|
22
|
Hung ALY, Zhao K, Zheng H, Yan R, Raman SS, Terzopoulos D, Sung K. Med-cDiff: Conditional Medical Image Generation with Diffusion Models. Bioengineering (Basel) 2023; 10:1258. [PMID: 38002382 PMCID: PMC10669033 DOI: 10.3390/bioengineering10111258] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2023] [Revised: 10/23/2023] [Accepted: 10/23/2023] [Indexed: 11/26/2023] Open
Abstract
Conditional image generation plays a vital role in medical image analysis as it is effective in tasks such as super-resolution, denoising, and inpainting, among others. Diffusion models have been shown to perform at a state-of-the-art level in natural image generation, but they have not been thoroughly studied in medical image generation with specific conditions. Moreover, current medical image generation models have their own problems, limiting their usage in various medical image generation tasks. In this paper, we introduce the use of conditional Denoising Diffusion Probabilistic Models (cDDPMs) for medical image generation, which achieve state-of-the-art performance on several medical image generation tasks.
Collapse
Affiliation(s)
- Alex Ling Yu Hung
- Computer Science Department, University of California, Los Angeles, CA 90095, USA; (H.Z.); (D.T.)
- Department of Radiology, University of California, Los Angeles, CA 90095, USA; (K.Z.); (R.Y.); (S.S.R.); (K.S.)
| | - Kai Zhao
- Department of Radiology, University of California, Los Angeles, CA 90095, USA; (K.Z.); (R.Y.); (S.S.R.); (K.S.)
| | - Haoxin Zheng
- Computer Science Department, University of California, Los Angeles, CA 90095, USA; (H.Z.); (D.T.)
- Department of Radiology, University of California, Los Angeles, CA 90095, USA; (K.Z.); (R.Y.); (S.S.R.); (K.S.)
| | - Ran Yan
- Department of Radiology, University of California, Los Angeles, CA 90095, USA; (K.Z.); (R.Y.); (S.S.R.); (K.S.)
- Bioengineering Department, University of California, Los Angeles, CA 90095, USA
| | - Steven S. Raman
- Department of Radiology, University of California, Los Angeles, CA 90095, USA; (K.Z.); (R.Y.); (S.S.R.); (K.S.)
| | - Demetri Terzopoulos
- Computer Science Department, University of California, Los Angeles, CA 90095, USA; (H.Z.); (D.T.)
- VoxelCloud, Inc., Los Angeles, CA 90024, USA
| | - Kyunghyun Sung
- Department of Radiology, University of California, Los Angeles, CA 90095, USA; (K.Z.); (R.Y.); (S.S.R.); (K.S.)
| |
Collapse
|
23
|
Kim G, Baek J. Power-law spectrum-based objective function to train a generative adversarial network with transfer learning for the synthetic breast CT image. Phys Med Biol 2023; 68:205007. [PMID: 37722388 DOI: 10.1088/1361-6560/acfadf] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2023] [Accepted: 09/18/2023] [Indexed: 09/20/2023]
Abstract
Objective.This paper proposes a new objective function to improve the quality of synthesized breast CT images generated by the GAN and compares the GAN performances on transfer learning datasets from different image domains.Approach.The proposed objective function, named beta loss function, is based on the fact that x-ray-based breast images follow the power-law spectrum. Accordingly, the exponent of the power-law spectrum (beta value) for breast CT images is approximately two. The beta loss function is defined in terms of L1 distance between the beta value of synthetic images and validation samples. To compare the GAN performances for transfer learning datasets from different image domains, ImageNet and anatomical noise images are used in the transfer learning dataset. We employ styleGAN2 as the backbone network and add the proposed beta loss function. The patient-derived breast CT dataset is used as the training and validation dataset; 7355 and 212 images are used for network training and validation, respectively. We use the beta value evaluation and Fréchet inception distance (FID) score for quantitative evaluation.Main results.For qualitative assessment, we attempt to replicate the images from the validation dataset using the trained GAN. Our results show that the proposed beta loss function achieves a more similar beta value to real images and a lower FID score. Moreover, we observe that the GAN pretrained with anatomical noise images achieves better equality than ImageNet for beta value evaluation and FID score. Finally, the beta loss function with anatomical noise as the transfer learning dataset achieves the lowest FID score.Significance.Overall, the GAN using the proposed beta loss function with anatomical noise images as the transfer learning dataset provides the lowest FID score among all tested cases. Hence, this work has implications for developing GAN-based breast image synthesis methods for medical imaging applications.
Collapse
Affiliation(s)
- Gihun Kim
- School of Integrated Technology, Yonsei University, Republic of Korea
| | - Jongduk Baek
- Department of Artificial Intelligence, Yonsei University, Republic of Korea
- Baruenex Imaging, Republic of Korea
| |
Collapse
|
24
|
Zhang J, Wei Z, Wu X, Shang Y, Tian J, Hui H. Magnetic particle imaging deblurring with dual contrastive learning and adversarial framework. Comput Biol Med 2023; 165:107461. [PMID: 37708716 DOI: 10.1016/j.compbiomed.2023.107461] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2023] [Revised: 08/27/2023] [Accepted: 09/04/2023] [Indexed: 09/16/2023]
Abstract
Magnetic particle imaging (MPI) is an emerging medical imaging technique that has high sensitivity, contrast, and excellent depth penetration. In MPI, x-space is a reconstruction method that transforms the measured voltages into particle concentrations. The reconstructed native image can be modeled as a convolution of the magnetic particle concentration with a point-spread function (PSF). The PSF is one of the important parameters in deconvolution. However, accurately measuring or modeling the PSF in the hardware used for deconvolution is challenging due to the various environment and magnetic particle relaxation. The inaccurate PSF estimation may lead to the loss of the content structure of the MPI image, especially in low gradient fields. In this study, we developed a Dual Adversarial Network (DAN) with patch-wise contrastive constraint to deblur the MPI image. This method can overcome the limitations of unpaired data in data acquisition scenarios and remove the blur around the boundary more effectively than the common deconvolution method. We evaluated the performance of the proposed DAN model on simulated and real data. Experimental results confirmed that our model performs favorably against the deconvolution method that is mainly used for deblurring the MPI image and other GAN-based deep learning models.
Collapse
Affiliation(s)
- Jiaxin Zhang
- CAS Key Laboratory of Molecular Imaging, Institute of Automation, Chinese Academy of Sciences, Beijing, China; Beijing Key Laboratory of Molecular Imaging, Beijing, China; University of Chinese Academy of Sciences, Beijing, China
| | - Zechen Wei
- CAS Key Laboratory of Molecular Imaging, Institute of Automation, Chinese Academy of Sciences, Beijing, China; Beijing Key Laboratory of Molecular Imaging, Beijing, China; University of Chinese Academy of Sciences, Beijing, China
| | - Xiangjun Wu
- Key Laboratory of Big Data-Based Precision Medicine (Beihang University), Ministry of Industry and Information Technology, Beijing, China; School of Engineering Medicine & School of Biological Science and Medical Engineering, Beihang University, Beijing, China
| | - Yaxin Shang
- School of Computer and Information Technology, Beijing Jiaotong University, Beijing, China
| | - Jie Tian
- CAS Key Laboratory of Molecular Imaging, Institute of Automation, Chinese Academy of Sciences, Beijing, China; Beijing Key Laboratory of Molecular Imaging, Beijing, China; Key Laboratory of Big Data-Based Precision Medicine (Beihang University), Ministry of Industry and Information Technology, Beijing, China; School of Engineering Medicine & School of Biological Science and Medical Engineering, Beihang University, Beijing, China.
| | - Hui Hui
- CAS Key Laboratory of Molecular Imaging, Institute of Automation, Chinese Academy of Sciences, Beijing, China; Beijing Key Laboratory of Molecular Imaging, Beijing, China; University of Chinese Academy of Sciences, Beijing, China.
| |
Collapse
|
25
|
Singh D, Monga A, de Moura HL, Zhang X, Zibetti MVW, Regatte RR. Emerging Trends in Fast MRI Using Deep-Learning Reconstruction on Undersampled k-Space Data: A Systematic Review. Bioengineering (Basel) 2023; 10:1012. [PMID: 37760114 PMCID: PMC10525988 DOI: 10.3390/bioengineering10091012] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Revised: 08/22/2023] [Accepted: 08/24/2023] [Indexed: 09/29/2023] Open
Abstract
Magnetic Resonance Imaging (MRI) is an essential medical imaging modality that provides excellent soft-tissue contrast and high-resolution images of the human body, allowing us to understand detailed information on morphology, structural integrity, and physiologic processes. However, MRI exams usually require lengthy acquisition times. Methods such as parallel MRI and Compressive Sensing (CS) have significantly reduced the MRI acquisition time by acquiring less data through undersampling k-space. The state-of-the-art of fast MRI has recently been redefined by integrating Deep Learning (DL) models with these undersampled approaches. This Systematic Literature Review (SLR) comprehensively analyzes deep MRI reconstruction models, emphasizing the key elements of recently proposed methods and highlighting their strengths and weaknesses. This SLR involves searching and selecting relevant studies from various databases, including Web of Science and Scopus, followed by a rigorous screening and data extraction process using the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. It focuses on various techniques, such as residual learning, image representation using encoders and decoders, data-consistency layers, unrolled networks, learned activations, attention modules, plug-and-play priors, diffusion models, and Bayesian methods. This SLR also discusses the use of loss functions and training with adversarial networks to enhance deep MRI reconstruction methods. Moreover, we explore various MRI reconstruction applications, including non-Cartesian reconstruction, super-resolution, dynamic MRI, joint learning of reconstruction with coil sensitivity and sampling, quantitative mapping, and MR fingerprinting. This paper also addresses research questions, provides insights for future directions, and emphasizes robust generalization and artifact handling. Therefore, this SLR serves as a valuable resource for advancing fast MRI, guiding research and development efforts of MRI reconstruction for better image quality and faster data acquisition.
Collapse
Affiliation(s)
- Dilbag Singh
- Center of Biomedical Imaging, Department of Radiology, New York University Grossman School of Medicine, New York, NY 10016, USA; (A.M.); (H.L.d.M.); (X.Z.); (M.V.W.Z.)
| | | | | | | | | | - Ravinder R. Regatte
- Center of Biomedical Imaging, Department of Radiology, New York University Grossman School of Medicine, New York, NY 10016, USA; (A.M.); (H.L.d.M.); (X.Z.); (M.V.W.Z.)
| |
Collapse
|
26
|
Alamgeer M, Alruwais N, Alshahrani HM, Mohamed A, Assiri M. Dung Beetle Optimization with Deep Feature Fusion Model for Lung Cancer Detection and Classification. Cancers (Basel) 2023; 15:3982. [PMID: 37568800 PMCID: PMC10417684 DOI: 10.3390/cancers15153982] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2023] [Revised: 07/27/2023] [Accepted: 07/31/2023] [Indexed: 08/13/2023] Open
Abstract
Lung cancer is the main cause of cancer deaths all over the world. An important reason for these deaths was late analysis and worse prediction. With the accelerated improvement of deep learning (DL) approaches, DL can be effectively and widely executed for several real-world applications in healthcare systems, like medical image interpretation and disease analysis. Medical imaging devices can be vital in primary-stage lung tumor analysis and the observation of lung tumors from the treatment. Many medical imaging modalities like computed tomography (CT), chest X-ray (CXR), molecular imaging, magnetic resonance imaging (MRI), and positron emission tomography (PET) systems are widely analyzed for lung cancer detection. This article presents a new dung beetle optimization modified deep feature fusion model for lung cancer detection and classification (DBOMDFF-LCC) technique. The presented DBOMDFF-LCC technique mainly depends upon the feature fusion and hyperparameter tuning process. To accomplish this, the DBOMDFF-LCC technique uses a feature fusion process comprising three DL models, namely residual network (ResNet), densely connected network (DenseNet), and Inception-ResNet-v2. Furthermore, the DBO approach was employed for the optimum hyperparameter selection of three DL approaches. For lung cancer detection purposes, the DBOMDFF-LCC system utilizes a long short-term memory (LSTM) approach. The simulation result analysis of the DBOMDFF-LCC technique of the medical dataset is investigated using different evaluation metrics. The extensive comparative results highlighted the betterment of the DBOMDFF-LCC technique of lung cancer classification.
Collapse
Affiliation(s)
- Mohammad Alamgeer
- Department of Information Systems, College of Science & Art at Mahayil, King Khalid University, Abha 61421, Saudi Arabia
| | - Nuha Alruwais
- Department of Computer Science and Engineering, College of Applied Studies and Community Services, King Saud University, P.O. Box 22459, Riyadh 11495, Saudi Arabia;
| | - Haya Mesfer Alshahrani
- Department of Information Systems, College of Computer and Information Sciences, Princess Nourah bint Abdulrahman University, P.O. Box 84428, Riyadh 11671, Saudi Arabia;
| | - Abdullah Mohamed
- Research Centre, Future University in Egypt, New Cairo 11845, Egypt;
| | - Mohammed Assiri
- Department of Computer Science, College of Sciences and Humanities-Aflaj, Prince Sattam bin Abdulaziz University, Aflaj 16273, Saudi Arabia;
| |
Collapse
|