1
|
Liang Z, Xue Z, Rajaraman S, Antani S. Automated quantification of SARS-CoV-2 pneumonia with large vision model knowledge adaptation. New Microbes New Infect 2024; 62:101457. [PMID: 39253407 PMCID: PMC11381763 DOI: 10.1016/j.nmni.2024.101457] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2023] [Revised: 07/10/2024] [Accepted: 08/12/2024] [Indexed: 09/11/2024] Open
Abstract
Background Large vision models (LVM) pretrained by large datasets have demonstrated their enormous capacity to understand visual patterns and capture semantic information from images. We proposed a novel method of knowledge domain adaptation with pretrained LVM for a low-cost artificial intelligence (AI) model to quantify the severity of SARS-CoV-2 pneumonia based on frontal chest X-ray (CXR) images. Methods Our method used the pretrained LVMs as the primary feature extractor and self-supervised contrastive learning for domain adaptation. An encoder with a 2048-dimensional feature vector output was first trained by self-supervised learning for knowledge domain adaptation. Then a multi-layer perceptron (MLP) was trained for the final severity prediction. A dataset with 2599 CXR images was used for model training and evaluation. Results The model based on the pretrained vision transformer (ViT) and self-supervised learning achieved the best performance in cross validation, with mean squared error (MSE) of 23.83 (95 % CI 22.67-25.00) and mean absolute error (MAE) of 3.64 (95 % CI 3.54-3.73). Its prediction correlation has theR 2 of 0.81 (95 % CI 0.79-0.82) and Spearman ρ of 0.80 (95 % CI 0.77-0.81), which are comparable to the current state-of-the-art (SOTA) methods trained by much larger CXR datasets. Conclusion The proposed new method has achieved the SOTA performance to quantify the severity of SARS-CoV-2 pneumonia at a significantly lower cost. The method can be extended to other infectious disease detection or quantification to expedite the application of AI in medical research.
Collapse
Affiliation(s)
- Zhaohui Liang
- Computational Health Research Branch, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Zhiyun Xue
- Computational Health Research Branch, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Sivaramakrishnan Rajaraman
- Computational Health Research Branch, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Sameer Antani
- Computational Health Research Branch, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| |
Collapse
|
2
|
Abidin ZU, Naqvi RA, Haider A, Kim HS, Jeong D, Lee SW. Recent deep learning-based brain tumor segmentation models using multi-modality magnetic resonance imaging: a prospective survey. Front Bioeng Biotechnol 2024; 12:1392807. [PMID: 39104626 PMCID: PMC11298476 DOI: 10.3389/fbioe.2024.1392807] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2024] [Accepted: 06/14/2024] [Indexed: 08/07/2024] Open
Abstract
Radiologists encounter significant challenges when segmenting and determining brain tumors in patients because this information assists in treatment planning. The utilization of artificial intelligence (AI), especially deep learning (DL), has emerged as a useful tool in healthcare, aiding radiologists in their diagnostic processes. This empowers radiologists to understand the biology of tumors better and provide personalized care to patients with brain tumors. The segmentation of brain tumors using multi-modal magnetic resonance imaging (MRI) images has received considerable attention. In this survey, we first discuss multi-modal and available magnetic resonance imaging modalities and their properties. Subsequently, we discuss the most recent DL-based models for brain tumor segmentation using multi-modal MRI. We divide this section into three parts based on the architecture: the first is for models that use the backbone of convolutional neural networks (CNN), the second is for vision transformer-based models, and the third is for hybrid models that use both convolutional neural networks and transformer in the architecture. In addition, in-depth statistical analysis is performed of the recent publication, frequently used datasets, and evaluation metrics for segmentation tasks. Finally, open research challenges are identified and suggested promising future directions for brain tumor segmentation to improve diagnostic accuracy and treatment outcomes for patients with brain tumors. This aligns with public health goals to use health technologies for better healthcare delivery and population health management.
Collapse
Affiliation(s)
- Zain Ul Abidin
- Department of Intelligent Mechatronics Engineering, Sejong University, Seoul, Republic of Korea
| | - Rizwan Ali Naqvi
- Department of Intelligent Mechatronics Engineering, Sejong University, Seoul, Republic of Korea
| | - Amir Haider
- Department of Intelligent Mechatronics Engineering, Sejong University, Seoul, Republic of Korea
| | - Hyung Seok Kim
- Department of Intelligent Mechatronics Engineering, Sejong University, Seoul, Republic of Korea
| | - Daesik Jeong
- College of Convergence Engineering, Sangmyung University, Seoul, Republic of Korea
| | - Seung Won Lee
- School of Medicine, Sungkyunkwan University, Suwon, Republic of Korea
| |
Collapse
|
3
|
Ellison J, Caliva F, Damasceno P, Luks TL, LaFontaine M, Cluceru J, Kemisetti A, Li Y, Molinaro AM, Pedoia V, Villanueva-Meyer JE, Lupo JM. Improving the Generalizability of Deep Learning for T2-Lesion Segmentation of Gliomas in the Post-Treatment Setting. Bioengineering (Basel) 2024; 11:497. [PMID: 38790363 PMCID: PMC11117752 DOI: 10.3390/bioengineering11050497] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2024] [Revised: 04/24/2024] [Accepted: 05/07/2024] [Indexed: 05/26/2024] Open
Abstract
Although fully automated volumetric approaches for monitoring brain tumor response have many advantages, most available deep learning models are optimized for highly curated, multi-contrast MRI from newly diagnosed gliomas, which are not representative of post-treatment cases in the clinic. Improving segmentation for treated patients is critical to accurately tracking changes in response to therapy. We investigated mixing data from newly diagnosed (n = 208) and treated (n = 221) gliomas in training, applying transfer learning (TL) from pre- to post-treatment imaging domains, and incorporating spatial regularization for T2-lesion segmentation using only T2 FLAIR images as input to improve generalization post-treatment. These approaches were evaluated on 24 patients suspected of progression who had received prior treatment. Including 26% of treated patients in training improved performance by 13.9%, and including more treated and untreated patients resulted in minimal changes. Fine-tuning with treated glioma improved sensitivity compared to data mixing by 2.5% (p < 0.05), and spatial regularization further improved performance when used with TL by 95th HD, Dice, and sensitivity (6.8%, 0.8%, 2.2%; p < 0.05). While training with ≥60 treated patients yielded the majority of performance gain, TL and spatial regularization further improved T2-lesion segmentation to treated gliomas using a single MR contrast and minimal processing, demonstrating clinical utility in response assessment.
Collapse
Affiliation(s)
- Jacob Ellison
- Department of Radiology and Biomedical Imaging, UCSF, San Francisco, CA 94143, USA; (J.E.); (F.C.); (P.D.); (T.L.L.); (M.L.); (J.C.); (A.K.); (Y.L.); (V.P.); (J.E.V.-M.)
- Center for Intelligent Imaging, UCSF, San Francisco, CA 94143, USA
- UCSF/UC Berkeley Graduate Program in Bioengineering, San Francisco, CA 94143, USA
| | - Francesco Caliva
- Department of Radiology and Biomedical Imaging, UCSF, San Francisco, CA 94143, USA; (J.E.); (F.C.); (P.D.); (T.L.L.); (M.L.); (J.C.); (A.K.); (Y.L.); (V.P.); (J.E.V.-M.)
- Center for Intelligent Imaging, UCSF, San Francisco, CA 94143, USA
| | - Pablo Damasceno
- Department of Radiology and Biomedical Imaging, UCSF, San Francisco, CA 94143, USA; (J.E.); (F.C.); (P.D.); (T.L.L.); (M.L.); (J.C.); (A.K.); (Y.L.); (V.P.); (J.E.V.-M.)
- Center for Intelligent Imaging, UCSF, San Francisco, CA 94143, USA
| | - Tracy L. Luks
- Department of Radiology and Biomedical Imaging, UCSF, San Francisco, CA 94143, USA; (J.E.); (F.C.); (P.D.); (T.L.L.); (M.L.); (J.C.); (A.K.); (Y.L.); (V.P.); (J.E.V.-M.)
| | - Marisa LaFontaine
- Department of Radiology and Biomedical Imaging, UCSF, San Francisco, CA 94143, USA; (J.E.); (F.C.); (P.D.); (T.L.L.); (M.L.); (J.C.); (A.K.); (Y.L.); (V.P.); (J.E.V.-M.)
| | - Julia Cluceru
- Department of Radiology and Biomedical Imaging, UCSF, San Francisco, CA 94143, USA; (J.E.); (F.C.); (P.D.); (T.L.L.); (M.L.); (J.C.); (A.K.); (Y.L.); (V.P.); (J.E.V.-M.)
- Center for Intelligent Imaging, UCSF, San Francisco, CA 94143, USA
| | - Anil Kemisetti
- Department of Radiology and Biomedical Imaging, UCSF, San Francisco, CA 94143, USA; (J.E.); (F.C.); (P.D.); (T.L.L.); (M.L.); (J.C.); (A.K.); (Y.L.); (V.P.); (J.E.V.-M.)
| | - Yan Li
- Department of Radiology and Biomedical Imaging, UCSF, San Francisco, CA 94143, USA; (J.E.); (F.C.); (P.D.); (T.L.L.); (M.L.); (J.C.); (A.K.); (Y.L.); (V.P.); (J.E.V.-M.)
- Center for Intelligent Imaging, UCSF, San Francisco, CA 94143, USA
| | | | - Valentina Pedoia
- Department of Radiology and Biomedical Imaging, UCSF, San Francisco, CA 94143, USA; (J.E.); (F.C.); (P.D.); (T.L.L.); (M.L.); (J.C.); (A.K.); (Y.L.); (V.P.); (J.E.V.-M.)
- Center for Intelligent Imaging, UCSF, San Francisco, CA 94143, USA
- UCSF/UC Berkeley Graduate Program in Bioengineering, San Francisco, CA 94143, USA
| | - Javier E. Villanueva-Meyer
- Department of Radiology and Biomedical Imaging, UCSF, San Francisco, CA 94143, USA; (J.E.); (F.C.); (P.D.); (T.L.L.); (M.L.); (J.C.); (A.K.); (Y.L.); (V.P.); (J.E.V.-M.)
- Center for Intelligent Imaging, UCSF, San Francisco, CA 94143, USA
| | - Janine M. Lupo
- Department of Radiology and Biomedical Imaging, UCSF, San Francisco, CA 94143, USA; (J.E.); (F.C.); (P.D.); (T.L.L.); (M.L.); (J.C.); (A.K.); (Y.L.); (V.P.); (J.E.V.-M.)
- Center for Intelligent Imaging, UCSF, San Francisco, CA 94143, USA
- UCSF/UC Berkeley Graduate Program in Bioengineering, San Francisco, CA 94143, USA
| |
Collapse
|
4
|
Khalighi S, Reddy K, Midya A, Pandav KB, Madabhushi A, Abedalthagafi M. Artificial intelligence in neuro-oncology: advances and challenges in brain tumor diagnosis, prognosis, and precision treatment. NPJ Precis Oncol 2024; 8:80. [PMID: 38553633 PMCID: PMC10980741 DOI: 10.1038/s41698-024-00575-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2023] [Accepted: 03/13/2024] [Indexed: 04/02/2024] Open
Abstract
This review delves into the most recent advancements in applying artificial intelligence (AI) within neuro-oncology, specifically emphasizing work on gliomas, a class of brain tumors that represent a significant global health issue. AI has brought transformative innovations to brain tumor management, utilizing imaging, histopathological, and genomic tools for efficient detection, categorization, outcome prediction, and treatment planning. Assessing its influence across all facets of malignant brain tumor management- diagnosis, prognosis, and therapy- AI models outperform human evaluations in terms of accuracy and specificity. Their ability to discern molecular aspects from imaging may reduce reliance on invasive diagnostics and may accelerate the time to molecular diagnoses. The review covers AI techniques, from classical machine learning to deep learning, highlighting current applications and challenges. Promising directions for future research include multimodal data integration, generative AI, large medical language models, precise tumor delineation and characterization, and addressing racial and gender disparities. Adaptive personalized treatment strategies are also emphasized for optimizing clinical outcomes. Ethical, legal, and social implications are discussed, advocating for transparency and fairness in AI integration for neuro-oncology and providing a holistic understanding of its transformative impact on patient care.
Collapse
Affiliation(s)
- Sirvan Khalighi
- Wallace H. Coulter Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, Atlanta, GA, USA
| | - Kartik Reddy
- Department of Radiology, Emory University, Atlanta, GA, USA
| | - Abhishek Midya
- Wallace H. Coulter Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, Atlanta, GA, USA
| | - Krunal Balvantbhai Pandav
- Wallace H. Coulter Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, Atlanta, GA, USA
| | - Anant Madabhushi
- Wallace H. Coulter Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, Atlanta, GA, USA.
- Atlanta Veterans Administration Medical Center, Atlanta, GA, USA.
| | - Malak Abedalthagafi
- Department of Pathology and Laboratory Medicine, Emory University, Atlanta, GA, USA.
- The Cell and Molecular Biology Program, Winship Cancer Institute, Atlanta, GA, USA.
| |
Collapse
|
5
|
Ghaderi S, Mohammadi S, Ghaderi K, Kiasat F, Mohammadi M. Marker-controlled watershed algorithm and fuzzy C-means clustering machine learning: automated segmentation of glioblastoma from MRI images in a case series. Ann Med Surg (Lond) 2024; 86:1460-1475. [PMID: 38463066 PMCID: PMC10923355 DOI: 10.1097/ms9.0000000000001756] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2023] [Accepted: 01/16/2024] [Indexed: 03/12/2024] Open
Abstract
Introduction and importance Automated segmentation of glioblastoma multiforme (GBM) from MRI images is crucial for accurate diagnosis and treatment planning. This paper presents a new and innovative approach for automating the segmentation of GBM from MRI images using the marker-controlled watershed segmentation (MCWS) algorithm. Case presentation and methods The technique involves several image processing techniques, including adaptive thresholding, morphological filtering, gradient magnitude calculation, and regional maxima identification. The MCWS algorithm efficiently segments images based on local intensity structures using the watershed transform, and fuzzy c-means (FCM) clustering improves segmentation accuracy. The presented approach achieved improved segmentation accuracy in detecting and segmenting GBM tumours from axial T2-weighted (T2-w) MRI images, as demonstrated by the mean characteristics performance metrics for GBM segmentation (sensitivity: 0.9905, specificity: 0.9483, accuracy: 0.9508, precision: 0.5481, F_measure: 0.7052, and jaccard: 0.9340). Clinical discussion The results of this study underline the importance of reliable and accurate image segmentation for effective diagnosis and treatment planning of GBM tumours. Conclusion The MCWS technique provides an effective and efficient approach for the segmentation of challenging medical images.
Collapse
Affiliation(s)
- Sadegh Ghaderi
- Department of Neuroscience and Addiction Studies, School of Advanced Technologies in Medicine, Tehran University of Medical Sciences, Tehran
| | - Sana Mohammadi
- Department of Medical Sciences, School of Medicine, Iran University of Medical Sciences, Tehran
| | - Kayvan Ghaderi
- Department of Information Technology and Computer Engineering, Faculty of Engineering, University of Kurdistan, Sanandaj
| | - Fereshteh Kiasat
- Department of Information Technology and Computer Engineering, Faculty of Engineering, University of Kurdistan, Sanandaj
| | - Mahdi Mohammadi
- Department of Medical Physics and Biomedical Engineering, School of Medicine, Tehran University of Medical Sciences, Tehran, Iran
| |
Collapse
|
6
|
Vrettos K, Koltsakis E, Zibis AH, Karantanas AH, Klontzas ME. Generative adversarial networks for spine imaging: A critical review of current applications. Eur J Radiol 2024; 171:111313. [PMID: 38237518 DOI: 10.1016/j.ejrad.2024.111313] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2023] [Revised: 12/18/2023] [Accepted: 01/09/2024] [Indexed: 02/10/2024]
Abstract
PURPOSE In recent years, the field of medical imaging has witnessed remarkable advancements, with innovative technologies which revolutionized the visualization and analysis of the human spine. Among the groundbreaking developments in medical imaging, Generative Adversarial Networks (GANs) have emerged as a transformative tool, offering unprecedented possibilities in enhancing spinal imaging techniques and diagnostic outcomes. This review paper aims to provide a comprehensive overview of the use of GANs in spinal imaging, and to emphasize their potential to improve the diagnosis and treatment of spine-related disorders. A specific review focusing on Generative Adversarial Networks (GANs) in the context of medical spine imaging is needed to provide a comprehensive and specialized analysis of the unique challenges, applications, and advancements within this specific domain, which might not be fully addressed in broader reviews covering GANs in general medical imaging. Such a review can offer insights into the tailored solutions and innovations that GANs bring to the field of spinal medical imaging. METHODS An extensive literature search from 2017 until July 2023, was conducted using the most important search engines and identified studies that used GANs in spinal imaging. RESULTS The implementations include generating fat suppressed T2-weighted (fsT2W) images from T1 and T2-weighted sequences, to reduce scan time. The generated images had a significantly better image quality than true fsT2W images and could improve diagnostic accuracy for certain pathologies. GANs were also utilized in generating virtual thin-slice images of intervertebral spaces, creating digital twins of human vertebrae, and predicting fracture response. Lastly, they could be applied to convert CT to MRI images, with the potential to generate near-MR images from CT without MRI. CONCLUSIONS GANs have promising applications in personalized medicine, image augmentation, and improved diagnostic accuracy. However, limitations such as small databases and misalignment in CT-MRI pairs, must be considered.
Collapse
Affiliation(s)
- Konstantinos Vrettos
- Department of Radiology, School of Medicine, University of Crete, Voutes Campus, Heraklion, Greece
| | - Emmanouil Koltsakis
- Department of Radiology, Karolinska University Hospital, Solna, Stockholm, Sweden
| | - Aristeidis H Zibis
- Department of Anatomy, Medical School, University of Thessaly, Larissa, Greece
| | - Apostolos H Karantanas
- Department of Radiology, School of Medicine, University of Crete, Voutes Campus, Heraklion, Greece; Computational BioMedicine Laboratory, Institute of Computer Science, Foundation for Research and Technology (FORTH), Heraklion, Crete, Greece; Department of Medical Imaging, University Hospital of Heraklion, Heraklion, Crete, Greece
| | - Michail E Klontzas
- Department of Radiology, School of Medicine, University of Crete, Voutes Campus, Heraklion, Greece; Computational BioMedicine Laboratory, Institute of Computer Science, Foundation for Research and Technology (FORTH), Heraklion, Crete, Greece; Department of Medical Imaging, University Hospital of Heraklion, Heraklion, Crete, Greece.
| |
Collapse
|
7
|
Kalantar R, Curcean S, Winfield JM, Lin G, Messiou C, Blackledge MD, Koh DM. Deep Learning Framework with Multi-Head Dilated Encoders for Enhanced Segmentation of Cervical Cancer on Multiparametric Magnetic Resonance Imaging. Diagnostics (Basel) 2023; 13:3381. [PMID: 37958277 PMCID: PMC10647438 DOI: 10.3390/diagnostics13213381] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2023] [Revised: 10/29/2023] [Accepted: 11/01/2023] [Indexed: 11/15/2023] Open
Abstract
T2-weighted magnetic resonance imaging (MRI) and diffusion-weighted imaging (DWI) are essential components of cervical cancer diagnosis. However, combining these channels for the training of deep learning models is challenging due to image misalignment. Here, we propose a novel multi-head framework that uses dilated convolutions and shared residual connections for the separate encoding of multiparametric MRI images. We employ a residual U-Net model as a baseline, and perform a series of architectural experiments to evaluate the tumor segmentation performance based on multiparametric input channels and different feature encoding configurations. All experiments were performed on a cohort of 207 patients with locally advanced cervical cancer. Our proposed multi-head model using separate dilated encoding for T2W MRI and combined b1000 DWI and apparent diffusion coefficient (ADC) maps achieved the best median Dice similarity coefficient (DSC) score, 0.823 (confidence interval (CI), 0.595-0.797), outperforming the conventional multi-channel model, DSC 0.788 (95% CI, 0.568-0.776), although the difference was not statistically significant (p > 0.05). We investigated channel sensitivity using 3D GRAD-CAM and channel dropout, and highlighted the critical importance of T2W and ADC channels for accurate tumor segmentation. However, our results showed that b1000 DWI had a minor impact on the overall segmentation performance. We demonstrated that the use of separate dilated feature extractors and independent contextual learning improved the model's ability to reduce the boundary effects and distortion of DWI, leading to improved segmentation performance. Our findings could have significant implications for the development of robust and generalizable models that can extend to other multi-modal segmentation applications.
Collapse
Affiliation(s)
- Reza Kalantar
- Division of Radiotherapy and Imaging, The Institute of Cancer Research, London SW7 3RP, UK; (R.K.); (J.M.W.); (C.M.); (D.-M.K.)
- Department of Radiology, The Royal Marsden Hospital, London SW3 6JJ, UK
| | - Sebastian Curcean
- Department of Radiation Oncology, Iuliu Hatieganu University of Medicine and Pharmacy, 400347 Cluj-Napoca, Romania;
| | - Jessica M. Winfield
- Division of Radiotherapy and Imaging, The Institute of Cancer Research, London SW7 3RP, UK; (R.K.); (J.M.W.); (C.M.); (D.-M.K.)
- Department of Radiology, The Royal Marsden Hospital, London SW3 6JJ, UK
| | - Gigin Lin
- Department of Medical Imaging and Intervention, Chang Gung Memorial Hospital at Linkou, Chang Gung University, Guishan, Taoyuan 333, Taiwan;
| | - Christina Messiou
- Division of Radiotherapy and Imaging, The Institute of Cancer Research, London SW7 3RP, UK; (R.K.); (J.M.W.); (C.M.); (D.-M.K.)
- Department of Radiology, The Royal Marsden Hospital, London SW3 6JJ, UK
| | - Matthew D. Blackledge
- Division of Radiotherapy and Imaging, The Institute of Cancer Research, London SW7 3RP, UK; (R.K.); (J.M.W.); (C.M.); (D.-M.K.)
- Department of Radiology, The Royal Marsden Hospital, London SW3 6JJ, UK
| | - Dow-Mu Koh
- Division of Radiotherapy and Imaging, The Institute of Cancer Research, London SW7 3RP, UK; (R.K.); (J.M.W.); (C.M.); (D.-M.K.)
- Department of Radiology, The Royal Marsden Hospital, London SW3 6JJ, UK
| |
Collapse
|
8
|
Ma J, Yuan G, Guo C, Gang X, Zheng M. SW-UNet: a U-Net fusing sliding window transformer block with CNN for segmentation of lung nodules. Front Med (Lausanne) 2023; 10:1273441. [PMID: 37841008 PMCID: PMC10569032 DOI: 10.3389/fmed.2023.1273441] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2023] [Accepted: 09/12/2023] [Indexed: 10/17/2023] Open
Abstract
Medical images are information carriers that visually reflect and record the anatomical structure of the human body, and play an important role in clinical diagnosis, teaching and research, etc. Modern medicine has become increasingly inseparable from the intelligent processing of medical images. In recent years, there have been more and more attempts to apply deep learning theory to medical image segmentation tasks, and it is imperative to explore a simple and efficient deep learning algorithm for medical image segmentation. In this paper, we investigate the segmentation of lung nodule images. We address the above-mentioned problems of medical image segmentation algorithms and conduct research on medical image fusion algorithms based on a hybrid channel-space attention mechanism and medical image segmentation algorithms with a hybrid architecture of Convolutional Neural Networks (CNN) and Visual Transformer. To the problem that medical image segmentation algorithms are difficult to capture long-range feature dependencies, this paper proposes a medical image segmentation model SW-UNet based on a hybrid CNN and Vision Transformer (ViT) framework. Self-attention mechanism and sliding window design of Visual Transformer are used to capture global feature associations and break the perceptual field limitation of convolutional operations due to inductive bias. At the same time, a widened self-attentive vector is used to streamline the number of modules and compress the model size so as to fit the characteristics of a small amount of medical data, which makes the model easy to be overfitted. Experiments on the LUNA16 lung nodule image dataset validate the algorithm and show that the proposed network can achieve efficient medical image segmentation on a lightweight scale. In addition, to validate the migratability of the model, we performed additional validation on other tumor datasets with desirable results. Our research addresses the crucial need for improved medical image segmentation algorithms. By introducing the SW-UNet model, which combines CNN and ViT, we successfully capture long-range feature dependencies and break the perceptual field limitations of traditional convolutional operations. This approach not only enhances the efficiency of medical image segmentation but also maintains model scalability and adaptability to small medical datasets. The positive outcomes on various tumor datasets emphasize the potential migratability and broad applicability of our proposed model in the field of medical image analysis.
Collapse
Affiliation(s)
- Jiajun Ma
- Shenhua Hollysys Information Technology Co., Ltd., Beijing, China
| | - Gang Yuan
- The First Affiliated Hospital of Dalian Medical University, Dalian, China
| | - Chenhua Guo
- School of Software, North University of China, Taiyuan, China
| | | | - Minting Zheng
- The First Affiliated Hospital of Dalian Medical University, Dalian, China
| |
Collapse
|