1
|
Wu BZ, Hu LH, Cao SF, Tan J, Danzeng NZ, Fan JF, Zhang WB, Peng X. Deep learning-based multimodal CT/MRI image fusion and segmentation strategies for surgical planning of oral and maxillofacial tumors: A pilot study. JOURNAL OF STOMATOLOGY, ORAL AND MAXILLOFACIAL SURGERY 2025:102324. [PMID: 40174752 DOI: 10.1016/j.jormas.2025.102324] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/05/2025] [Revised: 03/25/2025] [Accepted: 03/31/2025] [Indexed: 04/04/2025]
Abstract
PURPOSE This pilot study aims to evaluate the feasibility and accuracy of deep learning-based multimodal computed tomography/magnetic resonance imaging (CT/MRI) fusion and segmentation strategies for the surgical planning of oral and maxillofacial tumors. MATERIALS AND METHODS This study enrolled 30 oral and maxillofacial tumor patients visiting our department between 2016 and 2022. All patients underwent enhanced CT and MRI scanning of the oral and maxillofacial region. Furthermore, three fusion models (Elastix, ANTs, and NiftyReg) and three segmentation models (nnU-Net, 3D UX-Net, and U-Net) were combined to generate nine hybrid deep learning models that were trained. The performance of each model was evaluated via the Fusion Index (FI), Dice similarity coefficient (Dice), 95th-percentile Hausdorff distance (HD95), mean surface distance (MSD), precision, and recall analysis. RESULTS All three image fusion models (Elastix, ANTs, and NiftyReg) demonstrated satisfactory accuracy, with Elastix exhibiting the best performance. Among the tested segmentation models, the highest degree of accuracy for segmenting the maxilla and mandible was achieved by combining NiftyReg and nnU-Net. Furthermore, the highest overall accuracy of the nine hybrid models was observed with the Elastix and nnU-Net combination, which yielded a Dice coefficient of 0.89 for tumor segmentation. CONCLUSION In this study, deep learning models capable of automatic multimodal CT/MRI image fusion and segmentation of oral and maxillofacial tumors were successfully trained with a high degree of accuracy. The results demonstrated the feasibility of using deep learning-based image fusion and segmentation to establish a basis for virtual surgical planning.
Collapse
Affiliation(s)
- Bin-Zhang Wu
- Department of Oral and Maxillofacial Surgery, Peking University School and Hospital of Stomatology & National Center for Stomatology & National Clinical Research Center for Oral Diseases & National Engineering Research Center of Oral Biomaterials and Digital Medical Devices & Beijing Key Laboratory of Digital Stomatology & NHC Key Laboratory of Digital Stomatology & NMPA Key Laboratory for Dental Materials, Beijing, PR China; First Clinical Division, Peking University School and Hospital of Stomatology & National Center for Stomatology & National Clinical Research Center for Oral Diseases & National Engineering Research Center of Oral Biomaterials and Digital Medical Devices & Beijing Key Laboratory of Digital Stomatology & NHC Key Laboratory of Digital Stomatology & NMPA Key Laboratory for Dental Materials, Beijing, PR China
| | - Lei-Hao Hu
- Department of Oral and Maxillofacial Surgery, Peking University School and Hospital of Stomatology & National Center for Stomatology & National Clinical Research Center for Oral Diseases & National Engineering Research Center of Oral Biomaterials and Digital Medical Devices & Beijing Key Laboratory of Digital Stomatology & NHC Key Laboratory of Digital Stomatology & NMPA Key Laboratory for Dental Materials, Beijing, PR China; Department of General Dentistry, Peking University School and Hospital of Stomatology & National Center for Stomatology & National Clinical Research Center for Oral Diseases & National Engineering Research Center of Oral Biomaterials and Digital Medical Devices & Beijing Key Laboratory of Digital Stomatology & NHC Key Laboratory of Digital Stomatology & NMPA Key Laboratory for Dental Materials, Beijing, PR China
| | - Si-Fan Cao
- Beijing Engineering Research Center of Mixed Reality and Advanced Display, School of Optics and Photonics, Beijing Institute of Technology, Beijing, PR China
| | - Ji Tan
- Beijing Engineering Research Center of Mixed Reality and Advanced Display, School of Optics and Photonics, Beijing Institute of Technology, Beijing, PR China
| | - Nian-Zha Danzeng
- Department of stomatology, People's Hospital of Tibet Autonomous Region, Tibet Autonomous Region, PR China
| | - Jing-Fan Fan
- Beijing Engineering Research Center of Mixed Reality and Advanced Display, School of Optics and Photonics, Beijing Institute of Technology, Beijing, PR China.
| | - Wen-Bo Zhang
- Department of Oral and Maxillofacial Surgery, Peking University School and Hospital of Stomatology & National Center for Stomatology & National Clinical Research Center for Oral Diseases & National Engineering Research Center of Oral Biomaterials and Digital Medical Devices & Beijing Key Laboratory of Digital Stomatology & NHC Key Laboratory of Digital Stomatology & NMPA Key Laboratory for Dental Materials, Beijing, PR China.
| | - Xin Peng
- Department of Oral and Maxillofacial Surgery, Peking University School and Hospital of Stomatology & National Center for Stomatology & National Clinical Research Center for Oral Diseases & National Engineering Research Center of Oral Biomaterials and Digital Medical Devices & Beijing Key Laboratory of Digital Stomatology & NHC Key Laboratory of Digital Stomatology & NMPA Key Laboratory for Dental Materials, Beijing, PR China
| |
Collapse
|
2
|
R P, M JPP, J S N. Brain tumor segmentation using multi-scale attention U-Net with EfficientNetB4 encoder for enhanced MRI analysis. Sci Rep 2025; 15:9914. [PMID: 40121246 PMCID: PMC11929897 DOI: 10.1038/s41598-025-94267-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2024] [Accepted: 03/12/2025] [Indexed: 03/25/2025] Open
Abstract
Accurate brain tumor segmentation is critical for clinical diagnosis and treatment planning. This study proposes an advanced segmentation framework that combines Multiscale Attention U-Net with the EfficientNetB4 encoder to enhance segmentation performance. Unlike conventional U-Net-based architectures, the proposed model leverages EfficientNetB4's compound scaling to optimize feature extraction at multiple resolutions while maintaining low computational overhead. Additionally, the Multi-Scale Attention Mechanism (utilizing [Formula: see text], and [Formula: see text] kernels) enhances feature representation by capturing tumor boundaries across different scales, addressing limitations of existing CNN-based segmentation methods. Our approach effectively suppresses irrelevant regions and enhances tumor localization through attention-enhanced skip connections and residual attention blocks. Extensive experiments were conducted on the publicly available Figshare brain tumor dataset, comparing different EfficientNet variants to determine the optimal architecture. EfficientNetB4 demonstrated superior performance, achieving an Accuracy of 99.79%, MCR of 0.21%, Dice Coefficient of 0.9339, and an Intersection over Union (IoU) of 0.8795, outperforming other variants in accuracy and computational efficiency. The training process was analyzed using key metrics, including Dice Coefficient, dice loss, precision, recall, specificity, and IoU, showing stable convergence and generalization. Additionally, the proposed method was evaluated against state-of-the-art approaches, surpassing them in all critical metrics, including accuracy, IoU, Dice Coefficient, precision, recall, specificity, and mean IoU. This study demonstrates the effectiveness of the proposed method for robust and efficient segmentation of brain tumors, positioning it as a valuable tool for clinical and research applications.
Collapse
Affiliation(s)
- Preetha R
- School of Electronics Engineering, Vellore Institute of Technology, Vellore, 632014, Tamilnadu, India
| | | | - Nisha J S
- School of Electronics Engineering, Vellore Institute of Technology, Vellore, 632014, Tamilnadu, India
| |
Collapse
|
3
|
Lin Q, Chen C, Li K, Cao W, Wang R, Fichera A, Han S, Zou X, Li T, Zou P, Wang H, Ye Z, Yuan Z. A deep-learning model to predict the completeness of cytoreductive surgery in colorectal cancer with peritoneal metastasis☆. EUROPEAN JOURNAL OF SURGICAL ONCOLOGY 2025; 51:109760. [PMID: 40174333 DOI: 10.1016/j.ejso.2025.109760] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2024] [Revised: 02/26/2025] [Accepted: 03/09/2025] [Indexed: 04/04/2025]
Abstract
BACKGROUND Colorectal cancer (CRC) with peritoneal metastasis (PM) is associated with poor prognosis. The Peritoneal Cancer Index (PCI) is used to evaluate the extent of PM and to select Cytoreductive Surgery (CRS). However, PCI score is not accurate to guide patient's selection for CRS. OBJECTIVE We have developed a novel AI framework of decoupling feature alignment and fusion (DeAF) by deep learning to aid selection of PM patients and predict surgical completeness of CRS. METHODS 186 CRC patients with PM recruited from four tertiary hospitals were enrolled. In the training cohort, deep learning was used to train the DeAF model using Simsiam algorithms by contrast CT images and then fuse clinicopathological parameters to increase performance. The accuracy, sensitivity, specificity, and AUC by ROC were evaluated both in the internal validation cohort and three external cohorts. RESULTS The DeAF model demonstrated a robust accuracy to predict the completeness of CRS with AUC of 0.9 (95 % CI: 0.793-1.000) in internal validation cohort. The model can guide selection of suitable patients and predict potential benefits from CRS. The high predictive performance in predicting CRS completeness were validated in three external cohorts with AUC values of 0.906(95 % CI: 0.812-1.000), 0.960(95 % CI: 0.885-1.000), and 0.933 (95 % CI: 0.791-1.000), respectively. CONCLUSION The novel DeAF framework can aid surgeons to select suitable PM patients for CRS and predict the completeness of CRS. The model can change surgical decision-making and provide potential benefits for PM patients.
Collapse
Affiliation(s)
- Qingfeng Lin
- Department of Colorectal Surgery and Guangdong Provincial Key Laboratory of Colorectal and Pelvic Floor Diseases, The Sixth Affiliated Hospital, Sun Yat-Sen University, Guangzhou, Guangdong Province, China
| | - Can Chen
- College of Mathematics and Informatics, South China Agricultural University, Guangzhou, China; College of Computers, Central South University, Changsha, China
| | - Kangshun Li
- College of Mathematics and Informatics, South China Agricultural University, Guangzhou, China
| | - Wuteng Cao
- Department of Radiology, The Sixth Affiliated Hospital, Sun Yat-Sen University, Guangzhou, China
| | - Renjie Wang
- Department of Colorectal Surgery, Fudan University Shanghai Cancer Center, Shanghai, China
| | - Alessandro Fichera
- Colon and Rectal Surgery, Baylor University Medical Center, Dallas, TX, USA
| | - Shuai Han
- General Surgery Center, Department of Gastrointestinal Surgery, Zhujiang Hospital, Southern Medical University, Guangzhou, China
| | - Xiangjun Zou
- College of Intelligent Manufacturing and Modern Industry (School of Mechanical Engineering), Xinjiang University, Urumqi, China
| | - Tian Li
- Department of Health Technology and Informatics, The Hong Kong Polytechnic University, Hong Kong, China
| | - Peiru Zou
- Department of Colorectal Surgery and Guangdong Provincial Key Laboratory of Colorectal and Pelvic Floor Diseases, The Sixth Affiliated Hospital, Sun Yat-Sen University, Guangzhou, Guangdong Province, China
| | - Hui Wang
- Department of Colorectal Surgery and Guangdong Provincial Key Laboratory of Colorectal and Pelvic Floor Diseases, The Sixth Affiliated Hospital, Sun Yat-Sen University, Guangzhou, Guangdong Province, China.
| | - Zaisheng Ye
- Department of Gastrointestinal Surgical Oncology, Fujian Cancer Hospital and Fujian Medical University Cancer Hospital, Fuzhou, China.
| | - Zixu Yuan
- Department of Colorectal Surgery and Guangdong Provincial Key Laboratory of Colorectal and Pelvic Floor Diseases, The Sixth Affiliated Hospital, Sun Yat-Sen University, Guangzhou, Guangdong Province, China.
| |
Collapse
|
4
|
Zhang Z, Zhou X, Fang Y, Xiong Z, Zhang T. AI-driven 3D bioprinting for regenerative medicine: From bench to bedside. Bioact Mater 2025; 45:201-230. [PMID: 39651398 PMCID: PMC11625302 DOI: 10.1016/j.bioactmat.2024.11.021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2024] [Revised: 11/01/2024] [Accepted: 11/16/2024] [Indexed: 12/11/2024] Open
Abstract
In recent decades, 3D bioprinting has garnered significant research attention due to its ability to manipulate biomaterials and cells to create complex structures precisely. However, due to technological and cost constraints, the clinical translation of 3D bioprinted products (BPPs) from bench to bedside has been hindered by challenges in terms of personalization of design and scaling up of production. Recently, the emerging applications of artificial intelligence (AI) technologies have significantly improved the performance of 3D bioprinting. However, the existing literature remains deficient in a methodological exploration of AI technologies' potential to overcome these challenges in advancing 3D bioprinting toward clinical application. This paper aims to present a systematic methodology for AI-driven 3D bioprinting, structured within the theoretical framework of Quality by Design (QbD). This paper commences by introducing the QbD theory into 3D bioprinting, followed by summarizing the technology roadmap of AI integration in 3D bioprinting, including multi-scale and multi-modal sensing, data-driven design, and in-line process control. This paper further describes specific AI applications in 3D bioprinting's key elements, including bioink formulation, model structure, printing process, and function regulation. Finally, the paper discusses current prospects and challenges associated with AI technologies to further advance the clinical translation of 3D bioprinting.
Collapse
Affiliation(s)
- Zhenrui Zhang
- Biomanufacturing Center, Department of Mechanical Engineering, Tsinghua University, Beijing, 100084, PR China
- Biomanufacturing and Rapid Forming Technology Key Laboratory of Beijing, Beijing, 100084, PR China
- “Biomanufacturing and Engineering Living Systems” Innovation International Talents Base (111 Base), Beijing, 100084, PR China
| | - Xianhao Zhou
- Biomanufacturing Center, Department of Mechanical Engineering, Tsinghua University, Beijing, 100084, PR China
- Biomanufacturing and Rapid Forming Technology Key Laboratory of Beijing, Beijing, 100084, PR China
- “Biomanufacturing and Engineering Living Systems” Innovation International Talents Base (111 Base), Beijing, 100084, PR China
| | - Yongcong Fang
- Biomanufacturing Center, Department of Mechanical Engineering, Tsinghua University, Beijing, 100084, PR China
- Biomanufacturing and Rapid Forming Technology Key Laboratory of Beijing, Beijing, 100084, PR China
- “Biomanufacturing and Engineering Living Systems” Innovation International Talents Base (111 Base), Beijing, 100084, PR China
- State Key Laboratory of Tribology in Advanced Equipment, Tsinghua University, Beijing, 100084, PR China
| | - Zhuo Xiong
- Biomanufacturing Center, Department of Mechanical Engineering, Tsinghua University, Beijing, 100084, PR China
- Biomanufacturing and Rapid Forming Technology Key Laboratory of Beijing, Beijing, 100084, PR China
- “Biomanufacturing and Engineering Living Systems” Innovation International Talents Base (111 Base), Beijing, 100084, PR China
| | - Ting Zhang
- Biomanufacturing Center, Department of Mechanical Engineering, Tsinghua University, Beijing, 100084, PR China
- Biomanufacturing and Rapid Forming Technology Key Laboratory of Beijing, Beijing, 100084, PR China
- “Biomanufacturing and Engineering Living Systems” Innovation International Talents Base (111 Base), Beijing, 100084, PR China
- State Key Laboratory of Tribology in Advanced Equipment, Tsinghua University, Beijing, 100084, PR China
| |
Collapse
|
5
|
Yucheng L, Lingyun Q, Kainan S, Yongshi J, Wenming Z, Jieni D, Weijun C. Development and validation of a deep reinforcement learning algorithm for auto-delineation of organs at risk in cervical cancer radiotherapy. Sci Rep 2025; 15:6800. [PMID: 40000766 PMCID: PMC11861648 DOI: 10.1038/s41598-025-91362-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2024] [Accepted: 02/19/2025] [Indexed: 02/27/2025] Open
Abstract
This study was conducted to develop and validate a novel deep reinforcement learning (DRL) algorithm incorporating the segment anything model (SAM) to enhance the accuracy of automatic contouring organs at risk during radiotherapy for cervical cancer patients. CT images were collected from 150 cervical cancer patients treated at our hospital between 2021 and 2023. Among these images, 122 CT images were used as a training set for the algorithm training of the DRL model based on the SAM model, and 28 CT images were used for the test set. The model's performance was evaluated by comparing its segmentation results with the ground truth (manual contouring) obtained through manual contouring by expert clinicians. The test results were compared with the contouring results of commercial automatic contouring software based on the deep learning (DL) algorithm model. The Dice similarity coefficient (DSC), 95th percentile Hausdorff distance, average symmetric surface distance (ASSD), and relative absolute volume difference (RAVD) were used to quantitatively assess the contouring accuracy from different perspectives, enabling the contouring results to be comprehensively and objectively evaluated. The DRL model outperformed the DL model across all evaluated metrics. DRL achieved higher median DSC values, such as 0.97 versus 0.96 for the left kidney (P < 0.001), and demonstrated better boundary accuracy with lower HD95 values, e.g., 14.30 mm versus 17.24 mm for the rectum (P < 0.001). Moreover, DRL exhibited superior spatial agreement (median ASSD: 1.55 mm vs. 1.80 mm for the rectum, P < 0.001) and volume prediction accuracy (median RAVD: 10.25 vs. 10.64 for the duodenum, P < 0.001). These findings indicate that integrating SAM with RL (reinforcement learning) enhances segmentation accuracy and consistency compared to conventional DL methods. The proposed approach introduces a novel training strategy that improves performance without increasing model complexity, demonstrating its potential applicability in clinical practice.
Collapse
Affiliation(s)
- Li Yucheng
- Cancer Center, Department of Radiation Oncology, Zhejiang Provincial People's Hospital, Affiliated People's Hospital, Hangzhou Medical College, Hangzhou, Zhejiang, China
| | - Qiu Lingyun
- Cancer Center, Department of Radiation Oncology, Zhejiang Provincial People's Hospital, Affiliated People's Hospital, Hangzhou Medical College, Hangzhou, Zhejiang, China
| | - Shao Kainan
- Cancer Center, Department of Radiation Oncology, Zhejiang Provincial People's Hospital, Affiliated People's Hospital, Hangzhou Medical College, Hangzhou, Zhejiang, China
| | - Jia Yongshi
- Cancer Center, Department of Radiation Oncology, Zhejiang Provincial People's Hospital, Affiliated People's Hospital, Hangzhou Medical College, Hangzhou, Zhejiang, China
| | - Zhan Wenming
- Cancer Center, Department of Radiation Oncology, Zhejiang Provincial People's Hospital, Affiliated People's Hospital, Hangzhou Medical College, Hangzhou, Zhejiang, China
| | - Ding Jieni
- Cancer Center, Department of Radiation Oncology, Zhejiang Provincial People's Hospital, Affiliated People's Hospital, Hangzhou Medical College, Hangzhou, Zhejiang, China
| | - Chen Weijun
- Cancer Center, Department of Radiation Oncology, Zhejiang Provincial People's Hospital, Affiliated People's Hospital, Hangzhou Medical College, Hangzhou, Zhejiang, China.
| |
Collapse
|
6
|
Dai S, Ye K, Zhan C, Tang H, Zhan L. SIN-Seg: A joint spatial-spectral information fusion model for medical image segmentation. Comput Struct Biotechnol J 2025; 27:744-752. [PMID: 40092663 PMCID: PMC11909746 DOI: 10.1016/j.csbj.2025.02.024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2024] [Revised: 02/21/2025] [Accepted: 02/21/2025] [Indexed: 03/19/2025] Open
Abstract
In recent years, the application of deep convolutional neural networks (DCNNs) to medical image segmentation has shown significant promise in computer-aided detection and diagnosis (CAD). Leveraging features from different spaces (i.e. Euclidean, non-Euclidean, and spectrum spaces) and multi-modalities of data have the potential to improve the information available to the CAD system, enhancing both effectiveness and efficiency. However, directly acquiring data from different spaces across multi-modalities is often prohibitively expensive and time-consuming. Consequently, most current medical image segmentation techniques are confined to the spatial domain, which is limited to utilizing scanned images from MRI, CT, PET, etc. Here, we introduce an innovative Joint Spatial-Spectral Information Fusion method which requires no additional data collection for CAD. We translate existing single-modality data into a new domain to extract features from an alternative space. Specifically, we apply Discrete Cosine Transformation (DCT) to enter the spectrum domain, thereby accessing supplementary feature information from an alternate space. Recognizing that information from different spaces typically necessitates complex alignment modules, we introduce a contrastive loss function for achieving feature alignment before synchronizing information across different feature spaces. Our empirical results illustrate the greater effectiveness of our model in harnessing additional information from the spectrum-based space and affirm its superior performance against influential state-of-the-art segmentation baselines. The code is available at https://github.com/Auroradsy/SIN-Seg.
Collapse
Affiliation(s)
- Siyuan Dai
- Department of Electrical and Computer Engineering, University of Pittsburgh, Pittsburgh, 15213, PA, USA
| | - Kai Ye
- Department of Electrical and Computer Engineering, University of Pittsburgh, Pittsburgh, 15213, PA, USA
| | - Charlie Zhan
- Department of Electrical and Computer Engineering, University of Pittsburgh, Pittsburgh, 15213, PA, USA
| | - Haoteng Tang
- Department of Computer Science, University of Texas Rio Grande Valley, Edinburg, 78582, TX, USA
| | - Liang Zhan
- Department of Electrical and Computer Engineering, University of Pittsburgh, Pittsburgh, 15213, PA, USA
| |
Collapse
|
7
|
Qin J, Luo H, He F, Qin G. DSA-Former: A Network of Hybrid Variable Structures for Liver and Liver Tumour Segmentation. Int J Med Robot 2024; 20:e70004. [PMID: 39535347 DOI: 10.1002/rcs.70004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2024] [Revised: 10/03/2024] [Accepted: 10/20/2024] [Indexed: 11/16/2024]
Abstract
BACKGROUND Accurately annotated CT images of liver tumours can effectively assist doctors in diagnosing and treating liver cancer. However, due to the relatively low density of the liver, its tumours, and surrounding tissues, as well as the existence of multi-scale problems, accurate automatic segmentation still faces challenges. METHODS We propose a segmentation network DSA-Former that combines convolutional kernels and attention. By combining the morphological and edge features of liver tumour images, capture global/local features and key inter-layer information, and integrate attention mechanisms obtaining detailed information to improve segmentation accuracy. RESULTS Compared to other methods, our approach demonstrates significant advantages in evaluation metrics such as the Dice coefficient, IOU, VOE, and HD95. Specifically, we achieve Dice coefficients of 96.8% for liver segmentation and 72.2% for liver tumour segmentation. CONCLUSION Our method offers enhanced precision in segmenting liver and liver tumour images, laying a robust foundation for liver cancer diagnosis and treatment.
Collapse
Affiliation(s)
- Jun Qin
- School of Computer Science and Technology, Changchun University of Science and Technology, Changchun, China
| | - Huizhen Luo
- School of Computer Science and Technology, Changchun University of Science and Technology, Changchun, China
| | - Fei He
- School of Computer Science and Technology, Changchun University of Science and Technology, Changchun, China
| | - Guihe Qin
- School of Computer Science and Technology, Jilin University, Changchun, China
| |
Collapse
|
8
|
Spaanderman DJ, Starmans MPA, van Erp GCM, Hanff DF, Sluijter JH, Schut ARW, van Leenders GJLH, Verhoef C, Grünhagen DJ, Niessen WJ, Visser JJ, Klein S. Minimally interactive segmentation of soft-tissue tumors on CT and MRI using deep learning. Eur Radiol 2024:10.1007/s00330-024-11167-8. [PMID: 39560714 DOI: 10.1007/s00330-024-11167-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2024] [Revised: 08/13/2024] [Accepted: 10/05/2024] [Indexed: 11/20/2024]
Abstract
BACKGROUND Segmentations are crucial in medical imaging for morphological, volumetric, and radiomics biomarkers. Manual segmentation is accurate but not feasible in clinical workflow, while automatic segmentation generally performs sub-par. PURPOSE To develop a minimally interactive deep learning-based segmentation method for soft-tissue tumors (STTs) on CT and MRI. MATERIAL AND METHODS The interactive method requires the user to click six points near the tumor's extreme boundaries in the image. These six points are transformed into a distance map and serve, with the image, as input for a convolutional neural network. A multi-center public dataset with 514 patients and nine STT phenotypes in seven anatomical locations, with CT or T1-weighted MRI, was used for training and internal validation. For external validation, another public dataset was employed, which included five unseen STT phenotypes in extremities on CT, T1-weighted MRI, and T2-weighted fat-saturated (FS) MRI. RESULTS Internal validation resulted in a dice similarity coefficient (DSC) of 0.85 ± 0.11 (mean ± standard deviation) for CT and 0.84 ± 0.12 for T1-weighted MRI. External validation resulted in DSCs of 0.81 ± 0.08 for CT, 0.84 ± 0.09 for T1-weighted MRI, and 0.88 ± 0.08 for T2-weighted FS MRI. Volumetric measurements showed consistent replication with low error internally (volume: 1 ± 28 mm3, r = 0.99; diameter: - 6 ± 14 mm, r = 0.90) and externally (volume: - 7 ± 23 mm3, r = 0.96; diameter: - 3 ± 6 mm, r = 0.99). Interactive segmentation time was considerably shorter (CT: 364 s, T1-weighted MRI: 258s) than manual segmentation (CT: 1639s, T1-weighted MRI: 1895s). CONCLUSION The minimally interactive segmentation method effectively segments STT phenotypes on CT and MRI, with robust generalization to unseen phenotypes and imaging modalities. KEY POINTS Question Can this deep learning-based method segment soft-tissue tumors faster than can be done manually and more accurately than other automatic methods? Findings The minimally interactive segmentation method achieved accurate segmentation results in internal and external validation, and generalized well across soft-tissue tumor phenotypes and imaging modalities. Clinical relevance This minimally interactive deep learning-based segmentation method could reduce the burden of manual segmentation, facilitate the integration of imaging-based biomarkers (e.g., radiomics) into clinical practice, and provide a fast, semi-automatic solution for volume and diameter measurements (e.g., RECIST).
Collapse
Affiliation(s)
- Douwe J Spaanderman
- Department of Radiology and Nuclear Medicine, Erasmus MC, Rotterdam, The Netherlands.
| | - Martijn P A Starmans
- Department of Radiology and Nuclear Medicine, Erasmus MC, Rotterdam, The Netherlands
| | - Gonnie C M van Erp
- Department of Radiology and Nuclear Medicine, Erasmus MC, Rotterdam, The Netherlands
| | - David F Hanff
- Department of Radiology and Nuclear Medicine, Erasmus MC, Rotterdam, The Netherlands
| | - Judith H Sluijter
- Department of Radiology and Nuclear Medicine, Erasmus MC, Rotterdam, The Netherlands
| | - Anne-Rose W Schut
- Department of Surgical Oncology, Erasmus MC Cancer Institute, Rotterdam, The Netherlands
- Department of Medical Oncology, Erasmus MC Cancer Institute, Rotterdam, The Netherlands
| | | | - Cornelis Verhoef
- Department of Surgical Oncology, Erasmus MC Cancer Institute, Rotterdam, The Netherlands
| | - Dirk J Grünhagen
- Department of Surgical Oncology, Erasmus MC Cancer Institute, Rotterdam, The Netherlands
| | - Wiro J Niessen
- Faculty of Medical Sciences, University of Groningen, Groningen, The Netherlands
| | - Jacob J Visser
- Department of Radiology and Nuclear Medicine, Erasmus MC, Rotterdam, The Netherlands
| | - Stefan Klein
- Department of Radiology and Nuclear Medicine, Erasmus MC, Rotterdam, The Netherlands
| |
Collapse
|
9
|
Huang J, Li X, Tan H, Cheng X. Generative Adversarial Network for Trimodal Medical Image Fusion Using Primitive Relationship Reasoning. IEEE J Biomed Health Inform 2024; 28:5729-5741. [PMID: 39093669 DOI: 10.1109/jbhi.2024.3426664] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/04/2024]
Abstract
Medical image fusion has become a hot biomedical image processing technology in recent years. The technology coalesces useful information from different modal medical images onto an informative single fused image to provide reasonable and effective medical assistance. Currently, research has mainly focused on dual-modal medical image fusion, and little attention has been paid on trimodal medical image fusion, which has greater application requirements and clinical significance. For this, the study proposes an end-to-end generative adversarial network for trimodal medical image fusion. Utilizing a multi-scale squeeze and excitation reasoning attention network, the proposed method generates an energy map for each source image, facilitating efficient trimodal medical image fusion under the guidance of an energy ratio fusion strategy. To obtain the global semantic information, we introduced squeeze and excitation reasoning attention blocks and enhanced the global feature by primitive relationship reasoning. Through extensive fusion experiments, we demonstrate that our method yields superior visual results and objective evaluation metric scores compared to state-of-the-art fusion methods. Furthermore, the proposed method also obtained the best accuracy in the glioma segmentation experiment.
Collapse
|
10
|
El-Ateif S, Idri A. Multimodality Fusion Strategies in Eye Disease Diagnosis. JOURNAL OF IMAGING INFORMATICS IN MEDICINE 2024; 37:2524-2558. [PMID: 38639808 PMCID: PMC11522204 DOI: 10.1007/s10278-024-01105-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/05/2024] [Revised: 03/08/2024] [Accepted: 03/26/2024] [Indexed: 04/20/2024]
Abstract
Multimodality fusion has gained significance in medical applications, particularly in diagnosing challenging diseases like eye diseases, notably diabetic eye diseases that pose risks of vision loss and blindness. Mono-modality eye disease diagnosis proves difficult, often missing crucial disease indicators. In response, researchers advocate multimodality-based approaches to enhance diagnostics. This study is a unique exploration, evaluating three multimodality fusion strategies-early, joint, and late-in conjunction with state-of-the-art convolutional neural network models for automated eye disease binary detection across three datasets: fundus fluorescein angiography, macula, and combination of digital retinal images for vessel extraction, structured analysis of the retina, and high-resolution fundus. Findings reveal the efficacy of each fusion strategy: type 0 early fusion with DenseNet121 achieves an impressive 99.45% average accuracy. InceptionResNetV2 emerges as the top-performing joint fusion architecture with an average accuracy of 99.58%. Late fusion ResNet50V2 achieves a perfect score of 100% across all metrics, surpassing both early and joint fusion. Comparative analysis demonstrates that late fusion ResNet50V2 matches the accuracy of state-of-the-art feature-level fusion model for multiview learning. In conclusion, this study substantiates late fusion as the optimal strategy for eye disease diagnosis compared to early and joint fusion, showcasing its superiority in leveraging multimodal information.
Collapse
Affiliation(s)
- Sara El-Ateif
- Software Project Management Research Team, ENSIAS, Mohammed V University, BP 713, Agdal, Rabat, Morocco
| | - Ali Idri
- Software Project Management Research Team, ENSIAS, Mohammed V University, BP 713, Agdal, Rabat, Morocco.
- Faculty of Medical Sciences, Mohammed VI Polytechnic University, Marrakech-Rhamna, Benguerir, Morocco.
| |
Collapse
|
11
|
Rai S, Bhatt JS, Patra SK. An AI-Based Low-Risk Lung Health Image Visualization Framework Using LR-ULDCT. JOURNAL OF IMAGING INFORMATICS IN MEDICINE 2024; 37:2047-2062. [PMID: 38491236 PMCID: PMC11522248 DOI: 10.1007/s10278-024-01062-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/23/2023] [Revised: 01/18/2024] [Accepted: 02/12/2024] [Indexed: 03/18/2024]
Abstract
In this article, we propose an AI-based low-risk visualization framework for lung health monitoring using low-resolution ultra-low-dose CT (LR-ULDCT). We present a novel deep cascade processing workflow to achieve diagnostic visualization on LR-ULDCT (<0.3 mSv) at par high-resolution CT (HRCT) of 100 mSV radiation technology. To this end, we build a low-risk and affordable deep cascade network comprising three sequential deep processes: restoration, super-resolution (SR), and segmentation. Given degraded LR-ULDCT, the first novel network unsupervisedly learns restoration function from augmenting patch-based dictionaries and residuals. The restored version is then super-resolved (SR) for target (sensor) resolution. Here, we combine perceptual and adversarial losses in novel GAN to establish the closeness between probability distributions of generated SR-ULDCT and restored LR-ULDCT. Thus SR-ULDCT is presented to the segmentation network that first separates the chest portion from SR-ULDCT followed by lobe-wise colorization. Finally, we extract five lobes to account for the presence of ground glass opacity (GGO) in the lung. Hence, our AI-based system provides low-risk visualization of input degraded LR-ULDCT to various stages, i.e., restored LR-ULDCT, restored SR-ULDCT, and segmented SR-ULDCT, and achieves diagnostic power of HRCT. We perform case studies by experimenting on real datasets of COVID-19, pneumonia, and pulmonary edema/congestion while comparing our results with state-of-the-art. Ablation experiments are conducted for better visualizing different operating pipelines. Finally, we present a verification report by fourteen (14) experienced radiologists and pulmonologists.
Collapse
Affiliation(s)
- Swati Rai
- Indian Institute of Information Technology Vadodara, Vadodara, India.
| | - Jignesh S Bhatt
- Indian Institute of Information Technology Vadodara, Vadodara, India
| | | |
Collapse
|
12
|
Singh S, Saber E, Markopoulos PP, Heard J. Regulating Modality Utilization within Multimodal Fusion Networks. SENSORS (BASEL, SWITZERLAND) 2024; 24:6054. [PMID: 39338798 PMCID: PMC11435562 DOI: 10.3390/s24186054] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/30/2024] [Revised: 08/30/2024] [Accepted: 09/17/2024] [Indexed: 09/30/2024]
Abstract
Multimodal fusion networks play a pivotal role in leveraging diverse sources of information for enhanced machine learning applications in aerial imagery. However, current approaches often suffer from a bias towards certain modalities, diminishing the potential benefits of multimodal data. This paper addresses this issue by proposing a novel modality utilization-based training method for multimodal fusion networks. The method aims to guide the network's utilization on its input modalities, ensuring a balanced integration of complementary information streams, effectively mitigating the overutilization of dominant modalities. The method is validated on multimodal aerial imagery classification and image segmentation tasks, effectively maintaining modality utilization within ±10% of the user-defined target utilization and demonstrating the versatility and efficacy of the proposed method across various applications. Furthermore, the study explores the robustness of the fusion networks against noise in input modalities, a crucial aspect in real-world scenarios. The method showcases better noise robustness by maintaining performance amidst environmental changes affecting different aerial imagery sensing modalities. The network trained with 75.0% EO utilization achieves significantly better accuracy (81.4%) in noisy conditions (noise variance = 0.12) compared to traditional training methods with 99.59% EO utilization (73.7%). Additionally, it maintains an average accuracy of 85.0% across different noise levels, outperforming the traditional method's average accuracy of 81.9%. Overall, the proposed approach presents a significant step towards harnessing the full potential of multimodal data fusion in diverse machine learning applications such as robotics, healthcare, satellite imagery, and defense applications.
Collapse
Affiliation(s)
- Saurav Singh
- Department of Electrical & Microelectronic Engineering, Rochester Institute of Technology, Rochester, NY 14623, USA; (E.S.); (J.H.)
| | - Eli Saber
- Department of Electrical & Microelectronic Engineering, Rochester Institute of Technology, Rochester, NY 14623, USA; (E.S.); (J.H.)
| | - Panos P. Markopoulos
- Department of Electrical & Computer Engineering and Department of Computer Science, The University of Texas at San Antonio, San Antonio, TX 78249, USA;
| | - Jamison Heard
- Department of Electrical & Microelectronic Engineering, Rochester Institute of Technology, Rochester, NY 14623, USA; (E.S.); (J.H.)
| |
Collapse
|
13
|
Zhou X, Zhang Z, Du H, Qiu B. MLMFNet: A multi-level modality fusion network for multi-modal accelerated MRI reconstruction. Magn Reson Imaging 2024; 111:246-255. [PMID: 38663831 DOI: 10.1016/j.mri.2024.04.028] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2023] [Revised: 04/09/2024] [Accepted: 04/19/2024] [Indexed: 06/01/2024]
Abstract
Magnetic resonance imaging produces detailed anatomical and physiological images of the human body that can be used in the clinical diagnosis and treatment of diseases. However, MRI suffers its comparatively longer acquisition time than other imaging methods and is thus vulnerable to motion artifacts, which ultimately lead to likely failed or even wrong diagnosis. In order to perform faster reconstruction, deep learning-based methods along with traditional strategies such as parallel imaging and compressed sensing come into play in recent years in this field. Meanwhile, in order to better analyze the diseases, it is also often necessary to acquire images in the same region of interest under different modalities, which yield images with different contrast levels. However, most of these aforementioned methods tend to use single-modal images for reconstruction, neglecting the correlation and redundancy information embedded in MR images acquired with different modalities. While there are works on multi-modal reconstruction, the information is yet to be efficiently explored. In this paper, we propose an end-to-end neural network called MLMFNet, which helps the reconstruction of the target modality by using information from the auxiliary modality across feature channels and layers. Specifically, this is highlighted by three components: (I) An encoder based on UNet with a single-stream strategy that fuses auxiliary and target modalities; (II) a decoder that tends to multi-level features from all layers of the encoder, and (III) a channel attention module. Quantitative and qualitative analyses are performed on a public brain dataset and knee brain dataset, which show that the proposed method achieves satisfying results in MRI reconstruction within the multi-modal context, and also demonstrate its effectiveness and potential to be used in clinical practice.
Collapse
Affiliation(s)
- Xiuyun Zhou
- Biomedical Engineering Center, University of Science and Technology of China, Hefei, Anhui 230026, China
| | - Zhenxi Zhang
- Biomedical Engineering Center, University of Science and Technology of China, Hefei, Anhui 230026, China
| | - Hongwei Du
- Biomedical Engineering Center, University of Science and Technology of China, Hefei, Anhui 230026, China.
| | - Bensheng Qiu
- Biomedical Engineering Center, University of Science and Technology of China, Hefei, Anhui 230026, China
| |
Collapse
|
14
|
Stefano A. Challenges and limitations in applying radiomics to PET imaging: Possible opportunities and avenues for research. Comput Biol Med 2024; 179:108827. [PMID: 38964244 DOI: 10.1016/j.compbiomed.2024.108827] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2024] [Revised: 06/05/2024] [Accepted: 06/29/2024] [Indexed: 07/06/2024]
Abstract
Radiomics, the high-throughput extraction of quantitative imaging features from medical images, holds immense potential for advancing precision medicine in oncology and beyond. While radiomics applied to positron emission tomography (PET) imaging offers unique insights into tumor biology and treatment response, it is imperative to elucidate the challenges and constraints inherent in this domain to facilitate their translation into clinical practice. This review examines the challenges and limitations of applying radiomics to PET imaging, synthesizing findings from the last five years (2019-2023) and highlights the significance of addressing these challenges to realize the full clinical potential of radiomics in oncology and molecular imaging. A comprehensive search was conducted across multiple electronic databases, including PubMed, Scopus, and Web of Science, using keywords relevant to radiomics issues in PET imaging. Only studies published in peer-reviewed journals were eligible for inclusion in this review. Although many studies have highlighted the potential of radiomics in predicting treatment response, assessing tumor heterogeneity, enabling risk stratification, and personalized therapy selection, various challenges regarding the practical implementation of the proposed models still need to be addressed. This review illustrates the challenges and limitations of radiomics in PET imaging across various cancer types, encompassing both phantom and clinical investigations. The analyzed studies highlight the importance of reproducible segmentation methods, standardized pre-processing and post-processing methodologies, and the need to create large multicenter studies registered in a centralized database to promote the continuous validation and clinical integration of radiomics into PET imaging.
Collapse
Affiliation(s)
- Alessandro Stefano
- Institute of Molecular Bioimaging and Physiology, National Research Council (IBFM-CNR), Cefalù, Italy.
| |
Collapse
|
15
|
Kalage D, Gupta P, Gulati A, Reddy KP, Sharma K, Thakur A, Yadav TD, Gupta V, Kaman L, Nada R, Singh H, Irrinki S, Gupta P, Das CK, Dutta U, Sandhu M. Contrast Enhanced CT Versus MRI for Accurate Diagnosis of Wall-thickening Type Gallbladder Cancer. J Clin Exp Hepatol 2024; 14:101397. [PMID: 38595988 PMCID: PMC10999705 DOI: 10.1016/j.jceh.2024.101397] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/06/2023] [Accepted: 03/09/2024] [Indexed: 04/11/2024] Open
Abstract
Introduction Diagnosis of wall-thickening type gallbladder cancer (GBC) is challenging. Computed tomography (CT) and magnetic resonance imaging (MRI) are commonly utilized to evaluate gallbladder wall thickening. However, there is a lack of data comparing the performance of CT and MRI for the detection of wall-thickening type GBC. Aim We aim to compare the diagnostic accuracy of CT and MRI in diagnosis of wall-thickening type GBC. Materials and methods This prospective study comprised consecutive patients suspected of wall-thickening type GBC who underwent preoperative contrast-enhanced CT and MRI. The final diagnosis was based on the histopathology of the resected gallbladder lesion. Two radiologists independently reviewed the characteristics of gallbladder wall thickening at CT and MRI. The association of CT and MRI findings with histological diagnosis and the interobserver agreement of CT and MRI findings were assessed. Results Thirty-three patients (malignancy, 13 and benign, 20) were included. None of the CT findings were significantly associated with GBC. However, at MRI, heterogeneous enhancement, indistinct interface with the liver, and diffusion restriction were significantly associated with malignancy (P = 0.006, <0.001, and 0.005, respectively), and intramural cysts were significantly associated with benign lesions (P = 0.012). For all MRI findings, the interobserver agreement was substantial to perfect (kappa = 0.697-1.000). At CT, the interobserver agreement was substantial to perfect (k = 0.631-1.000). Conclusion These findings suggest that MRI may be preferred over CT in patients with suspected wall thickening type GBC. However, larger multicenter studies must confirm our findings.
Collapse
Affiliation(s)
- Daneshwari Kalage
- Department of Radiodiagnosis and Imaging, Postgraduate Institute of Medical Education and Research, Chandigarh, India
| | - Pankaj Gupta
- Department of Radiodiagnosis and Imaging, Postgraduate Institute of Medical Education and Research, Chandigarh, India
| | - Ajay Gulati
- Department of Radiodiagnosis and Imaging, Postgraduate Institute of Medical Education and Research, Chandigarh, India
| | - Kakivaya P Reddy
- Department of Radiodiagnosis and Imaging, Postgraduate Institute of Medical Education and Research, Chandigarh, India
| | - Kritika Sharma
- Department of Radiodiagnosis and Imaging, Postgraduate Institute of Medical Education and Research, Chandigarh, India
| | - Ati Thakur
- Department of Radiodiagnosis and Imaging, Postgraduate Institute of Medical Education and Research, Chandigarh, India
| | - Thakur D Yadav
- Department of Surgical Gastroenterology, Postgraduate Institute of Medical Education and Research, Chandigarh, India
| | - Vikas Gupta
- Department of Surgical Gastroenterology, Postgraduate Institute of Medical Education and Research, Chandigarh, India
| | - Lileswar Kaman
- Department of General Surgery, Postgraduate Institute of Medical Education and Research, Chandigarh, India
| | - Ritambhra Nada
- Department of Histopathology, Postgraduate Institute of Medical Education and Research, Chandigarh, India
| | - Harjeet Singh
- Department of Surgical Gastroenterology, Postgraduate Institute of Medical Education and Research, Chandigarh, India
| | - Santosh Irrinki
- Department of General Surgery, Postgraduate Institute of Medical Education and Research, Chandigarh, India
| | - Parikshaa Gupta
- Department of Cytology, Postgraduate Institute of Medical Education and Research, Chandigarh, India
| | - Chandan K Das
- Department of Medical Oncology, Postgraduate Institute of Medical Education and Research, Chandigarh, India
| | - Usha Dutta
- Department of Gastroenterology, Postgraduate Institute of Medical Education and Research, Chandigarh, India
| | - Manavjit Sandhu
- Department of Radiodiagnosis and Imaging, Postgraduate Institute of Medical Education and Research, Chandigarh, India
| |
Collapse
|
16
|
Gunashekar DD, Bielak L, Oerther B, Benndorf M, Nedelcu A, Hickey S, Zamboglou C, Grosu AL, Bock M. Comparison of data fusion strategies for automated prostate lesion detection using mpMRI correlated with whole mount histology. Radiat Oncol 2024; 19:96. [PMID: 39080735 PMCID: PMC11287985 DOI: 10.1186/s13014-024-02471-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2024] [Accepted: 06/14/2024] [Indexed: 08/03/2024] Open
Abstract
BACKGROUND In this work, we compare input level, feature level and decision level data fusion techniques for automatic detection of clinically significant prostate lesions (csPCa). METHODS Multiple deep learning CNN architectures were developed using the Unet as the baseline. The CNNs use both multiparametric MRI images (T2W, ADC, and High b-value) and quantitative clinical data (prostate specific antigen (PSA), PSA density (PSAD), prostate gland volume & gross tumor volume (GTV)), and only mp-MRI images (n = 118), as input. In addition, co-registered ground truth data from whole mount histopathology images (n = 22) were used as a test set for evaluation. RESULTS The CNNs achieved for early/intermediate / late level fusion a precision of 0.41/0.51/0.61, recall value of 0.18/0.22/0.25, an average precision of 0.13 / 0.19 / 0.27, and F scores of 0.55/0.67/ 0.76. Dice Sorensen Coefficient (DSC) was used to evaluate the influence of combining mpMRI with parametric clinical data for the detection of csPCa. We compared the DSC between the predictions of CNN's trained with mpMRI and parametric clinical and the CNN's trained with only mpMRI images as input with the ground truth. We obtained a DSC of data 0.30/0.34/0.36 and 0.26/0.33/0.34 respectively. Additionally, we evaluated the influence of each mpMRI input channel for the task of csPCa detection and obtained a DSC of 0.14 / 0.25 / 0.28. CONCLUSION The results show that the decision level fusion network performs better for the task of prostate lesion detection. Combining mpMRI data with quantitative clinical data does not show significant differences between these networks (p = 0.26/0.62/0.85). The results show that CNNs trained with all mpMRI data outperform CNNs with less input channels which is consistent with current clinical protocols where the same input is used for PI-RADS lesion scoring. TRIAL REGISTRATION The trial was registered retrospectively at the German Register for Clinical Studies (DRKS) under proposal number Nr. 476/14 & 476/19.
Collapse
Affiliation(s)
- Deepa Darshini Gunashekar
- Division of Medical Physics, Department of Diagnostic and Interventional Radiology, University Medical Center Freiburg, Faculty of Medicine, University of Freiburg, Freiburg, Germany.
- German Cancer Consortium (DKTK), Partner Site Freiburg, Freiburg, Germany.
| | - Lars Bielak
- Division of Medical Physics, Department of Diagnostic and Interventional Radiology, University Medical Center Freiburg, Faculty of Medicine, University of Freiburg, Freiburg, Germany
- German Cancer Consortium (DKTK), Partner Site Freiburg, Freiburg, Germany
| | - Benedict Oerther
- Department of Diagnostic and Interventional Radiology, University Medical Center Freiburg, Faculty of Medicine, University of Freiburg, Freiburg, Germany
| | - Matthias Benndorf
- Department of Diagnostic and Interventional Radiology, University Medical Center Freiburg, Faculty of Medicine, University of Freiburg, Freiburg, Germany
| | - Andrea Nedelcu
- Department of Diagnostic and Interventional Radiology, University Medical Center Freiburg, Faculty of Medicine, University of Freiburg, Freiburg, Germany
| | - Samantha Hickey
- Division of Medical Physics, Department of Diagnostic and Interventional Radiology, University Medical Center Freiburg, Faculty of Medicine, University of Freiburg, Freiburg, Germany
| | - Constantinos Zamboglou
- Department of Diagnostic and Interventional Radiology, University Medical Center Freiburg, Faculty of Medicine, University of Freiburg, Freiburg, Germany
- German Oncology Center, European University Cyprus, Limassol, Cyprus
| | - Anca-Ligia Grosu
- Department of Diagnostic and Interventional Radiology, University Medical Center Freiburg, Faculty of Medicine, University of Freiburg, Freiburg, Germany
- German Cancer Consortium (DKTK), Partner Site Freiburg, Freiburg, Germany
| | - Michael Bock
- Division of Medical Physics, Department of Diagnostic and Interventional Radiology, University Medical Center Freiburg, Faculty of Medicine, University of Freiburg, Freiburg, Germany
- German Cancer Consortium (DKTK), Partner Site Freiburg, Freiburg, Germany
| |
Collapse
|
17
|
Rizk PA, Gonzalez MR, Galoaa BM, Girgis AG, Van Der Linden L, Chang CY, Lozano-Calderon SA. Machine Learning-Assisted Decision Making in Orthopaedic Oncology. JBJS Rev 2024; 12:01874474-202407000-00005. [PMID: 38991098 DOI: 10.2106/jbjs.rvw.24.00057] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/13/2024]
Abstract
» Artificial intelligence is an umbrella term for computational calculations that are designed to mimic human intelligence and problem-solving capabilities, although in the future, this may become an incomplete definition. Machine learning (ML) encompasses the development of algorithms or predictive models that generate outputs without explicit instructions, assisting in clinical predictions based on large data sets. Deep learning is a subset of ML that utilizes layers of networks that use various inter-relational connections to define and generalize data.» ML algorithms can enhance radiomics techniques for improved image evaluation and diagnosis. While ML shows promise with the advent of radiomics, there are still obstacles to overcome.» Several calculators leveraging ML algorithms have been developed to predict survival in primary sarcomas and metastatic bone disease utilizing patient-specific data. While these models often report exceptionally accurate performance, it is crucial to evaluate their robustness using standardized guidelines.» While increased computing power suggests continuous improvement of ML algorithms, these advancements must be balanced against challenges such as diversifying data, addressing ethical concerns, and enhancing model interpretability.
Collapse
Affiliation(s)
- Paul A Rizk
- Division of Orthopaedic Oncology, Department of Orthopaedic Surgery, Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts
| | - Marcos R Gonzalez
- Division of Orthopaedic Oncology, Department of Orthopaedic Surgery, Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts
| | - Bishoy M Galoaa
- Interdisciplinary Science & Engineering Complex (ISEC), Northeastern University, Boston, Massachusetts
| | - Andrew G Girgis
- Boston University Chobanian & Avedisian School of Medicine, Boston, Massachusetts
| | - Lotte Van Der Linden
- Division of Orthopaedic Oncology, Department of Orthopaedic Surgery, Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts
| | - Connie Y Chang
- Musculoskeletal Imaging and Intervention, Department of Radiology, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts
| | - Santiago A Lozano-Calderon
- Division of Orthopaedic Oncology, Department of Orthopaedic Surgery, Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts
| |
Collapse
|
18
|
Wang Z, Zhao D, Heidari AA, Chen Y, Chen H, Liang G. Improved Latin hypercube sampling initialization-based whale optimization algorithm for COVID-19 X-ray multi-threshold image segmentation. Sci Rep 2024; 14:13239. [PMID: 38853172 PMCID: PMC11163015 DOI: 10.1038/s41598-024-63739-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2024] [Accepted: 05/31/2024] [Indexed: 06/11/2024] Open
Abstract
Image segmentation techniques play a vital role in aiding COVID-19 diagnosis. Multi-threshold image segmentation methods are favored for their computational simplicity and operational efficiency. Existing threshold selection techniques in multi-threshold image segmentation, such as Kapur based on exhaustive enumeration, often hamper efficiency and accuracy. The whale optimization algorithm (WOA) has shown promise in addressing this challenge, but issues persist, including poor stability, low efficiency, and accuracy in COVID-19 threshold image segmentation. To tackle these issues, we introduce a Latin hypercube sampling initialization-based multi-strategy enhanced WOA (CAGWOA). It incorporates a COS sampling initialization strategy (COSI), an adaptive global search approach (GS), and an all-dimensional neighborhood mechanism (ADN). COSI leverages probability density functions created from Latin hypercube sampling, ensuring even solution space coverage to improve the stability of the segmentation model. GS widens the exploration scope to combat stagnation during iterations and improve segmentation efficiency. ADN refines convergence accuracy around optimal individuals to improve segmentation accuracy. CAGWOA's performance is validated through experiments on various benchmark function test sets. Furthermore, we apply CAGWOA alongside similar methods in a multi-threshold image segmentation model for comparative experiments on lung X-ray images of infected patients. The results demonstrate CAGWOA's superiority, including better image detail preservation, clear segmentation boundaries, and adaptability across different threshold levels.
Collapse
Affiliation(s)
- Zhen Wang
- College of Computer Science and Technology, Changchun Normal University, Changchun, 130032, Jilin, China
| | - Dong Zhao
- College of Computer Science and Technology, Changchun Normal University, Changchun, 130032, Jilin, China.
| | - Ali Asghar Heidari
- School of Surveying and Geospatial Engineering, College of Engineering, University of Tehran, Tehran, Iran
| | - Yi Chen
- Key Laboratory of Intelligent Informatics for Safety & Emergency of Zhejiang Province, Wenzhou University, Wenzhou, 325035, China
| | - Huiling Chen
- Key Laboratory of Intelligent Informatics for Safety & Emergency of Zhejiang Province, Wenzhou University, Wenzhou, 325035, China.
| | - Guoxi Liang
- Department of Artificial Intelligence, Wenzhou Polytechnic, Wenzhou, 325035, China.
| |
Collapse
|
19
|
Usha MP, Kannan G, Ramamoorthy M. Multimodal Brain Tumor Classification Using Convolutional Tumnet Architecture. Behav Neurol 2024; 2024:4678554. [PMID: 38882177 PMCID: PMC11178426 DOI: 10.1155/2024/4678554] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2022] [Revised: 12/22/2023] [Accepted: 01/10/2024] [Indexed: 06/18/2024] Open
Abstract
The most common and aggressive tumor is brain malignancy, which has a short life span in the fourth grade of the disease. As a result, the medical plan may be a crucial step toward improving the well-being of a patient. Both diagnosis and therapy are part of the medical plan. Brain tumors are commonly imaged with magnetic resonance imaging (MRI), positron emission tomography (PET), and computed tomography (CT). In this paper, multimodal fused imaging with classification and segmentation for brain tumors was proposed using the deep learning method. The MRI and CT brain tumor images of the same slices (308 slices of meningioma and sarcoma) are combined using three different types of pixel-level fusion methods. The presence/absence of a tumor is classified using the proposed Tumnet technique, and the tumor area is found accordingly. In the other case, Tumnet is also applied for single-modal MRI/CT (561 image slices) for classification. The proposed Tumnet was modeled with 5 convolutional layers, 3 pooling layers with ReLU activation function, and 3 fully connected layers. The first-order statistical fusion metrics for an average method of MRI-CT images are obtained as SSIM tissue at 83%, SSIM bone at 84%, accuracy at 90%, sensitivity at 96%, and specificity at 95%, and the second-order statistical fusion metrics are obtained as the standard deviation of fused images at 79% and entropy at 0.99. The entropy value confirms the presence of additional features in the fused image. The proposed Tumnet yields a sensitivity of 96%, an accuracy of 98%, a specificity of 99%, normalized values of the mean of 0.75, a standard deviation of 0.4, a variance of 0.16, and an entropy of 0.90.
Collapse
Affiliation(s)
- M Padma Usha
- Department of Electronics and Communication Engineering B.S. Abdur Rahman Crescent Institute of Science and Technology, Vandalur, Chennai, India
| | - G Kannan
- Department of Electronics and Communication Engineering B.S. Abdur Rahman Crescent Institute of Science and Technology, Vandalur, Chennai, India
| | - M Ramamoorthy
- Department of Artificial Intelligence and Machine Learning Saveetha School of Engineering SIMATS, Chennai, 600124, India
| |
Collapse
|
20
|
He Q, Summerfield N, Dong M, Glide-Hurst C. MODALITY-AGNOSTIC LEARNING FOR MEDICAL IMAGE SEGMENTATION USING MULTI-MODALITY SELF-DISTILLATION. PROCEEDINGS. IEEE INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING 2024; 2024:10.1109/isbi56570.2024.10635881. [PMID: 39735423 PMCID: PMC11673955 DOI: 10.1109/isbi56570.2024.10635881] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/31/2024]
Abstract
In medical image segmentation, although multi-modality training is possible, clinical translation is challenged by the limited availability of all image types for a given patient. Different from typical segmentation models, modality-agnostic (MAG) learning trains a single model based on all available modalities but remains input-agnostic, allowing a single model to produce accurate segmentation given any modality combinations. In this paper, we propose a novel frame-work, MAG learning through Multi-modality Self-distillation (MAG-MS), for medical image segmentation. MAG-MS distills knowledge from the fusion of multiple modalities and applies it to enhance representation learning for individual modalities. This makes it an adaptable and efficient solution for handling limited modalities during testing scenarios. Our extensive experiments on benchmark datasets demonstrate its superior segmentation accuracy, MAG robustness, and efficiency than the current state-of-the-art methods.
Collapse
Affiliation(s)
- Qisheng He
- Department of Computer Science, Wayne State University, Detroit, MI, USA
| | - Nicholas Summerfield
- Department of Human Oncology, University of Wisconsin-Madison, Madison, WI, USA
- Department of Medical Physics, University of Wisconsin-Madison, Madison, WI, USA
| | - Ming Dong
- Department of Computer Science, Wayne State University, Detroit, MI, USA
| | - Carri Glide-Hurst
- Department of Human Oncology, University of Wisconsin-Madison, Madison, WI, USA
- Department of Medical Physics, University of Wisconsin-Madison, Madison, WI, USA
| |
Collapse
|
21
|
Chadoulos C, Tsaopoulos D, Symeonidis A, Moustakidis S, Theocharis J. Dense Multi-Scale Graph Convolutional Network for Knee Joint Cartilage Segmentation. Bioengineering (Basel) 2024; 11:278. [PMID: 38534552 DOI: 10.3390/bioengineering11030278] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2024] [Revised: 03/07/2024] [Accepted: 03/11/2024] [Indexed: 03/28/2024] Open
Abstract
In this paper, we propose a dense multi-scale adaptive graph convolutional network (DMA-GCN) method for automatic segmentation of the knee joint cartilage from MR images. Under the multi-atlas setting, the suggested approach exhibits several novelties, as described in the following. First, our models integrate both local-level and global-level learning simultaneously. The local learning task aggregates spatial contextual information from aligned spatial neighborhoods of nodes, at multiple scales, while global learning explores pairwise affinities between nodes, located globally at different positions in the image. We propose two different structures of building models, whereby the local and global convolutional units are combined by following an alternating or a sequential manner. Secondly, based on the previous models, we develop the DMA-GCN network, by utilizing a densely connected architecture with residual skip connections. This is a deeper GCN structure, expanded over different block layers, thus being capable of providing more expressive node feature representations. Third, all units pertaining to the overall network are equipped with their individual adaptive graph learning mechanism, which allows the graph structures to be automatically learned during training. The proposed cartilage segmentation method is evaluated on the entire publicly available Osteoarthritis Initiative (OAI) cohort. To this end, we have devised a thorough experimental setup, with the goal of investigating the effect of several factors of our approach on the classification rates. Furthermore, we present exhaustive comparative results, considering traditional existing methods, six deep learning segmentation methods, and seven graph-based convolution methods, including the currently most representative models from this field. The obtained results demonstrate that the DMA-GCN outperforms all competing methods across all evaluation measures, providing DSC=95.71% and DSC=94.02% for the segmentation of femoral and tibial cartilage, respectively.
Collapse
Affiliation(s)
- Christos Chadoulos
- Department of Electrical & Computer Engineering, Aristotle University of Thessaloniki, 54124 Thessaloniki, Greece
| | - Dimitrios Tsaopoulos
- Institute for Bio-Economy and Agri-Technology, Centre for Research and Technology-Hellas, 38333 Volos, Greece
| | - Andreas Symeonidis
- Department of Electrical & Computer Engineering, Aristotle University of Thessaloniki, 54124 Thessaloniki, Greece
| | - Serafeim Moustakidis
- Department of Electrical & Computer Engineering, Aristotle University of Thessaloniki, 54124 Thessaloniki, Greece
| | - John Theocharis
- Department of Electrical & Computer Engineering, Aristotle University of Thessaloniki, 54124 Thessaloniki, Greece
| |
Collapse
|
22
|
Ma Y, Peng Y. Mammogram mass segmentation and classification based on cross-view VAE and spatial hidden factor disentanglement. Phys Eng Sci Med 2024; 47:223-238. [PMID: 38150059 DOI: 10.1007/s13246-023-01359-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2023] [Accepted: 11/19/2023] [Indexed: 12/28/2023]
Abstract
Breast masses are the most important clinical findings of breast carcinomas. The mass segmentation and classification in mammograms remain a crucial yet challenging topic in computer-aided diagnosis systems, as the masses show their irregularities in shape, size and texture. In this paper, we propose a new framework for mammogram mass classification and segmentation. Specifically, to utilize the complementary information within the mammographic cross-views, cranio caudal and mediolateral oblique, a cross-view based variational autoencoder (CV-VAE) combined with a spatial hidden factor disentanglement module is presented, where the two views can be reconstructed from each other through two explicitly disentangled hidden factors: class related (specified) and background common (unspecified). Then, the specified factor is not only divided into two categories: benign and malignant by a new introduced feature pyramid networks based mass classifier, but also used to predict the mass mask label based on a U-Net-like decoder. By integrating the two complementary modules, more discriminative morphological and semantic features can be learned to solve the mass classification and segmentation problems simultaneously. The proposed method is evaluated on two most used public mammography datasets, CBIS-DDSM and INbreast, achieving the Dice similarity coefficient (DSC) of 92.46% and 93.70% for segmentation and the area under receiver operating characteristic curve (AUC) of 93.20% and 95.01% for classification, respectively. Compared with other state-of-the-art approaches, it gives competitive results.
Collapse
Affiliation(s)
- Yingran Ma
- College of Computer Science and Engineering, Shandong University of Science and Technology, Qingdao, 266590, CO, China
| | - Yanjun Peng
- College of Computer Science and Engineering, Shandong University of Science and Technology, Qingdao, 266590, CO, China.
- Shandong Province Key Laboratory of Wisdom Mining Information Technology, Shandong University of Science and Technology, Qingdao, 266590, CO, China.
| |
Collapse
|
23
|
Li GY, Chen J, Jang SI, Gong K, Li Q. SwinCross: Cross-modal Swin transformer for head-and-neck tumor segmentation in PET/CT images. Med Phys 2024; 51:2096-2107. [PMID: 37776263 PMCID: PMC10939987 DOI: 10.1002/mp.16703] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2023] [Revised: 06/20/2023] [Accepted: 07/30/2023] [Indexed: 10/02/2023] Open
Abstract
BACKGROUND Radiotherapy (RT) combined with cetuximab is the standard treatment for patients with inoperable head and neck cancers. Segmentation of head and neck (H&N) tumors is a prerequisite for radiotherapy planning but a time-consuming process. In recent years, deep convolutional neural networks (DCNN) have become the de facto standard for automated image segmentation. However, due to the expensive computational cost associated with enlarging the field of view in DCNNs, their ability to model long-range dependency is still limited, and this can result in sub-optimal segmentation performance for objects with background context spanning over long distances. On the other hand, Transformer models have demonstrated excellent capabilities in capturing such long-range information in several semantic segmentation tasks performed on medical images. PURPOSE Despite the impressive representation capacity of vision transformer models, current vision transformer-based segmentation models still suffer from inconsistent and incorrect dense predictions when fed with multi-modal input data. We suspect that the power of their self-attention mechanism may be limited in extracting the complementary information that exists in multi-modal data. To this end, we propose a novel segmentation model, debuted, Cross-modal Swin Transformer (SwinCross), with cross-modal attention (CMA) module to incorporate cross-modal feature extraction at multiple resolutions. METHODS We propose a novel architecture for cross-modal 3D semantic segmentation with two main components: (1) a cross-modal 3D Swin Transformer for integrating information from multiple modalities (PET and CT), and (2) a cross-modal shifted window attention block for learning complementary information from the modalities. To evaluate the efficacy of our approach, we conducted experiments and ablation studies on the HECKTOR 2021 challenge dataset. We compared our method against nnU-Net (the backbone of the top-5 methods in HECKTOR 2021) and other state-of-the-art transformer-based models, including UNETR and Swin UNETR. The experiments employed a five-fold cross-validation setup using PET and CT images. RESULTS Empirical evidence demonstrates that our proposed method consistently outperforms the comparative techniques. This success can be attributed to the CMA module's capacity to enhance inter-modality feature representations between PET and CT during head-and-neck tumor segmentation. Notably, SwinCross consistently surpasses Swin UNETR across all five folds, showcasing its proficiency in learning multi-modal feature representations at varying resolutions through the cross-modal attention modules. CONCLUSIONS We introduced a cross-modal Swin Transformer for automating the delineation of head and neck tumors in PET and CT images. Our model incorporates a cross-modality attention module, enabling the exchange of features between modalities at multiple resolutions. The experimental results establish the superiority of our method in capturing improved inter-modality correlations between PET and CT for head-and-neck tumor segmentation. Furthermore, the proposed methodology holds applicability to other semantic segmentation tasks involving different imaging modalities like SPECT/CT or PET/MRI. Code:https://github.com/yli192/SwinCross_CrossModalSwinTransformer_for_Medical_Image_Segmentation.
Collapse
Affiliation(s)
- Gary Y. Li
- Center for Advanced Medical Computing and Analysis, Massachusetts General Hospital/Harvard Medical School, Boston, MA
| | - Junyu Chen
- The Russell H Morgan Department of Radiology and Radiological Science, School of Medicine, Johns Hopkins University, Baltimore, MD
- Department of Electrical and Computer Engineering, Whiting School of Engineering, Johns Hopkins University, Baltimore, MD
| | - Se-In Jang
- Center for Advanced Medical Computing and Analysis, Massachusetts General Hospital/Harvard Medical School, Boston, MA
| | - Kuang Gong
- Center for Advanced Medical Computing and Analysis, Massachusetts General Hospital/Harvard Medical School, Boston, MA
| | - Quanzheng Li
- Center for Advanced Medical Computing and Analysis, Massachusetts General Hospital/Harvard Medical School, Boston, MA
| |
Collapse
|
24
|
Young A, Tan K, Tariq F, Jin MX, Bluestone AY. Rogue AI: Cautionary Cases in Neuroradiology and What We Can Learn From Them. Cureus 2024; 16:e56317. [PMID: 38628986 PMCID: PMC11019475 DOI: 10.7759/cureus.56317] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/16/2024] [Indexed: 04/19/2024] Open
Abstract
Introduction In recent years, artificial intelligence (AI) in medical imaging has undergone unprecedented innovation and advancement, sparking a revolutionary transformation in healthcare. The field of radiology is particularly implicated, as clinical radiologists are expected to interpret an ever-increasing number of complex cases in record time. Machine learning software purchased by our institution is expected to help our radiologists come to a more prompt diagnosis by delivering point-of-care quantitative analysis of suspicious findings and streamlining clinical workflow. This paper explores AI's impact on neuroradiology, an area accounting for a substantial portion of recent radiology studies. We present a case series evaluating an AI software's performance in detecting neurovascular findings, highlighting five cases where AI interpretations differed from radiologists' assessments. Our study underscores common pitfalls of AI in the context of CT head angiograms, aiming to guide future AI algorithms. Methods We conducted a retrospective case series study at Stony Brook University Hospital, a large medical center in Stony Brook, New York, spanning from October 1, 2021 to December 31, 2021, analyzing 140 randomly sampled CT angiograms using AI software. This software assessed various neurovascular parameters, and AI findings were compared with neuroradiologists' interpretations. Five cases with divergent interpretations were selected for detailed analysis. Results Five representative cases in which AI findings were discordant with radiologists' interpretations are presented with diagnoses including diffuse anoxic ischemic injury, cortical laminar necrosis, colloid cyst, right superficial temporal artery-to-middle cerebral artery (STA-MCA) bypass, and subacute bilateral subdural hematomas. Discussion The errors identified in our case series expose AI's limitations in radiology. Our case series reveals that AI's incorrect interpretations can stem from complexities in pathology, challenges in distinguishing densities, inability to identify artifacts, identifying post-surgical changes in normal anatomy, sensitivity limitations, and insufficient pattern recognition. AI's potential for improvement lies in refining its algorithms to effectively recognize and differentiate pathologies. Incorporating more diverse training datasets, multimodal data, deep-reinforcement learning, clinical context, and real-time learning capabilities are some ways to improve AI's performance in the field of radiology. Conclusion Overall, it is apparent that AI applications in radiology have much room for improvement before becoming more widely integrated into clinical workflows. While AI demonstrates remarkable potential to aid in diagnosis and streamline workflows, our case series highlights common pitfalls that underscore the need for continuous improvement. By refining algorithms, incorporating diverse datasets, embracing multimodal information, and leveraging innovative machine learning strategies, AI's diagnostic accuracy can be significantly improved.
Collapse
Affiliation(s)
- Austin Young
- Department of Radiology, Stony Brook University Hospital, Stony Brook, USA
| | - Kevin Tan
- Department of Radiology, Stony Brook University Hospital, Stony Brook, USA
| | - Faiq Tariq
- Department of Radiology, Stony Brook University Hospital, Stony Brook, USA
| | - Michael X Jin
- Department of Radiology, Stony Brook University Hospital, Stony Brook, USA
| | | |
Collapse
|
25
|
R SSRM, T J. Multi-Scale and Spatial Information Extraction for Kidney Tumor Segmentation: A Contextual Deformable Attention and Edge-Enhanced U-Net. JOURNAL OF IMAGING INFORMATICS IN MEDICINE 2024; 37:151-166. [PMID: 38343255 DOI: 10.1007/s10278-023-00900-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/23/2023] [Revised: 09/04/2023] [Accepted: 09/05/2023] [Indexed: 03/02/2024]
Abstract
Kidney tumor segmentation is a difficult task because of the complex spatial and volumetric information present in medical images. Recent advances in deep convolutional neural networks (DCNNs) have improved tumor segmentation accuracy. However, the practical usability of current CNN-based networks is constrained by their high computational complexity. Additionally, these techniques often struggle to make adaptive modifications based on the structure of the tumors, which can lead to blurred edges in segmentation results. A lightweight architecture called the contextual deformable attention and edge-enhanced U-Net (CDA2E-Net) for high-accuracy pixel-level kidney tumor segmentation is proposed to address these challenges. Rather than using complex deep encoders, the approach includes a lightweight depthwise dilated ShuffleNetV2 (LDS-Net) encoder integrated into the CDA2E-Net framework. The proposed method also contains a multiscale attention feature pyramid pooling (MAF2P) module that improves the ability of multiscale features to adapt to various tumor shapes. Finally, an edge-enhanced loss function is introduced to guide the CDA2E-Net to concentrate on tumor edge information. The CDA2E-Net is evaluated on the KiTS19 and KiTS21 datasets, and the results demonstrate its superiority over existing approaches in terms of Hausdorff distance (HD), intersection over union (IoU), and dice coefficient (DSC) metrics.
Collapse
Affiliation(s)
- Shamija Sherryl R M R
- Department of Electronics & Communication Engineering, Ponjesly College of Engineering, Nagercoil, Tamil Nadu, India.
| | - Jaya T
- Department of Electronics & Communication Engineering, Saveetha Engineering College, Thandalam, India
| |
Collapse
|
26
|
Yoo JH, Jeong H, An JH, Chung TM. Mood Disorder Severity and Subtype Classification Using Multimodal Deep Neural Network Models. SENSORS (BASEL, SWITZERLAND) 2024; 24:715. [PMID: 38276406 PMCID: PMC10818263 DOI: 10.3390/s24020715] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/20/2023] [Revised: 01/17/2024] [Accepted: 01/19/2024] [Indexed: 01/27/2024]
Abstract
The subtype diagnosis and severity classification of mood disorder have been made through the judgment of verified assistance tools and psychiatrists. Recently, however, many studies have been conducted using biomarker data collected from subjects to assist in diagnosis, and most studies use heart rate variability (HRV) data collected to understand the balance of the autonomic nervous system on statistical analysis methods to perform classification through statistical analysis. In this research, three mood disorder severity or subtype classification algorithms are presented through multimodal analysis of data on the collected heart-related data variables and hidden features from the variables of time and frequency domain of HRV. Comparing the classification performance of the statistical analysis widely used in existing major depressive disorder (MDD), anxiety disorder (AD), and bipolar disorder (BD) classification studies and the multimodality deep neural network analysis newly proposed in this study, it was confirmed that the severity or subtype classification accuracy performance of each disease improved by 0.118, 0.231, and 0.125 on average. Through the study, it was confirmed that deep learning analysis of biomarker data such as HRV can be applied as a primary identification and diagnosis aid for mental diseases, and that it can help to objectively diagnose psychiatrists in that it can confirm not only the diagnosed disease but also the current mood status.
Collapse
Affiliation(s)
- Joo Hun Yoo
- Department of Artificial Intelligence, Sungkyunkwan University, Suwon 16419, Republic of Korea;
- Hippo T&C Inc., Suwon 16419, Republic of Korea;
| | - Harim Jeong
- Hippo T&C Inc., Suwon 16419, Republic of Korea;
- Department of Interaction Science, Sungkyunkwan University, Seoul 03063, Republic of Korea
| | - Ji Hyun An
- Department of Psychiatry, Sungkyunkwan University School of Medicine, Seoul 06351, Republic of Korea
| | - Tai-Myoung Chung
- Hippo T&C Inc., Suwon 16419, Republic of Korea;
- Department of Computer Science and Engineering, Sungkyunkwan University, Suwon 16419, Republic of Korea
| |
Collapse
|
27
|
Bourazana A, Xanthopoulos A, Briasoulis A, Magouliotis D, Spiliopoulos K, Athanasiou T, Vassilopoulos G, Skoularigis J, Triposkiadis F. Artificial Intelligence in Heart Failure: Friend or Foe? Life (Basel) 2024; 14:145. [PMID: 38276274 PMCID: PMC10817517 DOI: 10.3390/life14010145] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2023] [Revised: 01/08/2024] [Accepted: 01/17/2024] [Indexed: 01/27/2024] Open
Abstract
In recent times, there have been notable changes in cardiovascular medicine, propelled by the swift advancements in artificial intelligence (AI). The present work provides an overview of the current applications and challenges of AI in the field of heart failure. It emphasizes the "garbage in, garbage out" issue, where AI systems can produce inaccurate results with skewed data. The discussion covers issues in heart failure diagnostic algorithms, particularly discrepancies between existing models. Concerns about the reliance on the left ventricular ejection fraction (LVEF) for classification and treatment are highlighted, showcasing differences in current scientific perceptions. This review also delves into challenges in implementing AI, including variable considerations and biases in training data. It underscores the limitations of current AI models in real-world scenarios and the difficulty in interpreting their predictions, contributing to limited physician trust in AI-based models. The overarching suggestion is that AI can be a valuable tool in clinicians' hands for treating heart failure patients, as far as existing medical inaccuracies have been addressed before integrating AI into these frameworks.
Collapse
Affiliation(s)
- Angeliki Bourazana
- Department of Cardiology, University Hospital of Larissa, 41110 Larissa, Greece
| | - Andrew Xanthopoulos
- Department of Cardiology, University Hospital of Larissa, 41110 Larissa, Greece
| | - Alexandros Briasoulis
- Division of Cardiovascular Medicine, Section of Heart Failure and Transplantation, University of Iowa, Iowa City, IA 52242, USA
| | - Dimitrios Magouliotis
- Department of Cardiothoracic Surgery, University of Thessaly, 41110 Larissa, Greece; (D.M.); (K.S.)
| | - Kyriakos Spiliopoulos
- Department of Cardiothoracic Surgery, University of Thessaly, 41110 Larissa, Greece; (D.M.); (K.S.)
| | - Thanos Athanasiou
- Department of Surgery and Cancer, Imperial College London, St Mary’s Hospital, London W2 1NY, UK
| | - George Vassilopoulos
- Department of Hematology, University Hospital of Larissa, University of Thessaly Medical School, 41110 Larissa, Greece
| | - John Skoularigis
- Department of Cardiology, University Hospital of Larissa, 41110 Larissa, Greece
| | | |
Collapse
|
28
|
Oyelade ON, Irunokhai EA, Wang H. A twin convolutional neural network with hybrid binary optimizer for multimodal breast cancer digital image classification. Sci Rep 2024; 14:692. [PMID: 38184742 PMCID: PMC10771515 DOI: 10.1038/s41598-024-51329-8] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2023] [Accepted: 01/03/2024] [Indexed: 01/08/2024] Open
Abstract
There is a wide application of deep learning technique to unimodal medical image analysis with significant classification accuracy performance observed. However, real-world diagnosis of some chronic diseases such as breast cancer often require multimodal data streams with different modalities of visual and textual content. Mammography, magnetic resonance imaging (MRI) and image-guided breast biopsy represent a few of multimodal visual streams considered by physicians in isolating cases of breast cancer. Unfortunately, most studies applying deep learning techniques to solving classification problems in digital breast images have often narrowed their study to unimodal samples. This is understood considering the challenging nature of multimodal image abnormality classification where the fusion of high dimension heterogeneous features learned needs to be projected into a common representation space. This paper presents a novel deep learning approach combining a dual/twin convolutional neural network (TwinCNN) framework to address the challenge of breast cancer image classification from multi-modalities. First, modality-based feature learning was achieved by extracting both low and high levels features using the networks embedded with TwinCNN. Secondly, to address the notorious problem of high dimensionality associated with the extracted features, binary optimization method is adapted to effectively eliminate non-discriminant features in the search space. Furthermore, a novel method for feature fusion is applied to computationally leverage the ground-truth and predicted labels for each sample to enable multimodality classification. To evaluate the proposed method, digital mammography images and digital histopathology breast biopsy samples from benchmark datasets namely MIAS and BreakHis respectively. Experimental results obtained showed that the classification accuracy and area under the curve (AUC) for the single modalities yielded 0.755 and 0.861871 for histology, and 0.791 and 0.638 for mammography. Furthermore, the study investigated classification accuracy resulting from the fused feature method, and the result obtained showed that 0.977, 0.913, and 0.667 for histology, mammography, and multimodality respectively. The findings from the study confirmed that multimodal image classification based on combination of image features and predicted label improves performance. In addition, the contribution of the study shows that feature dimensionality reduction based on binary optimizer supports the elimination of non-discriminant features capable of bottle-necking the classifier.
Collapse
Affiliation(s)
- Olaide N Oyelade
- School of Electronics, Electrical Engineering and Computer Science, Queen's University Belfast, Belfast, BT9 SBN, UK.
| | | | - Hui Wang
- School of Electronics, Electrical Engineering and Computer Science, Queen's University Belfast, Belfast, BT9 SBN, UK
| |
Collapse
|
29
|
Khan R, Zaman A, Chen C, Xiao C, Zhong W, Liu Y, Hassan H, Su L, Xie W, Kang Y, Huang B. MLAU-Net: Deep supervised attention and hybrid loss strategies for enhanced segmentation of low-resolution kidney ultrasound. Digit Health 2024; 10:20552076241291306. [PMID: 39559387 PMCID: PMC11571257 DOI: 10.1177/20552076241291306] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2024] [Accepted: 09/25/2024] [Indexed: 11/20/2024] Open
Abstract
Objective The precise segmentation of kidneys from a 2D ultrasound (US) image is crucial for diagnosing and monitoring kidney diseases. However, achieving detailed segmentation is difficult due to US images' low signal-to-noise ratio and low-contrast object boundaries. Methods This paper presents an approach called deep supervised attention with multi-loss functions (MLAU-Net) for US segmentation. The MLAU-Net model combines the benefits of attention mechanisms and deep supervision to improve segmentation accuracy. The attention mechanism allows the model to selectively focus on relevant regions of the kidney and ignore irrelevant background information, while the deep supervision captures the high-dimensional structure of the kidney in US images. Results We conducted experiments on two datasets to evaluate the MLAU-Net model's performance. The Wuerzburg Dynamic Kidney Ultrasound (WD-KUS) dataset with annotation contained kidney US images from 176 patients split into training and testing sets totaling 44,880. The Open Kidney Dataset's second dataset has over 500 B-mode abdominal US images. The proposed approach achieved the highest dice, accuracy, specificity, Hausdorff distance (HD95), recall, and Average Symmetric Surface Distance (ASSD) scores of 90.2%, 98.26%, 98.93%, 8.90 mm, 91.78%, and 2.87 mm, respectively, upon testing and comparison with state-of-the-art U-Net series segmentation frameworks, which demonstrates the potential clinical value of our work. Conclusion The proposed MLAU-Net model has the potential to be applied to other medical image segmentation tasks that face similar challenges of low signal-to-noise ratios and low-contrast object boundaries.
Collapse
Affiliation(s)
- Rashid Khan
- College of Big Data and Internet, Shenzhen Technology University, Shenzhen, China
- College of Applied Sciences, Shenzhen University, Shenzhen, China
- Guangdong Key Laboratory for Biomedical Measurements and Ultrasound Imaging, National-Regional Key Technology Engineering Laboratory for Medical Ultrasound, School of Biomedical Engineering, Shenzhen University Medical School, Shenzhen, China
| | - Asim Zaman
- College of Health Science and Environmental Engineering, Shenzhen Technology University, Shenzhen, China
- School of Biomedical Engineering, Shenzhen University Medical School, Shenzhen University, Shenzhen, China
| | - Chao Chen
- College of Big Data and Internet, Shenzhen Technology University, Shenzhen, China
- College of Applied Sciences, Shenzhen University, Shenzhen, China
| | - Chuda Xiao
- Wuerzburg Dynamics Inc., Shenzhen, China
| | - Wen Zhong
- Department of Urology, The First Affiliated Hospital of Guangzhou Medical University, Guangzhou, China
| | - Yang Liu
- Department of Urology, The First Affiliated Hospital of Guangzhou Medical University, Guangzhou, China
| | - Haseeb Hassan
- College of Health Science and Environmental Engineering, Shenzhen Technology University, Shenzhen, China
| | - Liyilei Su
- College of Big Data and Internet, Shenzhen Technology University, Shenzhen, China
- College of Applied Sciences, Shenzhen University, Shenzhen, China
- Guangdong Key Laboratory for Biomedical Measurements and Ultrasound Imaging, National-Regional Key Technology Engineering Laboratory for Medical Ultrasound, School of Biomedical Engineering, Shenzhen University Medical School, Shenzhen, China
| | - Weiguo Xie
- Wuerzburg Dynamics Inc., Shenzhen, China
| | - Yan Kang
- College of Applied Sciences, Shenzhen University, Shenzhen, China
- College of Health Science and Environmental Engineering, Shenzhen Technology University, Shenzhen, China
| | - Bingding Huang
- College of Big Data and Internet, Shenzhen Technology University, Shenzhen, China
| |
Collapse
|
30
|
Sharma P, Nayak DR, Balabantaray BK, Tanveer M, Nayak R. A survey on cancer detection via convolutional neural networks: Current challenges and future directions. Neural Netw 2024; 169:637-659. [PMID: 37972509 DOI: 10.1016/j.neunet.2023.11.006] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2023] [Revised: 10/21/2023] [Accepted: 11/04/2023] [Indexed: 11/19/2023]
Abstract
Cancer is a condition in which abnormal cells uncontrollably split and damage the body tissues. Hence, detecting cancer at an early stage is highly essential. Currently, medical images play an indispensable role in detecting various cancers; however, manual interpretation of these images by radiologists is observer-dependent, time-consuming, and tedious. An automatic decision-making process is thus an essential need for cancer detection and diagnosis. This paper presents a comprehensive survey on automated cancer detection in various human body organs, namely, the breast, lung, liver, prostate, brain, skin, and colon, using convolutional neural networks (CNN) and medical imaging techniques. It also includes a brief discussion about deep learning based on state-of-the-art cancer detection methods, their outcomes, and the possible medical imaging data used. Eventually, the description of the dataset used for cancer detection, the limitations of the existing solutions, future trends, and challenges in this domain are discussed. The utmost goal of this paper is to provide a piece of comprehensive and insightful information to researchers who have a keen interest in developing CNN-based models for cancer detection.
Collapse
Affiliation(s)
- Pallabi Sharma
- School of Computer Science, UPES, Dehradun, 248007, Uttarakhand, India.
| | - Deepak Ranjan Nayak
- Department of Computer Science and Engineering, Malaviya National Institute of Technology, Jaipur, 302017, Rajasthan, India.
| | - Bunil Kumar Balabantaray
- Computer Science and Engineering, National Institute of Technology Meghalaya, Shillong, 793003, Meghalaya, India.
| | - M Tanveer
- Department of Mathematics, Indian Institute of Technology Indore, Simrol, 453552, Indore, India.
| | - Rajashree Nayak
- School of Applied Sciences, Birla Global University, Bhubaneswar, 751029, Odisha, India.
| |
Collapse
|
31
|
Shiri I, Amini M, Yousefirizi F, Vafaei Sadr A, Hajianfar G, Salimi Y, Mansouri Z, Jenabi E, Maghsudi M, Mainta I, Becker M, Rahmim A, Zaidi H. Information fusion for fully automated segmentation of head and neck tumors from PET and CT images. Med Phys 2024; 51:319-333. [PMID: 37475591 DOI: 10.1002/mp.16615] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2023] [Revised: 05/16/2023] [Accepted: 06/19/2023] [Indexed: 07/22/2023] Open
Abstract
BACKGROUND PET/CT images combining anatomic and metabolic data provide complementary information that can improve clinical task performance. PET image segmentation algorithms exploiting the multi-modal information available are still lacking. PURPOSE Our study aimed to assess the performance of PET and CT image fusion for gross tumor volume (GTV) segmentations of head and neck cancers (HNCs) utilizing conventional, deep learning (DL), and output-level voting-based fusions. METHODS The current study is based on a total of 328 histologically confirmed HNCs from six different centers. The images were automatically cropped to a 200 × 200 head and neck region box, and CT and PET images were normalized for further processing. Eighteen conventional image-level fusions were implemented. In addition, a modified U2-Net architecture as DL fusion model baseline was used. Three different input, layer, and decision-level information fusions were used. Simultaneous truth and performance level estimation (STAPLE) and majority voting to merge different segmentation outputs (from PET and image-level and network-level fusions), that is, output-level information fusion (voting-based fusions) were employed. Different networks were trained in a 2D manner with a batch size of 64. Twenty percent of the dataset with stratification concerning the centers (20% in each center) were used for final result reporting. Different standard segmentation metrics and conventional PET metrics, such as SUV, were calculated. RESULTS In single modalities, PET had a reasonable performance with a Dice score of 0.77 ± 0.09, while CT did not perform acceptably and reached a Dice score of only 0.38 ± 0.22. Conventional fusion algorithms obtained a Dice score range of [0.76-0.81] with guided-filter-based context enhancement (GFCE) at the low-end, and anisotropic diffusion and Karhunen-Loeve transform fusion (ADF), multi-resolution singular value decomposition (MSVD), and multi-level image decomposition based on latent low-rank representation (MDLatLRR) at the high-end. All DL fusion models achieved Dice scores of 0.80. Output-level voting-based models outperformed all other models, achieving superior results with a Dice score of 0.84 for Majority_ImgFus, Majority_All, and Majority_Fast. A mean error of almost zero was achieved for all fusions using SUVpeak , SUVmean and SUVmedian . CONCLUSION PET/CT information fusion adds significant value to segmentation tasks, considerably outperforming PET-only and CT-only methods. In addition, both conventional image-level and DL fusions achieve competitive results. Meanwhile, output-level voting-based fusion using majority voting of several algorithms results in statistically significant improvements in the segmentation of HNC.
Collapse
Affiliation(s)
- Isaac Shiri
- Division of Nuclear Medicine and Molecular Imaging, Geneva University Hospital, Geneva, Switzerland
| | - Mehdi Amini
- Division of Nuclear Medicine and Molecular Imaging, Geneva University Hospital, Geneva, Switzerland
| | - Fereshteh Yousefirizi
- Department of Integrative Oncology, BC Cancer Research Institute, Vancouver, British Columbia, Canada
| | - Alireza Vafaei Sadr
- Institute of Pathology, RWTH Aachen University Hospital, Aachen, Germany
- Department of Public Health Sciences, College of Medicine, The Pennsylvania State University, Hershey, USA
| | - Ghasem Hajianfar
- Division of Nuclear Medicine and Molecular Imaging, Geneva University Hospital, Geneva, Switzerland
| | - Yazdan Salimi
- Division of Nuclear Medicine and Molecular Imaging, Geneva University Hospital, Geneva, Switzerland
| | - Zahra Mansouri
- Division of Nuclear Medicine and Molecular Imaging, Geneva University Hospital, Geneva, Switzerland
| | - Elnaz Jenabi
- Research Center for Nuclear Medicine, Shariati Hospital, Tehran University of Medical Sciences, Tehran, Iran
| | - Mehdi Maghsudi
- Rajaie Cardiovascular Medical and Research Center, Iran University of Medical Sciences, Tehran, Iran
| | - Ismini Mainta
- Division of Nuclear Medicine and Molecular Imaging, Geneva University Hospital, Geneva, Switzerland
| | - Minerva Becker
- Service of Radiology, Geneva University Hospital, Geneva, Switzerland
| | - Arman Rahmim
- Department of Integrative Oncology, BC Cancer Research Institute, Vancouver, British Columbia, Canada
- Department of Radiology and Physics, University of British Columbia, Vancouver, Canada
| | - Habib Zaidi
- Division of Nuclear Medicine and Molecular Imaging, Geneva University Hospital, Geneva, Switzerland
- Geneva University Neurocenter, Geneva University, Geneva, Switzerland
- Department of Nuclear Medicine and Molecular Imaging, University of Groningen, University Medical Center Groningen, Groningen, Netherlands
- Department of Nuclear Medicine, University of Southern Denmark, Odense, Denmark
| |
Collapse
|
32
|
Hussain D, Al-Masni MA, Aslam M, Sadeghi-Niaraki A, Hussain J, Gu YH, Naqvi RA. Revolutionizing tumor detection and classification in multimodality imaging based on deep learning approaches: Methods, applications and limitations. JOURNAL OF X-RAY SCIENCE AND TECHNOLOGY 2024; 32:857-911. [PMID: 38701131 DOI: 10.3233/xst-230429] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/05/2024]
Abstract
BACKGROUND The emergence of deep learning (DL) techniques has revolutionized tumor detection and classification in medical imaging, with multimodal medical imaging (MMI) gaining recognition for its precision in diagnosis, treatment, and progression tracking. OBJECTIVE This review comprehensively examines DL methods in transforming tumor detection and classification across MMI modalities, aiming to provide insights into advancements, limitations, and key challenges for further progress. METHODS Systematic literature analysis identifies DL studies for tumor detection and classification, outlining methodologies including convolutional neural networks (CNNs), recurrent neural networks (RNNs), and their variants. Integration of multimodality imaging enhances accuracy and robustness. RESULTS Recent advancements in DL-based MMI evaluation methods are surveyed, focusing on tumor detection and classification tasks. Various DL approaches, including CNNs, YOLO, Siamese Networks, Fusion-Based Models, Attention-Based Models, and Generative Adversarial Networks, are discussed with emphasis on PET-MRI, PET-CT, and SPECT-CT. FUTURE DIRECTIONS The review outlines emerging trends and future directions in DL-based tumor analysis, aiming to guide researchers and clinicians toward more effective diagnosis and prognosis. Continued innovation and collaboration are stressed in this rapidly evolving domain. CONCLUSION Conclusions drawn from literature analysis underscore the efficacy of DL approaches in tumor detection and classification, highlighting their potential to address challenges in MMI analysis and their implications for clinical practice.
Collapse
Affiliation(s)
- Dildar Hussain
- Department of Artificial Intelligence and Data Science, Sejong University, Seoul, Korea
| | - Mohammed A Al-Masni
- Department of Artificial Intelligence and Data Science, Sejong University, Seoul, Korea
| | - Muhammad Aslam
- Department of Artificial Intelligence and Data Science, Sejong University, Seoul, Korea
| | - Abolghasem Sadeghi-Niaraki
- Department of Computer Science & Engineering and Convergence Engineering for Intelligent Drone, XR Research Center, Sejong University, Seoul, Korea
| | - Jamil Hussain
- Department of Artificial Intelligence and Data Science, Sejong University, Seoul, Korea
| | - Yeong Hyeon Gu
- Department of Artificial Intelligence and Data Science, Sejong University, Seoul, Korea
| | - Rizwan Ali Naqvi
- Department of Intelligent Mechatronics Engineering, Sejong University, Seoul, Korea
| |
Collapse
|
33
|
Xing F, Silosky M, Ghosh D, Chin BB. Location-Aware Encoding for Lesion Detection in 68Ga-DOTATATE Positron Emission Tomography Images. IEEE Trans Biomed Eng 2024; 71:247-257. [PMID: 37471190 DOI: 10.1109/tbme.2023.3297249] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/22/2023]
Abstract
OBJECTIVE Lesion detection with positron emission tomography (PET) imaging is critical for tumor staging, treatment planning, and advancing novel therapies to improve patient outcomes, especially for neuroendocrine tumors (NETs). Current lesion detection methods often require manual cropping of regions/volumes of interest (ROIs/VOIs) a priori, or rely on multi-stage, cascaded models, or use multi-modality imaging to detect lesions in PET images. This leads to significant inefficiency, high variability and/or potential accumulative errors in lesion quantification. To tackle this issue, we propose a novel single-stage lesion detection method using only PET images. METHODS We design and incorporate a new, plug-and-play codebook learning module into a U-Net-like neural network and promote lesion location-specific feature learning at multiple scales. We explicitly regularize the codebook learning with direct supervision at the network's multi-level hidden layers and enforce the network to learn multi-scale discriminative features with respect to predicting lesion positions. The network automatically combines the predictions from the codebook learning module and other layers via a learnable fusion layer. RESULTS We evaluate the proposed method on a real-world clinical 68Ga-DOTATATE PET image dataset, and our method produces significantly better lesion detection performance than recent state-of-the-art approaches. CONCLUSION We present a novel deep learning method for single-stage lesion detection in PET imaging data, with no ROI/VOI cropping in advance, no multi-stage modeling and no multi-modality data. SIGNIFICANCE This study provides a new perspective for effective and efficient lesion identification in PET, potentially accelerating novel therapeutic regimen development for NETs and ultimately improving patient outcomes including survival.
Collapse
|
34
|
Bi L, Buehner U, Fu X, Williamson T, Choong P, Kim J. Hybrid CNN-transformer network for interactive learning of challenging musculoskeletal images. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2024; 243:107875. [PMID: 37871450 DOI: 10.1016/j.cmpb.2023.107875] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/06/2023] [Revised: 10/16/2023] [Accepted: 10/17/2023] [Indexed: 10/25/2023]
Abstract
BACKGROUND AND OBJECTIVES Segmentation of regions of interest (ROIs) such as tumors and bones plays an essential role in the analysis of musculoskeletal (MSK) images. Segmentation results can help with orthopaedic surgeons in surgical outcomes assessment and patient's gait cycle simulation. Deep learning-based automatic segmentation methods, particularly those using fully convolutional networks (FCNs), are considered as the state-of-the-art. However, in scenarios where the training data is insufficient to account for all the variations in ROIs, these methods struggle to segment the challenging ROIs that with less common image characteristics. Such characteristics might include low contrast to the background, inhomogeneous textures, and fuzzy boundaries. METHODS we propose a hybrid convolutional neural network - transformer network (HCTN) for semi-automatic segmentation to overcome the limitations of segmenting challenging MSK images. Specifically, we propose to fuse user-inputs (manual, e.g., mouse clicks) with high-level semantic image features derived from the neural network (automatic) where the user-inputs are used in an interactive training for uncommon image characteristics. In addition, we propose to leverage the transformer network (TN) - a deep learning model designed for handling sequence data, in together with features derived from FCNs for segmentation; this addresses the limitation of FCNs that can only operate on small kernels, which tends to dismiss global context and only focus on local patterns. RESULTS We purposely selected three MSK imaging datasets covering a variety of structures to evaluate the generalizability of the proposed method. Our semi-automatic HCTN method achieved a dice coefficient score (DSC) of 88.46 ± 9.41 for segmenting the soft-tissue sarcoma tumors from magnetic resonance (MR) images, 73.32 ± 11.97 for segmenting the osteosarcoma tumors from MR images and 93.93 ± 1.84 for segmenting the clavicle bones from chest radiographs. When compared to the current state-of-the-art automatic segmentation method, our HCTN method is 11.7%, 19.11% and 7.36% higher in DSC on the three datasets, respectively. CONCLUSION Our experimental results demonstrate that HCTN achieved more generalizable results than the current methods, especially with challenging MSK studies.
Collapse
Affiliation(s)
- Lei Bi
- Institute of Translational Medicine, National Center for Translational Medicine, Shanghai Jiao Tong University, Shanghai, China; School of Computer Science, University of Sydney, NSW, Australia
| | | | - Xiaohang Fu
- School of Computer Science, University of Sydney, NSW, Australia
| | - Tom Williamson
- Stryker Corporation, Kalamazoo, Michigan, USA; Centre for Additive Manufacturing, School of Engineering, RMIT University, VIC, Australia
| | - Peter Choong
- Department of Surgery, University of Melbourne, VIC, Australia
| | - Jinman Kim
- School of Computer Science, University of Sydney, NSW, Australia.
| |
Collapse
|
35
|
Li M, Jiang Y, Zhang Y, Zhu H. Medical image analysis using deep learning algorithms. Front Public Health 2023; 11:1273253. [PMID: 38026291 PMCID: PMC10662291 DOI: 10.3389/fpubh.2023.1273253] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2023] [Accepted: 10/05/2023] [Indexed: 12/01/2023] Open
Abstract
In the field of medical image analysis within deep learning (DL), the importance of employing advanced DL techniques cannot be overstated. DL has achieved impressive results in various areas, making it particularly noteworthy for medical image analysis in healthcare. The integration of DL with medical image analysis enables real-time analysis of vast and intricate datasets, yielding insights that significantly enhance healthcare outcomes and operational efficiency in the industry. This extensive review of existing literature conducts a thorough examination of the most recent deep learning (DL) approaches designed to address the difficulties faced in medical healthcare, particularly focusing on the use of deep learning algorithms in medical image analysis. Falling all the investigated papers into five different categories in terms of their techniques, we have assessed them according to some critical parameters. Through a systematic categorization of state-of-the-art DL techniques, such as Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), Generative Adversarial Networks (GANs), Long Short-term Memory (LSTM) models, and hybrid models, this study explores their underlying principles, advantages, limitations, methodologies, simulation environments, and datasets. Based on our results, Python was the most frequent programming language used for implementing the proposed methods in the investigated papers. Notably, the majority of the scrutinized papers were published in 2021, underscoring the contemporaneous nature of the research. Moreover, this review accentuates the forefront advancements in DL techniques and their practical applications within the realm of medical image analysis, while simultaneously addressing the challenges that hinder the widespread implementation of DL in image analysis within the medical healthcare domains. These discerned insights serve as compelling impetuses for future studies aimed at the progressive advancement of image analysis in medical healthcare research. The evaluation metrics employed across the reviewed articles encompass a broad spectrum of features, encompassing accuracy, sensitivity, specificity, F-score, robustness, computational complexity, and generalizability.
Collapse
Affiliation(s)
- Mengfang Li
- The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, China
| | - Yuanyuan Jiang
- Department of Cardiovascular Medicine, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, China
| | - Yanzhou Zhang
- Department of Cardiovascular Medicine, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, China
| | - Haisheng Zhu
- Department of Cardiovascular Medicine, Wencheng People’s Hospital, Wencheng, China
| |
Collapse
|
36
|
Kim MJ, Jeong J, Lee JW, Kim IH, Park JW, Roh JY, Kim N, Kim SJ. Screening obstructive sleep apnea patients via deep learning of knowledge distillation in the lateral cephalogram. Sci Rep 2023; 13:17788. [PMID: 37853030 PMCID: PMC10584979 DOI: 10.1038/s41598-023-42880-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2022] [Accepted: 09/15/2023] [Indexed: 10/20/2023] Open
Abstract
The lateral cephalogram in orthodontics is a valuable screening tool on undetected obstructive sleep apnea (OSA), which can lead to consequences of severe systematic disease. We hypothesized that a deep learning-based classifier might be able to differentiate OSA as anatomical features in lateral cephalogram. Moreover, since the imaging devices used by each hospital could be different, there is a need to overcome modality difference of radiography. Therefore, we proposed a deep learning model with knowledge distillation to classify patients into OSA and non-OSA groups using the lateral cephalogram and to overcome modality differences simultaneously. Lateral cephalograms of 500 OSA patients and 498 non-OSA patients from two different devices were included. ResNet-50 and ResNet-50 with a feature-based knowledge distillation models were trained and their performances of classification were compared. Through the knowledge distillation, area under receiver operating characteristic curve analysis and gradient-weighted class activation mapping of knowledge distillation model exhibits high performance without being deceived by features caused by modality differences. By checking the probability values predicting OSA, an improvement in overcoming the modality differences was observed, which could be applied in the actual clinical situation.
Collapse
Affiliation(s)
- Min-Jung Kim
- Department of Convergence Medicine, Asan Medical Institute of Convergence Science and Technology, Asan Medical Center, University of Ulsan College of Medicine, 88 Olympic-ro 43-Gil Songpa-gu, Seoul, 05505, Republic of Korea
| | - Jiheon Jeong
- Department of Convergence Medicine, Asan Medical Institute of Convergence Science and Technology, Asan Medical Center, University of Ulsan College of Medicine, 88 Olympic-ro 43-Gil Songpa-gu, Seoul, 05505, Republic of Korea
- Department of Biomedical Engineering, Asan Medical Institute of Convergence Science and Technology, Asan Medical Center, College of Medicine, University of Ulsan, Seoul, Republic of Korea
| | - Jung-Wook Lee
- Department of Convergence Medicine, Asan Medical Institute of Convergence Science and Technology, Asan Medical Center, University of Ulsan College of Medicine, 88 Olympic-ro 43-Gil Songpa-gu, Seoul, 05505, Republic of Korea
| | - In-Hwan Kim
- Department of Biomedical Engineering, Asan Medical Institute of Convergence Science and Technology, Asan Medical Center, College of Medicine, University of Ulsan, Seoul, Republic of Korea
| | - Jae-Woo Park
- Department of Biomedical Engineering, Asan Medical Institute of Convergence Science and Technology, Asan Medical Center, College of Medicine, University of Ulsan, Seoul, Republic of Korea
| | - Jae-Yon Roh
- Department of Orthodontics, Kyung Hee University Dental Hospital, Seoul, 05505, Republic of Korea
| | - Namkug Kim
- Department of Convergence Medicine, Asan Medical Institute of Convergence Science and Technology, Asan Medical Center, University of Ulsan College of Medicine, 88 Olympic-ro 43-Gil Songpa-gu, Seoul, 05505, Republic of Korea.
- Department of Radiology, University of Ulsan College of Medicine, Asan Medical Center, 88 Olympic-Ro 43-Gil Songpa-Gu, Seoul, 05505, Republic of Korea.
| | - Su-Jung Kim
- Department of Biomedical Engineering, Asan Medical Institute of Convergence Science and Technology, Asan Medical Center, College of Medicine, University of Ulsan, Seoul, Republic of Korea.
- Department of Orthodontics, School of Dentistry, Kyung Hee University, 23, Kyungheedae-ro, Dongdaemun-gu, Seoul, 02447, Republic of Korea.
| |
Collapse
|
37
|
Goyal V, Schaub NJ, Voss TC, Hotaling NA. Unbiased image segmentation assessment toolkit for quantitative differentiation of state-of-the-art algorithms and pipelines. BMC Bioinformatics 2023; 24:388. [PMID: 37828466 PMCID: PMC10568754 DOI: 10.1186/s12859-023-05486-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2022] [Accepted: 09/18/2023] [Indexed: 10/14/2023] Open
Abstract
BACKGROUND Image segmentation pipelines are commonly used in microscopy to identify cellular compartments like nucleus and cytoplasm, but there are few standards for comparing segmentation accuracy across pipelines. The process of selecting a segmentation assessment pipeline can seem daunting to researchers due to the number and variety of metrics available for evaluating segmentation quality. RESULTS Here we present automated pipelines to obtain a comprehensive set of 69 metrics to evaluate segmented data and propose a selection methodology for models based on quantitative analysis, dimension reduction or unsupervised classification techniques and informed selection criteria. CONCLUSION We show that the metrics used here can often be reduced to a small number of metrics that give a more complete understanding of segmentation accuracy, with different groups of metrics providing sensitivity to different types of segmentation error. These tools are delivered as easy to use python libraries, command line tools, Common Workflow Language Tools, and as Web Image Processing Pipeline interactive plugins to ensure a wide range of users can access and use them. We also present how our evaluation methods can be used to observe the changes in segmentations across modern machine learning/deep learning workflows and use cases.
Collapse
Affiliation(s)
- Vishakha Goyal
- Information Research Technology Branch (ITRB), National Center for Advancing Translational Science (NCATS), National Institutes of Health (NIH), 9800 Medical Center Dr, Rockville, MD, 20850, USA
- Axle Research and Technologies, 6116 Executive Blvd #400, Rockville, MD, 20852, USA
| | - Nick J Schaub
- Information Research Technology Branch (ITRB), National Center for Advancing Translational Science (NCATS), National Institutes of Health (NIH), 9800 Medical Center Dr, Rockville, MD, 20850, USA
- Axle Research and Technologies, 6116 Executive Blvd #400, Rockville, MD, 20852, USA
| | - Ty C Voss
- Information Research Technology Branch (ITRB), National Center for Advancing Translational Science (NCATS), National Institutes of Health (NIH), 9800 Medical Center Dr, Rockville, MD, 20850, USA
- Axle Research and Technologies, 6116 Executive Blvd #400, Rockville, MD, 20852, USA
| | - Nathan A Hotaling
- Information Research Technology Branch (ITRB), National Center for Advancing Translational Science (NCATS), National Institutes of Health (NIH), 9800 Medical Center Dr, Rockville, MD, 20850, USA.
- Axle Research and Technologies, 6116 Executive Blvd #400, Rockville, MD, 20852, USA.
| |
Collapse
|
38
|
Chen X, Liu C. Deep-learning-based methods of attenuation correction for SPECT and PET. J Nucl Cardiol 2023; 30:1859-1878. [PMID: 35680755 DOI: 10.1007/s12350-022-03007-3] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2022] [Accepted: 05/02/2022] [Indexed: 10/18/2022]
Abstract
Attenuation correction (AC) is essential for quantitative analysis and clinical diagnosis of single-photon emission computed tomography (SPECT) and positron emission tomography (PET). In clinical practice, computed tomography (CT) is utilized to generate attenuation maps (μ-maps) for AC of hybrid SPECT/CT and PET/CT scanners. However, CT-based AC methods frequently produce artifacts due to CT artifacts and misregistration of SPECT-CT and PET-CT scans. Segmentation-based AC methods using magnetic resonance imaging (MRI) for PET/MRI scanners are inaccurate and complicated since MRI does not contain direct information of photon attenuation. Computational AC methods for SPECT and PET estimate attenuation coefficients directly from raw emission data, but suffer from low accuracy, cross-talk artifacts, high computational complexity, and high noise level. The recently evolving deep-learning-based methods have shown promising results in AC of SPECT and PET, which can be generally divided into two categories: indirect and direct strategies. Indirect AC strategies apply neural networks to transform emission, transmission, or MR images into synthetic μ-maps or CT images which are then incorporated into AC reconstruction. Direct AC strategies skip the intermediate steps of generating μ-maps or CT images and predict AC SPECT or PET images from non-attenuation-correction (NAC) SPECT or PET images directly. These deep-learning-based AC methods show comparable and even superior performance to non-deep-learning methods. In this article, we first discussed the principles and limitations of non-deep-learning AC methods, and then reviewed the status and prospects of deep-learning-based methods for AC of SPECT and PET.
Collapse
Affiliation(s)
- Xiongchao Chen
- Department of Biomedical Engineering, Yale University, New Haven, CT, USA
| | - Chi Liu
- Department of Biomedical Engineering, Yale University, New Haven, CT, USA.
- Department of Radiology and Biomedical Imaging, Yale University, PO Box 208048, New Haven, CT, 06520, USA.
| |
Collapse
|
39
|
Gu F, Wu Q. Quantitation of dynamic total-body PET imaging: recent developments and future perspectives. Eur J Nucl Med Mol Imaging 2023; 50:3538-3557. [PMID: 37460750 PMCID: PMC10547641 DOI: 10.1007/s00259-023-06299-w] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2022] [Accepted: 06/05/2023] [Indexed: 10/04/2023]
Abstract
BACKGROUND Positron emission tomography (PET) scanning is an important diagnostic imaging technique used in disease diagnosis, therapy planning, treatment monitoring, and medical research. The standardized uptake value (SUV) obtained at a single time frame has been widely employed in clinical practice. Well beyond this simple static measure, more detailed metabolic information can be recovered from dynamic PET scans, followed by the recovery of arterial input function and application of appropriate tracer kinetic models. Many efforts have been devoted to the development of quantitative techniques over the last couple of decades. CHALLENGES The advent of new-generation total-body PET scanners characterized by ultra-high sensitivity and long axial field of view, i.e., uEXPLORER (United Imaging Healthcare), PennPET Explorer (University of Pennsylvania), and Biograph Vision Quadra (Siemens Healthineers), further stimulates valuable inspiration to derive kinetics for multiple organs simultaneously. But some emerging issues also need to be addressed, e.g., the large-scale data size and organ-specific physiology. The direct implementation of classical methods for total-body PET imaging without proper validation may lead to less accurate results. CONCLUSIONS In this contribution, the published dynamic total-body PET datasets are outlined, and several challenges/opportunities for quantitation of such types of studies are presented. An overview of the basic equation, calculation of input function (based on blood sampling, image, population or mathematical model), and kinetic analysis encompassing parametric (compartmental model, graphical plot and spectral analysis) and non-parametric (B-spline and piece-wise basis elements) approaches is provided. The discussion mainly focuses on the feasibilities, recent developments, and future perspectives of these methodologies for a diverse-tissue environment.
Collapse
Affiliation(s)
- Fengyun Gu
- School of Mathematics and Physics, North China Electric Power University, 102206, Beijing, China.
- School of Mathematical Sciences, University College Cork, T12XF62, Cork, Ireland.
| | - Qi Wu
- School of Mathematical Sciences, University College Cork, T12XF62, Cork, Ireland
| |
Collapse
|
40
|
Fan J, Zhang L, Lv T, Liu Y, Sun H, Miao K, Jiang C, Li L, Pan X. MEAI: an artificial intelligence platform for predicting distant and lymph node metastases directly from primary breast cancer. J Cancer Res Clin Oncol 2023; 149:9229-9241. [PMID: 37199837 DOI: 10.1007/s00432-023-04787-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2023] [Accepted: 04/15/2023] [Indexed: 05/19/2023]
Abstract
PURPOSE Breast cancer patients typically have decent prognoses, with a 5-year survival rate of more than 90%, but when the disease metastases to lymph node or distant, the prognosis drastically declines. Therefore, it is essential for future treatment and patient survival to quickly and accurately identify tumor metastasis in patients. An artificial intelligence system was developed to recognize lymph node and distant tumor metastases on whole-slide images (WSIs) of primary breast cancer. METHODS In this study, a total of 832 WSIs from 520 patients without tumor metastases and 312 patients with breast cancer metastases (including lymph node, bone, lung, liver, and other) were gathered. Based on the WSIs were randomly divided into the training and testing cohorts, a brand-new artificial intelligence system called MEAI was built to identify lymph node and distant metastases in primary breast cancer. RESULTS The final AI system attained an area under the receiver operating characteristic curve of 0.934 in a test set of 187 patients. In addition, the potential for AI system to increase the precision, consistency, and effectiveness of tumor metastasis detection in patients with breast cancer was highlighted by the AI's achievement of an AUROC higher than the average of six board-certified pathologists (AUROC 0.811) in a retrospective pathologist evaluation. CONCLUSION The proposed MEAI system can provide a non-invasive approach to assess the metastatic probability of patients with primary breast cancer.
Collapse
Affiliation(s)
- Jiansong Fan
- School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi, Jiangsu, China
| | - Lei Zhang
- Department of Vascular Surgery, The Affiliated Hospital of Southwest Medical University, Luzhou, Sichuan, China
- Cancer Center, Faculty of Health Sciences, University of Macau, Macau SAR, China
| | - Tianxu Lv
- School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi, Jiangsu, China
| | - Yuan Liu
- School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi, Jiangsu, China
| | - Heng Sun
- Cancer Center, Faculty of Health Sciences, University of Macau, Macau SAR, China
| | - Kai Miao
- MOE Frontier Science Centre for Precision Oncology, University of Macau, Macau SAR, China
| | - Chunjuan Jiang
- Department of Nuclear Medicine, PET Image Center, The Second Xiangya Hospital of Central South University, Changsha, Hunan, China
| | - Lihua Li
- Institute of Biomedical Engineering and Instrumentation, Hangzhou Dianzi University, Hangzhou, Zhejiang, China
| | - Xiang Pan
- School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi, Jiangsu, China.
- Cancer Center, Faculty of Health Sciences, University of Macau, Macau SAR, China.
- MOE Frontier Science Centre for Precision Oncology, University of Macau, Macau SAR, China.
| |
Collapse
|
41
|
Liu H, Zhuang Y, Song E, Xu X, Ma G, Cetinkaya C, Hung CC. A modality-collaborative convolution and transformer hybrid network for unpaired multi-modal medical image segmentation with limited annotations. Med Phys 2023; 50:5460-5478. [PMID: 36864700 DOI: 10.1002/mp.16338] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2022] [Revised: 02/07/2023] [Accepted: 02/22/2023] [Indexed: 03/04/2023] Open
Abstract
BACKGROUND Multi-modal learning is widely adopted to learn the latent complementary information between different modalities in multi-modal medical image segmentation tasks. Nevertheless, the traditional multi-modal learning methods require spatially well-aligned and paired multi-modal images for supervised training, which cannot leverage unpaired multi-modal images with spatial misalignment and modality discrepancy. For training accurate multi-modal segmentation networks using easily accessible and low-cost unpaired multi-modal images in clinical practice, unpaired multi-modal learning has received comprehensive attention recently. PURPOSE Existing unpaired multi-modal learning methods usually focus on the intensity distribution gap but ignore the scale variation problem between different modalities. Besides, within existing methods, shared convolutional kernels are frequently employed to capture common patterns in all modalities, but they are typically inefficient at learning global contextual information. On the other hand, existing methods highly rely on a large number of labeled unpaired multi-modal scans for training, which ignores the practical scenario when labeled data is limited. To solve the above problems, we propose a modality-collaborative convolution and transformer hybrid network (MCTHNet) using semi-supervised learning for unpaired multi-modal segmentation with limited annotations, which not only collaboratively learns modality-specific and modality-invariant representations, but also could automatically leverage extensive unlabeled scans for improving performance. METHODS We make three main contributions to the proposed method. First, to alleviate the intensity distribution gap and scale variation problems across modalities, we develop a modality-specific scale-aware convolution (MSSC) module that can adaptively adjust the receptive field sizes and feature normalization parameters according to the input. Secondly, we propose a modality-invariant vision transformer (MIViT) module as the shared bottleneck layer for all modalities, which implicitly incorporates convolution-like local operations with the global processing of transformers for learning generalizable modality-invariant representations. Third, we design a multi-modal cross pseudo supervision (MCPS) method for semi-supervised learning, which enforces the consistency between the pseudo segmentation maps generated by two perturbed networks to acquire abundant annotation information from unlabeled unpaired multi-modal scans. RESULTS Extensive experiments are performed on two unpaired CT and MR segmentation datasets, including a cardiac substructure dataset derived from the MMWHS-2017 dataset and an abdominal multi-organ dataset consisting of the BTCV and CHAOS datasets. Experiment results show that our proposed method significantly outperforms other existing state-of-the-art methods under various labeling ratios, and achieves a comparable segmentation performance close to single-modal methods with fully labeled data by only leveraging a small portion of labeled data. Specifically, when the labeling ratio is 25%, our proposed method achieves overall mean DSC values of 78.56% and 76.18% in cardiac and abdominal segmentation, respectively, which significantly improves the average DSC value of two tasks by 12.84% compared to single-modal U-Net models. CONCLUSIONS Our proposed method is beneficial for reducing the annotation burden of unpaired multi-modal medical images in clinical applications.
Collapse
Affiliation(s)
- Hong Liu
- School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, China
| | - Yuzhou Zhuang
- School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, China
| | - Enmin Song
- School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, China
| | - Xiangyang Xu
- School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, China
| | - Guangzhi Ma
- School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, China
| | - Coskun Cetinkaya
- Center for Machine Vision and Security Research, Kennesaw State University, Kennesaw, Georgia, USA
| | - Chih-Cheng Hung
- Center for Machine Vision and Security Research, Kennesaw State University, Kennesaw, Georgia, USA
| |
Collapse
|
42
|
Nagendram S, Singh A, Harish Babu G, Joshi R, Pande SD, Ahammad SKH, Dhabliya D, Bisht A. Stochastic gradient descent optimisation for convolutional neural network for medical image segmentation. Open Life Sci 2023; 18:20220665. [PMID: 37589001 PMCID: PMC10426722 DOI: 10.1515/biol-2022-0665] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2023] [Revised: 06/16/2023] [Accepted: 07/02/2023] [Indexed: 08/18/2023] Open
Abstract
In accordance with the inability of various hair artefacts subjected to dermoscopic medical images, undergoing illumination challenges that include chest-Xray featuring conditions of imaging acquisi-tion situations built with clinical segmentation. The study proposed a novel deep-convolutional neural network (CNN)-integrated methodology for applying medical image segmentation upon chest-Xray and dermoscopic clinical images. The study develops a novel technique of segmenting medical images merged with CNNs with an architectural comparison that incorporates neural networks of U-net and fully convolutional networks (FCN) schemas with loss functions associated with Jaccard distance and Binary-cross entropy under optimised stochastic gradient descent + Nesterov practices. Digital image over clinical approach significantly built the diagnosis and determination of the best treatment for a patient's condition. Even though medical digital images are subjected to varied components clarified with the effect of noise, quality, disturbance, and precision depending on the enhanced version of images segmented with the optimised process. Ultimately, the threshold technique has been employed for the output reached under the pre- and post-processing stages to contrast the image technically being developed. The data source applied is well-known in PH2 Database for Melanoma lesion segmentation and chest X-ray images since it has variations in hair artefacts and illumination. Experiment outcomes outperform other U-net and FCN architectures of CNNs. The predictions produced from the model on test images were post-processed using the threshold technique to remove the blurry boundaries around the predicted lesions. Experimental results proved that the present model has better efficiency than the existing one, such as U-net and FCN, based on the image segmented in terms of sensitivity = 0.9913, accuracy = 0.9883, and dice coefficient = 0.0246.
Collapse
Affiliation(s)
- Sanam Nagendram
- Department of Artificial Intelligence, KKR & KSR Institute of Technology and Sciences, Guntur, India
| | - Arunendra Singh
- Department of Information Technology, Pranveer Singh Institute of Technology, Kanpur, 209305, Uttar Pradesh, India
| | - Gade Harish Babu
- Department of E.C.E, CVR College of Engineering, Hyderabad, India
| | - Rahul Joshi
- CSE Department, Symbiosis Institute of Technology, Symbiosis International (Deemed University), Pune, India
| | | | - S. K. Hasane Ahammad
- Department of E.C.E., Koneru Lakshmaiah Education Foundation,
Vaddeswaram, 522302, India
| | - Dharmesh Dhabliya
- Department of Information Technology, Vishwakarma Institute of Information Technology,
Pune, India
| | - Aadarsh Bisht
- University Institute of Engineering, Chandigarh University, Mohali, India
| |
Collapse
|
43
|
Wu B, Zhang F, Xu L, Shen S, Shao P, Sun M, Liu P, Yao P, Xu RX. Modality preserving U-Net for segmentation of multimodal medical images. Quant Imaging Med Surg 2023; 13:5242-5257. [PMID: 37581055 PMCID: PMC10423364 DOI: 10.21037/qims-22-1367] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2022] [Accepted: 05/19/2023] [Indexed: 08/16/2023]
Abstract
Background Recent advances in artificial intelligence and digital image processing have inspired the use of deep neural networks for segmentation tasks in multimodal medical imaging. Unlike natural images, multimodal medical images contain much richer information regarding different modal properties and therefore present more challenges for semantic segmentation. However, there is no report on systematic research that integrates multi-scaled and structured analysis of single-modal and multimodal medical images. Methods We propose a deep neural network, named as Modality Preserving U-Net (MPU-Net), for modality-preserving analysis and segmentation of medical targets from multimodal medical images. The proposed MPU-Net consists of a modality preservation encoder (MPE) module that preserves the feature independency among the modalities and a modality fusion decoder (MFD) module that performs a multiscale feature fusion analysis for each modality in order to provide a rich feature representation for the final task. The effectiveness of such a single-modal preservation and multimodal fusion feature extraction approach is verified by multimodal segmentation experiments and an ablation study using brain tumor and prostate datasets from Medical Segmentation Decathlon (MSD). Results The segmentation experiments demonstrated the superiority of MPU-Net over other methods in the segmentation tasks for multimodal medical images. In the brain tumor segmentation tasks, the Dice scores (DSCs) for the whole tumor (WT), the tumor core (TC) and the enhancing tumor (ET) regions were 89.42%, 86.92%, and 84.59%, respectively. In the meanwhile, the 95% Hausdorff distance (HD95) results were 3.530, 4.899 and 2.555, respectively. In the prostate segmentation tasks, the DSCs for the peripheral zone (PZ) and the transitional zone (TZ) of the prostate were 71.20% and 90.38%, respectively. In the meanwhile, the 95% HD95 results were 6.367 and 4.766, respectively. The ablation study showed that the combination of single-modal preservation and multimodal fusion methods improved the performance of multimodal medical image feature analysis. Conclusions In the segmentation tasks using brain tumor and prostate datasets, the MPU-Net method has achieved the improved performance in comparison with the conventional methods, indicating its potential application for other segmentation tasks in multimodal medical images.
Collapse
Affiliation(s)
- Bingxuan Wu
- Department of Precision Machinery and Precision Instrumentation, University of Science and Technology of China, Hefei, China
| | - Fan Zhang
- Department of Precision Machinery and Precision Instrumentation, University of Science and Technology of China, Hefei, China
| | - Liang Xu
- Suzhou Institute for Advanced Research, University of Science and Technology of China, Suzhou, China
| | - Shuwei Shen
- Suzhou Institute for Advanced Research, University of Science and Technology of China, Suzhou, China
| | - Pengfei Shao
- Department of Precision Machinery and Precision Instrumentation, University of Science and Technology of China, Hefei, China
| | - Mingzhai Sun
- Suzhou Institute for Advanced Research, University of Science and Technology of China, Suzhou, China
| | - Peng Liu
- Suzhou Institute for Advanced Research, University of Science and Technology of China, Suzhou, China
| | - Peng Yao
- School of Microelectronics, University of Science and Technology of China, Hefei, China
| | - Ronald X. Xu
- Suzhou Institute for Advanced Research, University of Science and Technology of China, Suzhou, China
| |
Collapse
|
44
|
Bakasa W, Viriri S. VGG16 Feature Extractor with Extreme Gradient Boost Classifier for Pancreas Cancer Prediction. J Imaging 2023; 9:138. [PMID: 37504815 PMCID: PMC10381878 DOI: 10.3390/jimaging9070138] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2023] [Revised: 06/19/2023] [Accepted: 07/04/2023] [Indexed: 07/29/2023] Open
Abstract
The prognosis of patients with pancreatic ductal adenocarcinoma (PDAC) is greatly improved by an early and accurate diagnosis. Several studies have created automated methods to forecast PDAC development utilising various medical imaging modalities. These papers give a general overview of the classification, segmentation, or grading of many cancer types utilising conventional machine learning techniques and hand-engineered characteristics, including pancreatic cancer. This study uses cutting-edge deep learning techniques to identify PDAC utilising computerised tomography (CT) medical imaging modalities. This work suggests that the hybrid model VGG16-XGBoost (VGG16-backbone feature extractor and Extreme Gradient Boosting-classifier) for PDAC images. According to studies, the proposed hybrid model performs better, obtaining an accuracy of 0.97 and a weighted F1 score of 0.97 for the dataset under study. The experimental validation of the VGG16-XGBoost model uses the Cancer Imaging Archive (TCIA) public access dataset, which has pancreas CT images. The results of this study can be extremely helpful for PDAC diagnosis from computerised tomography (CT) pancreas images, categorising them into five different tumours (T), node (N), and metastases (M) (TNM) staging system class labels, which are T0, T1, T2, T3, and T4.
Collapse
Affiliation(s)
- Wilson Bakasa
- School of Mathematics, Statistics and Computer Science, University of KwaZulu-Natal, Durban 4041, South Africa
| | - Serestina Viriri
- School of Mathematics, Statistics and Computer Science, University of KwaZulu-Natal, Durban 4041, South Africa
| |
Collapse
|
45
|
Astley JR, Biancardi AM, Marshall H, Hughes PJC, Collier GJ, Smith LJ, Eaden JA, Hughes R, Wild JM, Tahir BA. A Dual-Channel Deep Learning Approach for Lung Cavity Estimation From Hyperpolarized Gas and Proton MRI. J Magn Reson Imaging 2023; 57:1878-1890. [PMID: 36373828 PMCID: PMC10947587 DOI: 10.1002/jmri.28519] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2022] [Revised: 10/24/2022] [Accepted: 10/24/2022] [Indexed: 11/16/2022] Open
Abstract
BACKGROUND Hyperpolarized gas MRI can quantify regional lung ventilation via biomarkers, including the ventilation defect percentage (VDP). VDP is computed from segmentations derived from spatially co-registered functional hyperpolarized gas and structural proton (1 H)-MRI. Although acquired at similar lung inflation levels, they are frequently misaligned, requiring a lung cavity estimation (LCE). Recently, single-channel, mono-modal deep learning (DL)-based methods have shown promise for pulmonary image segmentation problems. Multichannel, multimodal approaches may outperform single-channel alternatives. PURPOSE We hypothesized that a DL-based dual-channel approach, leveraging both 1 H-MRI and Xenon-129-MRI (129 Xe-MRI), can generate LCEs more accurately than single-channel alternatives. STUDY TYPE Retrospective. POPULATION A total of 480 corresponding 1 H-MRI and 129 Xe-MRI scans from 26 healthy participants (median age [range]: 11 [8-71]; 50% females) and 289 patients with pulmonary pathologies (median age [range]: 47 [6-83]; 51% females) were split into training (422 scans [88%]; 257 participants [82%]) and testing (58 scans [12%]; 58 participants [18%]) sets. FIELD STRENGTH/SEQUENCE 1.5-T, three-dimensional (3D) spoiled gradient-recalled 1 H-MRI and 3D steady-state free-precession 129 Xe-MRI. ASSESSMENT We developed a multimodal DL approach, integrating 129 Xe-MRI and 1 H-MRI, in a dual-channel convolutional neural network. We compared this approach to single-channel alternatives using manually edited LCEs as a benchmark. We further assessed a fully automatic DL-based framework to calculate VDPs and compared it to manually generated VDPs. STATISTICAL TESTS Friedman tests with post hoc Bonferroni correction for multiple comparisons compared single-channel and dual-channel DL approaches using Dice similarity coefficient (DSC), average boundary Hausdorff distance (average HD), and relative error (XOR) metrics. Bland-Altman analysis and paired t-tests compared manual and DL-generated VDPs. A P value < 0.05 was considered statistically significant. RESULTS The dual-channel approach significantly outperformed single-channel approaches, achieving a median (range) DSC, average HD, and XOR of 0.967 (0.867-0.978), 1.68 mm (37.0-0.778), and 0.066 (0.246-0.045), respectively. DL-generated VDPs were statistically indistinguishable from manually generated VDPs (P = 0.710). DATA CONCLUSION Our dual-channel approach generated LCEs, which could be integrated with ventilated lung segmentations to produce biomarkers such as the VDP without manual intervention. EVIDENCE LEVEL 4. TECHNICAL EFFICACY Stage 1.
Collapse
Affiliation(s)
- Joshua R. Astley
- POLARIS, Department of Infection, Immunity & Cardiovascular DiseaseThe University of SheffieldSheffieldUK
- Department of Oncology and MetabolismThe University of SheffieldSheffieldUK
| | - Alberto M. Biancardi
- POLARIS, Department of Infection, Immunity & Cardiovascular DiseaseThe University of SheffieldSheffieldUK
| | - Helen Marshall
- POLARIS, Department of Infection, Immunity & Cardiovascular DiseaseThe University of SheffieldSheffieldUK
| | - Paul J. C. Hughes
- POLARIS, Department of Infection, Immunity & Cardiovascular DiseaseThe University of SheffieldSheffieldUK
| | - Guilhem J. Collier
- POLARIS, Department of Infection, Immunity & Cardiovascular DiseaseThe University of SheffieldSheffieldUK
| | - Laurie J. Smith
- POLARIS, Department of Infection, Immunity & Cardiovascular DiseaseThe University of SheffieldSheffieldUK
| | - James A. Eaden
- POLARIS, Department of Infection, Immunity & Cardiovascular DiseaseThe University of SheffieldSheffieldUK
| | - Rod Hughes
- Early Development RespiratoryAstraZenecaCambridgeUK
| | - Jim M. Wild
- POLARIS, Department of Infection, Immunity & Cardiovascular DiseaseThe University of SheffieldSheffieldUK
- Insigneo Institute for in silico medicine, The University of SheffieldSheffieldUK
| | - Bilal A. Tahir
- POLARIS, Department of Infection, Immunity & Cardiovascular DiseaseThe University of SheffieldSheffieldUK
- Department of Oncology and MetabolismThe University of SheffieldSheffieldUK
- Insigneo Institute for in silico medicine, The University of SheffieldSheffieldUK
| |
Collapse
|
46
|
Černý M, Kybic J, Májovský M, Sedlák V, Pirgl K, Misiorzová E, Lipina R, Netuka D. Fully automated imaging protocol independent system for pituitary adenoma segmentation: a convolutional neural network-based model on sparsely annotated MRI. Neurosurg Rev 2023; 46:116. [PMID: 37162632 DOI: 10.1007/s10143-023-02014-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2023] [Revised: 03/08/2023] [Accepted: 04/28/2023] [Indexed: 05/11/2023]
Abstract
This study aims to develop a fully automated imaging protocol independent system for pituitary adenoma segmentation from magnetic resonance imaging (MRI) scans that can work without user interaction and evaluate its accuracy and utility for clinical applications. We trained two independent artificial neural networks on MRI scans of 394 patients. The scans were acquired according to various imaging protocols over the course of 11 years on 1.5T and 3T MRI systems. The segmentation model assigned a class label to each input pixel (pituitary adenoma, internal carotid artery, normal pituitary gland, background). The slice segmentation model classified slices as clinically relevant (structures of interest in slice) or irrelevant (anterior or posterior to sella turcica). We used MRI data of another 99 patients to evaluate the performance of the model during training. We validated the model on a prospective cohort of 28 patients, Dice coefficients of 0.910, 0.719, and 0.240 for tumour, internal carotid artery, and normal gland labels, respectively, were achieved. The slice selection model achieved 82.5% accuracy, 88.7% sensitivity, 76.7% specificity, and an AUC of 0.904. A human expert rated 71.4% of the segmentation results as accurate, 21.4% as slightly inaccurate, and 7.1% as coarsely inaccurate. Our model achieved good results comparable with recent works of other authors on the largest dataset to date and generalized well for various imaging protocols. We discussed future clinical applications, and their considerations. Models and frameworks for clinical use have yet to be developed and evaluated.
Collapse
Affiliation(s)
- Martin Černý
- Department of Neurosurgery and Neurooncology, 1st Faculty of Medicine, Charles University, Central Military Hospital Prague, U Vojenské nemocnice 1200, 169 02, Praha 6, Czech Republic.
- 1st Faculty of Medicine, Charles University Prague, Kateřinská 1660/32, 121 08, Praha 2, Czech Republic.
| | - Jan Kybic
- Department of Cybernetics, Faculty of Electrical Engineering, Czech Technical University in Prague, Technická 2, 166 27, Praha 6, Czech Republic
| | - Martin Májovský
- Department of Neurosurgery and Neurooncology, 1st Faculty of Medicine, Charles University, Central Military Hospital Prague, U Vojenské nemocnice 1200, 169 02, Praha 6, Czech Republic
| | - Vojtěch Sedlák
- Department of Radiodiagnostics, Central Military Hospital Prague, U Vojenské nemocnice 1200, 169 02, Praha 6, Czech Republic
| | - Karin Pirgl
- Department of Neurosurgery and Neurooncology, 1st Faculty of Medicine, Charles University, Central Military Hospital Prague, U Vojenské nemocnice 1200, 169 02, Praha 6, Czech Republic
- 3rd Faculty of Medicine, Charles University Prague, Ruská 87, 100 00, Praha 10, Czech Republic
| | - Eva Misiorzová
- Department of Neurosurgery, Faculty of Medicine, University of Ostrava, University Hospital Ostrava, 17. listopadu 1790/5, 708 52, Ostrava-Poruba, Czech Republic
| | - Radim Lipina
- Department of Neurosurgery, Faculty of Medicine, University of Ostrava, University Hospital Ostrava, 17. listopadu 1790/5, 708 52, Ostrava-Poruba, Czech Republic
| | - David Netuka
- Department of Neurosurgery and Neurooncology, 1st Faculty of Medicine, Charles University, Central Military Hospital Prague, U Vojenské nemocnice 1200, 169 02, Praha 6, Czech Republic
| |
Collapse
|
47
|
Singh A, Kwiecinski J, Cadet S, Killekar A, Tzolos E, Williams MC, Dweck MR, Newby DE, Dey D, Slomka PJ. Automated nonlinear registration of coronary PET to CT angiography using pseudo-CT generated from PET with generative adversarial networks. J Nucl Cardiol 2023; 30:604-615. [PMID: 35701650 PMCID: PMC9747983 DOI: 10.1007/s12350-022-03010-8] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2022] [Accepted: 05/04/2022] [Indexed: 12/15/2022]
Abstract
BACKGROUND Coronary 18F-sodium-fluoride (18F-NaF) positron emission tomography (PET) showed promise in imaging coronary artery disease activity. Currently image processing remains subjective due to the need for manual registration of PET and computed tomography (CT) angiography data. We aimed to develop a novel fully automated method to register coronary 18F-NaF PET to CT angiography using pseudo-CT generated by generative adversarial networks (GAN). METHODS A total of 169 patients, 139 in the training and 30 in the testing sets were considered for generation of pseudo-CT from non-attenuation corrected (NAC) PET using GAN. Non-rigid registration was used to register pseudo-CT to CT angiography and the resulting transformation was used to align PET with CT angiography. We compared translations, maximal standard uptake value (SUVmax) and target to background ratio (TBRmax) at the location of plaques, obtained after observer and automated alignment. RESULTS Automatic end-to-end registration was performed for 30 patients with 88 coronary vessels and took 27.5 seconds per patient. Difference in displacement motion vectors between GAN-based and observer-based registration in the x-, y-, and z-directions was 0.8 ± 3.0, 0.7 ± 3.0, and 1.7 ± 3.9 mm, respectively. TBRmax had a coefficient of repeatability (CR) of 0.31, mean bias of 0.03 and narrow limits of agreement (LOA) (95% LOA: - 0.29 to 0.33). SUVmax had CR of 0.26, mean bias of 0 and narrow LOA (95% LOA: - 0.26 to 0.26). CONCLUSION Pseudo-CT generated by GAN are perfectly registered to PET can be used to facilitate quick and fully automated registration of PET and CT angiography.
Collapse
Affiliation(s)
- Ananya Singh
- Departments of Medicine (Division of Artificial Intelligence in Medicine), Imaging and Biomedical Imaging Research Institute, Cedars-Sinai Medical Center, 8700 Beverly Blvd, Suite Metro 203, Los Angeles, CA, 90048, USA
| | - Jacek Kwiecinski
- Departments of Medicine (Division of Artificial Intelligence in Medicine), Imaging and Biomedical Imaging Research Institute, Cedars-Sinai Medical Center, 8700 Beverly Blvd, Suite Metro 203, Los Angeles, CA, 90048, USA
- Department of Interventional Cardiology and Angiology, Institute of Cardiology, Warsaw, Poland
| | - Sebastien Cadet
- Departments of Medicine (Division of Artificial Intelligence in Medicine), Imaging and Biomedical Imaging Research Institute, Cedars-Sinai Medical Center, 8700 Beverly Blvd, Suite Metro 203, Los Angeles, CA, 90048, USA
| | - Aditya Killekar
- Departments of Medicine (Division of Artificial Intelligence in Medicine), Imaging and Biomedical Imaging Research Institute, Cedars-Sinai Medical Center, 8700 Beverly Blvd, Suite Metro 203, Los Angeles, CA, 90048, USA
| | - Evangelos Tzolos
- BHF Centre for Cardiovascular Science, University of Edinburgh, Edinburgh, UK
| | - Michelle C Williams
- BHF Centre for Cardiovascular Science, University of Edinburgh, Edinburgh, UK
| | - Marc R Dweck
- BHF Centre for Cardiovascular Science, University of Edinburgh, Edinburgh, UK
| | - David E Newby
- BHF Centre for Cardiovascular Science, University of Edinburgh, Edinburgh, UK
| | - Damini Dey
- Departments of Medicine (Division of Artificial Intelligence in Medicine), Imaging and Biomedical Imaging Research Institute, Cedars-Sinai Medical Center, 8700 Beverly Blvd, Suite Metro 203, Los Angeles, CA, 90048, USA
| | - Piotr J Slomka
- Departments of Medicine (Division of Artificial Intelligence in Medicine), Imaging and Biomedical Imaging Research Institute, Cedars-Sinai Medical Center, 8700 Beverly Blvd, Suite Metro 203, Los Angeles, CA, 90048, USA.
| |
Collapse
|
48
|
Park J, Kang SK, Hwang D, Choi H, Ha S, Seo JM, Eo JS, Lee JS. Automatic Lung Cancer Segmentation in [ 18F]FDG PET/CT Using a Two-Stage Deep Learning Approach. Nucl Med Mol Imaging 2023; 57:86-93. [PMID: 36998591 PMCID: PMC10043063 DOI: 10.1007/s13139-022-00745-7] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2021] [Revised: 03/10/2022] [Accepted: 03/12/2022] [Indexed: 10/18/2022] Open
Abstract
Purpose Since accurate lung cancer segmentation is required to determine the functional volume of a tumor in [18F]FDG PET/CT, we propose a two-stage U-Net architecture to enhance the performance of lung cancer segmentation using [18F]FDG PET/CT. Methods The whole-body [18F]FDG PET/CT scan data of 887 patients with lung cancer were retrospectively used for network training and evaluation. The ground-truth tumor volume of interest was drawn using the LifeX software. The dataset was randomly partitioned into training, validation, and test sets. Among the 887 PET/CT and VOI datasets, 730 were used to train the proposed models, 81 were used as the validation set, and the remaining 76 were used to evaluate the model. In Stage 1, the global U-net receives 3D PET/CT volume as input and extracts the preliminary tumor area, generating a 3D binary volume as output. In Stage 2, the regional U-net receives eight consecutive PET/CT slices around the slice selected by the Global U-net in Stage 1 and generates a 2D binary image as the output. Results The proposed two-stage U-Net architecture outperformed the conventional one-stage 3D U-Net in primary lung cancer segmentation. The two-stage U-Net model successfully predicted the detailed margin of the tumors, which was determined by manually drawing spherical VOIs and applying an adaptive threshold. Quantitative analysis using the Dice similarity coefficient confirmed the advantages of the two-stage U-Net. Conclusion The proposed method will be useful for reducing the time and effort required for accurate lung cancer segmentation in [18F]FDG PET/CT.
Collapse
Affiliation(s)
- Junyoung Park
- Department of Electrical and Computer Engineering, Seoul National University College of Engineering, Seoul, 08826 Korea
- Department of Nuclear Medicine, Seoul National University College of Medicine, 103 Daehak-ro, Jongno-gu, Seoul, 03080 Korea
| | - Seung Kwan Kang
- Department of Nuclear Medicine, Seoul National University College of Medicine, 103 Daehak-ro, Jongno-gu, Seoul, 03080 Korea
- Department of Biomedical Sciences, Seoul National University College of Medicine, Seoul, 03080 Korea
- Artificial Intelligence Institute, Seoul National University, Seoul, 08826 Korea
- Brightonix Imaging Inc., Seoul, 03080 Korea
| | - Donghwi Hwang
- Department of Nuclear Medicine, Seoul National University College of Medicine, 103 Daehak-ro, Jongno-gu, Seoul, 03080 Korea
- Department of Biomedical Sciences, Seoul National University College of Medicine, Seoul, 03080 Korea
- Artificial Intelligence Institute, Seoul National University, Seoul, 08826 Korea
| | - Hongyoon Choi
- Department of Nuclear Medicine, Seoul National University College of Medicine, 103 Daehak-ro, Jongno-gu, Seoul, 03080 Korea
| | - Seunggyun Ha
- Division of Nuclear Medicine, Department of Radiology, Seoul St Mary’s Hospital, The Catholic University of Korea, Seoul, 06591 Korea
| | - Jong Mo Seo
- Department of Electrical and Computer Engineering, Seoul National University College of Engineering, Seoul, 08826 Korea
| | - Jae Seon Eo
- Department of Nuclear Medicine, Korea University Guro Hospital, 148 Gurodong-ro, Guro-gu, Seoul, 08308 Korea
| | - Jae Sung Lee
- Department of Nuclear Medicine, Seoul National University College of Medicine, 103 Daehak-ro, Jongno-gu, Seoul, 03080 Korea
- Department of Biomedical Sciences, Seoul National University College of Medicine, Seoul, 03080 Korea
- Artificial Intelligence Institute, Seoul National University, Seoul, 08826 Korea
- Brightonix Imaging Inc., Seoul, 03080 Korea
- Institute of Radiation Medicine, Medical Research Center, Seoul National University College of Medicine, Seoul, 03080 Korea
| |
Collapse
|
49
|
He Q, Dong M, Summerfield N, Glide-Hurst C. MAGNET: A MODALITY-AGNOSTIC NETWORK FOR 3D MEDICAL IMAGE SEGMENTATION. PROCEEDINGS. IEEE INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING 2023; 2023:10.1109/isbi53787.2023.10230587. [PMID: 38169907 PMCID: PMC10760993 DOI: 10.1109/isbi53787.2023.10230587] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/05/2024]
Abstract
In this paper, we proposed MAGNET, a novel modality-agnostic network for 3D medical image segmentation. Different from existing learning methods, MAGNET is specifically designed to handle real medical situations where multiple modalities/sequences are available during model training, but fewer ones are available or used at time of clinical practice. Our results on multiple datasets show that MAGNET trained on multi-modality data has the unique ability to perform predictions using any subset of training imaging modalities. It outperforms individually trained uni-modality models while can further boost performance when more modalities are available at testing.
Collapse
Affiliation(s)
- Qisheng He
- Wayne State University Department of Computer Science 5057 Woodward Ave, Detroit, MI 48202
| | - Ming Dong
- Wayne State University Department of Computer Science 5057 Woodward Ave, Detroit, MI 48202
| | - Nicholas Summerfield
- University of Wisconsin-Madison Department of Human Oncology Department of Medical Physics 600 Highland Ave, Madison, WI 53792
| | - Carri Glide-Hurst
- University of Wisconsin-Madison Department of Human Oncology Department of Medical Physics 600 Highland Ave, Madison, WI 53792
| |
Collapse
|
50
|
Li K, Chen C, Cao W, Wang H, Han S, Wang R, Ye Z, Wu Z, Wang W, Cai L, Ding D, Yuan Z. DeAF: A multimodal deep learning framework for disease prediction. Comput Biol Med 2023; 156:106715. [PMID: 36867898 DOI: 10.1016/j.compbiomed.2023.106715] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2022] [Revised: 02/05/2023] [Accepted: 02/26/2023] [Indexed: 03/05/2023]
Abstract
Multimodal deep learning models have been applied for disease prediction tasks, but difficulties exist in training due to the conflict between sub-models and fusion modules. To alleviate this issue, we propose a framework for decoupling feature alignment and fusion (DeAF), which separates the multimodal model training into two stages. In the first stage, unsupervised representation learning is conducted, and the modality adaptation (MA) module is used to align the features from various modalities. In the second stage, the self-attention fusion (SAF) module combines the medical image features and clinical data using supervised learning. Moreover, we apply the DeAF framework to predict the postoperative efficacy of CRS for colorectal cancer and whether the MCI patients change to Alzheimer's disease. The DeAF framework achieves a significant improvement in comparison to the previous methods. Furthermore, extensive ablation experiments are conducted to demonstrate the rationality and effectiveness of our framework. In conclusion, our framework enhances the interaction between the local medical image features and clinical data, and derive more discriminative multimodal features for disease prediction. The framework implementation is available at https://github.com/cchencan/DeAF.
Collapse
Affiliation(s)
- Kangshun Li
- College of Mathematics and Informatics, South China Agricultural University, Guangzhou, 510000, China.
| | - Can Chen
- College of Mathematics and Informatics, South China Agricultural University, Guangzhou, 510000, China
| | - Wuteng Cao
- Department of Radiology, The Sixth Affiliated Hospital, Sun Yat-Sen University, Guangzhou, 510000, China
| | - Hui Wang
- Department of Colorectal Surgery, Department of General Surgery, Guangdong Provincial Key Laboratory of Colorectal and Pelvic Floor Diseases, The Sixth Affiliated Hospital, Sun Yat-sen University, Guangzhou, 510000, China
| | - Shuai Han
- General Surgery Center, Department of Gastrointestinal Surgery, Zhujiang Hospital, Southern Medical University, Guangzhou, 510000, China
| | - Renjie Wang
- Department of Colorectal Surgery, Fudan University Shanghai Cancer Center, Shanghai, 200000, China
| | - Zaisheng Ye
- Department of Gastrointestinal Surgical Oncology, Fujian Cancer Hospital and Fujian Medical University Cancer Hospital, Fuzhou, 350000, China
| | - Zhijie Wu
- Department of Colorectal Surgery, Department of General Surgery, Guangdong Provincial Key Laboratory of Colorectal and Pelvic Floor Diseases, The Sixth Affiliated Hospital, Sun Yat-sen University, Guangzhou, 510000, China
| | - Wenxiang Wang
- College of Mathematics and Informatics, South China Agricultural University, Guangzhou, 510000, China
| | - Leng Cai
- College of Mathematics and Informatics, South China Agricultural University, Guangzhou, 510000, China
| | - Deyu Ding
- Department of Economics, University of Konstanz, Konstanz, 350000, Germany
| | - Zixu Yuan
- Department of Colorectal Surgery, Department of General Surgery, Guangdong Provincial Key Laboratory of Colorectal and Pelvic Floor Diseases, The Sixth Affiliated Hospital, Sun Yat-sen University, Guangzhou, 510000, China.
| |
Collapse
|