1
|
Quan Q, Yao Q, Zhu H, Wang Q, Zhou SK. Which images to label for few-shot medical image analysis? Med Image Anal 2024; 96:103200. [PMID: 38801797 DOI: 10.1016/j.media.2024.103200] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2023] [Revised: 03/26/2024] [Accepted: 05/06/2024] [Indexed: 05/29/2024]
Abstract
The success of deep learning methodologies hinges upon the availability of meticulously labeled extensive datasets. However, when dealing with medical images, the annotation process for such abundant training data often necessitates the involvement of experienced radiologists, thereby consuming their limited time resources. In order to alleviate this burden, few-shot learning approaches have been developed, which manage to achieve competitive performance levels with only several labeled images. Nevertheless, a crucial yet previously overlooked problem in few-shot learning is about the selection of template images for annotation before learning, which affects the final performance. In this study, we propose a novel TEmplate Choosing Policy (TECP) that aims to identify and select "the most worthy" images for annotation, particularly within the context of multiple few-shot medical tasks, including landmark detection, anatomy detection, and anatomy segmentation. TECP is composed of four integral components: (1) Self-supervised training, which entails training a pre-existing deep model to extract salient features from radiological images; (2) Alternative proposals for localizing informative regions within the images; and (3) Representative Score Estimation, which involves the evaluation and identification of the most representative samples or templates. (4) Ranking, which rank all candidates and select one with highest representative score. The efficacy of the TECP approach is demonstrated through a series of comprehensive experiments conducted on multiple public datasets. Across all three medical tasks, the utilization of TECP yields noticeable improvements in model performance.
Collapse
Affiliation(s)
- Quan Quan
- Institute of Computing Technology, Chinese Academy of Sciences (CAS), Beijing, 100080, China; University of Chinese Academy of Sciences (UCAS), Beijing, 101408, China
| | - Qingsong Yao
- Institute of Computing Technology, Chinese Academy of Sciences (CAS), Beijing, 100080, China; University of Chinese Academy of Sciences (UCAS), Beijing, 101408, China
| | - Heqin Zhu
- School of Biomedical Engineering, Division of Life Sciences and Medicine, University of Science and Technology of China (USTC), Hefei, 230026, China
| | - Qiyuan Wang
- School of Biomedical Engineering, Division of Life Sciences and Medicine, University of Science and Technology of China (USTC), Hefei, 230026, China
| | - S Kevin Zhou
- Institute of Computing Technology, Chinese Academy of Sciences (CAS), Beijing, 100080, China; School of Biomedical Engineering, Division of Life Sciences and Medicine, University of Science and Technology of China (USTC), Hefei, 230026, China; Center for Medical Imaging, Robotics, Analytic Computing; Learning (MIRACLE), Suzhou Institute for Advance Research, USTC, Suzhou, 215000, China; Key Laboratory of Precision and Intelligent Chemistry, USTC, Hefei, 230026, China.
| |
Collapse
|
2
|
Li P, Hu Y. Deep magnetic resonance fingerprinting based on Local and Global Vision Transformer. Med Image Anal 2024; 95:103198. [PMID: 38759259 DOI: 10.1016/j.media.2024.103198] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2023] [Revised: 04/28/2024] [Accepted: 05/02/2024] [Indexed: 05/19/2024]
Abstract
To mitigate systematic errors in magnetic resonance fingerprinting (MRF), the precomputed dictionary is usually computed with minimal granularity across the entire range of tissue parameters. However, the dictionary grows exponentially with the number of parameters increase, posing significant challenges to the computational efficiency and matching accuracy of pattern-matching algorithms. Existing works, primarily based on convolutional neural networks (CNN), focus solely on local information to reconstruct multiple parameter maps, lacking in-depth investigations on the MRF mechanism. These methods may not exploit long-distance redundancies and the contextual information within voxel fingerprints introduced by the Bloch equation dynamics, leading to limited reconstruction speed and accuracy. To overcome these limitations, we propose a novel end-to-end neural network called the Local and Global Vision Transformer (LG-ViT) for MRF parameter reconstruction. Our proposed LG-ViT employs a multi-stage architecture that effectively reduces the computational overhead associated with the high-dimensional MRF data and the transformer model. Specifically, a local Transformer encoder is proposed to capture contextual information embedded within voxel fingerprints and local correlations introduced by the interconnected human tissues. Additionally, a global Transformer encoder is proposed to leverage long-distance dependencies arising from shared characteristics among different tissues across various spatial regions. By incorporating MRF physics-based data priors and effectively capturing local and global correlations, our proposed LG-ViT can achieve fast and accurate MRF parameter reconstruction. Experiments on both simulation and in vivo data demonstrate that the proposed method enables faster and more accurate MRF parameter reconstruction compared to state-of-the-art deep learning-based methods.
Collapse
Affiliation(s)
- Peng Li
- The School of Electronics and Information Engineering, Harbin Institute of Technology, Harbin, China
| | - Yue Hu
- The School of Electronics and Information Engineering, Harbin Institute of Technology, Harbin, China.
| |
Collapse
|
3
|
Ma Z, Li C, Du T, Zhang L, Tang D, Ma D, Huang S, Liu Y, Sun Y, Chen Z, Yuan J, Nie Q, Grzegorzek M, Sun H. AATCT-IDS: A benchmark Abdominal Adipose Tissue CT Image Dataset for image denoising, semantic segmentation, and radiomics evaluation. Comput Biol Med 2024; 177:108628. [PMID: 38810476 DOI: 10.1016/j.compbiomed.2024.108628] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2024] [Revised: 04/14/2024] [Accepted: 05/18/2024] [Indexed: 05/31/2024]
Abstract
BACKGROUND AND OBJECTIVE The metabolic syndrome induced by obesity is closely associated with cardiovascular disease, and the prevalence is increasing globally, year by year. Obesity is a risk marker for detecting this disease. However, current research on computer-aided detection of adipose distribution is hampered by the lack of open-source large abdominal adipose datasets. METHODS In this study, a benchmark Abdominal Adipose Tissue CT Image Dataset (AATCT-IDS) containing 300 subjects is prepared and published. AATCT-IDS publics 13,732 raw CT slices, and the researchers individually annotate the subcutaneous and visceral adipose tissue regions of 3213 of those slices that have the same slice distance to validate denoising methods, train semantic segmentation models, and study radiomics. For different tasks, this paper compares and analyzes the performance of various methods on AATCT-IDS by combining the visualization results and evaluation data. Thus, verify the research potential of this data set in the above three types of tasks. RESULTS In the comparative study of image denoising, algorithms using a smoothing strategy suppress mixed noise at the expense of image details and obtain better evaluation data. Methods such as BM3D preserve the original image structure better, although the evaluation data are slightly lower. The results show significant differences among them. In the comparative study of semantic segmentation of abdominal adipose tissue, the segmentation results of adipose tissue by each model show different structural characteristics. Among them, BiSeNet obtains segmentation results only slightly inferior to U-Net with the shortest training time and effectively separates small and isolated adipose tissue. In addition, the radiomics study based on AATCT-IDS reveals three adipose distributions in the subject population. CONCLUSION AATCT-IDS contains the ground truth of adipose tissue regions in abdominal CT slices. This open-source dataset can attract researchers to explore the multi-dimensional characteristics of abdominal adipose tissue and thus help physicians and patients in clinical practice. AATCT-IDS is freely published for non-commercial purpose at: https://figshare.com/articles/dataset/AATTCT-IDS/23807256.
Collapse
Affiliation(s)
- Zhiyu Ma
- Microscopic Image and Medical Image Analysis Group, College of Medicine and Biological Information Engineering, Northeastern University, Shenyang 110169, China
| | - Chen Li
- Microscopic Image and Medical Image Analysis Group, College of Medicine and Biological Information Engineering, Northeastern University, Shenyang 110169, China.
| | - Tianming Du
- Microscopic Image and Medical Image Analysis Group, College of Medicine and Biological Information Engineering, Northeastern University, Shenyang 110169, China
| | - Le Zhang
- Department of Radiology, Qingdao Municipal Hospital, Qingdao University, Qingdao, China
| | - Dechao Tang
- Microscopic Image and Medical Image Analysis Group, College of Medicine and Biological Information Engineering, Northeastern University, Shenyang 110169, China
| | - Deguo Ma
- Microscopic Image and Medical Image Analysis Group, College of Medicine and Biological Information Engineering, Northeastern University, Shenyang 110169, China
| | - Shanchuan Huang
- Microscopic Image and Medical Image Analysis Group, College of Medicine and Biological Information Engineering, Northeastern University, Shenyang 110169, China
| | - Yan Liu
- Microscopic Image and Medical Image Analysis Group, College of Medicine and Biological Information Engineering, Northeastern University, Shenyang 110169, China
| | - Yihao Sun
- Microscopic Image and Medical Image Analysis Group, College of Medicine and Biological Information Engineering, Northeastern University, Shenyang 110169, China
| | - Zhihao Chen
- Microscopic Image and Medical Image Analysis Group, College of Medicine and Biological Information Engineering, Northeastern University, Shenyang 110169, China
| | - Jin Yuan
- Microscopic Image and Medical Image Analysis Group, College of Medicine and Biological Information Engineering, Northeastern University, Shenyang 110169, China
| | - Qianqing Nie
- Microscopic Image and Medical Image Analysis Group, College of Medicine and Biological Information Engineering, Northeastern University, Shenyang 110169, China
| | - Marcin Grzegorzek
- Institute of Medical Informatics, University of Luebeck, Luebeck, Germany
| | - Hongzan Sun
- Shengjing Hospital, China Medical University, Shenyang 110122, China.
| |
Collapse
|
4
|
Lee JM, Bae JS. Enhancing diagnostic precision in liver lesion analysis using a deep learning-based system: opportunities and challenges. Nat Rev Clin Oncol 2024; 21:485-486. [PMID: 38519602 DOI: 10.1038/s41571-024-00887-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/25/2024]
Affiliation(s)
- Jeong Min Lee
- Department of Radiology, Seoul National University Hospital, Seoul, South Korea.
- Department of Radiology, Seoul National University College of Medicine, Seoul, South Korea.
- Institute of Radiation Medicine, Seoul National University Medical Research Center, Seoul, South Korea.
| | - Jae Seok Bae
- Department of Radiology, Seoul National University Hospital, Seoul, South Korea
| |
Collapse
|
5
|
Vakli P, Weiss B, Rozmann D, Erőss G, Nárai Á, Hermann P, Vidnyánszky Z. The effect of head motion on brain age prediction using deep convolutional neural networks. Neuroimage 2024; 294:120646. [PMID: 38750907 DOI: 10.1016/j.neuroimage.2024.120646] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2024] [Revised: 05/10/2024] [Accepted: 05/12/2024] [Indexed: 05/23/2024] Open
Abstract
Deep learning can be used effectively to predict participants' age from brain magnetic resonance imaging (MRI) data, and a growing body of evidence suggests that the difference between predicted and chronological age-referred to as brain-predicted age difference (brain-PAD)-is related to various neurological and neuropsychiatric disease states. A crucial aspect of the applicability of brain-PAD as a biomarker of individual brain health is whether and how brain-predicted age is affected by MR image artifacts commonly encountered in clinical settings. To investigate this issue, we trained and validated two different 3D convolutional neural network architectures (CNNs) from scratch and tested the models on a separate dataset consisting of motion-free and motion-corrupted T1-weighted MRI scans from the same participants, the quality of which were rated by neuroradiologists from a clinical diagnostic point of view. Our results revealed a systematic increase in brain-PAD with worsening image quality for both models. This effect was also observed for images that were deemed usable from a clinical perspective, with brains appearing older in medium than in good quality images. These findings were also supported by significant associations found between the brain-PAD and standard image quality metrics indicating larger brain-PAD for lower-quality images. Our results demonstrate a spurious effect of advanced brain aging as a result of head motion and underline the importance of controlling for image quality when using brain-predicted age based on structural neuroimaging data as a proxy measure for brain health.
Collapse
Affiliation(s)
- Pál Vakli
- Brain Imaging Centre, HUN-REN Research Centre for Natural Sciences, Budapest 1117, Hungary.
| | - Béla Weiss
- Brain Imaging Centre, HUN-REN Research Centre for Natural Sciences, Budapest 1117, Hungary; Biomatics and Applied Artificial Intelligence Institute, John von Neumann Faculty of Informatics, Óbuda University, Budapest 1034, Hungary.
| | - Dorina Rozmann
- Brain Imaging Centre, HUN-REN Research Centre for Natural Sciences, Budapest 1117, Hungary
| | - György Erőss
- Brain Imaging Centre, HUN-REN Research Centre for Natural Sciences, Budapest 1117, Hungary
| | - Ádám Nárai
- Brain Imaging Centre, HUN-REN Research Centre for Natural Sciences, Budapest 1117, Hungary; Doctoral School of Biology and Sportbiology, Institute of Biology, Faculty of Sciences, University of Pécs, Pécs 7624, Hungary
| | - Petra Hermann
- Brain Imaging Centre, HUN-REN Research Centre for Natural Sciences, Budapest 1117, Hungary
| | - Zoltán Vidnyánszky
- Brain Imaging Centre, HUN-REN Research Centre for Natural Sciences, Budapest 1117, Hungary.
| |
Collapse
|
6
|
Wang H, Jin Q, Li S, Liu S, Wang M, Song Z. A comprehensive survey on deep active learning in medical image analysis. Med Image Anal 2024; 95:103201. [PMID: 38776841 DOI: 10.1016/j.media.2024.103201] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2023] [Revised: 04/25/2024] [Accepted: 05/06/2024] [Indexed: 05/25/2024]
Abstract
Deep learning has achieved widespread success in medical image analysis, leading to an increasing demand for large-scale expert-annotated medical image datasets. Yet, the high cost of annotating medical images severely hampers the development of deep learning in this field. To reduce annotation costs, active learning aims to select the most informative samples for annotation and train high-performance models with as few labeled samples as possible. In this survey, we review the core methods of active learning, including the evaluation of informativeness and sampling strategy. For the first time, we provide a detailed summary of the integration of active learning with other label-efficient techniques, such as semi-supervised, self-supervised learning, and so on. We also summarize active learning works that are specifically tailored to medical image analysis. Additionally, we conduct a thorough comparative analysis of the performance of different AL methods in medical image analysis with experiments. In the end, we offer our perspectives on the future trends and challenges of active learning and its applications in medical image analysis. An accompanying paper list and code for the comparative analysis is available in https://github.com/LightersWang/Awesome-Active-Learning-for-Medical-Image-Analysis.
Collapse
Affiliation(s)
- Haoran Wang
- Digital Medical Research Center, School of Basic Medical Sciences, Fudan University, Shanghai 200032, China; Shanghai Key Laboratory of Medical Image Computing and Computer Assisted Intervention, Shanghai 200032, China
| | - Qiuye Jin
- Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal 23955, Saudi Arabia
| | - Shiman Li
- Digital Medical Research Center, School of Basic Medical Sciences, Fudan University, Shanghai 200032, China; Shanghai Key Laboratory of Medical Image Computing and Computer Assisted Intervention, Shanghai 200032, China
| | - Siyu Liu
- Digital Medical Research Center, School of Basic Medical Sciences, Fudan University, Shanghai 200032, China; Shanghai Key Laboratory of Medical Image Computing and Computer Assisted Intervention, Shanghai 200032, China
| | - Manning Wang
- Digital Medical Research Center, School of Basic Medical Sciences, Fudan University, Shanghai 200032, China; Shanghai Key Laboratory of Medical Image Computing and Computer Assisted Intervention, Shanghai 200032, China.
| | - Zhijian Song
- Digital Medical Research Center, School of Basic Medical Sciences, Fudan University, Shanghai 200032, China; Shanghai Key Laboratory of Medical Image Computing and Computer Assisted Intervention, Shanghai 200032, China.
| |
Collapse
|
7
|
Tao Y, Ge L, Su N, Li M, Fan W, Jiang L, Yuan S, Chen Q. Exploration on OCT biomarker candidate related to macular edema caused by diabetic retinopathy and retinal vein occlusion in SD-OCT images. Sci Rep 2024; 14:14317. [PMID: 38906954 PMCID: PMC11192959 DOI: 10.1038/s41598-024-63144-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2023] [Accepted: 05/24/2024] [Indexed: 06/23/2024] Open
Abstract
To improve the understanding of potential pathological mechanisms of macular edema (ME), we try to discover biomarker candidates related to ME caused by diabetic retinopathy (DR) and retinal vein occlusion (RVO) in spectral-domain optical coherence tomography images by means of deep learning (DL). 32 eyes of 26 subjects with non-proliferative DR (NPDR), 77 eyes of 61 subjects with proliferative DR (PDR), 120 eyes of 116 subjects with branch RVO (BRVO), and 17 eyes of 15 subjects with central RVO (CRVO) were collected. A DL model was implemented to guide biomarker candidate discovery. The disorganization of the retinal outer layers (DROL), i.e., the gray value of the retinal tissues between the external limiting membrane (ELM) and retinal pigment epithelium (RPE), the disrupted and obscured rate of the ELM, ellipsoid zone (EZ), and RPE, was measured. In addition, the occurrence, number, volume, and projected area of hyperreflective foci (HRF) were recorded. ELM, EZ, and RPE are more likely to be obscured in RVO group and HRFs are observed more frequently in DR group (all P ≤ 0.001). In conclusion, the features of DROL and HRF can be possible biomarkers related to ME caused by DR and RVO in OCT modality.
Collapse
Affiliation(s)
- Yuhui Tao
- School of Computer Science and Engineering, Nanjing University of Science and Technology, No.200 Xiao Lingwei, Nanjing, 210094, China
| | - Lexin Ge
- Department of Ophthalmology, The First Affiliated Hospital of Nanjing Medical University, No.300 Guangzhou Road, Nanjing, 210029, China
| | - Na Su
- Department of Ophthalmology, The First Affiliated Hospital of Nanjing Medical University, No.300 Guangzhou Road, Nanjing, 210029, China
| | - Mingchao Li
- School of Computer Science and Engineering, Nanjing University of Science and Technology, No.200 Xiao Lingwei, Nanjing, 210094, China
| | - Wen Fan
- Department of Ophthalmology, The First Affiliated Hospital of Nanjing Medical University, No.300 Guangzhou Road, Nanjing, 210029, China
| | - Lin Jiang
- Department of Endocrinology, The First Affiliated Hospital of Nanjing Medical University, Nanjing, China
| | - Songtao Yuan
- Department of Ophthalmology, The First Affiliated Hospital of Nanjing Medical University, No.300 Guangzhou Road, Nanjing, 210029, China.
| | - Qiang Chen
- School of Computer Science and Engineering, Nanjing University of Science and Technology, No.200 Xiao Lingwei, Nanjing, 210094, China.
| |
Collapse
|
8
|
Das SR, Ilesanmi A, Wolk DA, Gee JC. Beyond Macrostructure: Is There a Role for Radiomics Analysis in Neuroimaging ? Magn Reson Med Sci 2024:rev.2024-0053. [PMID: 38880615 DOI: 10.2463/mrms.rev.2024-0053] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/18/2024] Open
Abstract
The most commonly used neuroimaging biomarkers of brain structure, particularly in neurodegenerative diseases, have traditionally been summary measurements from ROIs derived from structural MRI, such as volume and thickness. Advances in MR acquisition techniques, including high-field imaging, and emergence of learning-based methods have opened up opportunities to interrogate brain structure in finer detail, allowing investigators to move beyond macrostructural measurements. On the one hand, superior signal contrast has the potential to make appearance-based metrics that directly analyze intensity patterns, such as texture analysis and radiomics features, more reliable. Quantitative MRI, particularly at high-field, can also provide a richer set of measures with greater interpretability. On the other hand, use of neural networks-based techniques has the potential to exploit subtle patterns in images that can now be mined with advanced imaging. Finally, there are opportunities for integration of multimodal data at different spatial scales that is enabled by developments in many of the above techniques-for example, by combining digital histopathology with high-resolution ex-vivo and in-vivo MRI. Some of these approaches are at early stages of development and present their own set of challenges. Nonetheless, they hold promise to drive the next generation of validation and biomarker studies. This article will survey recent developments in this area, with a particular focus on Alzheimer's disease and related disorders. However, most of the discussion is equally relevant to imaging of other neurological disorders, and even to other organ systems of interest. It is not meant to be an exhaustive review of the available literature, but rather presented as a summary of recent trends through the discussion of a collection of representative studies with an eye towards what the future may hold.
Collapse
Affiliation(s)
- Sandhitsu R Das
- Department of Neurology, University of Pennsylvania, Philadelphia, PA, USA
- Penn Image Computing and Science Laboratory (PICSL), Department of Radiology, University of Pennsylvania, Philadelphia, PA, USA
- Penn Memory Center, University of Pennsylvania, Philadelphia, PA, USA
| | - Ademola Ilesanmi
- Department of Neurology, University of Pennsylvania, Philadelphia, PA, USA
- Penn Image Computing and Science Laboratory (PICSL), Department of Radiology, University of Pennsylvania, Philadelphia, PA, USA
| | - David A Wolk
- Department of Neurology, University of Pennsylvania, Philadelphia, PA, USA
- Penn Memory Center, University of Pennsylvania, Philadelphia, PA, USA
| | - James C Gee
- Penn Image Computing and Science Laboratory (PICSL), Department of Radiology, University of Pennsylvania, Philadelphia, PA, USA
| |
Collapse
|
9
|
Yu Y, She K, Shi K, Cai X, Kwon OM, Soh Y. Analysis of medical images super-resolution via a wavelet pyramid recursive neural network constrained by wavelet energy entropy. Neural Netw 2024; 178:106460. [PMID: 38906052 DOI: 10.1016/j.neunet.2024.106460] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/25/2023] [Revised: 05/13/2024] [Accepted: 06/10/2024] [Indexed: 06/23/2024]
Abstract
Recently, multi-resolution pyramid-based techniques have emerged as the prevailing research approach for image super-resolution. However, these methods typically rely on a single mode of information transmission between levels. In our approach, a wavelet pyramid recursive neural network (WPRNN) based on wavelet energy entropy (WEE) constraint is proposed. This network transmits previous-level wavelet coefficients and additional shallow coefficient features to capture local details. Besides, the parameter of low- and high-frequency wavelet coefficients within each pyramid level and across pyramid levels is shared. A multi-resolution wavelet pyramid fusion (WPF) module is devised to facilitate information transfer across network pyramid levels. Additionally, a wavelet energy entropy loss is proposed to constrain the reconstruction of wavelet coefficients from the perspective of signal energy distribution. Finally, our method achieves the competitive reconstruction performance with the minimal parameters through an extensive series of experiments conducted on publicly available datasets, which demonstrates its practical utility.
Collapse
Affiliation(s)
- Yue Yu
- School of Electronic Information and Electrical Engineering, Chengdu University, Chengdu, 610106, Sichuan, China; School of Information and Software Engineering, University of Electronic Science and Technology of China, Chengdu, 610054, Sichuan, China.
| | - Kun She
- School of Information and Software Engineering, University of Electronic Science and Technology of China, Chengdu, 610054, Sichuan, China.
| | - Kaibo Shi
- School of Electronic Information and Electrical Engineering, Chengdu University, Chengdu, 610106, Sichuan, China; College of Electrical Engineering, Sichuan University, Chengdu, 610065, Sichuan, China.
| | - Xiao Cai
- School of Information and Software Engineering, University of Electronic Science and Technology of China, Chengdu, 610054, Sichuan, China.
| | - Oh-Min Kwon
- School of Electrical Engineering, Chungbuk National University, Cheongju, 28644, Chungbuk, South Korea.
| | - YengChai Soh
- School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore, 639798, Singapore.
| |
Collapse
|
10
|
Fang X, Zhang S, Wei Z, Wang K, Yang G, Li C, Han M, Du M. Automatic detection of the third molar and mandibular canal on panoramic radiographs based on deep learning. JOURNAL OF STOMATOLOGY, ORAL AND MAXILLOFACIAL SURGERY 2024:101946. [PMID: 38857691 DOI: 10.1016/j.jormas.2024.101946] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/26/2024] [Revised: 05/12/2024] [Accepted: 06/08/2024] [Indexed: 06/12/2024]
Abstract
PURPOSE This study aims to develop a deep learning framework for the automatic detection of the position relationship between the mandibular third molar (M3) and the mandibular canal (MC) on panoramic radiographs (PRs), to assist doctors in assessing and planning appropriate surgical interventions. METHODS Datasets D1 and D2 were obtained by collecting 253 PRs from a hospitals and 197 PRs from online platforms. The RPIFormer model proposed in this study was trained and validated on D1 to create a segmentation model. The CycleGAN model was trained and validated on both D1 and D2 to develop an image enhancement model. Ultimately, the segmentation and enhancement models were integrated with an object detection model to create a fully automated framework for M3 and MC detection in PRs. Experimental evaluation included calculating Dice coefficient, IoU, Recall, and Precision during the process. RESULTS The RPIFormer model proposed in this study achieved an average Dice coefficient of 92.56 % for segmenting M3 and MC, representing a 3.06 % improvement over the previous best study. The deep learning framework developed in this research enables automatic detection of M3 and MC in PRs without manual cropping, demonstrating superior detection accuracy and generalization capability. CONCLUSION The framework developed in this study can be applied to PRs captured in different hospitals without the need for model fine-tuning. This feature is significant for aiding doctors in accurately assessing the spatial relationship between M3 and MC, thereby determining the optimal treatment plan to ensure patients' oral health and surgical safety.
Collapse
Affiliation(s)
- Xinle Fang
- School of Information Science and Engineering, Shandong University, Qingdao, China
| | - Shengben Zhang
- Department of Implantology, School and Hospital of Stomatology, Cheeloo College of Medicine, Shandong University, Jinan, China
| | - Zhiyuan Wei
- School of Information Science and Engineering, Shandong University, Qingdao, China
| | - Kaixin Wang
- School of Information Science and Engineering, Shandong University, Qingdao, China
| | - Guanghui Yang
- School of Information Science and Engineering, Shandong University, Qingdao, China
| | - Chengliang Li
- School of Information Science and Engineering, Shandong University, Qingdao, China
| | - Min Han
- School of Information Science and Engineering, Shandong University, Qingdao, China.
| | - Mi Du
- Department of Implantology, School and Hospital of Stomatology, Cheeloo College of Medicine, Shandong University, Jinan, China; Shandong Key Laboratory of Oral Tissue Regeneration, Jinan, China; Shandong Engineering Laboratory for Dental Materials and Oral Tissue Regeneration, Jinan, China; Shandong Provincial Clinical Research Center for Oral Diseases, Jinan, China.
| |
Collapse
|
11
|
Shahzadi I, Madni TM, Janjua UI, Batool G, Naz B, Ali MQ. CSAMDT: Conditional Self Attention Memory-Driven Transformers for Radiology Report Generation from Chest X-Ray. JOURNAL OF IMAGING INFORMATICS IN MEDICINE 2024:10.1007/s10278-024-01126-6. [PMID: 38831189 DOI: 10.1007/s10278-024-01126-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/05/2024] [Revised: 03/21/2024] [Accepted: 04/11/2024] [Indexed: 06/05/2024]
Abstract
A radiology report plays a crucial role in guiding patient treatment, but writing these reports is a time-consuming task that demands a radiologist's expertise. In response to this challenge, researchers in the subfields of artificial intelligence for healthcare have explored techniques for automatically interpreting radiographic images and generating free-text reports, while much of the research on medical report creation has focused on image captioning methods without adequately addressing particular report aspects. This study introduces a Conditional Self Attention Memory-Driven Transformer model for generating radiological reports. The model operates in two phases: initially, a multi-label classification model, utilizing ResNet152 v2 as an encoder, is employed for feature extraction and multiple disease diagnosis. In the second phase, the Conditional Self Attention Memory-Driven Transformer serves as a decoder, utilizing self-attention memory-driven transformers to generate text reports. Comprehensive experimentation was conducted to compare existing and proposed techniques based on Bilingual Evaluation Understudy (BLEU) scores ranging from 1 to 4. The model outperforms the other state-of-the-art techniques by increasing the BLEU 1 (0.475), BLEU 2 (0.358), BLEU 3 (0.229), and BLEU 4 (0.165) respectively. This study's findings can alleviate radiologists' workloads and enhance clinical workflows by introducing an autonomous radiological report generation system.
Collapse
Affiliation(s)
- Iqra Shahzadi
- Department of Computer Science, COMSATS University Islamabad, Islamabad, Pakistan
| | - Tahir Mustafa Madni
- Department of Computer Science, COMSATS University Islamabad, Islamabad, Pakistan.
| | - Uzair Iqbal Janjua
- Department of Computer Science, COMSATS University Islamabad, Islamabad, Pakistan
| | - Ghanwa Batool
- Department of Computer Science, COMSATS University Islamabad, Islamabad, Pakistan
| | - Bushra Naz
- Department of Computer Science, COMSATS University Islamabad, Islamabad, Pakistan
| | - Muhammad Qasim Ali
- Rehabilitation Department, Yusra Medical and Dental College, Rawalpindi, Pakistan
| |
Collapse
|
12
|
Cao P, Derhaag J, Coonen E, Brunner H, Acharya G, Salumets A, Zamani Esteki M. Generative artificial intelligence to produce high-fidelity blastocyst-stage embryo images. Hum Reprod 2024; 39:1197-1207. [PMID: 38600621 PMCID: PMC11145014 DOI: 10.1093/humrep/deae064] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2023] [Revised: 02/13/2024] [Indexed: 04/12/2024] Open
Abstract
STUDY QUESTION Can generative artificial intelligence (AI) models produce high-fidelity images of human blastocysts? SUMMARY ANSWER Generative AI models exhibit the capability to generate high-fidelity human blastocyst images, thereby providing substantial training datasets crucial for the development of robust AI models. WHAT IS KNOWN ALREADY The integration of AI into IVF procedures holds the potential to enhance objectivity and automate embryo selection for transfer. However, the effectiveness of AI is limited by data scarcity and ethical concerns related to patient data privacy. Generative adversarial networks (GAN) have emerged as a promising approach to alleviate data limitations by generating synthetic data that closely approximate real images. STUDY DESIGN, SIZE, DURATION Blastocyst images were included as training data from a public dataset of time-lapse microscopy (TLM) videos (n = 136). A style-based GAN was fine-tuned as the generative model. PARTICIPANTS/MATERIALS, SETTING, METHODS We curated a total of 972 blastocyst images as training data, where frames were captured within the time window of 110-120 h post-insemination at 1-h intervals from TLM videos. We configured the style-based GAN model with data augmentation (AUG) and pretrained weights (Pretrained-T: with translation equivariance; Pretrained-R: with translation and rotation equivariance) to compare their optimization on image synthesis. We then applied quantitative metrics including Fréchet Inception Distance (FID) and Kernel Inception Distance (KID) to assess the quality and fidelity of the generated images. Subsequently, we evaluated qualitative performance by measuring the intelligence behavior of the model through the visual Turing test. To this end, 60 individuals with diverse backgrounds and expertise in clinical embryology and IVF evaluated the quality of synthetic embryo images. MAIN RESULTS AND THE ROLE OF CHANCE During the training process, we observed consistent improvement of image quality that was measured by FID and KID scores. Pretrained and AUG + Pretrained initiated with remarkably lower FID and KID values compared to both Baseline and AUG + Baseline models. Following 5000 training iterations, the AUG + Pretrained-R model showed the highest performance of the evaluated five configurations with FID and KID scores of 15.2 and 0.004, respectively. Subsequently, we carried out the visual Turing test, such that IVF embryologists, IVF laboratory technicians, and non-experts evaluated the synthetic blastocyst-stage embryo images and obtained similar performance in specificity with marginal differences in accuracy and sensitivity. LIMITATIONS, REASONS FOR CAUTION In this study, we primarily focused the training data on blastocyst images as IVF embryos are primarily assessed in blastocyst stage. However, generation of an array of images in different preimplantation stages offers further insights into the development of preimplantation embryos and IVF success. In addition, we resized training images to a resolution of 256 × 256 pixels to moderate the computational costs of training the style-based GAN models. Further research is needed to involve a more extensive and diverse dataset from the formation of the zygote to the blastocyst stage, e.g. video generation, and the use of improved image resolution to facilitate the development of comprehensive AI algorithms and to produce higher-quality images. WIDER IMPLICATIONS OF THE FINDINGS Generative AI models hold promising potential in generating high-fidelity human blastocyst images, which allows the development of robust AI models as it can provide sufficient training datasets while safeguarding patient data privacy. Additionally, this may help to produce sufficient embryo imaging training data with different (rare) abnormal features, such as embryonic arrest, tripolar cell division to avoid class imbalances and reach to even datasets. Thus, generative models may offer a compelling opportunity to transform embryo selection procedures and substantially enhance IVF outcomes. STUDY FUNDING/COMPETING INTEREST(S) This study was supported by a Horizon 2020 innovation grant (ERIN, grant no. EU952516) and a Horizon Europe grant (NESTOR, grant no. 101120075) of the European Commission to A.S. and M.Z.E., the Estonian Research Council (grant no. PRG1076) to A.S., and the EVA (Erfelijkheid Voortplanting & Aanleg) specialty program (grant no. KP111513) of Maastricht University Medical Centre (MUMC+) to M.Z.E. TRIAL REGISTRATION NUMBER Not applicable.
Collapse
Affiliation(s)
- Ping Cao
- Department of Clinical Genetics, Maastricht University Medical Center+ (MUMC+), Maastricht, The Netherlands
- Department of Genetics and Cell Biology, GROW Research Institute for Oncology and Reproduction, Faculty of Health, Medicine and Life Sciences (FHML), Maastricht University, Maastricht, The Netherlands
| | - Josien Derhaag
- Department of Reproductive Medicine, Maastricht University Medical Center+ (MUMC+), Maastricht, The Netherlands
| | - Edith Coonen
- Department of Clinical Genetics, Maastricht University Medical Center+ (MUMC+), Maastricht, The Netherlands
- Department of Reproductive Medicine, Maastricht University Medical Center+ (MUMC+), Maastricht, The Netherlands
| | - Han Brunner
- Department of Clinical Genetics, Maastricht University Medical Center+ (MUMC+), Maastricht, The Netherlands
- Department of Genetics and Cell Biology, GROW Research Institute for Oncology and Reproduction, Faculty of Health, Medicine and Life Sciences (FHML), Maastricht University, Maastricht, The Netherlands
- Department of Human Genetics, Radboud University Medical Center, Nijmegen, The Netherlands
| | - Ganesh Acharya
- Division of Obstetrics and Gynecology, Department of Clinical Science, Intervention and Technology (CLINTEC), Karolinska Institutet, and Karolinska University Hospital, Stockholm, Sweden
- Women’s Health and Perinatology Research Group, Department of Clinical Medicine, UiT—The Arctic University of Norway, Tromsø, Norway
| | - Andres Salumets
- Division of Obstetrics and Gynecology, Department of Clinical Science, Intervention and Technology (CLINTEC), Karolinska Institutet, and Karolinska University Hospital, Stockholm, Sweden
- Competence Centre on Health Technologies, Tartu, Estonia
- Department of Obstetrics and Gynecology, Institute of Clinical Medicine, University of Tartu, Tartu, Estonia
| | - Masoud Zamani Esteki
- Department of Clinical Genetics, Maastricht University Medical Center+ (MUMC+), Maastricht, The Netherlands
- Department of Genetics and Cell Biology, GROW Research Institute for Oncology and Reproduction, Faculty of Health, Medicine and Life Sciences (FHML), Maastricht University, Maastricht, The Netherlands
- Division of Obstetrics and Gynecology, Department of Clinical Science, Intervention and Technology (CLINTEC), Karolinska Institutet, and Karolinska University Hospital, Stockholm, Sweden
| |
Collapse
|
13
|
Shi Q, Song F, Zhou X, Chen X, Cao J, Na J, Fan Y, Zhang G, Zheng L. Early Predicting Osteogenic Differentiation of Mesenchymal Stem Cells Based on Deep Learning Within One Day. Ann Biomed Eng 2024; 52:1706-1718. [PMID: 38488988 DOI: 10.1007/s10439-024-03483-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2023] [Accepted: 02/24/2024] [Indexed: 03/17/2024]
Abstract
Osteogenic differentiation of mesenchymal stem cells (MSCs) is proposed to be critical for bone tissue engineering and regenerative medicine. However, the current approach for evaluating osteogenic differentiation mainly involves immunohistochemical staining of specific markers which often can be detected at day 5-7 of osteogenic inducing. Deep learning (DL) is a significant technology for realizing artificial intelligence (AI). Computer vision, a branch of AI, has been proved to achieve high-precision image recognition using convolutional neural networks (CNNs). Our goal was to train CNNs to quantitatively measure the osteogenic differentiation of MSCs. To this end, bright-field images of MSCs during early osteogenic differentiation (day 0, 1, 3, 5, and 7) were captured using a simple optical phase contrast microscope to train CNNs. The results showed that the CNNs could be trained to recognize undifferentiated cells and differentiating cells with an accuracy of 0.961 on the independent test set. In addition, we found that CNNs successfully distinguished differentiated cells at a very early stage (only 1 day). Further analysis showed that overall morphological features of MSCs were the main basis for the CNN classification. In conclusion, MSCs differentiation detection can be achieved early and accurately through simple bright-field images and DL networks, which may also provide a potential and novel method for the field of cell detection in the near future.
Collapse
Affiliation(s)
- Qiusheng Shi
- Key Laboratory of Biomechanics and Mechanobiology (Beihang University), Ministry of Education, Beijing Advanced Innovation Center for Biomedical Engineering, School of Biological Science and Medical Engineering, Beihang University, No. 37 Xueyuan Road, Haidian District, Beijing, 100191, China
| | - Fan Song
- Key Laboratory of Biomechanics and Mechanobiology (Beihang University), Ministry of Education, Beijing Advanced Innovation Center for Biomedical Engineering, School of Biological Science and Medical Engineering, Beihang University, No. 37 Xueyuan Road, Haidian District, Beijing, 100191, China
| | - Xiaocheng Zhou
- Department of Statistics, The Chinese University of Hong Kong, Sha Tin, Hong Kong SAR, China
| | - Xinyuan Chen
- Key Laboratory of Biomechanics and Mechanobiology (Beihang University), Ministry of Education, Beijing Advanced Innovation Center for Biomedical Engineering, School of Biological Science and Medical Engineering, Beihang University, No. 37 Xueyuan Road, Haidian District, Beijing, 100191, China
| | - Jingqi Cao
- Key Laboratory of Biomechanics and Mechanobiology (Beihang University), Ministry of Education, Beijing Advanced Innovation Center for Biomedical Engineering, School of Biological Science and Medical Engineering, Beihang University, No. 37 Xueyuan Road, Haidian District, Beijing, 100191, China
| | - Jing Na
- Key Laboratory of Biomechanics and Mechanobiology (Beihang University), Ministry of Education, Beijing Advanced Innovation Center for Biomedical Engineering, School of Biological Science and Medical Engineering, Beihang University, No. 37 Xueyuan Road, Haidian District, Beijing, 100191, China
| | - Yubo Fan
- Key Laboratory of Biomechanics and Mechanobiology (Beihang University), Ministry of Education, Beijing Advanced Innovation Center for Biomedical Engineering, School of Biological Science and Medical Engineering, Beihang University, No. 37 Xueyuan Road, Haidian District, Beijing, 100191, China.
| | - Guanglei Zhang
- Key Laboratory of Biomechanics and Mechanobiology (Beihang University), Ministry of Education, Beijing Advanced Innovation Center for Biomedical Engineering, School of Biological Science and Medical Engineering, Beihang University, No. 37 Xueyuan Road, Haidian District, Beijing, 100191, China.
| | - Lisha Zheng
- Key Laboratory of Biomechanics and Mechanobiology (Beihang University), Ministry of Education, Beijing Advanced Innovation Center for Biomedical Engineering, School of Biological Science and Medical Engineering, Beihang University, No. 37 Xueyuan Road, Haidian District, Beijing, 100191, China.
| |
Collapse
|
14
|
Ay S, Cardei M, Meyer AM, Zhang W, Topaloglu U. Improving Equity in Deep Learning Medical Applications with the Gerchberg-Saxton Algorithm. JOURNAL OF HEALTHCARE INFORMATICS RESEARCH 2024; 8:225-243. [PMID: 38681756 PMCID: PMC11052977 DOI: 10.1007/s41666-024-00163-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2023] [Revised: 02/02/2024] [Accepted: 02/19/2024] [Indexed: 05/01/2024]
Abstract
Deep learning (DL) has gained prominence in healthcare for its ability to facilitate early diagnosis, treatment identification with associated prognosis, and varying patient outcome predictions. However, because of highly variable medical practices and unsystematic data collection approaches, DL can unfortunately exacerbate biases and distort estimates. For example, the presence of sampling bias poses a significant challenge to the efficacy and generalizability of any statistical model. Even with DL approaches, selection bias can lead to inconsistent, suboptimal, or inaccurate model results, especially for underrepresented populations. Therefore, without addressing bias, wider implementation of DL approaches can potentially cause unintended harm. In this paper, we studied a novel method for bias reduction that leverages the frequency domain transformation via the Gerchberg-Saxton and corresponding impact on the outcome from a racio-ethnic bias perspective.
Collapse
Affiliation(s)
- Seha Ay
- Department of Biomedical Engineering, Wake Forest School of Medicine, Winston-Salem, NC USA
| | - Michael Cardei
- Department of Cancer Biology, Wake Forest School of Medicine, Winston-Salem, NC USA
| | - Anne-Marie Meyer
- Department of Epidemiology, University of North Carolina, Chapel Hill, NC USA
| | - Wei Zhang
- Department of Cancer Biology, Wake Forest School of Medicine, Winston-Salem, NC USA
| | - Umit Topaloglu
- National Cancer Institute, Shady Grove, Rockville, MD USA
| |
Collapse
|
15
|
Berris T, Myronakis M, Stratakis J, Perisinakis K, Karantanas A, Damilakis J. Is deep learning-enabled real-time personalized CT dosimetry feasible using only patient images as input? Phys Med 2024; 122:103381. [PMID: 38810391 DOI: 10.1016/j.ejmp.2024.103381] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/21/2024] [Revised: 03/28/2024] [Accepted: 05/20/2024] [Indexed: 05/31/2024] Open
Abstract
PURPOSE To propose a novel deep-learning based dosimetry method that allows quick and accurate estimation of organ doses for individual patients, using only their computed tomography (CT) images as input. METHODS Despite recent advances in medical dosimetry, personalized CT dosimetry remains a labour-intensive process. Current state-of-the-art methods utilize time-consuming Monte Carlo (MC) based simulations for individual organ dose estimation in CT. The proposed method uses conditional generative adversarial networks (cGANs) to substitute MC simulations with fast dose image generation, based on image-to-image translation. The pix2pix architecture in conjunction with a regression model was utilized for the generation of the synthetic dose images. The lungs, heart, breast, bone and skin were manually segmented to estimate and compare organ doses calculated using both the original and synthetic dose images, respectively. RESULTS The average organ dose estimation error for the proposed method was 8.3% and did not exceed 20% for any of the organs considered. The performance of the method in the clinical environment was also assessed. Using segmentation tools developed in-house, an automatic organ dose calculation pipeline was set up. Calculation of organ doses for heart and lung for each CT slice took about 2 s. CONCLUSIONS This work shows that deep learning-enabled personalized CT dosimetry is feasible in real-time, using only patient CT images as input.
Collapse
Affiliation(s)
- Theocharis Berris
- Department of Medical Physics, School of Medicine, University of Crete, P.O. Box 2208, 71003 Iraklion, Crete, Greece
| | - Marios Myronakis
- Department of Medical Physics, School of Medicine, University of Crete, P.O. Box 2208, 71003 Iraklion, Crete, Greece
| | - John Stratakis
- Department of Medical Physics, University Hospital of Iraklion, 71110 Iraklion, Crete, Greece
| | - Kostas Perisinakis
- Department of Medical Physics, School of Medicine, University of Crete, P.O. Box 2208, 71003 Iraklion, Crete, Greece
| | - Apostolos Karantanas
- Department of Radiology, School of Medicine, University of Crete, P.O. Box 2208, 71003 Iraklion, Crete, Greece
| | - John Damilakis
- Department of Medical Physics, School of Medicine, University of Crete, P.O. Box 2208, 71003 Iraklion, Crete, Greece.
| |
Collapse
|
16
|
Holste G, Zhou Y, Wang S, Jaiswal A, Lin M, Zhuge S, Yang Y, Kim D, Nguyen-Mau TH, Tran MT, Jeong J, Park W, Ryu J, Hong F, Verma A, Yamagishi Y, Kim C, Seo H, Kang M, Celi LA, Lu Z, Summers RM, Shih G, Wang Z, Peng Y. Towards long-tailed, multi-label disease classification from chest X-ray: Overview of the CXR-LT challenge. Med Image Anal 2024; 97:103224. [PMID: 38850624 DOI: 10.1016/j.media.2024.103224] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2023] [Revised: 04/01/2024] [Accepted: 05/27/2024] [Indexed: 06/10/2024]
Abstract
Many real-world image recognition problems, such as diagnostic medical imaging exams, are "long-tailed" - there are a few common findings followed by many more relatively rare conditions. In chest radiography, diagnosis is both a long-tailed and multi-label problem, as patients often present with multiple findings simultaneously. While researchers have begun to study the problem of long-tailed learning in medical image recognition, few have studied the interaction of label imbalance and label co-occurrence posed by long-tailed, multi-label disease classification. To engage with the research community on this emerging topic, we conducted an open challenge, CXR-LT, on long-tailed, multi-label thorax disease classification from chest X-rays (CXRs). We publicly release a large-scale benchmark dataset of over 350,000 CXRs, each labeled with at least one of 26 clinical findings following a long-tailed distribution. We synthesize common themes of top-performing solutions, providing practical recommendations for long-tailed, multi-label medical image classification. Finally, we use these insights to propose a path forward involving vision-language foundation models for few- and zero-shot disease classification.
Collapse
Affiliation(s)
- Gregory Holste
- Department of Electrical and Computer Engineering, The University of Texas at Austin, 78712, Austin, TX, USA
| | - Yiliang Zhou
- Department of Population Health Sciences, Weill Cornell Medicine, 10065, New York, NY, USA
| | - Song Wang
- Department of Electrical and Computer Engineering, The University of Texas at Austin, 78712, Austin, TX, USA
| | - Ajay Jaiswal
- Department of Electrical and Computer Engineering, The University of Texas at Austin, 78712, Austin, TX, USA
| | - Mingquan Lin
- Department of Population Health Sciences, Weill Cornell Medicine, 10065, New York, NY, USA
| | - Sherry Zhuge
- School of Information Systems, Carnegie Mellon University, 15213, Pittsburgh, PA, USA
| | - Yuzhe Yang
- Department of Electrical Engineering and Computer Science, Massachussetts Institute of Technology, 02139, Cambridge, MA, USA
| | - Dongkyun Kim
- School of Computer Science, Carnegie Mellon University, 15213, Pittsburgh, PA, USA
| | | | | | - Jaehyup Jeong
- KT Research & Development Center, KT Corporation, 06763, Seoul, South Korea
| | - Wongi Park
- Department of Software and Computer Engineering, Ajou University, 16499, Suwon, South Korea
| | - Jongbin Ryu
- Department of Software and Computer Engineering, Ajou University, 16499, Suwon, South Korea
| | - Feng Hong
- Cooperative Medianet Innovation Center, Shanghai Jiao Tong University, 200240, Shanghai, China
| | - Arsh Verma
- Wadhwani Institute for Artificial Intelligence, 400079, Mumbai, India
| | - Yosuke Yamagishi
- Division of Radiology and Biomedical Engineering, Graduate School of Medicine, The University of Tokyo, 113-0033, Tokyo, Japan
| | - Changhyun Kim
- BioMedical AI Team, AIX Future R&D Center, SK Telecom, 04539, Seoul, South Korea
| | - Hyeryeong Seo
- Interdisciplinary Program in AI (IPAI), Seoul National University, 02504, Seoul, South Korea
| | - Myungjoo Kang
- Department of Mathematical Sciences, Seoul National University, 02504, Seoul, South Korea
| | - Leo Anthony Celi
- Laboratory for Computational Physiology, Massachusetts Institute of Technology, 02139, Cambridge, MA, USA; Division of Pulmonary, Critical Care and Sleep Medicine, Beth Israel Deaconess Medical Center, 02215, Boston, MA, USA; Department of Biostatistics, Harvard T.H. Chan School of Public Health, 02115, Boston, MA, USA
| | - Zhiyong Lu
- National Center for Biotechnology Information, National Library of Medicine, 20894, Bethesda, MD, USA
| | - Ronald M Summers
- Clinical Center, National Institutes of Health, 20892, Bethesda, MD, USA
| | - George Shih
- Department of Radiology, Weill Cornell Medicine, 10065, New York, NY, USA
| | - Zhangyang Wang
- Department of Electrical and Computer Engineering, The University of Texas at Austin, 78712, Austin, TX, USA.
| | - Yifan Peng
- Department of Population Health Sciences, Weill Cornell Medicine, 10065, New York, NY, USA.
| |
Collapse
|
17
|
Yıldız Potter İ, Yeritsyan D, Mahar S, Kheir N, Vaziri A, Putman M, Rodriguez EK, Wu J, Nazarian A, Vaziri A. Proximal femur fracture detection on plain radiography via feature pyramid networks. Sci Rep 2024; 14:12046. [PMID: 38802519 PMCID: PMC11130146 DOI: 10.1038/s41598-024-63001-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2023] [Accepted: 05/23/2024] [Indexed: 05/29/2024] Open
Abstract
Hip fractures exceed 250,000 cases annually in the United States, with the worldwide incidence projected to increase by 240-310% by 2050. Hip fractures are predominantly diagnosed by radiologist review of radiographs. In this study, we developed a deep learning model by extending the VarifocalNet Feature Pyramid Network (FPN) for detection and localization of proximal femur fractures from plain radiography with clinically relevant metrics. We used a dataset of 823 hip radiographs of 150 subjects with proximal femur fractures and 362 controls to develop and evaluate the deep learning model. Our model attained 0.94 specificity and 0.95 sensitivity in fracture detection over the diverse imaging dataset. We compared the performance of our model against five benchmark FPN models, demonstrating 6-14% sensitivity and 1-9% accuracy improvement. In addition, we demonstrated that our model outperforms a state-of-the-art transformer model based on DINO network by 17% sensitivity and 5% accuracy, while taking half the time on average to process a radiograph. The developed model can aid radiologists and support on-premise integration with hospital cloud services to enable automatic, opportunistic screening for hip fractures.
Collapse
Affiliation(s)
| | - Diana Yeritsyan
- Carl J. Shapiro Department of Orthopaedic Surgery, Beth Israel Deaconess Medical Center (BIDMC) and Harvard Medical School, 330 Brookline Avenue, Stoneman 10, Boston, MA, 02215, USA
- Musculoskeletal Translational Innovation Initiative, Beth Israel Deaconess Medical Center and Harvard Medical School, 330 Brookline Avenue RN123, Boston, MA, 02215, USA
| | - Sarah Mahar
- Carl J. Shapiro Department of Orthopaedic Surgery, Beth Israel Deaconess Medical Center (BIDMC) and Harvard Medical School, 330 Brookline Avenue, Stoneman 10, Boston, MA, 02215, USA
- Musculoskeletal Translational Innovation Initiative, Beth Israel Deaconess Medical Center and Harvard Medical School, 330 Brookline Avenue RN123, Boston, MA, 02215, USA
| | - Nadim Kheir
- Carl J. Shapiro Department of Orthopaedic Surgery, Beth Israel Deaconess Medical Center (BIDMC) and Harvard Medical School, 330 Brookline Avenue, Stoneman 10, Boston, MA, 02215, USA
- Musculoskeletal Translational Innovation Initiative, Beth Israel Deaconess Medical Center and Harvard Medical School, 330 Brookline Avenue RN123, Boston, MA, 02215, USA
| | - Aidin Vaziri
- BioSensics, LLC, 57 Chapel Street, Newton, MA, 02458, USA
| | - Melissa Putman
- Division of Endocrinology, Massachusetts General Hospital and Harvard Medical School, 55 Fruit Street, Boston, MA, 02114, USA
| | - Edward K Rodriguez
- Carl J. Shapiro Department of Orthopaedic Surgery, Beth Israel Deaconess Medical Center (BIDMC) and Harvard Medical School, 330 Brookline Avenue, Stoneman 10, Boston, MA, 02215, USA
- Musculoskeletal Translational Innovation Initiative, Beth Israel Deaconess Medical Center and Harvard Medical School, 330 Brookline Avenue RN123, Boston, MA, 02215, USA
| | - Jim Wu
- Department of Radiology, Massachusetts General Brigham (MGB) and Harvard Medical School, 75 Francis Street, Boston, MA, 02215, USA
| | - Ara Nazarian
- Carl J. Shapiro Department of Orthopaedic Surgery, Beth Israel Deaconess Medical Center (BIDMC) and Harvard Medical School, 330 Brookline Avenue, Stoneman 10, Boston, MA, 02215, USA
- Musculoskeletal Translational Innovation Initiative, Beth Israel Deaconess Medical Center and Harvard Medical School, 330 Brookline Avenue RN123, Boston, MA, 02215, USA
- Department of Orthopaedic Surgery, Yerevan State University, Yerevan, Armenia
| | - Ashkan Vaziri
- BioSensics, LLC, 57 Chapel Street, Newton, MA, 02458, USA
| |
Collapse
|
18
|
Zhang X, Chen S, Zhang P, Wang C, Wang Q, Zhou X. Staging of Liver Fibrosis Based on Energy Valley Optimization Multiple Stacking (EVO-MS) Model. Bioengineering (Basel) 2024; 11:485. [PMID: 38790352 PMCID: PMC11117710 DOI: 10.3390/bioengineering11050485] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2024] [Revised: 05/09/2024] [Accepted: 05/10/2024] [Indexed: 05/26/2024] Open
Abstract
Currently, staging the degree of liver fibrosis predominantly relies on liver biopsy, a method fraught with potential risks, such as bleeding and infection. With the rapid development of medical imaging devices, quantification of liver fibrosis through image processing technology has become feasible. Stacking technology is one of the effective ensemble techniques for potential usage, but precise tuning to find the optimal configuration manually is challenging. Therefore, this paper proposes a novel EVO-MS model-a multiple stacking ensemble learning model optimized by the energy valley optimization (EVO) algorithm to select most informatic features for fibrosis quantification. Liver contours are profiled from 415 biopsied proven CT cases, from which 10 shape features are calculated and inputted into a Support Vector Machine (SVM) classifier to generate the accurate predictions, then the EVO algorithm is applied to find the optimal parameter combination to fuse six base models: K-Nearest Neighbors (KNNs), Decision Tree (DT), Naive Bayes (NB), Extreme Gradient Boosting (XGB), Gradient Boosting Decision Tree (GBDT), and Random Forest (RF), to create a well-performing ensemble model. Experimental results indicate that selecting 3-5 feature parameters yields satisfactory results in classification, with features such as the contour roundness non-uniformity (Rmax), maximum peak height of contour (Rp), and maximum valley depth of contour (Rm) significantly influencing classification accuracy. The improved EVO algorithm, combined with a multiple stacking model, achieves an accuracy of 0.864, a precision of 0.813, a sensitivity of 0.912, a specificity of 0.824, and an F1-score of 0.860, which demonstrates the effectiveness of our EVO-MS model in staging the degree of liver fibrosis.
Collapse
Affiliation(s)
- Xuejun Zhang
- School of Computer, Electronics and Information, Guangxi University, Nanning 530004, China; (X.Z.); (P.Z.); (C.W.)
- Guangxi Key Laboratory of Multimedia Communications and Network Technology, Guangxi University, Nanning 530004, China
| | - Shengxiang Chen
- School of Computer, Electronics and Information, Guangxi University, Nanning 530004, China; (X.Z.); (P.Z.); (C.W.)
| | - Pengfei Zhang
- School of Computer, Electronics and Information, Guangxi University, Nanning 530004, China; (X.Z.); (P.Z.); (C.W.)
| | - Chun Wang
- School of Computer, Electronics and Information, Guangxi University, Nanning 530004, China; (X.Z.); (P.Z.); (C.W.)
| | - Qibo Wang
- School of Computer, Electronics and Information, Guangxi University, Nanning 530004, China; (X.Z.); (P.Z.); (C.W.)
| | - Xiangrong Zhou
- Department of Electrical, Electronic and Computer Engineering, Gifu University, Gifu 501-1193, Japan;
| |
Collapse
|
19
|
Daneshpajooh V, Ahmad D, Toth J, Bascom R, Higgins WE. Automatic lesion detection for narrow-band imaging bronchoscopy. J Med Imaging (Bellingham) 2024; 11:036002. [PMID: 38827776 PMCID: PMC11138083 DOI: 10.1117/1.jmi.11.3.036002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2023] [Revised: 04/04/2024] [Accepted: 05/14/2024] [Indexed: 06/05/2024] Open
Abstract
Purpose Early detection of cancer is crucial for lung cancer patients, as it determines disease prognosis. Lung cancer typically starts as bronchial lesions along the airway walls. Recent research has indicated that narrow-band imaging (NBI) bronchoscopy enables more effective bronchial lesion detection than other bronchoscopic modalities. Unfortunately, NBI video can be hard to interpret because physicians currently are forced to perform a time-consuming subjective visual search to detect bronchial lesions in a long airway-exam video. As a result, NBI bronchoscopy is not regularly used in practice. To alleviate this problem, we propose an automatic two-stage real-time method for bronchial lesion detection in NBI video and perform a first-of-its-kind pilot study of the method using NBI airway exam video collected at our institution. Approach Given a patient's NBI video, the first method stage entails a deep-learning-based object detection network coupled with a multiframe abnormality measure to locate candidate lesions on each video frame. The second method stage then draws upon a Siamese network and a Kalman filter to track candidate lesions over multiple frames to arrive at final lesion decisions. Results Tests drawing on 23 patient NBI airway exam videos indicate that the method can process an incoming video stream at a real-time frame rate, thereby making the method viable for real-time inspection during a live bronchoscopic airway exam. Furthermore, our studies showed a 93% sensitivity and 86% specificity for lesion detection; this compares favorably to a sensitivity and specificity of 80% and 84% achieved over a series of recent pooled clinical studies using the current time-consuming subjective clinical approach. Conclusion The method shows potential for robust lesion detection in NBI video at a real-time frame rate. Therefore, it could help enable more common use of NBI bronchoscopy for bronchial lesion detection.
Collapse
Affiliation(s)
- Vahid Daneshpajooh
- The Pennsylvania State University, School of Electrical Engineering and Computer Science, University Park, Pennsylvania, United States
| | - Danish Ahmad
- The Pennsylvania State University, College of Medicine, Hershey, Pennsylvania, United States
| | - Jennifer Toth
- The Pennsylvania State University, College of Medicine, Hershey, Pennsylvania, United States
| | - Rebecca Bascom
- The Pennsylvania State University, College of Medicine, Hershey, Pennsylvania, United States
| | - William E. Higgins
- The Pennsylvania State University, School of Electrical Engineering and Computer Science, University Park, Pennsylvania, United States
| |
Collapse
|
20
|
Ying S, Huang F, Shen X, Liu W, He F. Performance comparison of multifarious deep networks on caries detection with tooth X-ray images. J Dent 2024; 144:104970. [PMID: 38556194 DOI: 10.1016/j.jdent.2024.104970] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2023] [Revised: 03/11/2024] [Accepted: 03/24/2024] [Indexed: 04/02/2024] Open
Abstract
OBJECTIVES Deep networks have been preliminarily studied in caries diagnosis based on clinical X-ray images. However, the performance of different deep networks on caries detection is still unclear. This study aims to comprehensively compare the caries detection performances of recent multifarious deep networks with clinical dentist level as a bridge. METHODS Based on the self-collected periapical radiograph dataset in clinic, four most popular deep networks in two types, namely YOLOv5 and DETR object detection networks, and UNet and Trans-UNet segmentation networks, were included in the comparison study. Five dentists carried out the caries detection on the same testing dataset for reference. Key tooth-level metrics, including precision, sensitivity, specificity, F1-score and Youden index, were obtained, based on which statistical analysis was conducted. RESULTS The F1-score order of deep networks is YOLOv5 (0.87), Trans-UNet (0.86), DETR (0.82) and UNet (0.80) in caries detection. A same ranking order is found using the Youden index combining sensitivity and specificity, which are 0.76, 0.73, 0.69 and 0.64 respectively. A moderate level of concordance was observed between all networks and the gold standard. No significant difference (p > 0.05) was found between deep networks and between the well-trained network and dentists in caries detection. CONCLUSIONS Among investigated deep networks, YOLOv5 is recommended to be priority for caries detection in terms of its high metrics. The well-trained deep network could be used as a good assistance for dentists to detect and diagnose caries. CLINICAL SIGNIFICANCE The well-trained deep network shows a promising potential clinical application prospect. It can provide valuable support to healthcare professionals in facilitating detection and diagnosis of dental caries.
Collapse
Affiliation(s)
- Shunv Ying
- Stomatology Hospital, School of Stomatology, Zhejiang University School of Medicine, Clinical Research Center for Oral Diseases of Zhejiang Province, Key Laboratory of Oral Biomedical Research of Zhejiang Province, Cancer Center of Zhejiang University, Hangzhou, 310006, China
| | - Feng Huang
- School of Mechanical & Energy Engineering, Zhejiang University of Science & Technology, Hangzhou, 310023, China.
| | - Xiaoting Shen
- Stomatology Hospital, School of Stomatology, Zhejiang University School of Medicine, Clinical Research Center for Oral Diseases of Zhejiang Province, Key Laboratory of Oral Biomedical Research of Zhejiang Province, Cancer Center of Zhejiang University, Hangzhou, 310006, China
| | - Wei Liu
- Stomatology Hospital, School of Stomatology, Zhejiang University School of Medicine, Clinical Research Center for Oral Diseases of Zhejiang Province, Key Laboratory of Oral Biomedical Research of Zhejiang Province, Cancer Center of Zhejiang University, Hangzhou, 310006, China
| | - Fuming He
- Stomatology Hospital, School of Stomatology, Zhejiang University School of Medicine, Clinical Research Center for Oral Diseases of Zhejiang Province, Key Laboratory of Oral Biomedical Research of Zhejiang Province, Cancer Center of Zhejiang University, Hangzhou, 310006, China.
| |
Collapse
|
21
|
Jiang Q, Ye H, Yang B, Cao F. Label-Decoupled Medical Image Segmentation With Spatial-Channel Graph Convolution and Dual Attention Enhancement. IEEE J Biomed Health Inform 2024; 28:2830-2841. [PMID: 38376972 DOI: 10.1109/jbhi.2024.3367756] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/22/2024]
Abstract
Deep learning-based methods have been widely used in medical image segmentation recently. However, existing works are usually difficult to simultaneously capture global long-range information from images and topological correlations among feature maps. Further, medical images often suffer from blurred target edges. Accordingly, this paper proposes a novel medical image segmentation framework named a label-decoupled network with spatial-channel graph convolution and dual attention enhancement mechanism (LADENet for short). It constructs learnable adjacency matrices and utilizes graph convolutions to effectively capture global long-range information on spatial locations and topological dependencies between different channels in an image. Then a label-decoupled strategy based on distance transformation is introduced to decouple an original segmentation label into a body label and an edge label for supervising the body branch and edge branch. Again, a dual attention enhancement mechanism, designing a body attention block in the body branch and an edge attention block in the edge branch, is built to promote the learning ability of spatial region and boundary features. Besides, a feature interactor is devised to fully consider the information interaction between the body and edge branches to improve segmentation performance. Experiments on benchmark datasets reveal the superiority of LADENet compared to state-of-the-art approaches.
Collapse
|
22
|
Zhu M, Fu Q, Liu B, Zhang M, Li B, Luo X, Zhou F. RT-SRTS: Angle-agnostic real-time simultaneous 3D reconstruction and tumor segmentation from single X-ray projection. Comput Biol Med 2024; 173:108390. [PMID: 38569234 DOI: 10.1016/j.compbiomed.2024.108390] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2023] [Revised: 03/24/2024] [Accepted: 03/26/2024] [Indexed: 04/05/2024]
Abstract
Radiotherapy is one of the primary treatment methods for tumors, but the organ movement caused by respiration limits its accuracy. Recently, 3D imaging from a single X-ray projection has received extensive attention as a promising approach to address this issue. However, current methods can only reconstruct 3D images without directly locating the tumor and are only validated for fixed-angle imaging, which fails to fully meet the requirements of motion control in radiotherapy. In this study, a novel imaging method RT-SRTS is proposed which integrates 3D imaging and tumor segmentation into one network based on multi-task learning (MTL) and achieves real-time simultaneous 3D reconstruction and tumor segmentation from a single X-ray projection at any angle. Furthermore, the attention enhanced calibrator (AEC) and uncertain-region elaboration (URE) modules have been proposed to aid feature extraction and improve segmentation accuracy. The proposed method was evaluated on fifteen patient cases and compared with three state-of-the-art methods. It not only delivers superior 3D reconstruction but also demonstrates commendable tumor segmentation results. Simultaneous reconstruction and segmentation can be completed in approximately 70 ms, significantly faster than the required time threshold for real-time tumor tracking. The efficacies of both AEC and URE have also been validated in ablation studies. The code of work is available at https://github.com/ZywooSimple/RT-SRTS.
Collapse
Affiliation(s)
- Miao Zhu
- Image Processing Center, Beihang University, Beijing, 100191, PR China
| | - Qiming Fu
- Image Processing Center, Beihang University, Beijing, 100191, PR China
| | - Bo Liu
- Image Processing Center, Beihang University, Beijing, 100191, PR China.
| | - Mengxi Zhang
- Image Processing Center, Beihang University, Beijing, 100191, PR China
| | - Bojian Li
- Image Processing Center, Beihang University, Beijing, 100191, PR China
| | - Xiaoyan Luo
- Image Processing Center, Beihang University, Beijing, 100191, PR China.
| | - Fugen Zhou
- Image Processing Center, Beihang University, Beijing, 100191, PR China
| |
Collapse
|
23
|
Thakur GK, Thakur A, Kulkarni S, Khan N, Khan S. Deep Learning Approaches for Medical Image Analysis and Diagnosis. Cureus 2024; 16:e59507. [PMID: 38826977 PMCID: PMC11144045 DOI: 10.7759/cureus.59507] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2024] [Accepted: 05/01/2024] [Indexed: 06/04/2024] Open
Abstract
In addition to enhancing diagnostic accuracy, deep learning techniques offer the potential to streamline workflows, reduce interpretation time, and ultimately improve patient outcomes. The scalability and adaptability of deep learning algorithms enable their deployment across diverse clinical settings, ranging from radiology departments to point-of-care facilities. Furthermore, ongoing research efforts focus on addressing the challenges of data heterogeneity, model interpretability, and regulatory compliance, paving the way for seamless integration of deep learning solutions into routine clinical practice. As the field continues to evolve, collaborations between clinicians, data scientists, and industry stakeholders will be paramount in harnessing the full potential of deep learning for advancing medical image analysis and diagnosis. Furthermore, the integration of deep learning algorithms with other technologies, including natural language processing and computer vision, may foster multimodal medical data analysis and clinical decision support systems to improve patient care. The future of deep learning in medical image analysis and diagnosis is promising. With each success and advancement, this technology is getting closer to being leveraged for medical purposes. Beyond medical image analysis, patient care pathways like multimodal imaging, imaging genomics, and intelligent operating rooms or intensive care units can benefit from deep learning models.
Collapse
Affiliation(s)
- Gopal Kumar Thakur
- Department of Data Sciences, Harrisburg University of Science and Technology, Harrisburg, USA
| | - Abhishek Thakur
- Department of Data Sciences, Harrisburg University of Science and Technology, Harrisburg, USA
| | - Shridhar Kulkarni
- Department of Data Sciences, Harrisburg University of Science and Technology, Harrisburg, USA
| | - Naseebia Khan
- Department of Data Sciences, Harrisburg University of Science and Technology, Harrisburg, USA
| | - Shahnawaz Khan
- Department of Computer Application, Bundelkhand University, Jhansi, IND
| |
Collapse
|
24
|
Arora S, Jariwala SP, Balsari S. Artificial intelligence in medicine: A primer and recommendation. J Hosp Med 2024. [PMID: 38639172 DOI: 10.1002/jhm.13371] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/09/2023] [Revised: 03/25/2024] [Accepted: 04/01/2024] [Indexed: 04/20/2024]
Affiliation(s)
- Shitij Arora
- Montefiore Medical Center, Bronx, New York, USA
- Albert Einstein College of Medicine, Bronx, New York, USA
| | - Sunit P Jariwala
- Montefiore Medical Center, Bronx, New York, USA
- Albert Einstein College of Medicine, Bronx, New York, USA
| | - Satchit Balsari
- Harvard Medical School and Beth Israel Deaconess Medical Center, Boston, Massachusetts, USA
| |
Collapse
|
25
|
Mohsin ASM, Choudhury SH. Label-free quantification of gold nanoparticles at the single-cell level using a multi-column convolutional neural network (MC-CNN). Analyst 2024; 149:2412-2419. [PMID: 38487894 DOI: 10.1039/d3an01982a] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/16/2024]
Abstract
Gold nanoparticles (AuNPs) are extensively used in cellular imaging, single-particle tracking, disease diagnosis, studying membrane protein interaction, and drug delivery. Understanding the dynamics of AuNP uptake in live cells is crucial for optimizing their efficacy and safety. Traditional manual methods for quantifying AuNP uptake are time-consuming and subjective, limiting their scalability and accuracy. The available fluorescence-based techniques are limited to photobleaching and photoblinking. Optical microscopy techniques are limited by diffraction limits. Electron microscopy-based imaging techniques are destructive and unsuitable for live cell imaging. Furthermore, the resulting images may contain hundreds of particles with varied intensities, blurring, and substantial occlusion, making it difficult to manually quantify AuNP uptake. To overcome this issue and measure AuNP uptake by live cells, we annotated a dataset of dark-field images of 50 nanometer-radius AuNPs at different incubation durations. Then, to count the number of particles present in a cell, we created a customized multi-column convolutional neural network (MC-CNN). The customized MC-CNN outperformed typical particle counting architectures when compared to spectroscopy-based counting. This will allow researchers to gain a better understanding of AuNP behavior and interactions with cells, paving the way for advancements in nanomedicine, drug delivery, and biomedical research. The code for this paper is available at the following link: https://github.com/Namerlight/LabelFree_AuNP_Quantification.
Collapse
Affiliation(s)
- Abu S M Mohsin
- Nanotechnology, IoT and Applied Machine Learning Research Group, Brac University, Dhaka, Bangladesh.
| | - Shadab H Choudhury
- Nanotechnology, IoT and Applied Machine Learning Research Group, Brac University, Dhaka, Bangladesh.
| |
Collapse
|
26
|
Song Z, Wu H, Chen W, Slowik A. Improving automatic segmentation of liver tumor images using a deep learning model. Heliyon 2024; 10:e28538. [PMID: 38571625 PMCID: PMC10988037 DOI: 10.1016/j.heliyon.2024.e28538] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2023] [Revised: 03/17/2024] [Accepted: 03/20/2024] [Indexed: 04/05/2024] Open
Abstract
Liver tumors are one of the most aggressive malignancies in the human body. Computer-aided technology and liver interventional surgery are effective in the prediction, identification and management of liver neoplasms. One of the important processes is to accurately grasp the morphological structure of the liver and liver blood vessels. However, accurate identification and segmentation of hepatic blood vessels in CT images poses a formidable challenge. Manually locating and segmenting liver vessels in CT images is time-consuming and impractical. There is an imperative clinical requirement for a precise and effective algorithm to segment liver vessels. In response to this demand, the current paper advocates a liver vessel segmentation approach that employs an enhanced 3D fully convolutional neural network V-Net. The network model improves the basic network structure according to the characteristics of liver vessels. First, a pyramidal convolution block is introduced between the encoder and decoder of the network to improve the network localization ability. Then, multi-resolution deep supervision is introduced in the network, resulting in more robust segmentation. Finally, by fusing feature maps of different resolutions, the overall segmentation result is predicted. Evaluation experiments on public datasets demonstrate that our improved scheme can increase the segmentation ability of existing network models for liver vessels. Compared with the existing work, the experimental outcomes demonstrate that the technique presented in this manuscript has attained superior performance on the Dice Coefficient index, which can promote the treatment of liver tumors.
Collapse
Affiliation(s)
- Zhendong Song
- School of Mechanical and Electrical Engineering, Shenzhen Polytechnic University, Shenzhen, 518055, China
| | - Huiming Wu
- School of Mechanical and Electrical Engineering, Shenzhen Polytechnic University, Shenzhen, 518055, China
| | - Wei Chen
- School of Mechanical and Electrical Engineering, Shenzhen Polytechnic University, Shenzhen, 518055, China
| | - Adam Slowik
- Koszalin University of Technology, Koszalin, Poland
| |
Collapse
|
27
|
Zhan F, Wang W, Chen Q, Guo Y, He L, Wang L. Three-Direction Fusion for Accurate Volumetric Liver and Tumor Segmentation. IEEE J Biomed Health Inform 2024; 28:2175-2186. [PMID: 38109246 DOI: 10.1109/jbhi.2023.3344392] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2023]
Abstract
Biomedical image segmentation of organs, tissues and lesions has gained increasing attention in clinical treatment planning and navigation, which involves the exploration of two-dimensional (2D) and three-dimensional (3D) contexts in the biomedical image. Compared to 2D methods, 3D methods pay more attention to inter-slice correlations, which offer additional spatial information for image segmentation. An organ or tumor has a 3D structure that can be observed from three directions. Previous studies focus only on the vertical axis, limiting the understanding of the relationship between a tumor and its surrounding tissues. Important information can also be obtained from sagittal and coronal axes. Therefore, spatial information of organs and tumors can be obtained from three directions, i.e. the sagittal, coronal and vertical axes, to understand better the invasion depth of tumor and its relationship with the surrounding tissues. Moreover, the edges of organs and tumors in biomedical image may be blurred. To address these problems, we propose a three-direction fusion volumetric segmentation (TFVS) model for segmenting 3D biomedical images from three perspectives in sagittal, coronal and transverse planes, respectively. We use the dataset of the liver task provided by the Medical Segmentation Decathlon challenge to train our model. The TFVS method demonstrates a competitive performance on the 3D-IRCADB dataset. In addition, the t-test and Wilcoxon signed-rank test are also performed to show the statistical significance of the improvement by the proposed method as compared with the baseline methods. The proposed method is expected to be beneficial in guiding and facilitating clinical diagnosis and treatment.
Collapse
|
28
|
Bai J, Jin A, Adams M, Yang C, Nabavi S. Unsupervised feature correlation model to predict breast abnormal variation maps in longitudinal mammograms. Comput Med Imaging Graph 2024; 113:102341. [PMID: 38277769 DOI: 10.1016/j.compmedimag.2024.102341] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2023] [Revised: 01/18/2024] [Accepted: 01/18/2024] [Indexed: 01/28/2024]
Abstract
Breast cancer continues to be a significant cause of mortality among women globally. Timely identification and precise diagnosis of breast abnormalities are critical for enhancing patient prognosis. In this study, we focus on improving the early detection and accurate diagnosis of breast abnormalities, which is crucial for improving patient outcomes and reducing the mortality rate of breast cancer. To address the limitations of traditional screening methods, a novel unsupervised feature correlation network was developed to predict maps indicating breast abnormal variations using longitudinal 2D mammograms. The proposed model utilizes the reconstruction process of current year and prior year mammograms to extract tissue from different areas and analyze the differences between them to identify abnormal variations that may indicate the presence of cancer. The model incorporates a feature correlation module, an attention suppression gate, and a breast abnormality detection module, all working together to improve prediction accuracy. The proposed model not only provides breast abnormal variation maps but also distinguishes between normal and cancer mammograms, making it more advanced compared to the state-of-the-art baseline models. The results of the study show that the proposed model outperforms the baseline models in terms of Accuracy, Sensitivity, Specificity, Dice score, and cancer detection rate.
Collapse
Affiliation(s)
- Jun Bai
- Department of Computer Science and Engineering, University of Connecticut, 371 Fairfield Way, Storrs, CT 06269, USA
| | - Annie Jin
- University of Connecticut School of Medicine, 263 Farmington Ave. Farmington, CT 06030, USA
| | - Madison Adams
- University of Connecticut School of Medicine, 263 Farmington Ave. Farmington, CT 06030, USA
| | - Clifford Yang
- University of Connecticut School of Medicine, 263 Farmington Ave. Farmington, CT 06030, USA; Department of Radiology, UConn Health, 263 Farmington Ave. Farmington, CT 06030, USA
| | - Sheida Nabavi
- Department of Computer Science and Engineering, University of Connecticut, 371 Fairfield Way, Storrs, CT 06269, USA.
| |
Collapse
|
29
|
Holste G, Zhou Y, Wang S, Jaiswal A, Lin M, Zhuge S, Yang Y, Kim D, Nguyen-Mau TH, Tran MT, Jeong J, Park W, Ryu J, Hong F, Verma A, Yamagishi Y, Kim C, Seo H, Kang M, Celi LA, Lu Z, Summers RM, Shih G, Wang Z, Peng Y. Towards long-tailed, multi-label disease classification from chest X-ray: Overview of the CXR-LT challenge. ARXIV 2024:arXiv:2310.16112v2. [PMID: 37986726 PMCID: PMC10659524] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 11/22/2023]
Abstract
Many real-world image recognition problems, such as diagnostic medical imaging exams, are "long-tailed" - there are a few common findings followed by many more relatively rare conditions. In chest radiography, diagnosis is both a long-tailed and multi-label problem, as patients often present with multiple findings simultaneously. While researchers have begun to study the problem of long-tailed learning in medical image recognition, few have studied the interaction of label imbalance and label co-occurrence posed by long-tailed, multi-label disease classification. To engage with the research community on this emerging topic, we conducted an open challenge, CXR-LT, on long-tailed, multi-label thorax disease classification from chest X-rays (CXRs). We publicly release a large-scale benchmark dataset of over 350,000 CXRs, each labeled with at least one of 26 clinical findings following a long-tailed distribution. We synthesize common themes of top-performing solutions, providing practical recommendations for long-tailed, multi-label medical image classification. Finally, we use these insights to propose a path forward involving vision-language foundation models for few- and zero-shot disease classification.
Collapse
Affiliation(s)
- Gregory Holste
- Department of Electrical and Computer Engineering, The University of Texas at Austin, 78712, Austin, TX USA
| | - Yiliang Zhou
- Department of Population Health Sciences, Weill Cornell Medicine, 10065, New York, NY USA
| | - Song Wang
- Department of Electrical and Computer Engineering, The University of Texas at Austin, 78712, Austin, TX USA
| | - Ajay Jaiswal
- Department of Electrical and Computer Engineering, The University of Texas at Austin, 78712, Austin, TX USA
| | - Mingquan Lin
- Department of Population Health Sciences, Weill Cornell Medicine, 10065, New York, NY USA
| | - Sherry Zhuge
- School of Information Systems, Carnegie Mellon University, 15213, Pittsburgh, PA USA
| | - Yuzhe Yang
- Department of Electrical Engineering and Computer Science, Massachussetts Institute of Technology, 02139, Cambridge, MA USA
| | - Dongkyun Kim
- School of Computer Science, Carnegie Mellon University, 15213, Pittsburgh, PA USA
| | | | - Minh-Triet Tran
- University of Science, VNU-HCM, 70000, Ho Chi Minh City, Vietnam
| | - Jaehyup Jeong
- KT Research & Development Center, KT Corporation, 06763, Seoul, South Korea
| | - Wongi Park
- Department of Software and Computer Engineering, Ajou University, 16499, Suwon, South Korea
| | - Jongbin Ryu
- Department of Software and Computer Engineering, Ajou University, 16499, Suwon, South Korea
| | - Feng Hong
- Cooperative Medianet Innovation Center, Shanghai Jiao Tong University, 200240, Shanghai, China
| | - Arsh Verma
- Wadhwani Institute for Artificial Intelligence, 400079, Mumbai, India
| | - Yosuke Yamagishi
- Division of Radiology and Biomedical Engineering, Graduate School of Medicine, The University of Tokyo, 113-0033, Tokyo, Japan
| | - Changhyun Kim
- BioMedical AI Team, AIX Future R&D Center, SK Telecom, 04539, Seoul, South Korea
| | - Hyeryeong Seo
- Interdisciplinary Program in AI (IPAI), Seoul National University, 02504, Seoul, South Korea
| | - Myungjoo Kang
- Department of Mathematical Sciences, Seoul National University, 02504, Seoul, South Korea
| | - Leo Anthony Celi
- Laboratory for Computational Physiology, Massachusetts Institute of Technology, 02139, Cambridge, MA USA
- Division of Pulmonary, Critical Care and Sleep Medicine, Beth Israel Deaconess Medical Center, 02215, Boston, MA USA
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, 02115, Boston, MA USA
| | - Zhiyong Lu
- National Center for Biotechnology Information, National Library of Medicine, 20894, Bethesda, MD USA
| | - Ronald M. Summers
- Clinical Center, National Institutes of Health, 20892, Bethesda, MD USA
| | - George Shih
- Department of Radiology, Weill Cornell Medicine, 10065, New York, NY USA
| | - Zhangyang Wang
- Department of Electrical and Computer Engineering, The University of Texas at Austin, 78712, Austin, TX USA
| | - Yifan Peng
- Department of Population Health Sciences, Weill Cornell Medicine, 10065, New York, NY USA
| |
Collapse
|
30
|
Pan W, Shan Y, Li C, Huang S, Li T, Li Y, Zhu H. FPLS-DC: functional partial least squares through distance covariance for imaging genetics. Bioinformatics 2024; 40:btae173. [PMID: 38552322 PMCID: PMC11034987 DOI: 10.1093/bioinformatics/btae173] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2023] [Revised: 02/28/2024] [Accepted: 03/27/2024] [Indexed: 04/24/2024] Open
Abstract
MOTIVATION Imaging genetics integrates imaging and genetic techniques to examine how genetic variations influence the function and structure of organs like the brain or heart, providing insights into their impact on behavior and disease phenotypes. The use of organ-wide imaging endophenotypes has increasingly been used to identify potential genes associated with complex disorders. However, analyzing organ-wide imaging data alongside genetic data presents two significant challenges: high dimensionality and complex relationships. To address these challenges, we propose a novel, nonlinear inference framework designed to partially mitigate these issues. RESULTS We propose a functional partial least squares through distance covariance (FPLS-DC) framework for efficient genome wide analyses of imaging phenotypes. It consists of two components. The first component utilizes the FPLS-derived base functions to reduce image dimensionality while screening genetic markers. The second component maximizes the distance correlation between genetic markers and projected imaging data, which is a linear combination of the FPLS-basis functions, using simulated annealing algorithm. In addition, we proposed an iterative FPLS-DC method based on FPLS-DC framework, which effectively overcomes the influence of inter-gene correlation on inference analysis. We efficiently approximate the null distribution of test statistics using a gamma approximation. Compared to existing methods, FPLS-DC offers computational and statistical efficiency for handling large-scale imaging genetics. In real-world applications, our method successfully detected genetic variants associated with the hippocampus, demonstrating its value as a statistical toolbox for imaging genetic studies. AVAILABILITY AND IMPLEMENTATION The FPLS-DC method we propose opens up new research avenues and offers valuable insights for analyzing functional and high-dimensional data. In addition, it serves as a useful tool for scientific analysis in practical applications within the field of imaging genetics research. The R package FPLS-DC is available in Github: https://github.com/BIG-S2/FPLSDC.
Collapse
Affiliation(s)
- Wenliang Pan
- Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China
| | - Yue Shan
- Departments of Biostatistics, Statistics, Genetics, and Computer Science and Biomedical Research Imaging Center, University of North Carolina, Chapel Hill, NC 27599, USA
| | - Chuang Li
- Department of Statistical Science, School of Mathematics, Sun Yat-sen University, Guangzhou 510275, China
| | - Shuai Huang
- Departments of Biostatistics, Statistics, Genetics, and Computer Science and Biomedical Research Imaging Center, University of North Carolina, Chapel Hill, NC 27599, USA
| | - Tengfei Li
- Departments of Radiology and Biomedical Research Imaging Center, University of North Carolina, Chapel Hill, NC 27599, USA
| | - Yun Li
- Departments of Biostatistics, Statistics, Genetics, and Computer Science and Biomedical Research Imaging Center, University of North Carolina, Chapel Hill, NC 27599, USA
| | - Hongtu Zhu
- Departments of Biostatistics, Statistics, Genetics, and Computer Science and Biomedical Research Imaging Center, University of North Carolina, Chapel Hill, NC 27599, USA
- Departments of Radiology and Biomedical Research Imaging Center, University of North Carolina, Chapel Hill, NC 27599, USA
| |
Collapse
|
31
|
Liu Z, Kainth K, Zhou A, Deyer TW, Fayad ZA, Greenspan H, Mei X. A review of self-supervised, generative, and few-shot deep learning methods for data-limited magnetic resonance imaging segmentation. NMR IN BIOMEDICINE 2024:e5143. [PMID: 38523402 DOI: 10.1002/nbm.5143] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/01/2023] [Revised: 02/15/2024] [Accepted: 02/16/2024] [Indexed: 03/26/2024]
Abstract
Magnetic resonance imaging (MRI) is a ubiquitous medical imaging technology with applications in disease diagnostics, intervention, and treatment planning. Accurate MRI segmentation is critical for diagnosing abnormalities, monitoring diseases, and deciding on a course of treatment. With the advent of advanced deep learning frameworks, fully automated and accurate MRI segmentation is advancing. Traditional supervised deep learning techniques have advanced tremendously, reaching clinical-level accuracy in the field of segmentation. However, these algorithms still require a large amount of annotated data, which is oftentimes unavailable or impractical. One way to circumvent this issue is to utilize algorithms that exploit a limited amount of labeled data. This paper aims to review such state-of-the-art algorithms that use a limited number of annotated samples. We explain the fundamental principles of self-supervised learning, generative models, few-shot learning, and semi-supervised learning and summarize their applications in cardiac, abdomen, and brain MRI segmentation. Throughout this review, we highlight algorithms that can be employed based on the quantity of annotated data available. We also present a comprehensive list of notable publicly available MRI segmentation datasets. To conclude, we discuss possible future directions of the field-including emerging algorithms, such as contrastive language-image pretraining, and potential combinations across the methods discussed-that can further increase the efficacy of image segmentation with limited labels.
Collapse
Affiliation(s)
- Zelong Liu
- BioMedical Engineering and Imaging Institute, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - Komal Kainth
- BioMedical Engineering and Imaging Institute, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - Alexander Zhou
- BioMedical Engineering and Imaging Institute, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - Timothy W Deyer
- East River Medical Imaging, New York, New York, USA
- Department of Radiology, Cornell Medicine, New York, New York, USA
| | - Zahi A Fayad
- BioMedical Engineering and Imaging Institute, Icahn School of Medicine at Mount Sinai, New York, New York, USA
- Department of Diagnostic, Molecular, and Interventional Radiology, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - Hayit Greenspan
- BioMedical Engineering and Imaging Institute, Icahn School of Medicine at Mount Sinai, New York, New York, USA
- Department of Diagnostic, Molecular, and Interventional Radiology, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - Xueyan Mei
- BioMedical Engineering and Imaging Institute, Icahn School of Medicine at Mount Sinai, New York, New York, USA
- Department of Diagnostic, Molecular, and Interventional Radiology, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| |
Collapse
|
32
|
Kabir MM, Mridha M, Rahman A, Hamid MA, Monowar MM. Detection of COVID-19, pneumonia, and tuberculosis from radiographs using AI-driven knowledge distillation. Heliyon 2024; 10:e26801. [PMID: 38444490 PMCID: PMC10912466 DOI: 10.1016/j.heliyon.2024.e26801] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2023] [Revised: 01/30/2024] [Accepted: 02/20/2024] [Indexed: 03/07/2024] Open
Abstract
Chest radiography is an essential diagnostic tool for respiratory diseases such as COVID-19, pneumonia, and tuberculosis because it accurately depicts the structures of the chest. However, accurate detection of these diseases from radiographs is a complex task that requires the availability of medical imaging equipment and trained personnel. Conventional deep learning models offer a viable automated solution for this task. However, the high complexity of these models often poses a significant obstacle to their practical deployment within automated medical applications, including mobile apps, web apps, and cloud-based platforms. This study addresses and resolves this dilemma by reducing the complexity of neural networks using knowledge distillation techniques (KDT). The proposed technique trains a neural network on an extensive collection of chest X-ray images and propagates the knowledge to a smaller network capable of real-time detection. To create a comprehensive dataset, we have integrated three popular chest radiograph datasets with chest radiographs for COVID-19, pneumonia, and tuberculosis. Our experiments show that this knowledge distillation approach outperforms conventional deep learning methods in terms of computational complexity and performance for real-time respiratory disease detection. Specifically, our system achieves an impressive average accuracy of 0.97, precision of 0.94, and recall of 0.97.
Collapse
Affiliation(s)
- Md Mohsin Kabir
- Department of Computer Science & Engineering, Bangladesh University of Business & Technology, Dhaka-1216, Bangladesh
| | - M.F. Mridha
- Department of Computer Science, American International University-Bangladesh, Dhaka-1229, Bangladesh
| | - Ashifur Rahman
- Department of Computer Science & Engineering, Bangladesh University of Business & Technology, Dhaka-1216, Bangladesh
| | - Md. Abdul Hamid
- Department of Information Technology, Faculty of Computing & Information Technology, King Abdulaziz University, Jeddah-21589, Kingdom of Saudi Arabia
| | - Muhammad Mostafa Monowar
- Department of Information Technology, Faculty of Computing & Information Technology, King Abdulaziz University, Jeddah-21589, Kingdom of Saudi Arabia
| |
Collapse
|
33
|
Zhang Y, Shen Z, Jiao R. Segment anything model for medical image segmentation: Current applications and future directions. Comput Biol Med 2024; 171:108238. [PMID: 38422961 DOI: 10.1016/j.compbiomed.2024.108238] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2023] [Revised: 02/06/2024] [Accepted: 02/25/2024] [Indexed: 03/02/2024]
Abstract
Due to the inherent flexibility of prompting, foundation models have emerged as the predominant force in the fields of natural language processing and computer vision. The recent introduction of the Segment Anything Model (SAM) signifies a noteworthy expansion of the prompt-driven paradigm into the domain of image segmentation, thereby introducing a plethora of previously unexplored capabilities. However, the viability of its application to medical image segmentation remains uncertain, given the substantial distinctions between natural and medical images. In this work, we provide a comprehensive overview of recent endeavors aimed at extending the efficacy of SAM to medical image segmentation tasks, encompassing both empirical benchmarking and methodological adaptations. Additionally, we explore potential avenues for future research directions in SAM's role within medical image segmentation. While direct application of SAM to medical image segmentation does not yield satisfactory performance on multi-modal and multi-target medical datasets so far, numerous insights gleaned from these efforts serve as valuable guidance for shaping the trajectory of foundational models in the realm of medical image analysis. To support ongoing research endeavors, we maintain an active repository that contains an up-to-date paper list and a succinct summary of open-source projects at https://github.com/YichiZhang98/SAM4MIS.
Collapse
Affiliation(s)
- Yichi Zhang
- School of Data Science, Fudan University, Shanghai, China.
| | - Zhenrong Shen
- School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China
| | - Rushi Jiao
- School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China
| |
Collapse
|
34
|
Hossen MM, Ashraf A, Hasan M, Majid ME, Nashbat M, Kashem SBA, Kunju AKA, Khandakar A, Mahmud S, Chowdhury MEH. GCDN-Net: Garbage classifier deep neural network for recyclable urban waste management. WASTE MANAGEMENT (NEW YORK, N.Y.) 2024; 174:439-450. [PMID: 38113669 DOI: 10.1016/j.wasman.2023.12.014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/26/2023] [Revised: 11/10/2023] [Accepted: 12/06/2023] [Indexed: 12/21/2023]
Abstract
The escalating waste volume due to urbanization and population growth has underscored the need for advanced waste sorting and recycling methods to ensure sustainable waste management. Deep learning models, adept at image recognition tasks, offer potential solutions for waste sorting applications. These models, trained on extensive waste image datasets, possess the ability to discern unique features of diverse waste types. Automating waste sorting hinges on robust deep learning models capable of accurately categorizing a wide range of waste types. In this study, a multi-stage machine learning approach is proposed to classify different waste categories using the "Garbage In, Garbage Out" (GIGO) dataset of 25,000 images. The novel Garbage Classifier Deep Neural Network (GCDN-Net) is introduced as a comprehensive solution, adept in both single-label and multi-label classification tasks. Single-label classification distinguishes between garbage and non-garbage images, while multi-label classification identifies distinct garbage categories within single or multiple images. The performance of GCDN-Net is rigorously evaluated and compared against state-of-the-art waste classification methods. Results demonstrate GCDN-Net's excellence, achieving 95.77% accuracy, 95.78% precision, 95.77% recall, 95.77% F1-score, and 95.54% specificity when classifying waste images, outperforming existing models in single-label classification. In multi-label classification, GCDN-Net attains an overall Mean Average Precision (mAP) of 0.69 and an F1-score of 75.01%. The reliability of network performance is affirmed through saliency map-based visualization generated by Score-CAM (class activation mapping). In conclusion, deep learning-based models exhibit efficacy in categorizing diverse waste types, paving the way for automated waste sorting and recycling systems that can mitigate costs and processing times.
Collapse
Affiliation(s)
- Md Mosarrof Hossen
- Department of Electrical and Electronics Engineering, University of Dhaka, Dhaka, Bangladesh.
| | - Azad Ashraf
- Chemical Engineering Department, University of Doha for Science and Technology, Doha, Qatar.
| | - Mazhar Hasan
- Chemical Engineering Department, University of Doha for Science and Technology, Doha, Qatar.
| | - Molla E Majid
- Computer Applications Department, Academic Bridge Program, Qatar Foundation, Doha, Qatar.
| | - Mohammad Nashbat
- Chemical Engineering Department, University of Doha for Science and Technology, Doha, Qatar.
| | - Saad Bin Abul Kashem
- Department of Computing Science, AFG College with the University of Aberdeen, Doha, Qatar.
| | - Ali K Ansaruddin Kunju
- Chemical Engineering Department, University of Doha for Science and Technology, Doha, Qatar.
| | - Amith Khandakar
- Department of Electrical Engineering, Qatar University, Doha, Qatar.
| | - Sakib Mahmud
- Department of Electrical Engineering, Qatar University, Doha, Qatar.
| | | |
Collapse
|
35
|
Takahashi K, Ozawa E, Shimakura A, Mori T, Miyaaki H, Nakao K. Recent Advances in Endoscopic Ultrasound for Gallbladder Disease Diagnosis. Diagnostics (Basel) 2024; 14:374. [PMID: 38396413 PMCID: PMC10887964 DOI: 10.3390/diagnostics14040374] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2023] [Revised: 02/01/2024] [Accepted: 02/05/2024] [Indexed: 02/25/2024] Open
Abstract
Gallbladder (GB) disease is classified into two broad categories: GB wall-thickening and protuberant lesions, which include various lesions, such as adenomyomatosis, cholecystitis, GB polyps, and GB carcinoma. This review summarizes recent advances in the differential diagnosis of GB lesions, focusing primarily on endoscopic ultrasound (EUS) and related technologies. Fundamental B-mode EUS and contrast-enhanced harmonic EUS (CH-EUS) have been reported to be useful for the diagnosis of GB diseases because they can evaluate the thickening of the GB wall and protuberant lesions in detail. We also outline the current status of EUS-guided fine-needle aspiration (EUS-FNA) for GB lesions, as there have been scattered reports on EUS-FNA in recent years. Furthermore, artificial intelligence (AI) technologies, ranging from machine learning to deep learning, have become popular in healthcare for disease diagnosis, drug discovery, drug development, and patient risk identification. In this review, we outline the current status of AI in the diagnosis of GB.
Collapse
Affiliation(s)
- Kosuke Takahashi
- Department of Gastroenterology and Hepatology, Graduate School of Biomedical Sciences, Nagasaki University, Nagasaki 852-8501, Japan; (E.O.); (T.M.); (H.M.); (K.N.)
| | | | | | | | | | | |
Collapse
|
36
|
Doo FX, Kulkarni P, Siegel EL, Toland M, Yi PH, Carlos RC, Parekh VS. Economic and Environmental Costs of Cloud Technologies for Medical Imaging and Radiology Artificial Intelligence. J Am Coll Radiol 2024; 21:248-256. [PMID: 38072221 DOI: 10.1016/j.jacr.2023.11.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2023] [Revised: 11/07/2023] [Accepted: 11/10/2023] [Indexed: 01/18/2024]
Abstract
Radiology is on the verge of a technological revolution driven by artificial intelligence (including large language models), which requires robust computing and storage capabilities, often beyond the capacity of current non-cloud-based informatics systems. The cloud presents a potential solution for radiology, and we should weigh its economic and environmental implications. Recently, cloud technologies have become a cost-effective strategy by providing necessary infrastructure while reducing expenditures associated with hardware ownership, maintenance, and upgrades. Simultaneously, given the optimized energy consumption in modern cloud data centers, this transition is expected to reduce the environmental footprint of radiologic operations. The path to cloud integration comes with its own challenges, and radiology informatics leaders must consider elements such as cloud architectural choices, pricing, data security, uptime service agreements, user training and support, and broader interoperability. With the increasing importance of data-driven tools in radiology, understanding and navigating the cloud landscape will be essential for the future of radiology and its various stakeholders.
Collapse
Affiliation(s)
- Florence X Doo
- University of Maryland Medical Intelligent Imaging (UM2ii) Center, Department of Radiology and Nuclear Medicine, University of Maryland, Baltimore, Maryland.
| | - Pranav Kulkarni
- University of Maryland Medical Intelligent Imaging (UM2ii) Center, Department of Radiology and Nuclear Medicine, University of Maryland, Baltimore, Maryland. https://twitter.com/itsPranavK
| | - Eliot L Siegel
- University of Maryland Medical Intelligent Imaging (UM2ii) Center, Department of Radiology and Nuclear Medicine, University of Maryland, Baltimore, Maryland; Associate Vice Chair, University of Maryland, Baltimore, Maryland. https://twitter.com/EliotSiegel
| | - Michael Toland
- Senior Director of IT, Department of Diagnostic Imaging and Nuclear Medicine, University of Maryland Medical System, Baltimore, Maryland
| | - Paul H Yi
- University of Maryland Medical Intelligent Imaging (UM2ii) Center, Department of Radiology and Nuclear Medicine, University of Maryland, Baltimore, Maryland. https://twitter.com/PaulYiMD
| | - Ruth C Carlos
- University of Michigan, Ann Arbor, Michigan; and Editor-in-Chief, Journal of the American College of Radiology. https://twitter.com/ruthcarlosmd
| | - Vishwa S Parekh
- University of Maryland Medical Intelligent Imaging (UM2ii) Center, Department of Radiology and Nuclear Medicine, University of Maryland, Baltimore, Maryland. https://twitter.com/vishwa_parekh
| |
Collapse
|
37
|
Zhou J, Zhou L, Wang D, Xu X, Li H, Chu Y, Han W, Gao X. Personalized and privacy-preserving federated heterogeneous medical image analysis with PPPML-HMI. Comput Biol Med 2024; 169:107861. [PMID: 38141449 DOI: 10.1016/j.compbiomed.2023.107861] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2023] [Revised: 12/13/2023] [Accepted: 12/14/2023] [Indexed: 12/25/2023]
Abstract
Heterogeneous data is endemic due to the use of diverse models and settings of devices by hospitals in the field of medical imaging. However, there are few open-source frameworks for federated heterogeneous medical image analysis with personalization and privacy protection without the demand to modify the existing model structures or to share any private data. Here, we proposed PPPML-HMI, a novel open-source learning paradigm for personalized and privacy-preserving federated heterogeneous medical image analysis. To our best knowledge, personalization and privacy protection were discussed simultaneously for the first time under the federated scenario by integrating the PerFedAvg algorithm and designing the novel cyclic secure aggregation with the homomorphic encryption algorithm. To show the utility of PPPML-HMI, we applied it to a simulated classification task namely the classification of healthy people and patients from the RAD-ChestCT Dataset, and one real-world segmentation task namely the segmentation of lung infections from COVID-19 CT scans. Meanwhile, we applied the improved deep leakage from gradients to simulate adversarial attacks and showed the strong privacy-preserving capability of PPPML-HMI. By applying PPPML-HMI to both tasks with different neural networks, a varied number of users, and sample sizes, we demonstrated the strong generalizability of PPPML-HMI in privacy-preserving federated learning on heterogeneous medical images.
Collapse
Affiliation(s)
- Juexiao Zhou
- Computer Science Program, Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Kingdom of Saudi Arabia; Computational Bioscience Research Center, Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Kingdom of Saudi Arabia
| | - Longxi Zhou
- Computer Science Program, Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Kingdom of Saudi Arabia; Computational Bioscience Research Center, Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Kingdom of Saudi Arabia
| | - Di Wang
- Computer Science Program, Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Kingdom of Saudi Arabia; Computational Bioscience Research Center, Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Kingdom of Saudi Arabia
| | - Xiaopeng Xu
- Computer Science Program, Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Kingdom of Saudi Arabia; Computational Bioscience Research Center, Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Kingdom of Saudi Arabia
| | - Haoyang Li
- Computer Science Program, Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Kingdom of Saudi Arabia; Computational Bioscience Research Center, Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Kingdom of Saudi Arabia
| | - Yuetan Chu
- Computer Science Program, Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Kingdom of Saudi Arabia; Computational Bioscience Research Center, Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Kingdom of Saudi Arabia
| | - Wenkai Han
- Computer Science Program, Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Kingdom of Saudi Arabia; Computational Bioscience Research Center, Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Kingdom of Saudi Arabia
| | - Xin Gao
- Computer Science Program, Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Kingdom of Saudi Arabia; Computational Bioscience Research Center, Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Kingdom of Saudi Arabia.
| |
Collapse
|
38
|
Nanou A, Stoecklein NH, Doerr D, Driemel C, Terstappen LWMM, Coumans FAW. Training an automated circulating tumor cell classifier when the true classification is uncertain. PNAS NEXUS 2024; 3:pgae048. [PMID: 38371418 PMCID: PMC10873494 DOI: 10.1093/pnasnexus/pgae048] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/18/2023] [Accepted: 01/17/2024] [Indexed: 02/20/2024]
Abstract
Circulating tumor cell (CTC) and tumor-derived extracellular vesicle (tdEV) loads are prognostic factors of survival in patients with carcinoma. The current method of CTC enumeration relies on operator review and, unfortunately, has moderate interoperator agreement (Fleiss' kappa 0.60) due to difficulties in classifying CTC-like events. We compared operator review, ACCEPT automated image processing, and refined the output of a deep-learning algorithm to identify CTC and tdEV for the prediction of survival in patients with metastatic and nonmetastatic cancers. Operator review is only defined for CTC. Refinement was performed using automatic contrast maximization CM-CTC of events detected in cancer and in benign samples (CM-CTC). We used 418 samples from benign diseases, 6,293 from nonmetastatic breast, 2,408 from metastatic breast, and 698 from metastatic prostate cancer to train, test, optimize, and evaluate CTC and tdEV enumeration. For CTC identification, the CM-CTC performed best on metastatic/nonmetastatic breast cancer, respectively, with a hazard ratio (HR) for overall survival of 2.6/2.1 vs. 2.4/1.4 for operator CTC and 1.2/0.8 for ACCEPT-CTC. For tdEV identification, CM-tdEV performed best with an HR of 1.6/2.9 vs. 1.5/1.0 with ACCEPT-tdEV. In conclusion, contrast maximization is effective even though it does not utilize domain knowledge.
Collapse
Affiliation(s)
- Afroditi Nanou
- Department of Medical Cell BioPhysics, Faculty of Science and Technology, University of Twente, Enschede 7522 NH, The Netherlands
| | - Nikolas H Stoecklein
- Department of General, Visceral and Pediatric Surgery, Heinrich-Heine University, University Hospital Düsseldorf, Düsseldorf 40225, Germany
| | - Daniel Doerr
- Institute for Medical Biometry and Bioinformatics, Heinrich Heine University, Düsseldorf, Germany
| | - Christiane Driemel
- Department of General, Visceral and Pediatric Surgery, Heinrich-Heine University, University Hospital Düsseldorf, Düsseldorf 40225, Germany
| | - Leon W M M Terstappen
- Department of Medical Cell BioPhysics, Faculty of Science and Technology, University of Twente, Enschede 7522 NH, The Netherlands
- Decisive Science, Amsterdam 1019 BB, The Netherlands
| | - Frank A W Coumans
- Department of Medical Cell BioPhysics, Faculty of Science and Technology, University of Twente, Enschede 7522 NH, The Netherlands
- Decisive Science, Amsterdam 1019 BB, The Netherlands
| |
Collapse
|
39
|
Fuchs M, Gonzalez C, Frisch Y, Hahn P, Matthies P, Gruening M, Pinto Dos Santos D, Dratsch T, Kim M, Nensa F, Trenz M, Mukhopadhyay A. Closing the loop for AI-ready radiology. ROFO-FORTSCHR RONTG 2024; 196:154-162. [PMID: 37582385 DOI: 10.1055/a-2124-1958] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/17/2023]
Abstract
BACKGROUND In recent years, AI has made significant advancements in medical diagnosis and prognosis. However, the incorporation of AI into clinical practice is still challenging and under-appreciated. We aim to demonstrate a possible vertical integration approach to close the loop for AI-ready radiology. METHOD This study highlights the importance of two-way communication for AI-assisted radiology. As a key part of the methodology, it demonstrates the integration of AI systems into clinical practice with structured reports and AI visualization, giving more insight into the AI system. By integrating cooperative lifelong learning into the AI system, we ensure the long-term effectiveness of the AI system, while keeping the radiologist in the loop. RESULTS: We demonstrate the use of lifelong learning for AI systems by incorporating AI visualization and structured reports. We evaluate Memory Aware-Synapses and Rehearsal approach and find that both approaches work in practice. Furthermore, we see the advantage of lifelong learning algorithms that do not require the storing or maintaining of samples from previous datasets. CONCLUSION In conclusion, incorporating AI into the clinical routine of radiology requires a two-way communication approach and seamless integration of the AI system, which we achieve with structured reports and visualization of the insight gained by the model. Closing the loop for radiology leads to successful integration, enabling lifelong learning for the AI system, which is crucial for sustainable long-term performance. KEY POINTS · The integration of AI systems into the clinical routine with structured reports and AI visualization.. · Two-way communication between AI and radiologists is necessary to enable AI that keeps the radiologist in the loop.. · Closing the loop enables lifelong learning, which is crucial for long-term, high-performing AI in radiology..
Collapse
Affiliation(s)
| | | | | | | | | | - Maximilian Gruening
- Interorganisational Informationssystems, Georg-August-Universität Göttingen, Goettingen, Germany
| | - Daniel Pinto Dos Santos
- Institute for Diagnostic and Interventional Radiology, Uniklinik Koln, Germany
- Institute for Diagnostic and Interventional Radiology, Universitätsklinikum Frankfurt, Frankfurt am Main, Germany
| | - Thomas Dratsch
- Institute for Diagnostic and Interventional Radiology, Uniklinik Koln, Germany
| | - Moon Kim
- Institute for Diagnostic and Interventional Radiology and Neuroradiology, Universitätsklinikum Essen, Germany
- Institute for Artificial Intelligence in Medicine, Universitätsklinikum Essen, Germany
| | - Felix Nensa
- Institute for Diagnostic and Interventional Radiology and Neuroradiology, Universitätsklinikum Essen, Germany
- Institute for Artificial Intelligence in Medicine, Universitätsklinikum Essen, Germany
| | - Manuel Trenz
- Interorganisational Informationssystems, Georg-August-Universität Göttingen, Goettingen, Germany
| | | |
Collapse
|
40
|
Tripathi S, Tabari A, Mansur A, Dabbara H, Bridge CP, Daye D. From Machine Learning to Patient Outcomes: A Comprehensive Review of AI in Pancreatic Cancer. Diagnostics (Basel) 2024; 14:174. [PMID: 38248051 PMCID: PMC10814554 DOI: 10.3390/diagnostics14020174] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2023] [Revised: 12/28/2023] [Accepted: 12/29/2023] [Indexed: 01/23/2024] Open
Abstract
Pancreatic cancer is a highly aggressive and difficult-to-detect cancer with a poor prognosis. Late diagnosis is common due to a lack of early symptoms, specific markers, and the challenging location of the pancreas. Imaging technologies have improved diagnosis, but there is still room for improvement in standardizing guidelines. Biopsies and histopathological analysis are challenging due to tumor heterogeneity. Artificial Intelligence (AI) revolutionizes healthcare by improving diagnosis, treatment, and patient care. AI algorithms can analyze medical images with precision, aiding in early disease detection. AI also plays a role in personalized medicine by analyzing patient data to tailor treatment plans. It streamlines administrative tasks, such as medical coding and documentation, and provides patient assistance through AI chatbots. However, challenges include data privacy, security, and ethical considerations. This review article focuses on the potential of AI in transforming pancreatic cancer care, offering improved diagnostics, personalized treatments, and operational efficiency, leading to better patient outcomes.
Collapse
Affiliation(s)
- Satvik Tripathi
- Department of Radiology, Massachusetts General Hospital, Boston, MA 02114, USA; (S.T.); (A.T.); (A.M.); (C.P.B.)
- Athinoula A. Martinos Center for Biomedical Imaging, Charlestown, MA 02129, USA
- Harvard Medical School, Boston, MA 02115, USA
| | - Azadeh Tabari
- Department of Radiology, Massachusetts General Hospital, Boston, MA 02114, USA; (S.T.); (A.T.); (A.M.); (C.P.B.)
- Harvard Medical School, Boston, MA 02115, USA
| | - Arian Mansur
- Department of Radiology, Massachusetts General Hospital, Boston, MA 02114, USA; (S.T.); (A.T.); (A.M.); (C.P.B.)
- Harvard Medical School, Boston, MA 02115, USA
| | - Harika Dabbara
- Boston University Chobanian & Avedisian School of Medicine, Boston, MA 02118, USA;
| | - Christopher P. Bridge
- Department of Radiology, Massachusetts General Hospital, Boston, MA 02114, USA; (S.T.); (A.T.); (A.M.); (C.P.B.)
- Athinoula A. Martinos Center for Biomedical Imaging, Charlestown, MA 02129, USA
- Harvard Medical School, Boston, MA 02115, USA
| | - Dania Daye
- Department of Radiology, Massachusetts General Hospital, Boston, MA 02114, USA; (S.T.); (A.T.); (A.M.); (C.P.B.)
- Athinoula A. Martinos Center for Biomedical Imaging, Charlestown, MA 02129, USA
- Harvard Medical School, Boston, MA 02115, USA
| |
Collapse
|
41
|
Su X, Liu W, Jiang S, Gao X, Chu Y, Ma L. Deep learning-based anatomical position recognition for gastroscopic examination. Technol Health Care 2024; 32:39-48. [PMID: 38669495 PMCID: PMC11191429 DOI: 10.3233/thc-248004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/28/2024]
Abstract
BACKGROUND The gastroscopic examination is a preferred method for the detection of upper gastrointestinal lesions. However, gastroscopic examination has high requirements for doctors, especially for the strict position and quantity of the archived images. These requirements are challenging for the education and training of junior doctors. OBJECTIVE The purpose of this study is to use deep learning to develop automatic position recognition technology for gastroscopic examination. METHODS A total of 17182 gastroscopic images in eight anatomical position categories are collected. Convolutional neural network model MogaNet is used to identify all the anatomical positions of the stomach for gastroscopic examination The performance of four models is evaluated by sensitivity, precision, and F1 score. RESULTS The average sensitivity of the method proposed is 0.963, which is 0.074, 0.066 and 0.065 higher than ResNet, GoogleNet and SqueezeNet, respectively. The average precision of the method proposed is 0.964, which is 0.072, 0.067 and 0.068 higher than ResNet, GoogleNet, and SqueezeNet, respectively. And the average F1-Score of the method proposed is 0.964, which is 0.074, 0.067 and 0.067 higher than ResNet, GoogleNet, and SqueezeNet, respectively. The results of the t-test show that the method proposed is significantly different from other methods (p< 0.05). CONCLUSION The method proposed exhibits the best performance for anatomical positions recognition. And the method proposed can help junior doctors meet the requirements of completeness of gastroscopic examination and the number and position of archived images quickly.
Collapse
Affiliation(s)
- Xiufeng Su
- Weihai Municipal Hospital, Cheeloo College of Medicine, Shandong University, Weihai, Shandong, China
| | - Weiyu Liu
- Weihai Municipal Hospital, Cheeloo College of Medicine, Shandong University, Weihai, Shandong, China
| | - Suyi Jiang
- Weihai Municipal Hospital, Cheeloo College of Medicine, Shandong University, Weihai, Shandong, China
| | - Xiaozhong Gao
- Weihai Municipal Hospital, Cheeloo College of Medicine, Shandong University, Weihai, Shandong, China
| | - Yanliu Chu
- Weihai Municipal Hospital, Cheeloo College of Medicine, Shandong University, Weihai, Shandong, China
| | - Liyong Ma
- School of Information Science and Engineering, Harbin Institute of Technology, Weihai, Shandong, China
| |
Collapse
|
42
|
Betshrine Rachel R, Khanna Nehemiah H, Singh VK, Manoharan RMV. Diagnosis of Covid-19 from CT slices using Whale Optimization Algorithm, Support Vector Machine and Multi-Layer Perceptron. JOURNAL OF X-RAY SCIENCE AND TECHNOLOGY 2024; 32:253-269. [PMID: 38189732 DOI: 10.3233/xst-230196] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/09/2024]
Abstract
BACKGROUND The coronavirus disease 2019 is a serious and highly contagious disease caused by infection with a newly discovered virus, named severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). OBJECTIVE A Computer Aided Diagnosis (CAD) system to assist physicians to diagnose Covid-19 from chest Computed Tomography (CT) slices is modelled and experimented. METHODS The lung tissues are segmented using Otsu's thresholding method. The Covid-19 lesions have been annotated as the Regions of Interest (ROIs), which is followed by texture and shape extraction. The obtained features are stored as feature vectors and split into 80:20 train and test sets. To choose the optimal features, Whale Optimization Algorithm (WOA) with Support Vector Machine (SVM) classifier's accuracy is employed. A Multi-Layer Perceptron (MLP) classifier is trained to perform classification with the selected features. RESULTS Comparative experimentations of the proposed system with existing eight benchmark Machine Learning classifiers using real-time dataset demonstrates that the proposed system with 88.94% accuracy outperforms the benchmark classifier's results. Statistical analysis namely, Friedman test, Mann Whitney U test and Kendall's Rank Correlation Coefficient Test has been performed which indicates that the proposed method has a significant impact on the novel dataset considered. CONCLUSION The MLP classifier's accuracy without feature selection yielded 80.40%, whereas with feature selection using WOA, it yielded 88.94%.
Collapse
Affiliation(s)
- R Betshrine Rachel
- Ramanujan Computing Centre, College of Engineering Guindy, Anna University, Chennai, Tamil Nadu, India
| | - H Khanna Nehemiah
- Ramanujan Computing Centre, College of Engineering Guindy, Anna University, Chennai, Tamil Nadu, India
| | - Vaibhav Kumar Singh
- Alumna, Department of Information Science and Technology, College of Engineering Guindy, Anna University, Chennai, Tamil Nadu, India
| | - Rebecca Mercy Victoria Manoharan
- Alumna, Department of Computer Science and Engineering, College of Engineering Guindy, Anna University, Chennai, Tamil Nadu, India
| |
Collapse
|
43
|
Yang X, Huang K, Yang D, Zhao W, Zhou X. Biomedical Big Data Technologies, Applications, and Challenges for Precision Medicine: A Review. GLOBAL CHALLENGES (HOBOKEN, NJ) 2024; 8:2300163. [PMID: 38223896 PMCID: PMC10784210 DOI: 10.1002/gch2.202300163] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/02/2023] [Revised: 09/20/2023] [Indexed: 01/16/2024]
Abstract
The explosive growth of biomedical Big Data presents both significant opportunities and challenges in the realm of knowledge discovery and translational applications within precision medicine. Efficient management, analysis, and interpretation of big data can pave the way for groundbreaking advancements in precision medicine. However, the unprecedented strides in the automated collection of large-scale molecular and clinical data have also introduced formidable challenges in terms of data analysis and interpretation, necessitating the development of novel computational approaches. Some potential challenges include the curse of dimensionality, data heterogeneity, missing data, class imbalance, and scalability issues. This overview article focuses on the recent progress and breakthroughs in the application of big data within precision medicine. Key aspects are summarized, including content, data sources, technologies, tools, challenges, and existing gaps. Nine fields-Datawarehouse and data management, electronic medical record, biomedical imaging informatics, Artificial intelligence-aided surgical design and surgery optimization, omics data, health monitoring data, knowledge graph, public health informatics, and security and privacy-are discussed.
Collapse
Affiliation(s)
- Xue Yang
- Department of Pancreatic Surgery and West China Biomedical Big Data CenterWest China HospitalSichuan UniversityChengdu610041China
| | - Kexin Huang
- Department of Pancreatic Surgery and West China Biomedical Big Data CenterWest China HospitalSichuan UniversityChengdu610041China
| | - Dewei Yang
- College of Advanced Manufacturing EngineeringChongqing University of Posts and TelecommunicationsChongqingChongqing400000China
| | - Weiling Zhao
- Center for Systems MedicineSchool of Biomedical InformaticsUTHealth at HoustonHoustonTX77030USA
| | - Xiaobo Zhou
- Center for Systems MedicineSchool of Biomedical InformaticsUTHealth at HoustonHoustonTX77030USA
| |
Collapse
|
44
|
Panagiotidis E, Papachristou K, Makridou A, Zoglopitou LA, Paschali A, Kalathas T, Chatzimarkou M, Chatzipavlidou V. Review of artificial intelligence clinical applications in Nuclear Medicine. Nucl Med Commun 2024; 45:24-34. [PMID: 37901920 DOI: 10.1097/mnm.0000000000001786] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/31/2023]
Abstract
This paper provides an in-depth analysis of the clinical applications of artificial intelligence (AI) in Nuclear Medicine, focusing on three key areas: neurology, cardiology, and oncology. Beginning with neurology, specifically Alzheimer's disease and Parkinson's disease, the paper examines reviews on diagnosis and treatment planning. The same pattern is followed in cardiology studies. In the final section on oncology, the paper explores the various AI applications in multiple cancer types, including lung, head and neck, lymphoma, and pancreatic cancer.
Collapse
Affiliation(s)
| | | | - Anna Makridou
- Medical Physics Department, Cancer Hospital of Thessaloniki 'Theagenio', Thessaloniki, Greece
| | | | - Anna Paschali
- Nuclear Medicine Department, Cancer Hospital of Thessaloniki 'Theagenio' and
| | - Theodoros Kalathas
- Nuclear Medicine Department, Cancer Hospital of Thessaloniki 'Theagenio' and
| | - Michael Chatzimarkou
- Medical Physics Department, Cancer Hospital of Thessaloniki 'Theagenio', Thessaloniki, Greece
| | | |
Collapse
|
45
|
Jin Y, Yin H, Zhang H, Wang Y, Liu S, Yang L, Song B. Predicting tumor deposits in rectal cancer: a combined deep learning model using T2-MR imaging and clinical features. Insights Imaging 2023; 14:221. [PMID: 38117396 PMCID: PMC10733230 DOI: 10.1186/s13244-023-01564-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Accepted: 11/05/2023] [Indexed: 12/21/2023] Open
Abstract
BACKGROUND Tumor deposits (TDs) are associated with poor prognosis in rectal cancer (RC). This study aims to develop and validate a deep learning (DL) model incorporating T2-MR image and clinical factors for the preoperative prediction of TDs in RC patients. METHODS AND METHODS A total of 327 RC patients with pathologically confirmed TDs status from January 2016 to December 2019 were retrospectively recruited, and the T2-MR images and clinical variables were collected. Patients were randomly split into a development dataset (n = 246) and an independent testing dataset (n = 81). A single-channel DL model, a multi-channel DL model, a hybrid DL model, and a clinical model were constructed. The performance of these predictive models was assessed by using receiver operating characteristics (ROC) analysis and decision curve analysis (DCA). RESULTS The areas under the curves (AUCs) of the clinical, single-DL, multi-DL, and hybrid-DL models were 0.734 (95% CI, 0.674-0.788), 0.710 (95% CI, 0.649-0.766), 0.767 (95% CI, 0.710-0.819), and 0.857 (95% CI, 0.807-0.898) in the development dataset. The AUC of the hybrid-DL model was significantly higher than the single-DL and multi-DL models (both p < 0.001) in the development dataset, and the single-DL model (p = 0.028) in the testing dataset. Decision curve analysis demonstrated the hybrid-DL model had higher net benefit than other models across the majority range of threshold probabilities. CONCLUSIONS The proposed hybrid-DL model achieved good predictive efficacy and could be used to predict tumor deposits in rectal cancer. CRITICAL RELEVANCE STATEMENT The proposed hybrid-DL model achieved good predictive efficacy and could be used to predict tumor deposits in rectal cancer. KEY POINTS • Preoperative non-invasive identification of TDs is of great clinical significance. • The combined hybrid-DL model achieved good predictive efficacy and could be used to predict tumor deposits in rectal cancer. • A preoperative nomogram provides gastroenterologist with an accurate and effective tool.
Collapse
Affiliation(s)
- Yumei Jin
- Department of Medical Imaging Center, Qujing First People's Hospital, Qujing, 655000, Yunnan Province, China.
- Department of Radiology, West China Hospital of Sichuan University, Chengdu, 610041, Sichuan Province, China.
| | - Hongkun Yin
- Beijing Infervision Technology Co.Ltd, Beijing, China
| | - Huiling Zhang
- Beijing Infervision Technology Co.Ltd, Beijing, China
| | - Yewu Wang
- Department of Joint and Sports Medicine, Qujing First People's Hospital, Qujing, 655000, Yunnan Province, China
| | - Shengmei Liu
- Department of Radiology, West China Hospital of Sichuan University, Chengdu, 610041, Sichuan Province, China
| | - Ling Yang
- Department of Radiology, West China Hospital of Sichuan University, Chengdu, 610041, Sichuan Province, China
| | - Bin Song
- Department of Radiology, West China Hospital of Sichuan University, Chengdu, 610041, Sichuan Province, China.
- Functional and Molecular Imaging Key Laboratory of Sichuan Province, West China Hospital of Sichuan University, Chengdu, 610041, Sichuan Province, China.
- Department of Radiology, Sanya People's Hospital, Sanya, Hainan Province, 572000, China.
| |
Collapse
|
46
|
Liu F, Zhu T, Wu X, Yang B, You C, Wang C, Lu L, Liu Z, Zheng Y, Sun X, Yang Y, Clifton L, Clifton DA. A medical multimodal large language model for future pandemics. NPJ Digit Med 2023; 6:226. [PMID: 38042919 PMCID: PMC10693607 DOI: 10.1038/s41746-023-00952-2] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2023] [Accepted: 10/24/2023] [Indexed: 12/04/2023] Open
Abstract
Deep neural networks have been integrated into the whole clinical decision procedure which can improve the efficiency of diagnosis and alleviate the heavy workload of physicians. Since most neural networks are supervised, their performance heavily depends on the volume and quality of available labels. However, few such labels exist for rare diseases (e.g., new pandemics). Here we report a medical multimodal large language model (Med-MLLM) for radiograph representation learning, which can learn broad medical knowledge (e.g., image understanding, text semantics, and clinical phenotypes) from unlabelled data. As a result, when encountering a rare disease, our Med-MLLM can be rapidly deployed and easily adapted to them with limited labels. Furthermore, our model supports medical data across visual modality (e.g., chest X-ray and CT) and textual modality (e.g., medical report and free-text clinical note); therefore, it can be used for clinical tasks that involve both visual and textual data. We demonstrate the effectiveness of our Med-MLLM by showing how it would perform using the COVID-19 pandemic "in replay". In the retrospective setting, we test the model on the early COVID-19 datasets; and in the prospective setting, we test the model on the new variant COVID-19-Omicron. The experiments are conducted on 1) three kinds of input data; 2) three kinds of downstream tasks, including disease reporting, diagnosis, and prognosis; 3) five COVID-19 datasets; and 4) three different languages, including English, Chinese, and Spanish. All experiments show that our model can make accurate and robust COVID-19 decision-support with little labelled data.
Collapse
Affiliation(s)
- Fenglin Liu
- Institute of Biomedical Engineering, Department of Engineering Science, University of Oxford, Oxford, UK.
| | - Tingting Zhu
- Institute of Biomedical Engineering, Department of Engineering Science, University of Oxford, Oxford, UK
| | - Xian Wu
- Jarvis Research Center, Tencent YouTu Lab, Beijing, China
| | - Bang Yang
- School of Computer Science, Peking University, Beijing, China
| | | | - Chenyang Wang
- Institute of Biomedical Engineering, Department of Engineering Science, University of Oxford, Oxford, UK
| | - Lei Lu
- Institute of Biomedical Engineering, Department of Engineering Science, University of Oxford, Oxford, UK
| | - Zhangdaihong Liu
- Institute of Biomedical Engineering, Department of Engineering Science, University of Oxford, Oxford, UK
- Oxford-Suzhou Centre for Advanced Research, Suzhou, China
| | - Yefeng Zheng
- Jarvis Research Center, Tencent YouTu Lab, Beijing, China
| | - Xu Sun
- School of Computer Science, Peking University, Beijing, China
| | - Yang Yang
- School of Public Health, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Lei Clifton
- Nuffield Department of Population Health, University of Oxford, Oxford, UK
| | - David A Clifton
- Institute of Biomedical Engineering, Department of Engineering Science, University of Oxford, Oxford, UK.
- Oxford-Suzhou Centre for Advanced Research, Suzhou, China.
| |
Collapse
|
47
|
Wang Q, Lai MW, Bin G, Ding Q, Wu S, Zhou Z, Tsui PH. MBR-Net: A multi-branch residual network based on ultrasound backscattered signals for characterizing pediatric hepatic steatosis. ULTRASONICS 2023; 135:107093. [PMID: 37482038 DOI: 10.1016/j.ultras.2023.107093] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/13/2023] [Revised: 05/18/2023] [Accepted: 06/23/2023] [Indexed: 07/25/2023]
Abstract
The evaluation of pediatric hepatic steatosis and early detection of fatty liver in children are of critical importance. In this paper, a deep learning model based on the convolutional neural network (CNN) of ultrasound backscattered signals, multi-branch residual network (MBR-Net), was proposed for characterizing pediatric hepatic steatosis. The MBR-Net was composed of three convolutional branches. Each branch used different sizes of convolution blocks to enhance the capability of local feature acquisition, and leveraged the residual mechanism with skip connections to guide the network to effectively capture features. A total of 393 frames of ultrasound backscattered signals collected from 131 children were included in the experiments. The hepatic steatosis index was used as the reference standard for diagnosing the steatosis grade, G0-G3. The ultrasound backscattered signals within the liver region of interests (ROIs) were normalized and augmented using a sliding gate method. The gated ROI signals were randomly divided into training, validation, and test sets with the ratio of 8:1:1. The area under the operating characteristic curve (AUC), accuracy (ACC), sensitivity (SEN), and specificity (SPE) were used as the evaluation metrics. Experimental results showed that the MBR-Net yields AUCs for diagnosing pediatric hepatic steatosis grade ≥G1, ≥G2, and ≥G3 of 0.94 (ACC: 93.65%; SEN: 89.79%; SPE: 84.48%), 0.93 (ACC: 90.48%; SEN: 87.75%; SPE: 82.65%), and 0.93 (ACC: 87.76%; SEN: 84.84%; SPE: 86.55%), respectively, which were superior to the conventional one-branch CNNs without residual mechanisms. The proposed MBR-Net can be used as a new deep learning method for ultrasound backscattered signal analysis to characterize pediatric hepatic steatosis.
Collapse
Affiliation(s)
- Qian Wang
- Department of Biomedical Engineering, Faculty of Environment and Life, Beijing University of Technology, Beijing, China
| | - Ming-Wei Lai
- Division of Pediatric Gastroenterology, Department of Pediatrics, Chang Gung Children's Medical Center, Chang Gung Memorial Hospital, Linkou, College of Medicine, Chang Gung University, Taoyuan, Taiwan; Liver Research Center, Chang Gung Memorial Hospital, Linkou, Taoyuan, Taiwan
| | - Guangyu Bin
- Department of Biomedical Engineering, Faculty of Environment and Life, Beijing University of Technology, Beijing, China
| | - Qiying Ding
- Department of Ultrasound, BJUT Hospital, Beijing University of Technology, Beijing, China
| | - Shuicai Wu
- Department of Biomedical Engineering, Faculty of Environment and Life, Beijing University of Technology, Beijing, China
| | - Zhuhuang Zhou
- Department of Biomedical Engineering, Faculty of Environment and Life, Beijing University of Technology, Beijing, China.
| | - Po-Hsiang Tsui
- Department of Medical Imaging and Radiological Sciences, College of Medicine, Chang Gung University, Taoyuan, Taiwan; Institute for Radiological Research, Chang Gung University, Taoyuan, Taiwan; Division of Pediatric Gastroenterology, Department of Pediatrics, Chang Gung Memorial Hospital at Linkou, Taoyuan, Taiwan.
| |
Collapse
|
48
|
Nicolaes J, Skjødt MK, Raeymaeckers S, Smith CD, Abrahamsen B, Fuerst T, Debois M, Vandermeulen D, Libanati C. Towards Improved Identification of Vertebral Fractures in Routine Computed Tomography (CT) Scans: Development and External Validation of a Machine Learning Algorithm. J Bone Miner Res 2023; 38:1856-1866. [PMID: 37747147 DOI: 10.1002/jbmr.4916] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/17/2023] [Revised: 09/06/2023] [Accepted: 09/17/2023] [Indexed: 09/26/2023]
Abstract
Vertebral fractures (VFs) are the hallmark of osteoporosis, being one of the most frequent types of fragility fracture and an early sign of the disease. They are associated with significant morbidity and mortality. VFs are incidentally found in one out of five imaging studies, however, more than half of the VFs are not identified nor reported in patient computed tomography (CT) scans. Our study aimed to develop a machine learning algorithm to identify VFs in abdominal/chest CT scans and evaluate its performance. We acquired two independent data sets of routine abdominal/chest CT scans of patients aged 50 years or older: a training set of 1011 scans from a non-interventional, prospective proof-of-concept study at the Universitair Ziekenhuis (UZ) Brussel and a validation set of 2000 subjects from an observational cohort study at the Hospital of Holbaek. Both data sets were externally reevaluated to identify reference standard VF readings using the Genant semiquantitative (SQ) grading. Four independent models have been trained in a cross-validation experiment using the training set and an ensemble of four models has been applied to the external validation set. The validation set contained 15.3% scans with one or more VF (SQ2-3), whereas 663 of 24,930 evaluable vertebrae (2.7%) were fractured (SQ2-3) as per reference standard readings. Comparison of the ensemble model with the reference standard readings in identifying subjects with one or more moderate or severe VF resulted in an area under the receiver operating characteristic curve (AUROC) of 0.88 (95% confidence interval [CI], 0.85-0.90), accuracy of 0.92 (95% CI, 0.91-0.93), kappa of 0.72 (95% CI, 0.67-0.76), sensitivity of 0.81 (95% CI, 0.76-0.85), and specificity of 0.95 (95% CI, 0.93-0.96). We demonstrated that a machine learning algorithm trained for VF detection achieved strong performance on an external validation set. It has the potential to support healthcare professionals with the early identification of VFs and prevention of future fragility fractures. © 2023 UCB S.A. and The Authors. Journal of Bone and Mineral Research published by Wiley Periodicals LLC on behalf of American Society for Bone and Mineral Research (ASBMR).
Collapse
Affiliation(s)
- Joeri Nicolaes
- Department of Electrical Engineering (ESAT), Center for Processing Speech and Images, KU Leuven, Leuven, Belgium
- UCB Pharma, Brussels, Belgium
| | - Michael Kriegbaum Skjødt
- Department of Medicine, Hospital of Holbaek, Holbaek, Denmark
- OPEN-Open Patient Data Explorative Network, Department of Clinical Research, University of Southern Denmark and Odense University Hospital, Odense, Denmark
| | | | - Christopher Dyer Smith
- OPEN-Open Patient Data Explorative Network, Department of Clinical Research, University of Southern Denmark and Odense University Hospital, Odense, Denmark
| | - Bo Abrahamsen
- Department of Medicine, Hospital of Holbaek, Holbaek, Denmark
- OPEN-Open Patient Data Explorative Network, Department of Clinical Research, University of Southern Denmark and Odense University Hospital, Odense, Denmark
- NDORMS, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, Oxford University Hospitals, Oxford, UK
| | | | | | - Dirk Vandermeulen
- Department of Electrical Engineering (ESAT), Center for Processing Speech and Images, KU Leuven, Leuven, Belgium
| | | |
Collapse
|
49
|
Chen H, Wang R, Wang X, Li J, Fang Q, Li H, Bai J, Peng Q, Meng D, Wang L. Unsupervised Local Discrimination for Medical Images. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2023; 45:15912-15929. [PMID: 37494162 DOI: 10.1109/tpami.2023.3299038] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/28/2023]
Abstract
Contrastive learning, which aims to capture general representation from unlabeled images to initialize the medical analysis models, has been proven effective in alleviating the high demand for expensive annotations. Current methods mainly focus on instance-wise comparisons to learn the global discriminative features, however, pretermitting the local details to distinguish tiny anatomical structures, lesions, and tissues. To address this challenge, in this paper, we propose a general unsupervised representation learning framework, named local discrimination (LD), to learn local discriminative features for medical images by closely embedding semantically similar pixels and identifying regions of similar structures across different images. Specifically, this model is equipped with an embedding module for pixel-wise embedding and a clustering module for generating segmentation. And these two modules are unified by optimizing our novel region discrimination loss function in a mutually beneficial mechanism, which enables our model to reflect structure information as well as measure pixel-wise and region-wise similarity. Furthermore, based on LD, we propose a center-sensitive one-shot landmark localization algorithm and a shape-guided cross-modality segmentation model to foster the generalizability of our model. When transferred to downstream tasks, the learned representation by our method shows a better generalization, outperforming representation from 18 state-of-the-art (SOTA) methods and winning 9 out of all 12 downstream tasks. Especially for the challenging lesion segmentation tasks, the proposed method achieves significantly better performance.
Collapse
|
50
|
Candemir S, Moranville R, Wong KA, Campbell W, Bigelow MT, Prevedello LM, Makary MS. Detecting and Characterizing Inferior Vena Cava Filters on Abdominal Computed Tomography with Data-Driven Computational Frameworks. J Digit Imaging 2023; 36:2507-2518. [PMID: 37770730 PMCID: PMC10584764 DOI: 10.1007/s10278-023-00882-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2022] [Revised: 06/27/2023] [Accepted: 07/05/2023] [Indexed: 09/30/2023] Open
Abstract
Two data-driven algorithms were developed for detecting and characterizing Inferior Vena Cava (IVC) filters on abdominal computed tomography to assist healthcare providers with the appropriate management of these devices to decrease complications: one based on 2-dimensional data and transfer learning (2D + TL) and an augmented version of the same algorithm which accounts for the 3-dimensional information leveraging recurrent convolutional neural networks (3D + RCNN). The study contains 2048 abdominal computed tomography studies obtained from 439 patients who underwent IVC filter placement during the 10-year period from January 1st, 2009, to January 1st, 2019. Among these, 399 patients had retrievable filters, and 40 had non-retrievable filter types. The reference annotations for the filter location were obtained through a custom-developed interface. The ground truth annotations for the filter types were determined based on the electronic medical record and physician review of imaging. The initial stage of the framework returns a list of locations containing metallic objects based on the density of the structure. The second stage processes the candidate locations and determines which one contains an IVC filter. The final stage of the pipeline classifies the filter types as retrievable vs. non-retrievable. The computational models are trained using Tensorflow Keras API on an Nvidia Quadro GV100 system. We utilized a fine-tuning supervised training strategy to conduct our experiments. We find that the system achieves high sensitivity on detecting the filter locations with a high confidence value. The 2D + TL model achieved a sensitivity of 0.911 and a precision of 0.804, and the 3D + RCNN model achieved a sensitivity of 0.923 and a precision of 0.853 for filter detection. The system confidence for the IVC location predictions is high: 0.993 for 2D + TL and 0.996 for 3D + RCNN. The filter type prediction component of the system achieved 0.945 sensitivity, 0.882 specificity, and 0.97 AUC score with 2D + TL and 0. 940 sensitivity, 0.927 specificity, and 0.975 AUC score with 3D + RCNN. With the intent to create tools to improve patient outcomes, this study describes the initial phase of a computational framework to support healthcare providers in detecting patients with retained IVC filters, so an individualized decision can be made to remove these devices when appropriate, to decrease complications. To our knowledge, this is the first study that curates abdominal computed tomography (CT) scans and presents an algorithm for automated detection and characterization of IVC filters.
Collapse
Affiliation(s)
- Sema Candemir
- Department of Radiology, The Ohio State University Wexner Medical Center, Columbus, OH, 43210, USA.
- Laboratory for Augmented Intelligence in Imaging, The Ohio State University, Columbus, OH, 43210, USA.
| | - Robert Moranville
- Department of Radiology, The Ohio State University Wexner Medical Center, Columbus, OH, 43210, USA
| | - Kelvin A Wong
- Department of Radiology, The Ohio State University Wexner Medical Center, Columbus, OH, 43210, USA
- Laboratory for Augmented Intelligence in Imaging, The Ohio State University, Columbus, OH, 43210, USA
| | - Warren Campbell
- Department of Radiology, The Ohio State University Wexner Medical Center, Columbus, OH, 43210, USA
| | - Matthew T Bigelow
- Department of Radiology, The Ohio State University Wexner Medical Center, Columbus, OH, 43210, USA
- Laboratory for Augmented Intelligence in Imaging, The Ohio State University, Columbus, OH, 43210, USA
| | - Luciano M Prevedello
- Department of Radiology, The Ohio State University Wexner Medical Center, Columbus, OH, 43210, USA
- Laboratory for Augmented Intelligence in Imaging, The Ohio State University, Columbus, OH, 43210, USA
| | - Mina S Makary
- Department of Radiology, The Ohio State University Wexner Medical Center, Columbus, OH, 43210, USA
| |
Collapse
|