1
|
Chen M, Zhang M, Yin L, Ma L, Ding R, Zheng T, Yue Q, Lui S, Sun H. Medical image foundation models in assisting diagnosis of brain tumors: a pilot study. Eur Radiol 2024; 34:6667-6679. [PMID: 38627290 DOI: 10.1007/s00330-024-10728-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2023] [Revised: 02/08/2024] [Accepted: 03/04/2024] [Indexed: 04/23/2024]
Abstract
OBJECTIVES To build self-supervised foundation models for multicontrast MRI of the whole brain and evaluate their efficacy in assisting diagnosis of brain tumors. METHODS In this retrospective study, foundation models were developed using 57,621 enhanced head MRI scans through self-supervised learning with a pretext task of cross-contrast context restoration with two different content dropout schemes. Downstream classifiers were constructed based on the pretrained foundation models and fine-tuned for brain tumor detection, discrimination, and molecular status prediction. Metrics including accuracy, sensitivity, specificity, and area under the ROC curve (AUC) were used to evaluate the performance. Convolutional neural networks trained exclusively on downstream task data were employed for comparative analysis. RESULTS The pretrained foundation models demonstrated their ability to extract effective representations from multicontrast whole-brain volumes. The best classifiers, endowed with pretrained weights, showed remarkable performance with accuracies of 94.9, 92.3, and 80.4%, and corresponding AUC values of 0.981, 0.972, and 0.852 on independent test datasets in brain tumor detection, discrimination, and molecular status prediction, respectively. The classifiers with pretrained weights outperformed the convolutional classifiers trained from scratch by approximately 10% in terms of accuracy and AUC across all tasks. The saliency regions in the correctly predicted cases are mainly clustered around the tumors. Classifiers derived from the two dropout schemes differed significantly only in the detection of brain tumors. CONCLUSIONS Foundation models obtained from self-supervised learning have demonstrated encouraging potential for scalability and interpretability in downstream brain tumor-related tasks and hold promise for extension to neurological diseases with diffusely distributed lesions. CLINICAL RELEVANCE STATEMENT The application of our proposed method to the prediction of key molecular status in gliomas is expected to improve treatment planning and patient outcomes. Additionally, the foundation model we developed could serve as a cornerstone for advancing AI applications in the diagnosis of brain-related diseases.
Collapse
Affiliation(s)
- Mengyao Chen
- Department of Radiology, West China Hospital of Sichuan University, Chengdu, China
- Huaxi MR Research Center (HMRRC), West China Hospital of Sichuan University, Chengdu, China
| | | | - Lijuan Yin
- Department of Pathology, West China Hospital of Sichuan University, Chengdu, China
| | - Lu Ma
- Department of Neurosurgery, West China Hospital of Sichuan University, Chengdu, China
| | - Renxing Ding
- IT center, West China Hospital of Sichuan University, Chengdu, China
| | - Tao Zheng
- IT center, West China Hospital of Sichuan University, Chengdu, China
| | - Qiang Yue
- Department of Radiology, West China Hospital of Sichuan University, Chengdu, China
- Huaxi MR Research Center (HMRRC), West China Hospital of Sichuan University, Chengdu, China
| | - Su Lui
- Department of Radiology, West China Hospital of Sichuan University, Chengdu, China
- Huaxi MR Research Center (HMRRC), West China Hospital of Sichuan University, Chengdu, China
| | - Huaiqiang Sun
- Department of Radiology, West China Hospital of Sichuan University, Chengdu, China.
- Huaxi MR Research Center (HMRRC), West China Hospital of Sichuan University, Chengdu, China.
| |
Collapse
|
2
|
Martin E, Cook AG, Frost SM, Turner AW, Chen FK, McAllister IL, Nolde JM, Schlaich MP. Ocular biomarkers: useful incidental findings by deep learning algorithms in fundus photographs. Eye (Lond) 2024; 38:2581-2588. [PMID: 38734746 PMCID: PMC11385472 DOI: 10.1038/s41433-024-03085-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2023] [Revised: 04/03/2024] [Accepted: 04/11/2024] [Indexed: 05/13/2024] Open
Abstract
BACKGROUND/OBJECTIVES Artificial intelligence can assist with ocular image analysis for screening and diagnosis, but it is not yet capable of autonomous full-spectrum screening. Hypothetically, false-positive results may have unrealized screening potential arising from signals persisting despite training and/or ambiguous signals such as from biomarker overlap or high comorbidity. The study aimed to explore the potential to detect clinically useful incidental ocular biomarkers by screening fundus photographs of hypertensive adults using diabetic deep learning algorithms. SUBJECTS/METHODS Patients referred for treatment-resistant hypertension were imaged at a hospital unit in Perth, Australia, between 2016 and 2022. The same 45° colour fundus photograph selected for each of the 433 participants imaged was processed by three deep learning algorithms. Two expert retinal specialists graded all false-positive results for diabetic retinopathy in non-diabetic participants. RESULTS Of the 29 non-diabetic participants misclassified as positive for diabetic retinopathy, 28 (97%) had clinically useful retinal biomarkers. The models designed to screen for fewer diseases captured more incidental disease. All three algorithms showed a positive correlation between severity of hypertensive retinopathy and misclassified diabetic retinopathy. CONCLUSIONS The results suggest that diabetic deep learning models may be responsive to hypertensive and other clinically useful retinal biomarkers within an at-risk, hypertensive cohort. Observing that models trained for fewer diseases captured more incidental pathology increases confidence in signalling hypotheses aligned with using self-supervised learning to develop autonomous comprehensive screening. Meanwhile, non-referable and false-positive outputs of other deep learning screening models could be explored for immediate clinical use in other populations.
Collapse
Affiliation(s)
- Eve Martin
- Commonwealth Scientific and Industrial Research Organisation (CSIRO), Kensington, WA, Australia.
- School of Population and Global Health, The University of Western Australia, Crawley, Australia.
- Dobney Hypertension Centre - Royal Perth Hospital Unit, Medical School, The University of Western Australia, Perth, Australia.
- Australian e-Health Research Centre, Floreat, WA, Australia.
| | - Angus G Cook
- School of Population and Global Health, The University of Western Australia, Crawley, Australia
| | - Shaun M Frost
- Commonwealth Scientific and Industrial Research Organisation (CSIRO), Kensington, WA, Australia
- Australian e-Health Research Centre, Floreat, WA, Australia
| | - Angus W Turner
- Lions Eye Institute, Nedlands, WA, Australia
- Centre for Ophthalmology and Visual Science, The University of Western Australia, Perth, Australia
| | - Fred K Chen
- Lions Eye Institute, Nedlands, WA, Australia
- Centre for Ophthalmology and Visual Science, The University of Western Australia, Perth, Australia
- Centre for Eye Research Australia, The Royal Victorian Eye and Ear Hospital, East Melbourne, VIC, Australia
- Ophthalmology, Department of Surgery, The University of Melbourne, East Melbourne, VIC, Australia
- Ophthalmology Department, Royal Perth Hospital, Perth, Australia
| | - Ian L McAllister
- Lions Eye Institute, Nedlands, WA, Australia
- Centre for Ophthalmology and Visual Science, The University of Western Australia, Perth, Australia
| | - Janis M Nolde
- Dobney Hypertension Centre - Royal Perth Hospital Unit, Medical School, The University of Western Australia, Perth, Australia
- Departments of Cardiology and Nephrology, Royal Perth Hospital, Perth, Australia
| | - Markus P Schlaich
- Dobney Hypertension Centre - Royal Perth Hospital Unit, Medical School, The University of Western Australia, Perth, Australia
- Departments of Cardiology and Nephrology, Royal Perth Hospital, Perth, Australia
| |
Collapse
|
3
|
Bhattacharya D, Behrendt F, Becker BT, Maack L, Beyersdorff D, Petersen E, Petersen M, Cheng B, Eggert D, Betz C, Hoffmann AS, Schlaefer A. Self-supervised learning for classifying paranasal anomalies in the maxillary sinus. Int J Comput Assist Radiol Surg 2024; 19:1713-1721. [PMID: 38850438 PMCID: PMC11365849 DOI: 10.1007/s11548-024-03172-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2023] [Accepted: 05/01/2024] [Indexed: 06/10/2024]
Abstract
PURPOSE Paranasal anomalies, frequently identified in routine radiological screenings, exhibit diverse morphological characteristics. Due to the diversity of anomalies, supervised learning methods require large labelled dataset exhibiting diverse anomaly morphology. Self-supervised learning (SSL) can be used to learn representations from unlabelled data. However, there are no SSL methods designed for the downstream task of classifying paranasal anomalies in the maxillary sinus (MS). METHODS Our approach uses a 3D convolutional autoencoder (CAE) trained in an unsupervised anomaly detection (UAD) framework. Initially, we train the 3D CAE to reduce reconstruction errors when reconstructing normal maxillary sinus (MS) image. Then, this CAE is applied to an unlabelled dataset to generate coarse anomaly locations by creating residual MS images. Following this, a 3D convolutional neural network (CNN) reconstructs these residual images, which forms our SSL task. Lastly, we fine-tune the encoder part of the 3D CNN on a labelled dataset of normal and anomalous MS images. RESULTS The proposed SSL technique exhibits superior performance compared to existing generic self-supervised methods, especially in scenarios with limited annotated data. When trained on just 10% of the annotated dataset, our method achieves an area under the precision-recall curve (AUPRC) of 0.79 for the downstream classification task. This performance surpasses other methods, with BYOL attaining an AUPRC of 0.75, SimSiam at 0.74, SimCLR at 0.73 and masked autoencoding using SparK at 0.75. CONCLUSION A self-supervised learning approach that inherently focuses on localizing paranasal anomalies proves to be advantageous, particularly when the subsequent task involves differentiating normal from anomalous maxillary sinuses. Access our code at https://github.com/mtec-tuhh/self-supervised-paranasal-anomaly .
Collapse
Affiliation(s)
- Debayan Bhattacharya
- Institute of Medical Technology and Intelligent Systems, Technische Universitaet Hamburg, Hamburg, Germany.
- Department of Otorhinolaryngology, Head and Neck Surgery and Oncology, University Medical Center Hamburg-Eppendorf, Hamburg, Germany.
| | - Finn Behrendt
- Institute of Medical Technology and Intelligent Systems, Technische Universitaet Hamburg, Hamburg, Germany
| | - Benjamin Tobias Becker
- Department of Otorhinolaryngology, Head and Neck Surgery and Oncology, University Medical Center Hamburg-Eppendorf, Hamburg, Germany
| | - Lennart Maack
- Institute of Medical Technology and Intelligent Systems, Technische Universitaet Hamburg, Hamburg, Germany
| | - Dirk Beyersdorff
- Clinic and Polyclinic for Diagnostic and Interventional Radiology and Nuclear Medicine, University Medical Center Hamburg-Eppendorf, Hamburg, Germany
| | - Elina Petersen
- Population Health Research Department, University Heart and Vascular Center, University Medical Center Hamburg-Eppendorf, Hamburg, Germany
| | - Marvin Petersen
- Clinic and Polyclinic for Neurology, University Medical Center Hamburg-Eppendorf, Hamburg, Germany
| | - Bastian Cheng
- Clinic and Polyclinic for Neurology, University Medical Center Hamburg-Eppendorf, Hamburg, Germany
| | - Dennis Eggert
- Department of Otorhinolaryngology, Head and Neck Surgery and Oncology, University Medical Center Hamburg-Eppendorf, Hamburg, Germany
| | - Christian Betz
- Department of Otorhinolaryngology, Head and Neck Surgery and Oncology, University Medical Center Hamburg-Eppendorf, Hamburg, Germany
| | - Anna Sophie Hoffmann
- Department of Otorhinolaryngology, Head and Neck Surgery and Oncology, University Medical Center Hamburg-Eppendorf, Hamburg, Germany
| | - Alexander Schlaefer
- Institute of Medical Technology and Intelligent Systems, Technische Universitaet Hamburg, Hamburg, Germany
| |
Collapse
|
4
|
Li K, Yang J, Liang W, Li X, Zhang C, Chen L, Wu C, Zhang X, Xu Z, Wang Y, Meng L, Zhang Y, Chen Y, Zhou SK. O-PRESS: Boosting OCT axial resolution with Prior guidance, Recurrence, and Equivariant Self-Supervision. Med Image Anal 2024; 99:103319. [PMID: 39270466 DOI: 10.1016/j.media.2024.103319] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2024] [Revised: 07/10/2024] [Accepted: 08/19/2024] [Indexed: 09/15/2024]
Abstract
Optical coherence tomography (OCT) is a noninvasive technology that enables real-time imaging of tissue microanatomies. The axial resolution of OCT is intrinsically constrained by the spectral bandwidth of the employed light source while maintaining a fixed center wavelength for a specific application. Physically extending this bandwidth faces strong limitations and requires a substantial cost. We present a novel computational approach, called as O-PRESS, for boosting the axial resolution of OCT with Prior guidance, a Recurrent mechanism, and Equivariant Self-Supervision. Diverging from conventional deconvolution methods that rely on physical models or data-driven techniques, our method seamlessly integrates OCT modeling and deep learning, enabling us to achieve real-time axial-resolution enhancement exclusively from measurements without a need for paired images. Our approach solves two primary tasks of resolution enhancement and noise reduction with one treatment. Both tasks are executed in a self-supervised manner, with equivariance imaging and free space priors guiding their respective processes. Experimental evaluations, encompassing both quantitative metrics and visual assessments, consistently verify the efficacy and superiority of our approach, which exhibits performance on par with fully supervised methods. Importantly, the robustness of our model is affirmed, showcasing its dual capability to enhance axial resolution while concurrently improving the signal-to-noise ratio.
Collapse
Affiliation(s)
- Kaiyan Li
- School of Biomedical Engineering, Division of Life Sciences and Medicine, University of Science and Technology of China (USTC), Hefei Anhui, 230026, China; Center for Medical Imaging, Robotics, Analytic Computing & Learning (MIRACLE), Suzhou Institute for Advanced Research, USTC, Suzhou Jiangsu, 215123, China
| | - Jingyuan Yang
- Department of Ophthalmology, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences, Beijing, 100730, China; Key Laboratory of Ocular Fundus Diseases, Chinese Academy of Medical Sciences, Beijing, 100730, China
| | - Wenxuan Liang
- School of Biomedical Engineering, Division of Life Sciences and Medicine, University of Science and Technology of China (USTC), Hefei Anhui, 230026, China; Center for Medical Imaging, Robotics, Analytic Computing & Learning (MIRACLE), Suzhou Institute for Advanced Research, USTC, Suzhou Jiangsu, 215123, China; School of Physical Sciences, University of Science and Technology of China, Hefei Anhui, 230026, China
| | - Xingde Li
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, 21287, USA
| | - Chenxi Zhang
- Department of Ophthalmology, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences, Beijing, 100730, China; Key Laboratory of Ocular Fundus Diseases, Chinese Academy of Medical Sciences, Beijing, 100730, China
| | - Lulu Chen
- Department of Ophthalmology, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences, Beijing, 100730, China; Key Laboratory of Ocular Fundus Diseases, Chinese Academy of Medical Sciences, Beijing, 100730, China
| | - Chan Wu
- Department of Ophthalmology, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences, Beijing, 100730, China; Key Laboratory of Ocular Fundus Diseases, Chinese Academy of Medical Sciences, Beijing, 100730, China
| | - Xiao Zhang
- Department of Ophthalmology, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences, Beijing, 100730, China; Key Laboratory of Ocular Fundus Diseases, Chinese Academy of Medical Sciences, Beijing, 100730, China
| | - Zhiyan Xu
- Department of Ophthalmology, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences, Beijing, 100730, China; Key Laboratory of Ocular Fundus Diseases, Chinese Academy of Medical Sciences, Beijing, 100730, China
| | - Yueling Wang
- Department of Ophthalmology, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences, Beijing, 100730, China; Key Laboratory of Ocular Fundus Diseases, Chinese Academy of Medical Sciences, Beijing, 100730, China
| | - Lihui Meng
- Department of Ophthalmology, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences, Beijing, 100730, China; Key Laboratory of Ocular Fundus Diseases, Chinese Academy of Medical Sciences, Beijing, 100730, China
| | - Yue Zhang
- School of Biomedical Engineering, Division of Life Sciences and Medicine, University of Science and Technology of China (USTC), Hefei Anhui, 230026, China; Center for Medical Imaging, Robotics, Analytic Computing & Learning (MIRACLE), Suzhou Institute for Advanced Research, USTC, Suzhou Jiangsu, 215123, China
| | - Youxin Chen
- Department of Ophthalmology, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences, Beijing, 100730, China; Key Laboratory of Ocular Fundus Diseases, Chinese Academy of Medical Sciences, Beijing, 100730, China.
| | - S Kevin Zhou
- School of Biomedical Engineering, Division of Life Sciences and Medicine, University of Science and Technology of China (USTC), Hefei Anhui, 230026, China; Center for Medical Imaging, Robotics, Analytic Computing & Learning (MIRACLE), Suzhou Institute for Advanced Research, USTC, Suzhou Jiangsu, 215123, China; Key Laboratory of Precision and Intelligent Chemistry, USTC, Hefei Anhui, 230026, China; Key Laboratory of Intelligent Information Processing of Chinese Academy of Sciences (CAS), Institute of Computing Technology, CAS, Beijing, 100190, China.
| |
Collapse
|
5
|
Bluethgen C, Chambon P, Delbrouck JB, van der Sluijs R, Połacin M, Zambrano Chaves JM, Abraham TM, Purohit S, Langlotz CP, Chaudhari AS. A vision-language foundation model for the generation of realistic chest X-ray images. Nat Biomed Eng 2024:10.1038/s41551-024-01246-y. [PMID: 39187663 DOI: 10.1038/s41551-024-01246-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2023] [Accepted: 07/28/2024] [Indexed: 08/28/2024]
Abstract
The paucity of high-quality medical imaging datasets could be mitigated by machine learning models that generate compositionally diverse images that faithfully represent medical concepts and pathologies. However, large vision-language models are trained on natural images, and the diversity distribution of the generated images substantially differs from that of medical images. Moreover, medical language involves specific and semantically rich vocabulary. Here we describe a domain-adaptation strategy for large vision-language models that overcomes distributional shifts. Specifically, by leveraging publicly available datasets of chest X-ray images and the corresponding radiology reports, we adapted a latent diffusion model pre-trained on pairs of natural images and text descriptors to generate diverse and visually plausible synthetic chest X-ray images (as confirmed by board-certified radiologists) whose appearance can be controlled with free-form medical text prompts. The domain-adaptation strategy for the text-conditioned synthesis of medical images can be used to augment training datasets and is a viable alternative to the sharing of real medical images for model training and fine-tuning.
Collapse
Affiliation(s)
- Christian Bluethgen
- Center for Artificial Intelligence in Medicine and Imaging, Stanford University, Palo Alto, CA, USA.
- Department of Radiology, Stanford University, Palo Alto, CA, USA.
- Diagnostic and Interventional Radiology, University Hospital Zurich, University of Zurich, Zurich, Switzerland.
| | - Pierre Chambon
- Center for Artificial Intelligence in Medicine and Imaging, Stanford University, Palo Alto, CA, USA
- Department of Radiology, Stanford University, Palo Alto, CA, USA
| | - Jean-Benoit Delbrouck
- Center for Artificial Intelligence in Medicine and Imaging, Stanford University, Palo Alto, CA, USA
- Department of Radiology, Stanford University, Palo Alto, CA, USA
| | - Rogier van der Sluijs
- Center for Artificial Intelligence in Medicine and Imaging, Stanford University, Palo Alto, CA, USA
- Department of Radiology, Stanford University, Palo Alto, CA, USA
| | - Małgorzata Połacin
- Department of Radiology, Stanford University, Palo Alto, CA, USA
- Diagnostic and Interventional Radiology, University Hospital Zurich, University of Zurich, Zurich, Switzerland
| | - Juan Manuel Zambrano Chaves
- Center for Artificial Intelligence in Medicine and Imaging, Stanford University, Palo Alto, CA, USA
- Department of Biomedical Data Science, Stanford University, Palo Alto, CA, USA
| | | | | | - Curtis P Langlotz
- Center for Artificial Intelligence in Medicine and Imaging, Stanford University, Palo Alto, CA, USA
- Department of Radiology, Stanford University, Palo Alto, CA, USA
- Department of Biomedical Data Science, Stanford University, Palo Alto, CA, USA
| | - Akshay S Chaudhari
- Center for Artificial Intelligence in Medicine and Imaging, Stanford University, Palo Alto, CA, USA
- Department of Radiology, Stanford University, Palo Alto, CA, USA
- Department of Biomedical Data Science, Stanford University, Palo Alto, CA, USA
| |
Collapse
|
6
|
Zhang J, Xiao F, Zou H, Feng R, He J. Self-supervised learning-enhanced deep learning method for identifying myopic maculopathy in high myopia patients. iScience 2024; 27:110566. [PMID: 39211543 PMCID: PMC11359982 DOI: 10.1016/j.isci.2024.110566] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2024] [Revised: 04/28/2024] [Accepted: 07/18/2024] [Indexed: 09/04/2024] Open
Abstract
Accurate detection and timely care for patients with high myopia present significant challenges. We developed a deep learning (DL) system enhanced by a self-supervised learning (SSL) approach to improve the automatic diagnosis of myopic maculopathy (MM). Using a dataset of 7,906 images from the Shanghai High Myopia Screening Project and a public validation set of 1,391 images from MMAC2023, our method significantly outperformed conventional techniques. Internally, it achieved 96.8% accuracy, 83.1% sensitivity, and 95.6% specificity, with AUC values of 0.982 and 0.999. Externally, it maintained 89.0% accuracy, 71.7% sensitivity, and 87.8% specificity, with AUC values of 0.978 and 0.973. The model's Cohen's kappa values exceeded 0.8, indicating substantial agreement with retinal experts. Our SSL-enhanced DL approach offers high accuracy and potential to enhance large-scale myopia screenings, demonstrating broader significance in improving early detection and treatment of MM.
Collapse
Affiliation(s)
- Juzhao Zhang
- Shanghai Eye Disease Prevention & Treatment Center/Shanghai Eye Hospital, School of Medicine, Tongji University, Shanghai, China
- National Clinical Research Center for Eye Diseases, Shanghai, China
- Department of Ophthalmology, Shanghai General Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China
- Shanghai Engineering Center for Precise Diagnosis and Treatment of Eye Diseases, Shanghai, China
| | - Fan Xiao
- School of Computer Science, Shanghai Key Laboratory of Intelligent Information Processing, Fudan University, Shanghai, China
- Academy for Engineering and Technology, Fudan University, Shanghai, China
| | - Haidong Zou
- Shanghai Eye Disease Prevention & Treatment Center/Shanghai Eye Hospital, School of Medicine, Tongji University, Shanghai, China
- National Clinical Research Center for Eye Diseases, Shanghai, China
- Department of Ophthalmology, Shanghai General Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China
- Shanghai Engineering Center for Precise Diagnosis and Treatment of Eye Diseases, Shanghai, China
| | - Rui Feng
- Shanghai Engineering Center for Precise Diagnosis and Treatment of Eye Diseases, Shanghai, China
- School of Computer Science, Shanghai Key Laboratory of Intelligent Information Processing, Fudan University, Shanghai, China
- Academy for Engineering and Technology, Fudan University, Shanghai, China
| | - Jiangnan He
- Shanghai Eye Disease Prevention & Treatment Center/Shanghai Eye Hospital, School of Medicine, Tongji University, Shanghai, China
- Shanghai Engineering Center for Precise Diagnosis and Treatment of Eye Diseases, Shanghai, China
| |
Collapse
|
7
|
Imagawa K, Shiomoto K. Evaluation of Effectiveness of Self-Supervised Learning in Chest X-Ray Imaging to Reduce Annotated Images. JOURNAL OF IMAGING INFORMATICS IN MEDICINE 2024; 37:1618-1624. [PMID: 38459399 PMCID: PMC11300406 DOI: 10.1007/s10278-024-00975-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/07/2023] [Revised: 11/17/2023] [Accepted: 11/17/2023] [Indexed: 03/10/2024]
Abstract
A significant challenge in machine learning-based medical image analysis is the scarcity of medical images. Obtaining a large number of labeled medical images is difficult because annotating medical images is a time-consuming process that requires specialized knowledge. In addition, inappropriate annotation processes can increase model bias. Self-supervised learning (SSL) is a type of unsupervised learning method that extracts image representations. Thus, SSL can be an effective method to reduce the number of labeled images. In this study, we investigated the feasibility of reducing the number of labeled images in a limited set of unlabeled medical images. The unlabeled chest X-ray (CXR) images were pretrained using the SimCLR framework, and then the representations were fine-tuned as supervised learning for the target task. A total of 2000 task-specific CXR images were used to perform binary classification of coronavirus disease 2019 (COVID-19) and normal cases. The results demonstrate that the performance of pretraining on task-specific unlabeled CXR images can be maintained when the number of labeled CXR images is reduced by approximately 40%. In addition, the performance was significantly better than that obtained without pretraining. In contrast, a large number of pretrained unlabeled images are required to maintain performance regardless of task specificity among a small number of labeled CXR images. In summary, to reduce the number of labeled images using SimCLR, we must consider both the number of images and the task-specific characteristics of the target images.
Collapse
Affiliation(s)
- Kuniki Imagawa
- Faculty of Information Technology, Tokyo City University, 1-28-1 Tamazutsumi, Setagaya-ku, Tokyo, 158-8557, Japan.
| | - Kohei Shiomoto
- Faculty of Information Technology, Tokyo City University, 1-28-1 Tamazutsumi, Setagaya-ku, Tokyo, 158-8557, Japan
| |
Collapse
|
8
|
Paverd H, Zormpas-Petridis K, Clayton H, Burge S, Crispin-Ortuzar M. Radiology and multi-scale data integration for precision oncology. NPJ Precis Oncol 2024; 8:158. [PMID: 39060351 PMCID: PMC11282284 DOI: 10.1038/s41698-024-00656-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2024] [Accepted: 07/15/2024] [Indexed: 07/28/2024] Open
Abstract
In this Perspective paper we explore the potential of integrating radiological imaging with other data types, a critical yet underdeveloped area in comparison to the fusion of other multi-omic data. Radiological images provide a comprehensive, three-dimensional view of cancer, capturing features that would be missed by biopsies or other data modalities. This paper explores the complexities and challenges of incorporating medical imaging into data integration models, in the context of precision oncology. We present the different categories of imaging-omics integration and discuss recent progress, highlighting the opportunities that arise from bringing together spatial data on different scales.
Collapse
Affiliation(s)
- Hania Paverd
- Cambridge University Hospitals NHS Foundation Trust, Cambridge, UK
- Department of Oncology, University of Cambridge, Cambridge, UK
- Cancer Research UK Cambridge Centre, University of Cambridge, Cambridge, UK
| | | | - Hannah Clayton
- Department of Oncology, University of Cambridge, Cambridge, UK
- Cancer Research UK Cambridge Centre, University of Cambridge, Cambridge, UK
| | - Sarah Burge
- Cancer Research UK Cambridge Centre, University of Cambridge, Cambridge, UK
| | - Mireia Crispin-Ortuzar
- Department of Oncology, University of Cambridge, Cambridge, UK.
- Cancer Research UK Cambridge Centre, University of Cambridge, Cambridge, UK.
| |
Collapse
|
9
|
Corponi F, Li BM, Anmella G, Valenzuela-Pascual C, Mas A, Pacchiarotti I, Valentí M, Grande I, Benabarre A, Garriga M, Vieta E, Young AH, Lawrie SM, Whalley HC, Hidalgo-Mazzei D, Vergari A. Wearable Data From Subjects Playing Super Mario, Taking University Exams, or Performing Physical Exercise Help Detect Acute Mood Disorder Episodes via Self-Supervised Learning: Prospective, Exploratory, Observational Study. JMIR Mhealth Uhealth 2024; 12:e55094. [PMID: 39018100 PMCID: PMC11292167 DOI: 10.2196/55094] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2023] [Revised: 04/14/2024] [Accepted: 05/24/2024] [Indexed: 07/18/2024] Open
Abstract
BACKGROUND Personal sensing, leveraging data passively and near-continuously collected with wearables from patients in their ecological environment, is a promising paradigm to monitor mood disorders (MDs), a major determinant of the worldwide disease burden. However, collecting and annotating wearable data is resource intensive. Studies of this kind can thus typically afford to recruit only a few dozen patients. This constitutes one of the major obstacles to applying modern supervised machine learning techniques to MD detection. OBJECTIVE In this paper, we overcame this data bottleneck and advanced the detection of acute MD episodes from wearables' data on the back of recent advances in self-supervised learning (SSL). This approach leverages unlabeled data to learn representations during pretraining, subsequently exploited for a supervised task. METHODS We collected open access data sets recording with the Empatica E4 wristband spanning different, unrelated to MD monitoring, personal sensing tasks-from emotion recognition in Super Mario players to stress detection in undergraduates-and devised a preprocessing pipeline performing on-/off-body detection, sleep/wake detection, segmentation, and (optionally) feature extraction. With 161 E4-recorded subjects, we introduced E4SelfLearning, the largest-to-date open access collection, and its preprocessing pipeline. We developed a novel E4-tailored transformer (E4mer) architecture, serving as the blueprint for both SSL and fully supervised learning; we assessed whether and under which conditions self-supervised pretraining led to an improvement over fully supervised baselines (ie, the fully supervised E4mer and pre-deep learning algorithms) in detecting acute MD episodes from recording segments taken in 64 (n=32, 50%, acute, n=32, 50%, stable) patients. RESULTS SSL significantly outperformed fully supervised pipelines using either our novel E4mer or extreme gradient boosting (XGBoost): n=3353 (81.23%) against n=3110 (75.35%; E4mer) and n=2973 (72.02%; XGBoost) correctly classified recording segments from a total of 4128 segments. SSL performance was strongly associated with the specific surrogate task used for pretraining, as well as with unlabeled data availability. CONCLUSIONS We showed that SSL, a paradigm where a model is pretrained on unlabeled data with no need for human annotations before deployment on the supervised target task of interest, helps overcome the annotation bottleneck; the choice of the pretraining surrogate task and the size of unlabeled data for pretraining are key determinants of SSL success. We introduced E4mer, which can be used for SSL, and shared the E4SelfLearning collection, along with its preprocessing pipeline, which can foster and expedite future research into SSL for personal sensing.
Collapse
Affiliation(s)
- Filippo Corponi
- School of Informatics, University of Edinburgh, Edinburgh, United Kingdom
| | - Bryan M Li
- School of Informatics, University of Edinburgh, Edinburgh, United Kingdom
- The Alan Turing Institute, London, United Kingdom
| | - Gerard Anmella
- Bipolar and Depressive Disorders Unit, Department of Psychiatry and Psychology, Hospital Clínic de Barcelona, Barcelona, Spain
- Institut d'Investigacions Biomèdiques August Pi i Sunyer, Barcelona, Spain
- Centro de Investigación Biomédica en Red de Salud Mental, Instituto de Salud Carlos III, Madrid, Spain
- Departament de Medicina, Facultat de Medicina i Ciències de la Salut, Universitat de Barcelona, Barcelona, Spain
| | - Clàudia Valenzuela-Pascual
- Bipolar and Depressive Disorders Unit, Department of Psychiatry and Psychology, Hospital Clínic de Barcelona, Barcelona, Spain
- Institut d'Investigacions Biomèdiques August Pi i Sunyer, Barcelona, Spain
- Centro de Investigación Biomédica en Red de Salud Mental, Instituto de Salud Carlos III, Madrid, Spain
- Departament de Medicina, Facultat de Medicina i Ciències de la Salut, Universitat de Barcelona, Barcelona, Spain
| | - Ariadna Mas
- Bipolar and Depressive Disorders Unit, Department of Psychiatry and Psychology, Hospital Clínic de Barcelona, Barcelona, Spain
- Institut d'Investigacions Biomèdiques August Pi i Sunyer, Barcelona, Spain
- Centro de Investigación Biomédica en Red de Salud Mental, Instituto de Salud Carlos III, Madrid, Spain
- Departament de Medicina, Facultat de Medicina i Ciències de la Salut, Universitat de Barcelona, Barcelona, Spain
| | - Isabella Pacchiarotti
- Bipolar and Depressive Disorders Unit, Department of Psychiatry and Psychology, Hospital Clínic de Barcelona, Barcelona, Spain
- Institut d'Investigacions Biomèdiques August Pi i Sunyer, Barcelona, Spain
- Centro de Investigación Biomédica en Red de Salud Mental, Instituto de Salud Carlos III, Madrid, Spain
- Departament de Medicina, Facultat de Medicina i Ciències de la Salut, Universitat de Barcelona, Barcelona, Spain
| | - Marc Valentí
- Bipolar and Depressive Disorders Unit, Department of Psychiatry and Psychology, Hospital Clínic de Barcelona, Barcelona, Spain
- Institut d'Investigacions Biomèdiques August Pi i Sunyer, Barcelona, Spain
- Centro de Investigación Biomédica en Red de Salud Mental, Instituto de Salud Carlos III, Madrid, Spain
- Departament de Medicina, Facultat de Medicina i Ciències de la Salut, Universitat de Barcelona, Barcelona, Spain
| | - Iria Grande
- Bipolar and Depressive Disorders Unit, Department of Psychiatry and Psychology, Hospital Clínic de Barcelona, Barcelona, Spain
- Institut d'Investigacions Biomèdiques August Pi i Sunyer, Barcelona, Spain
- Centro de Investigación Biomédica en Red de Salud Mental, Instituto de Salud Carlos III, Madrid, Spain
- Departament de Medicina, Facultat de Medicina i Ciències de la Salut, Universitat de Barcelona, Barcelona, Spain
| | - Antoni Benabarre
- Bipolar and Depressive Disorders Unit, Department of Psychiatry and Psychology, Hospital Clínic de Barcelona, Barcelona, Spain
- Institut d'Investigacions Biomèdiques August Pi i Sunyer, Barcelona, Spain
- Centro de Investigación Biomédica en Red de Salud Mental, Instituto de Salud Carlos III, Madrid, Spain
- Departament de Medicina, Facultat de Medicina i Ciències de la Salut, Universitat de Barcelona, Barcelona, Spain
| | - Marina Garriga
- Bipolar and Depressive Disorders Unit, Department of Psychiatry and Psychology, Hospital Clínic de Barcelona, Barcelona, Spain
- Institut d'Investigacions Biomèdiques August Pi i Sunyer, Barcelona, Spain
- Centro de Investigación Biomédica en Red de Salud Mental, Instituto de Salud Carlos III, Madrid, Spain
- Departament de Medicina, Facultat de Medicina i Ciències de la Salut, Universitat de Barcelona, Barcelona, Spain
| | - Eduard Vieta
- Bipolar and Depressive Disorders Unit, Department of Psychiatry and Psychology, Hospital Clínic de Barcelona, Barcelona, Spain
- Institut d'Investigacions Biomèdiques August Pi i Sunyer, Barcelona, Spain
- Centro de Investigación Biomédica en Red de Salud Mental, Instituto de Salud Carlos III, Madrid, Spain
- Departament de Medicina, Facultat de Medicina i Ciències de la Salut, Universitat de Barcelona, Barcelona, Spain
| | - Allan H Young
- Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, United Kingdom
| | - Stephen M Lawrie
- Division of Psychiatry, Centre for Clinical Brain Sciences, University of Edinburgh, Edinburgh, United Kingdom
| | - Heather C Whalley
- Division of Psychiatry, Centre for Clinical Brain Sciences, University of Edinburgh, Edinburgh, United Kingdom
- Generation Scotland, Institute for Genetics and Cancer, University of Edinburgh, Edinburgh, United Kingdom
| | - Diego Hidalgo-Mazzei
- Bipolar and Depressive Disorders Unit, Department of Psychiatry and Psychology, Hospital Clínic de Barcelona, Barcelona, Spain
- Institut d'Investigacions Biomèdiques August Pi i Sunyer, Barcelona, Spain
- Centro de Investigación Biomédica en Red de Salud Mental, Instituto de Salud Carlos III, Madrid, Spain
- Departament de Medicina, Facultat de Medicina i Ciències de la Salut, Universitat de Barcelona, Barcelona, Spain
- Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, United Kingdom
| | - Antonio Vergari
- School of Informatics, University of Edinburgh, Edinburgh, United Kingdom
| |
Collapse
|
10
|
Kudus K, Wagner M, Ertl-Wagner BB, Khalvati F. Applications of machine learning to MR imaging of pediatric low-grade gliomas. Childs Nerv Syst 2024:10.1007/s00381-024-06522-5. [PMID: 38972953 DOI: 10.1007/s00381-024-06522-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/25/2024] [Accepted: 06/21/2024] [Indexed: 07/09/2024]
Abstract
INTRODUCTION Machine learning (ML) shows promise for the automation of routine tasks related to the treatment of pediatric low-grade gliomas (pLGG) such as tumor grading, typing, and segmentation. Moreover, it has been shown that ML can identify crucial information from medical images that is otherwise currently unattainable. For example, ML appears to be capable of preoperatively identifying the underlying genetic status of pLGG. METHODS In this chapter, we reviewed, to the best of our knowledge, all published works that have used ML techniques for the imaging-based evaluation of pLGGs. Additionally, we aimed to provide some context on what it will take to go from the exploratory studies we reviewed to clinically deployed models. RESULTS Multiple studies have demonstrated that ML can accurately grade, type, and segment and detect the genetic status of pLGGs. We compared the approaches used between the different studies and observed a high degree of variability throughout the methodologies. Standardization and cooperation between the numerous groups working on these approaches will be key to accelerating the clinical deployment of these models. CONCLUSION The studies reviewed in this chapter detail the potential for ML techniques to transform the treatment of pLGG. However, there are still challenges that need to be overcome prior to clinical deployment.
Collapse
Affiliation(s)
- Kareem Kudus
- Neurosciences & Mental Health Research Program, The Hospital for Sick Children, Toronto, Canada
- Institute of Medical Science, University of Toronto, Toronto, Canada
| | - Matthias Wagner
- Department of Diagnostic & Interventional Radiology, The Hospital for Sick Children, Toronto, Canada
- Department of Diagnostic and Interventional Neuroradiology, University Hospital Augsburg, Augsburg, Germany
| | - Birgit Betina Ertl-Wagner
- Neurosciences & Mental Health Research Program, The Hospital for Sick Children, Toronto, Canada
- Institute of Medical Science, University of Toronto, Toronto, Canada
- Department of Diagnostic & Interventional Radiology, The Hospital for Sick Children, Toronto, Canada
- Department of Medical Imaging, University of Toronto, Toronto, Canada
| | - Farzad Khalvati
- Neurosciences & Mental Health Research Program, The Hospital for Sick Children, Toronto, Canada.
- Institute of Medical Science, University of Toronto, Toronto, Canada.
- Department of Diagnostic & Interventional Radiology, The Hospital for Sick Children, Toronto, Canada.
- Department of Medical Imaging, University of Toronto, Toronto, Canada.
- Department of Computer Science, University of Toronto, Toronto, Canada.
- Department of Mechanical and Industrial Engineering, University of Toronto, Toronto, Canada.
| |
Collapse
|
11
|
Gryshchuk V, Singh D, Teipel S, Dyrba M. Contrastive Self-supervised Learning for Neurodegenerative Disorder Classification. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.07.03.24309882. [PMID: 39006425 PMCID: PMC11245060 DOI: 10.1101/2024.07.03.24309882] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/16/2024]
Abstract
Neurodegenerative diseases such as Alzheimer's disease (AD) or frontotemporal lobar degeneration (FTLD) involve specific loss of brain volume, detectable in vivo using T1-weighted MRI scans. Supervised machine learning approaches classifying neurodegenerative diseases require diagnostic-labels for each sample. However, it can be difficult to obtain expert labels for a large amount of data. Self-supervised learning (SSL) offers an alternative for training machine learning models without data-labels. We investigated if the SSL models can applied to distinguish between different neurodegenerative disorders in an interpretable manner. Our method comprises a feature extractor and a downstream classification head. A deep convolutional neural network trained in a contrastive self-supervised way serves as the feature extractor, learning latent representation, while the classifier head is a single-layer perceptron. We used N=2694 T1-weighted MRI scans from four data cohorts: two ADNI datasets, AIBL and FTLDNI, including cognitively normal controls (CN), cases with prodromal and clinical AD, as well as FTLD cases differentiated into its sub-types. Our results showed that the feature extractor trained in a self-supervised way provides generalizable and robust representations for the downstream classification. For AD vs. CN, our model achieves 82% balanced accuracy on the test subset and 80% on an independent holdout dataset. Similarly, the Behavioral variant of frontotemporal dementia (BV) vs. CN model attains an 88% balanced accuracy on the test subset. The average feature attribution heatmaps obtained by the Integrated Gradient method highlighted hallmark regions, i.e., temporal gray matter atrophy for AD, and insular atrophy for BV. In conclusion, our models perform comparably to state-of-the-art supervised deep learning approaches. This suggests that the SSL methodology can successfully make use of unannotated neuroimaging datasets as training data while remaining robust and interpretable.
Collapse
|
12
|
Linguraru MG, Bakas S, Aboian M, Chang PD, Flanders AE, Kalpathy-Cramer J, Kitamura FC, Lungren MP, Mongan J, Prevedello LM, Summers RM, Wu CC, Adewole M, Kahn CE. Clinical, Cultural, Computational, and Regulatory Considerations to Deploy AI in Radiology: Perspectives of RSNA and MICCAI Experts. Radiol Artif Intell 2024; 6:e240225. [PMID: 38984986 PMCID: PMC11294958 DOI: 10.1148/ryai.240225] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2024] [Revised: 04/13/2024] [Accepted: 04/25/2024] [Indexed: 07/11/2024]
Abstract
The Radiological Society of North of America (RSNA) and the Medical Image Computing and Computer Assisted Intervention (MICCAI) Society have led a series of joint panels and seminars focused on the present impact and future directions of artificial intelligence (AI) in radiology. These conversations have collected viewpoints from multidisciplinary experts in radiology, medical imaging, and machine learning on the current clinical penetration of AI technology in radiology and how it is impacted by trust, reproducibility, explainability, and accountability. The collective points-both practical and philosophical-define the cultural changes for radiologists and AI scientists working together and describe the challenges ahead for AI technologies to meet broad approval. This article presents the perspectives of experts from MICCAI and RSNA on the clinical, cultural, computational, and regulatory considerations-coupled with recommended reading materials-essential to adopt AI technology successfully in radiology and, more generally, in clinical practice. The report emphasizes the importance of collaboration to improve clinical deployment, highlights the need to integrate clinical and medical imaging data, and introduces strategies to ensure smooth and incentivized integration. Keywords: Adults and Pediatrics, Computer Applications-General (Informatics), Diagnosis, Prognosis © RSNA, 2024.
Collapse
Affiliation(s)
- Marius George Linguraru
- From the Sheikh Zayed Institute for Pediatric Surgical Innovation,
Children’s National Hospital, Washington, DC (M.G.L.); Divisions of
Radiology and Pediatrics, George Washington University School of Medicine and
Health Sciences, Washington, DC (M.G.L.); Division of Computational Pathology,
Department of Pathology & Laboratory Medicine, School of Medicine,
Indiana University, Indianapolis, Ind (S.B.); Department of Radiology,
Children’s Hospital of Philadelphia, Philadelphia, Pa (M.A.); Department
of Radiological Sciences, University of California Irvine, Irvine, Calif
(P.D.C.); Department of Radiology, Thomas Jefferson University, Philadelphia, Pa
(A.E.F.); Department of Ophthalmology, University of Colorado Anschutz Medical
Campus, Aurora, Colo (J.K.C.); Department of Applied Innovation and AI,
Diagnósticos da América SA (DasaInova), São Paulo, Brazil
(F.C.K.); Department of Diagnostic Imaging, Universidade Federal de São
Paulo, São Paulo, Brazil (F.C.K.); Microsoft, Nuance, Burlington, Mass
(M.P.L.); Department of Radiology and Biomedical Imaging and Center for
Intelligent Imaging, University of California San Francisco, San Francisco,
Calif (J.M.); Department of Radiology, The Ohio State University Wexner Medical
Center, Columbus, Ohio (L.M.P.); Department of Radiology and Imaging Sciences,
National Institutes of Health Clinical Center, Bethesda, Md (R.M.S.); Division
of Diagnostic Imaging, University of Texas MD Anderson Cancer Center, Houston,
Tex (C.C.W.); Medical Artificial Intelligence Laboratory, University of Lagos
College of Medicine, Lagos, Nigeria (M.A.); and Department of Radiology,
University of Pennsylvania, 3400 Spruce St, 1 Silverstein, Philadelphia, PA
19104-6243 (C.E.K.)
| | - Spyridon Bakas
- From the Sheikh Zayed Institute for Pediatric Surgical Innovation,
Children’s National Hospital, Washington, DC (M.G.L.); Divisions of
Radiology and Pediatrics, George Washington University School of Medicine and
Health Sciences, Washington, DC (M.G.L.); Division of Computational Pathology,
Department of Pathology & Laboratory Medicine, School of Medicine,
Indiana University, Indianapolis, Ind (S.B.); Department of Radiology,
Children’s Hospital of Philadelphia, Philadelphia, Pa (M.A.); Department
of Radiological Sciences, University of California Irvine, Irvine, Calif
(P.D.C.); Department of Radiology, Thomas Jefferson University, Philadelphia, Pa
(A.E.F.); Department of Ophthalmology, University of Colorado Anschutz Medical
Campus, Aurora, Colo (J.K.C.); Department of Applied Innovation and AI,
Diagnósticos da América SA (DasaInova), São Paulo, Brazil
(F.C.K.); Department of Diagnostic Imaging, Universidade Federal de São
Paulo, São Paulo, Brazil (F.C.K.); Microsoft, Nuance, Burlington, Mass
(M.P.L.); Department of Radiology and Biomedical Imaging and Center for
Intelligent Imaging, University of California San Francisco, San Francisco,
Calif (J.M.); Department of Radiology, The Ohio State University Wexner Medical
Center, Columbus, Ohio (L.M.P.); Department of Radiology and Imaging Sciences,
National Institutes of Health Clinical Center, Bethesda, Md (R.M.S.); Division
of Diagnostic Imaging, University of Texas MD Anderson Cancer Center, Houston,
Tex (C.C.W.); Medical Artificial Intelligence Laboratory, University of Lagos
College of Medicine, Lagos, Nigeria (M.A.); and Department of Radiology,
University of Pennsylvania, 3400 Spruce St, 1 Silverstein, Philadelphia, PA
19104-6243 (C.E.K.)
| | - Mariam Aboian
- From the Sheikh Zayed Institute for Pediatric Surgical Innovation,
Children’s National Hospital, Washington, DC (M.G.L.); Divisions of
Radiology and Pediatrics, George Washington University School of Medicine and
Health Sciences, Washington, DC (M.G.L.); Division of Computational Pathology,
Department of Pathology & Laboratory Medicine, School of Medicine,
Indiana University, Indianapolis, Ind (S.B.); Department of Radiology,
Children’s Hospital of Philadelphia, Philadelphia, Pa (M.A.); Department
of Radiological Sciences, University of California Irvine, Irvine, Calif
(P.D.C.); Department of Radiology, Thomas Jefferson University, Philadelphia, Pa
(A.E.F.); Department of Ophthalmology, University of Colorado Anschutz Medical
Campus, Aurora, Colo (J.K.C.); Department of Applied Innovation and AI,
Diagnósticos da América SA (DasaInova), São Paulo, Brazil
(F.C.K.); Department of Diagnostic Imaging, Universidade Federal de São
Paulo, São Paulo, Brazil (F.C.K.); Microsoft, Nuance, Burlington, Mass
(M.P.L.); Department of Radiology and Biomedical Imaging and Center for
Intelligent Imaging, University of California San Francisco, San Francisco,
Calif (J.M.); Department of Radiology, The Ohio State University Wexner Medical
Center, Columbus, Ohio (L.M.P.); Department of Radiology and Imaging Sciences,
National Institutes of Health Clinical Center, Bethesda, Md (R.M.S.); Division
of Diagnostic Imaging, University of Texas MD Anderson Cancer Center, Houston,
Tex (C.C.W.); Medical Artificial Intelligence Laboratory, University of Lagos
College of Medicine, Lagos, Nigeria (M.A.); and Department of Radiology,
University of Pennsylvania, 3400 Spruce St, 1 Silverstein, Philadelphia, PA
19104-6243 (C.E.K.)
| | - Peter D. Chang
- From the Sheikh Zayed Institute for Pediatric Surgical Innovation,
Children’s National Hospital, Washington, DC (M.G.L.); Divisions of
Radiology and Pediatrics, George Washington University School of Medicine and
Health Sciences, Washington, DC (M.G.L.); Division of Computational Pathology,
Department of Pathology & Laboratory Medicine, School of Medicine,
Indiana University, Indianapolis, Ind (S.B.); Department of Radiology,
Children’s Hospital of Philadelphia, Philadelphia, Pa (M.A.); Department
of Radiological Sciences, University of California Irvine, Irvine, Calif
(P.D.C.); Department of Radiology, Thomas Jefferson University, Philadelphia, Pa
(A.E.F.); Department of Ophthalmology, University of Colorado Anschutz Medical
Campus, Aurora, Colo (J.K.C.); Department of Applied Innovation and AI,
Diagnósticos da América SA (DasaInova), São Paulo, Brazil
(F.C.K.); Department of Diagnostic Imaging, Universidade Federal de São
Paulo, São Paulo, Brazil (F.C.K.); Microsoft, Nuance, Burlington, Mass
(M.P.L.); Department of Radiology and Biomedical Imaging and Center for
Intelligent Imaging, University of California San Francisco, San Francisco,
Calif (J.M.); Department of Radiology, The Ohio State University Wexner Medical
Center, Columbus, Ohio (L.M.P.); Department of Radiology and Imaging Sciences,
National Institutes of Health Clinical Center, Bethesda, Md (R.M.S.); Division
of Diagnostic Imaging, University of Texas MD Anderson Cancer Center, Houston,
Tex (C.C.W.); Medical Artificial Intelligence Laboratory, University of Lagos
College of Medicine, Lagos, Nigeria (M.A.); and Department of Radiology,
University of Pennsylvania, 3400 Spruce St, 1 Silverstein, Philadelphia, PA
19104-6243 (C.E.K.)
| | - Adam E. Flanders
- From the Sheikh Zayed Institute for Pediatric Surgical Innovation,
Children’s National Hospital, Washington, DC (M.G.L.); Divisions of
Radiology and Pediatrics, George Washington University School of Medicine and
Health Sciences, Washington, DC (M.G.L.); Division of Computational Pathology,
Department of Pathology & Laboratory Medicine, School of Medicine,
Indiana University, Indianapolis, Ind (S.B.); Department of Radiology,
Children’s Hospital of Philadelphia, Philadelphia, Pa (M.A.); Department
of Radiological Sciences, University of California Irvine, Irvine, Calif
(P.D.C.); Department of Radiology, Thomas Jefferson University, Philadelphia, Pa
(A.E.F.); Department of Ophthalmology, University of Colorado Anschutz Medical
Campus, Aurora, Colo (J.K.C.); Department of Applied Innovation and AI,
Diagnósticos da América SA (DasaInova), São Paulo, Brazil
(F.C.K.); Department of Diagnostic Imaging, Universidade Federal de São
Paulo, São Paulo, Brazil (F.C.K.); Microsoft, Nuance, Burlington, Mass
(M.P.L.); Department of Radiology and Biomedical Imaging and Center for
Intelligent Imaging, University of California San Francisco, San Francisco,
Calif (J.M.); Department of Radiology, The Ohio State University Wexner Medical
Center, Columbus, Ohio (L.M.P.); Department of Radiology and Imaging Sciences,
National Institutes of Health Clinical Center, Bethesda, Md (R.M.S.); Division
of Diagnostic Imaging, University of Texas MD Anderson Cancer Center, Houston,
Tex (C.C.W.); Medical Artificial Intelligence Laboratory, University of Lagos
College of Medicine, Lagos, Nigeria (M.A.); and Department of Radiology,
University of Pennsylvania, 3400 Spruce St, 1 Silverstein, Philadelphia, PA
19104-6243 (C.E.K.)
| | - Jayashree Kalpathy-Cramer
- From the Sheikh Zayed Institute for Pediatric Surgical Innovation,
Children’s National Hospital, Washington, DC (M.G.L.); Divisions of
Radiology and Pediatrics, George Washington University School of Medicine and
Health Sciences, Washington, DC (M.G.L.); Division of Computational Pathology,
Department of Pathology & Laboratory Medicine, School of Medicine,
Indiana University, Indianapolis, Ind (S.B.); Department of Radiology,
Children’s Hospital of Philadelphia, Philadelphia, Pa (M.A.); Department
of Radiological Sciences, University of California Irvine, Irvine, Calif
(P.D.C.); Department of Radiology, Thomas Jefferson University, Philadelphia, Pa
(A.E.F.); Department of Ophthalmology, University of Colorado Anschutz Medical
Campus, Aurora, Colo (J.K.C.); Department of Applied Innovation and AI,
Diagnósticos da América SA (DasaInova), São Paulo, Brazil
(F.C.K.); Department of Diagnostic Imaging, Universidade Federal de São
Paulo, São Paulo, Brazil (F.C.K.); Microsoft, Nuance, Burlington, Mass
(M.P.L.); Department of Radiology and Biomedical Imaging and Center for
Intelligent Imaging, University of California San Francisco, San Francisco,
Calif (J.M.); Department of Radiology, The Ohio State University Wexner Medical
Center, Columbus, Ohio (L.M.P.); Department of Radiology and Imaging Sciences,
National Institutes of Health Clinical Center, Bethesda, Md (R.M.S.); Division
of Diagnostic Imaging, University of Texas MD Anderson Cancer Center, Houston,
Tex (C.C.W.); Medical Artificial Intelligence Laboratory, University of Lagos
College of Medicine, Lagos, Nigeria (M.A.); and Department of Radiology,
University of Pennsylvania, 3400 Spruce St, 1 Silverstein, Philadelphia, PA
19104-6243 (C.E.K.)
| | - Felipe C. Kitamura
- From the Sheikh Zayed Institute for Pediatric Surgical Innovation,
Children’s National Hospital, Washington, DC (M.G.L.); Divisions of
Radiology and Pediatrics, George Washington University School of Medicine and
Health Sciences, Washington, DC (M.G.L.); Division of Computational Pathology,
Department of Pathology & Laboratory Medicine, School of Medicine,
Indiana University, Indianapolis, Ind (S.B.); Department of Radiology,
Children’s Hospital of Philadelphia, Philadelphia, Pa (M.A.); Department
of Radiological Sciences, University of California Irvine, Irvine, Calif
(P.D.C.); Department of Radiology, Thomas Jefferson University, Philadelphia, Pa
(A.E.F.); Department of Ophthalmology, University of Colorado Anschutz Medical
Campus, Aurora, Colo (J.K.C.); Department of Applied Innovation and AI,
Diagnósticos da América SA (DasaInova), São Paulo, Brazil
(F.C.K.); Department of Diagnostic Imaging, Universidade Federal de São
Paulo, São Paulo, Brazil (F.C.K.); Microsoft, Nuance, Burlington, Mass
(M.P.L.); Department of Radiology and Biomedical Imaging and Center for
Intelligent Imaging, University of California San Francisco, San Francisco,
Calif (J.M.); Department of Radiology, The Ohio State University Wexner Medical
Center, Columbus, Ohio (L.M.P.); Department of Radiology and Imaging Sciences,
National Institutes of Health Clinical Center, Bethesda, Md (R.M.S.); Division
of Diagnostic Imaging, University of Texas MD Anderson Cancer Center, Houston,
Tex (C.C.W.); Medical Artificial Intelligence Laboratory, University of Lagos
College of Medicine, Lagos, Nigeria (M.A.); and Department of Radiology,
University of Pennsylvania, 3400 Spruce St, 1 Silverstein, Philadelphia, PA
19104-6243 (C.E.K.)
| | - Matthew P. Lungren
- From the Sheikh Zayed Institute for Pediatric Surgical Innovation,
Children’s National Hospital, Washington, DC (M.G.L.); Divisions of
Radiology and Pediatrics, George Washington University School of Medicine and
Health Sciences, Washington, DC (M.G.L.); Division of Computational Pathology,
Department of Pathology & Laboratory Medicine, School of Medicine,
Indiana University, Indianapolis, Ind (S.B.); Department of Radiology,
Children’s Hospital of Philadelphia, Philadelphia, Pa (M.A.); Department
of Radiological Sciences, University of California Irvine, Irvine, Calif
(P.D.C.); Department of Radiology, Thomas Jefferson University, Philadelphia, Pa
(A.E.F.); Department of Ophthalmology, University of Colorado Anschutz Medical
Campus, Aurora, Colo (J.K.C.); Department of Applied Innovation and AI,
Diagnósticos da América SA (DasaInova), São Paulo, Brazil
(F.C.K.); Department of Diagnostic Imaging, Universidade Federal de São
Paulo, São Paulo, Brazil (F.C.K.); Microsoft, Nuance, Burlington, Mass
(M.P.L.); Department of Radiology and Biomedical Imaging and Center for
Intelligent Imaging, University of California San Francisco, San Francisco,
Calif (J.M.); Department of Radiology, The Ohio State University Wexner Medical
Center, Columbus, Ohio (L.M.P.); Department of Radiology and Imaging Sciences,
National Institutes of Health Clinical Center, Bethesda, Md (R.M.S.); Division
of Diagnostic Imaging, University of Texas MD Anderson Cancer Center, Houston,
Tex (C.C.W.); Medical Artificial Intelligence Laboratory, University of Lagos
College of Medicine, Lagos, Nigeria (M.A.); and Department of Radiology,
University of Pennsylvania, 3400 Spruce St, 1 Silverstein, Philadelphia, PA
19104-6243 (C.E.K.)
| | - John Mongan
- From the Sheikh Zayed Institute for Pediatric Surgical Innovation,
Children’s National Hospital, Washington, DC (M.G.L.); Divisions of
Radiology and Pediatrics, George Washington University School of Medicine and
Health Sciences, Washington, DC (M.G.L.); Division of Computational Pathology,
Department of Pathology & Laboratory Medicine, School of Medicine,
Indiana University, Indianapolis, Ind (S.B.); Department of Radiology,
Children’s Hospital of Philadelphia, Philadelphia, Pa (M.A.); Department
of Radiological Sciences, University of California Irvine, Irvine, Calif
(P.D.C.); Department of Radiology, Thomas Jefferson University, Philadelphia, Pa
(A.E.F.); Department of Ophthalmology, University of Colorado Anschutz Medical
Campus, Aurora, Colo (J.K.C.); Department of Applied Innovation and AI,
Diagnósticos da América SA (DasaInova), São Paulo, Brazil
(F.C.K.); Department of Diagnostic Imaging, Universidade Federal de São
Paulo, São Paulo, Brazil (F.C.K.); Microsoft, Nuance, Burlington, Mass
(M.P.L.); Department of Radiology and Biomedical Imaging and Center for
Intelligent Imaging, University of California San Francisco, San Francisco,
Calif (J.M.); Department of Radiology, The Ohio State University Wexner Medical
Center, Columbus, Ohio (L.M.P.); Department of Radiology and Imaging Sciences,
National Institutes of Health Clinical Center, Bethesda, Md (R.M.S.); Division
of Diagnostic Imaging, University of Texas MD Anderson Cancer Center, Houston,
Tex (C.C.W.); Medical Artificial Intelligence Laboratory, University of Lagos
College of Medicine, Lagos, Nigeria (M.A.); and Department of Radiology,
University of Pennsylvania, 3400 Spruce St, 1 Silverstein, Philadelphia, PA
19104-6243 (C.E.K.)
| | - Luciano M. Prevedello
- From the Sheikh Zayed Institute for Pediatric Surgical Innovation,
Children’s National Hospital, Washington, DC (M.G.L.); Divisions of
Radiology and Pediatrics, George Washington University School of Medicine and
Health Sciences, Washington, DC (M.G.L.); Division of Computational Pathology,
Department of Pathology & Laboratory Medicine, School of Medicine,
Indiana University, Indianapolis, Ind (S.B.); Department of Radiology,
Children’s Hospital of Philadelphia, Philadelphia, Pa (M.A.); Department
of Radiological Sciences, University of California Irvine, Irvine, Calif
(P.D.C.); Department of Radiology, Thomas Jefferson University, Philadelphia, Pa
(A.E.F.); Department of Ophthalmology, University of Colorado Anschutz Medical
Campus, Aurora, Colo (J.K.C.); Department of Applied Innovation and AI,
Diagnósticos da América SA (DasaInova), São Paulo, Brazil
(F.C.K.); Department of Diagnostic Imaging, Universidade Federal de São
Paulo, São Paulo, Brazil (F.C.K.); Microsoft, Nuance, Burlington, Mass
(M.P.L.); Department of Radiology and Biomedical Imaging and Center for
Intelligent Imaging, University of California San Francisco, San Francisco,
Calif (J.M.); Department of Radiology, The Ohio State University Wexner Medical
Center, Columbus, Ohio (L.M.P.); Department of Radiology and Imaging Sciences,
National Institutes of Health Clinical Center, Bethesda, Md (R.M.S.); Division
of Diagnostic Imaging, University of Texas MD Anderson Cancer Center, Houston,
Tex (C.C.W.); Medical Artificial Intelligence Laboratory, University of Lagos
College of Medicine, Lagos, Nigeria (M.A.); and Department of Radiology,
University of Pennsylvania, 3400 Spruce St, 1 Silverstein, Philadelphia, PA
19104-6243 (C.E.K.)
| | - Ronald M. Summers
- From the Sheikh Zayed Institute for Pediatric Surgical Innovation,
Children’s National Hospital, Washington, DC (M.G.L.); Divisions of
Radiology and Pediatrics, George Washington University School of Medicine and
Health Sciences, Washington, DC (M.G.L.); Division of Computational Pathology,
Department of Pathology & Laboratory Medicine, School of Medicine,
Indiana University, Indianapolis, Ind (S.B.); Department of Radiology,
Children’s Hospital of Philadelphia, Philadelphia, Pa (M.A.); Department
of Radiological Sciences, University of California Irvine, Irvine, Calif
(P.D.C.); Department of Radiology, Thomas Jefferson University, Philadelphia, Pa
(A.E.F.); Department of Ophthalmology, University of Colorado Anschutz Medical
Campus, Aurora, Colo (J.K.C.); Department of Applied Innovation and AI,
Diagnósticos da América SA (DasaInova), São Paulo, Brazil
(F.C.K.); Department of Diagnostic Imaging, Universidade Federal de São
Paulo, São Paulo, Brazil (F.C.K.); Microsoft, Nuance, Burlington, Mass
(M.P.L.); Department of Radiology and Biomedical Imaging and Center for
Intelligent Imaging, University of California San Francisco, San Francisco,
Calif (J.M.); Department of Radiology, The Ohio State University Wexner Medical
Center, Columbus, Ohio (L.M.P.); Department of Radiology and Imaging Sciences,
National Institutes of Health Clinical Center, Bethesda, Md (R.M.S.); Division
of Diagnostic Imaging, University of Texas MD Anderson Cancer Center, Houston,
Tex (C.C.W.); Medical Artificial Intelligence Laboratory, University of Lagos
College of Medicine, Lagos, Nigeria (M.A.); and Department of Radiology,
University of Pennsylvania, 3400 Spruce St, 1 Silverstein, Philadelphia, PA
19104-6243 (C.E.K.)
| | - Carol C. Wu
- From the Sheikh Zayed Institute for Pediatric Surgical Innovation,
Children’s National Hospital, Washington, DC (M.G.L.); Divisions of
Radiology and Pediatrics, George Washington University School of Medicine and
Health Sciences, Washington, DC (M.G.L.); Division of Computational Pathology,
Department of Pathology & Laboratory Medicine, School of Medicine,
Indiana University, Indianapolis, Ind (S.B.); Department of Radiology,
Children’s Hospital of Philadelphia, Philadelphia, Pa (M.A.); Department
of Radiological Sciences, University of California Irvine, Irvine, Calif
(P.D.C.); Department of Radiology, Thomas Jefferson University, Philadelphia, Pa
(A.E.F.); Department of Ophthalmology, University of Colorado Anschutz Medical
Campus, Aurora, Colo (J.K.C.); Department of Applied Innovation and AI,
Diagnósticos da América SA (DasaInova), São Paulo, Brazil
(F.C.K.); Department of Diagnostic Imaging, Universidade Federal de São
Paulo, São Paulo, Brazil (F.C.K.); Microsoft, Nuance, Burlington, Mass
(M.P.L.); Department of Radiology and Biomedical Imaging and Center for
Intelligent Imaging, University of California San Francisco, San Francisco,
Calif (J.M.); Department of Radiology, The Ohio State University Wexner Medical
Center, Columbus, Ohio (L.M.P.); Department of Radiology and Imaging Sciences,
National Institutes of Health Clinical Center, Bethesda, Md (R.M.S.); Division
of Diagnostic Imaging, University of Texas MD Anderson Cancer Center, Houston,
Tex (C.C.W.); Medical Artificial Intelligence Laboratory, University of Lagos
College of Medicine, Lagos, Nigeria (M.A.); and Department of Radiology,
University of Pennsylvania, 3400 Spruce St, 1 Silverstein, Philadelphia, PA
19104-6243 (C.E.K.)
| | - Maruf Adewole
- From the Sheikh Zayed Institute for Pediatric Surgical Innovation,
Children’s National Hospital, Washington, DC (M.G.L.); Divisions of
Radiology and Pediatrics, George Washington University School of Medicine and
Health Sciences, Washington, DC (M.G.L.); Division of Computational Pathology,
Department of Pathology & Laboratory Medicine, School of Medicine,
Indiana University, Indianapolis, Ind (S.B.); Department of Radiology,
Children’s Hospital of Philadelphia, Philadelphia, Pa (M.A.); Department
of Radiological Sciences, University of California Irvine, Irvine, Calif
(P.D.C.); Department of Radiology, Thomas Jefferson University, Philadelphia, Pa
(A.E.F.); Department of Ophthalmology, University of Colorado Anschutz Medical
Campus, Aurora, Colo (J.K.C.); Department of Applied Innovation and AI,
Diagnósticos da América SA (DasaInova), São Paulo, Brazil
(F.C.K.); Department of Diagnostic Imaging, Universidade Federal de São
Paulo, São Paulo, Brazil (F.C.K.); Microsoft, Nuance, Burlington, Mass
(M.P.L.); Department of Radiology and Biomedical Imaging and Center for
Intelligent Imaging, University of California San Francisco, San Francisco,
Calif (J.M.); Department of Radiology, The Ohio State University Wexner Medical
Center, Columbus, Ohio (L.M.P.); Department of Radiology and Imaging Sciences,
National Institutes of Health Clinical Center, Bethesda, Md (R.M.S.); Division
of Diagnostic Imaging, University of Texas MD Anderson Cancer Center, Houston,
Tex (C.C.W.); Medical Artificial Intelligence Laboratory, University of Lagos
College of Medicine, Lagos, Nigeria (M.A.); and Department of Radiology,
University of Pennsylvania, 3400 Spruce St, 1 Silverstein, Philadelphia, PA
19104-6243 (C.E.K.)
| | - Charles E. Kahn
- From the Sheikh Zayed Institute for Pediatric Surgical Innovation,
Children’s National Hospital, Washington, DC (M.G.L.); Divisions of
Radiology and Pediatrics, George Washington University School of Medicine and
Health Sciences, Washington, DC (M.G.L.); Division of Computational Pathology,
Department of Pathology & Laboratory Medicine, School of Medicine,
Indiana University, Indianapolis, Ind (S.B.); Department of Radiology,
Children’s Hospital of Philadelphia, Philadelphia, Pa (M.A.); Department
of Radiological Sciences, University of California Irvine, Irvine, Calif
(P.D.C.); Department of Radiology, Thomas Jefferson University, Philadelphia, Pa
(A.E.F.); Department of Ophthalmology, University of Colorado Anschutz Medical
Campus, Aurora, Colo (J.K.C.); Department of Applied Innovation and AI,
Diagnósticos da América SA (DasaInova), São Paulo, Brazil
(F.C.K.); Department of Diagnostic Imaging, Universidade Federal de São
Paulo, São Paulo, Brazil (F.C.K.); Microsoft, Nuance, Burlington, Mass
(M.P.L.); Department of Radiology and Biomedical Imaging and Center for
Intelligent Imaging, University of California San Francisco, San Francisco,
Calif (J.M.); Department of Radiology, The Ohio State University Wexner Medical
Center, Columbus, Ohio (L.M.P.); Department of Radiology and Imaging Sciences,
National Institutes of Health Clinical Center, Bethesda, Md (R.M.S.); Division
of Diagnostic Imaging, University of Texas MD Anderson Cancer Center, Houston,
Tex (C.C.W.); Medical Artificial Intelligence Laboratory, University of Lagos
College of Medicine, Lagos, Nigeria (M.A.); and Department of Radiology,
University of Pennsylvania, 3400 Spruce St, 1 Silverstein, Philadelphia, PA
19104-6243 (C.E.K.)
| |
Collapse
|
13
|
Misera L, Müller-Franzes G, Truhn D, Kather JN. Weakly Supervised Deep Learning in Radiology. Radiology 2024; 312:e232085. [PMID: 39041937 DOI: 10.1148/radiol.232085] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/24/2024]
Abstract
Deep learning (DL) is currently the standard artificial intelligence tool for computer-based image analysis in radiology. Traditionally, DL models have been trained with strongly supervised learning methods. These methods depend on reference standard labels, typically applied manually by experts. In contrast, weakly supervised learning is more scalable. Weak supervision comprises situations in which only a portion of the data are labeled (incomplete supervision), labels refer to a whole region or case as opposed to a precisely delineated image region (inexact supervision), or labels contain errors (inaccurate supervision). In many applications, weak labels are sufficient to train useful models. Thus, weakly supervised learning can unlock a large amount of otherwise unusable data for training DL models. One example of this is using large language models to automatically extract weak labels from free-text radiology reports. Here, we outline the key concepts in weakly supervised learning and provide an overview of applications in radiologic image analysis. With more fundamental and clinical translational work, weakly supervised learning could facilitate the uptake of DL in radiology and research workflows by enabling large-scale image analysis and advancing the development of new DL-based biomarkers.
Collapse
Affiliation(s)
- Leo Misera
- From the Institute and Polyclinic for Diagnostic and Interventional Radiology (L.M.), Else Kröner Fresenius Center for Digital Health (L.M., J.N.K.), and Department of Medicine I (J.N.K.), Faculty of Medicine and University Hospital Carl Gustav Carus, TUD Dresden University of Technology, Fetscherstrasse 74, 01307 Dresden, Germany; Department of Diagnostic and Interventional Radiology, University Hospital Aachen, Aachen, Germany (G.M.F., D.T.); and Medical Oncology, National Center for Tumor Diseases (NCT), University Hospital Heidelberg, Heidelberg, Germany (J.N.K.)
| | - Gustav Müller-Franzes
- From the Institute and Polyclinic for Diagnostic and Interventional Radiology (L.M.), Else Kröner Fresenius Center for Digital Health (L.M., J.N.K.), and Department of Medicine I (J.N.K.), Faculty of Medicine and University Hospital Carl Gustav Carus, TUD Dresden University of Technology, Fetscherstrasse 74, 01307 Dresden, Germany; Department of Diagnostic and Interventional Radiology, University Hospital Aachen, Aachen, Germany (G.M.F., D.T.); and Medical Oncology, National Center for Tumor Diseases (NCT), University Hospital Heidelberg, Heidelberg, Germany (J.N.K.)
| | - Daniel Truhn
- From the Institute and Polyclinic for Diagnostic and Interventional Radiology (L.M.), Else Kröner Fresenius Center for Digital Health (L.M., J.N.K.), and Department of Medicine I (J.N.K.), Faculty of Medicine and University Hospital Carl Gustav Carus, TUD Dresden University of Technology, Fetscherstrasse 74, 01307 Dresden, Germany; Department of Diagnostic and Interventional Radiology, University Hospital Aachen, Aachen, Germany (G.M.F., D.T.); and Medical Oncology, National Center for Tumor Diseases (NCT), University Hospital Heidelberg, Heidelberg, Germany (J.N.K.)
| | - Jakob Nikolas Kather
- From the Institute and Polyclinic for Diagnostic and Interventional Radiology (L.M.), Else Kröner Fresenius Center for Digital Health (L.M., J.N.K.), and Department of Medicine I (J.N.K.), Faculty of Medicine and University Hospital Carl Gustav Carus, TUD Dresden University of Technology, Fetscherstrasse 74, 01307 Dresden, Germany; Department of Diagnostic and Interventional Radiology, University Hospital Aachen, Aachen, Germany (G.M.F., D.T.); and Medical Oncology, National Center for Tumor Diseases (NCT), University Hospital Heidelberg, Heidelberg, Germany (J.N.K.)
| |
Collapse
|
14
|
Hu Y, Lui A, Goldstein M, Sudarshan M, Tinsay A, Tsui C, Maidman SD, Medamana J, Jethani N, Puli A, Nguy V, Aphinyanaphongs Y, Kiefer N, Smilowitz NR, Horowitz J, Ahuja T, Fishman GI, Hochman J, Katz S, Bernard S, Ranganath R. Development and external validation of a dynamic risk score for early prediction of cardiogenic shock in cardiac intensive care units using machine learning. EUROPEAN HEART JOURNAL. ACUTE CARDIOVASCULAR CARE 2024; 13:472-480. [PMID: 38518758 PMCID: PMC11214586 DOI: 10.1093/ehjacc/zuae037] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/31/2023] [Revised: 03/11/2024] [Accepted: 03/19/2024] [Indexed: 03/24/2024]
Abstract
AIMS Myocardial infarction and heart failure are major cardiovascular diseases that affect millions of people in the USA with morbidity and mortality being highest among patients who develop cardiogenic shock. Early recognition of cardiogenic shock allows prompt implementation of treatment measures. Our objective is to develop a new dynamic risk score, called CShock, to improve early detection of cardiogenic shock in the cardiac intensive care unit (ICU). METHODS AND RESULTS We developed and externally validated a deep learning-based risk stratification tool, called CShock, for patients admitted into the cardiac ICU with acute decompensated heart failure and/or myocardial infarction to predict the onset of cardiogenic shock. We prepared a cardiac ICU dataset using the Medical Information Mart for Intensive Care-III database by annotating with physician-adjudicated outcomes. This dataset which consisted of 1500 patients with 204 having cardiogenic/mixed shock was then used to train CShock. The features used to train the model for CShock included patient demographics, cardiac ICU admission diagnoses, routinely measured laboratory values and vital signs, and relevant features manually extracted from echocardiogram and left heart catheterization reports. We externally validated the risk model on the New York University (NYU) Langone Health cardiac ICU database which was also annotated with physician-adjudicated outcomes. The external validation cohort consisted of 131 patients with 25 patients experiencing cardiogenic/mixed shock. CShock achieved an area under the receiver operator characteristic curve (AUROC) of 0.821 (95% CI 0.792-0.850). CShock was externally validated in the more contemporary NYU cohort and achieved an AUROC of 0.800 (95% CI 0.717-0.884), demonstrating its generalizability in other cardiac ICUs. Having an elevated heart rate is most predictive of cardiogenic shock development based on Shapley values. The other top 10 predictors are having an admission diagnosis of myocardial infarction with ST-segment elevation, having an admission diagnosis of acute decompensated heart failure, Braden Scale, Glasgow Coma Scale, blood urea nitrogen, systolic blood pressure, serum chloride, serum sodium, and arterial blood pH. CONCLUSION The novel CShock score has the potential to provide automated detection and early warning for cardiogenic shock and improve the outcomes for millions of patients who suffer from myocardial infarction and heart failure.
Collapse
Affiliation(s)
- Yuxuan Hu
- Leon. H. Charney Division of Cardiology, NYU Langone Health, 550 1st Avenue, New York, NY 10016, USA
| | - Albert Lui
- NYU Grossman School of Medicine, New York, USA
| | - Mark Goldstein
- Courant Institute of Mathematics, New York University, New York, USA
| | - Mukund Sudarshan
- Courant Institute of Mathematics, New York University, New York, USA
| | - Andrea Tinsay
- Department of Medicine, NYU Langone Health, New York, USA
| | - Cindy Tsui
- Department of Medicine, NYU Langone Health, New York, USA
| | | | - John Medamana
- Department of Medicine, NYU Langone Health, New York, USA
| | - Neil Jethani
- NYU Grossman School of Medicine, New York, USA
- Courant Institute of Mathematics, New York University, New York, USA
| | - Aahlad Puli
- Courant Institute of Mathematics, New York University, New York, USA
| | - Vuthy Nguy
- Department of Population Health, NYU Langone Health, New York, USA
| | | | - Nicholas Kiefer
- Leon. H. Charney Division of Cardiology, NYU Langone Health, 550 1st Avenue, New York, NY 10016, USA
| | - Nathaniel R Smilowitz
- Leon. H. Charney Division of Cardiology, NYU Langone Health, 550 1st Avenue, New York, NY 10016, USA
| | - James Horowitz
- Leon. H. Charney Division of Cardiology, NYU Langone Health, 550 1st Avenue, New York, NY 10016, USA
| | - Tania Ahuja
- Department of Pharmacy, NYU Langone Health, New York, USA
| | - Glenn I Fishman
- Leon. H. Charney Division of Cardiology, NYU Langone Health, 550 1st Avenue, New York, NY 10016, USA
| | - Judith Hochman
- Leon. H. Charney Division of Cardiology, NYU Langone Health, 550 1st Avenue, New York, NY 10016, USA
| | - Stuart Katz
- Leon. H. Charney Division of Cardiology, NYU Langone Health, 550 1st Avenue, New York, NY 10016, USA
| | - Samuel Bernard
- Leon. H. Charney Division of Cardiology, NYU Langone Health, 550 1st Avenue, New York, NY 10016, USA
| | - Rajesh Ranganath
- Courant Institute of Mathematics, New York University, New York, USA
- Department of Population Health, NYU Langone Health, New York, USA
- Center for Data Science, New York University, New York, USA
| |
Collapse
|
15
|
Blankemeier L, Cohen JP, Kumar A, Van Veen D, Gardezi SJS, Paschali M, Chen Z, Delbrouck JB, Reis E, Truyts C, Bluethgen C, Jensen MEK, Ostmeier S, Varma M, Valanarasu JMJ, Fang Z, Huo Z, Nabulsi Z, Ardila D, Weng WH, Amaro E, Ahuja N, Fries J, Shah NH, Johnston A, Boutin RD, Wentland A, Langlotz CP, Hom J, Gatidis S, Chaudhari AS. Merlin: A Vision Language Foundation Model for 3D Computed Tomography. RESEARCH SQUARE 2024:rs.3.rs-4546309. [PMID: 38978576 PMCID: PMC11230513 DOI: 10.21203/rs.3.rs-4546309/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/10/2024]
Abstract
Over 85 million computed tomography (CT) scans are performed annually in the US, of which approximately one quarter focus on the abdomen. Given the current shortage of both general and specialized radiologists, there is a large impetus to use artificial intelligence to alleviate the burden of interpreting these complex imaging studies while simultaneously using the images to extract novel physiological insights. Prior state-of-the-art approaches for automated medical image interpretation leverage vision language models (VLMs) that utilize both the image and the corresponding textual radiology reports. However, current medical VLMs are generally limited to 2D images and short reports. To overcome these shortcomings for abdominal CT interpretation, we introduce Merlin - a 3D VLM that leverages both structured electronic health records (EHR) and unstructured radiology reports for pretraining without requiring additional manual annotations. We train Merlin using a high-quality clinical dataset of paired CT scans (6+ million images from 15,331 CTs), EHR diagnosis codes (1.8+ million codes), and radiology reports (6+ million tokens) for training. We comprehensively evaluate Merlin on 6 task types and 752 individual tasks. The non-adapted (off-the-shelf) tasks include zero-shot findings classification (31 findings), phenotype classification (692 phenotypes), and zero-shot cross-modal retrieval (image to findings and image to impressions), while model adapted tasks include 5-year chronic disease prediction (6 diseases), radiology report generation, and 3D semantic segmentation (20 organs). We perform internal validation on a test set of 5,137 CTs, and external validation on 7,000 clinical CTs and on two public CT datasets (VerSe, TotalSegmentator). Beyond these clinically-relevant evaluations, we assess the efficacy of various network architectures and training strategies to depict that Merlin has favorable performance to existing task-specific baselines. We derive data scaling laws to empirically assess training data needs for requisite downstream task performance. Furthermore, unlike conventional VLMs that require hundreds of GPUs for training, we perform all training on a single GPU. This computationally efficient design can help democratize foundation model training, especially for health systems with compute constraints. We plan to release our trained models, code, and dataset, pending manual removal of all protected health information.
Collapse
Affiliation(s)
- Louis Blankemeier
- Department of Electrical Engineering, Stanford University
- Stanford Center for Artificial Intelligence in Medicine and Imaging, Stanford University
- Department of Radiology, Stanford University
| | - Joseph Paul Cohen
- Stanford Center for Artificial Intelligence in Medicine and Imaging, Stanford University
| | - Ashwin Kumar
- Stanford Center for Artificial Intelligence in Medicine and Imaging, Stanford University
- Department of Radiology, Stanford University
| | - Dave Van Veen
- Department of Electrical Engineering, Stanford University
- Stanford Center for Artificial Intelligence in Medicine and Imaging, Stanford University
- Department of Radiology, Stanford University
| | | | - Magdalini Paschali
- Stanford Center for Artificial Intelligence in Medicine and Imaging, Stanford University
- Department of Radiology, Stanford University
| | - Zhihong Chen
- Stanford Center for Artificial Intelligence in Medicine and Imaging, Stanford University
- Department of Radiology, Stanford University
| | - Jean-Benoit Delbrouck
- Stanford Center for Artificial Intelligence in Medicine and Imaging, Stanford University
- Department of Radiology, Stanford University
| | - Eduardo Reis
- Stanford Center for Artificial Intelligence in Medicine and Imaging, Stanford University
- Department of Radiology, Stanford University
| | - Cesar Truyts
- Department of Radiology, Hospital Israelita Albert Einstein
| | - Christian Bluethgen
- Stanford Center for Artificial Intelligence in Medicine and Imaging, Stanford University
- Department of Radiology, University Hospital Zurich
| | - Malte Engmann Kjeldskov Jensen
- Stanford Center for Artificial Intelligence in Medicine and Imaging, Stanford University
- Department of Radiology, Stanford University
| | - Sophie Ostmeier
- Stanford Center for Artificial Intelligence in Medicine and Imaging, Stanford University
- Department of Radiology, Stanford University
| | - Maya Varma
- Stanford Center for Artificial Intelligence in Medicine and Imaging, Stanford University
- Department of Radiology, Stanford University
- Department of Computer Science, Stanford University
| | - Jeya Maria Jose Valanarasu
- Stanford Center for Artificial Intelligence in Medicine and Imaging, Stanford University
- Department of Radiology, Stanford University
- Department of Computer Science, Stanford University
| | | | - Zepeng Huo
- Department of Biomedical Data Science, Stanford University
| | | | | | | | - Edson Amaro
- Department of Radiology, Hospital Israelita Albert Einstein
| | | | - Jason Fries
- Department of Computer Science, Stanford University
- Department of Biomedical Data Science, Stanford University
| | - Nigam H. Shah
- Department of Radiology, Stanford University
- Department of Biomedical Data Science, Stanford University
| | | | | | | | - Curtis P. Langlotz
- Stanford Center for Artificial Intelligence in Medicine and Imaging, Stanford University
- Department of Radiology, Stanford University
| | - Jason Hom
- Department of Medicine, Stanford University
| | | | - Akshay S. Chaudhari
- Stanford Center for Artificial Intelligence in Medicine and Imaging, Stanford University
- Department of Radiology, Stanford University
- Department of Biomedical Data Science, Stanford University
| |
Collapse
|
16
|
Qi L, Jiang Z, Shi W, Qu F, Feng G. GMIM: Self-supervised pre-training for 3D medical image segmentation with adaptive and hierarchical masked image modeling. Comput Biol Med 2024; 176:108547. [PMID: 38728994 DOI: 10.1016/j.compbiomed.2024.108547] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2023] [Revised: 04/07/2024] [Accepted: 04/28/2024] [Indexed: 05/12/2024]
Abstract
Self-supervised pre-training and fully supervised fine-tuning paradigms have received much attention to solve the data annotation problem in deep learning fields. Compared with traditional pre-training on large natural image datasets, medical self-supervised learning methods learn rich representations derived from unlabeled data itself thus avoiding the distribution shift between different image domains. However, nowadays state-of-the-art medical pre-training methods were specifically designed for downstream tasks making them less flexible and difficult to apply to new tasks. In this paper, we propose grid mask image modeling, a flexible and general self-supervised method to pre-train medical vision transformers for 3D medical image segmentation. Our goal is to guide networks to learn the correlations between organs and tissues by reconstructing original images based on partial observations. The relationships are consistent within the human body and invariant to disease type or imaging modality. To achieve this, we design a Siamese framework consisting of an online branch and a target branch. An adaptive and hierarchical masking strategy is employed in the online branch to (1) learn the boundaries or small contextual mutation regions within images; (2) to learn high-level semantic representations from deeper layers of the multiscale encoder. In addition, the target branch provides representations for contrastive learning to further reduce representation redundancy. We evaluate our method through segmentation performance on two public datasets. The experimental results demonstrate our method outperforms other self-supervised methods. Codes are available at https://github.com/mobiletomb/Gmim.
Collapse
Affiliation(s)
- Liangce Qi
- Department of Computer Science and Technology, Changchun University of Science and Technology, Changchun, 130022, Jilin, China.
| | - Zhengang Jiang
- Department of Computer Science and Technology, Changchun University of Science and Technology, Changchun, 130022, Jilin, China; Zhongshan Institute of Changchun University of Science and Technology, Zhongshan, 528400, Guangzhou, China.
| | - Weili Shi
- Department of Computer Science and Technology, Changchun University of Science and Technology, Changchun, 130022, Jilin, China; Zhongshan Institute of Changchun University of Science and Technology, Zhongshan, 528400, Guangzhou, China
| | - Feng Qu
- Department of Computer Science and Technology, Changchun University of Science and Technology, Changchun, 130022, Jilin, China
| | - Guanyuan Feng
- Department of Computer Science and Technology, Changchun University of Science and Technology, Changchun, 130022, Jilin, China
| |
Collapse
|
17
|
Rousta F, Esteki A, Shalbaf A, Sadeghi A, Moghadam PK, Voshagh A. Application of artificial intelligence in pancreas endoscopic ultrasound imaging- A systematic review. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2024; 250:108205. [PMID: 38703435 DOI: 10.1016/j.cmpb.2024.108205] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/04/2024] [Revised: 04/13/2024] [Accepted: 04/24/2024] [Indexed: 05/06/2024]
Abstract
The pancreas is a vital organ in digestive system which has significant health implications. It is imperative to evaluate and identify malignant pancreatic lesions promptly in light of the high mortality rate linked to such malignancies. Endoscopic Ultrasound (EUS) is a non-invasive precise technique to detect pancreas disorders, but it is highly operator dependent. Artificial intelligence (AI), including traditional machine learning (ML) and deep learning (DL) techniques can play a pivotal role to enhancing the performance of EUS regardless of operator. AI performs a critical function in the detection, classification, and segmentation of medical images. The utilization of AI-assisted systems has improved the accuracy and productivity of pancreatic analysis, including the detection of diverse pancreatic disorders (e.g., pancreatitis, masses, and cysts) as well as landmarks and parenchyma. This systematic review examines the rapidly developing domain of AI-assisted system in EUS of the pancreas. Its objective is to present a thorough study of the present research status and developments in this area. This paper explores the significant challenges of AI-assisted system in pancreas EUS imaging, highlights the potential of AI techniques in addressing these challenges, and suggests the scope for future research in domain of AI-assisted EUS systems.
Collapse
Affiliation(s)
- Fatemeh Rousta
- Department of Biomedical Engineering and Physics, School of Medicine, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Ali Esteki
- Department of Biomedical Engineering and Physics, School of Medicine, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Ahmad Shalbaf
- Department of Biomedical Engineering and Physics, School of Medicine, Shahid Beheshti University of Medical Sciences, Tehran, Iran.
| | - Amir Sadeghi
- Research Institute for Gastroenterology and Liver Diseases, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Pardis Ketabi Moghadam
- Research Institute for Gastroenterology and Liver Diseases, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Ardalan Voshagh
- Faculty of Electrical Engineering, Shahid Beheshti University, Tehran, Iran
| |
Collapse
|
18
|
Zeng M, Wang X, Chen W. Worldwide research landscape of artificial intelligence in lung disease: A scientometric study. Heliyon 2024; 10:e31129. [PMID: 38826704 PMCID: PMC11141367 DOI: 10.1016/j.heliyon.2024.e31129] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2023] [Revised: 05/09/2024] [Accepted: 05/10/2024] [Indexed: 06/04/2024] Open
Abstract
Purpose To perform a comprehensive bibliometric analysis of the application of artificial intelligence (AI) in lung disease to understand the current status and emerging trends of this field. Materials and methods AI-based lung disease research publications were selected from the Web of Science Core Collection. Citespace, VOS viewer and Excel were used to analyze and visualize co-authorship, co-citation, and co-occurrence analysis of authors, keywords, countries/regions, references and institutions in this field. Results Our study included a total of 5210 papers. The number of publications on AI in lung disease showed explosive growth since 2017. China and the United States lead in publication numbers. The most productive author were Li, Weimin and Qian Wei, with Shanghai Jiaotong University as the most productive institution. Radiology was the most co-cited journal. Lung cancer and COVID-19 emerged as the most studied diseases. Deep learning, convolutional neural network, lung cancer, radiomics will be the focus of future research. Conclusions AI-based diagnosis and treatment of lung disease has become a research hotspot in recent years, yielding significant results. Future work should focus on establishing multimodal AI models that incorporate clinical, imaging and laboratory information. Enhanced visualization of deep learning, AI-driven differential diagnosis model for lung disease and the creation of international large-scale lung disease databases should also be considered.
Collapse
Affiliation(s)
| | | | - Wei Chen
- Department of Radiology, Southwest Hospital, Third Military Medical University, Chongqing, China
| |
Collapse
|
19
|
Go J, Moon N. Cleaned Meta Pseudo Labels-Based Pet Behavior Recognition Using Time-Series Sensor Data. SENSORS (BASEL, SWITZERLAND) 2024; 24:3391. [PMID: 38894180 PMCID: PMC11175053 DOI: 10.3390/s24113391] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/25/2024] [Revised: 05/15/2024] [Accepted: 05/23/2024] [Indexed: 06/21/2024]
Abstract
With the increasing number of households owning pets, the importance of sensor data for recognizing pet behavior has grown significantly. However, challenges arise due to the costs and reliability issues associated with data collection. This paper proposes a method for classifying pet behavior using cleaned meta pseudo labels to overcome these issues. The data for this study were collected using wearable devices equipped with accelerometers, gyroscopes, and magnetometers, and pet behaviors were classified into five categories. Utilizing this data, we analyzed the impact of the quantity of labeled data on accuracy and further enhanced the learning process by integrating an additional Distance Loss. This method effectively improves the learning process by removing noise from unlabeled data. Experimental results demonstrated that while the conventional supervised learning method achieved an accuracy of 82.9%, the existing meta pseudo labels method showed an accuracy of 86.2%, and the cleaned meta pseudo labels method proposed in this study surpassed these with an accuracy of 88.3%. These results hold significant implications for the development of pet monitoring systems, and the approach of this paper provides an effective solution for recognizing and classifying pet behavior in environments with insufficient labels.
Collapse
Affiliation(s)
| | - Nammee Moon
- Department of Computer Science and Engineering, Hoseo University, Asan-si 31499, Republic of Korea;
| |
Collapse
|
20
|
Shin M, Seo M, Lee K, Yoon K. Super-resolution techniques for biomedical applications and challenges. Biomed Eng Lett 2024; 14:465-496. [PMID: 38645589 PMCID: PMC11026337 DOI: 10.1007/s13534-024-00365-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2023] [Revised: 02/12/2024] [Accepted: 02/18/2024] [Indexed: 04/23/2024] Open
Abstract
Super-resolution (SR) techniques have revolutionized the field of biomedical applications by detailing the structures at resolutions beyond the limits of imaging or measuring tools. These techniques have been applied in various biomedical applications, including microscopy, magnetic resonance imaging (MRI), computed tomography (CT), X-ray, electroencephalogram (EEG), ultrasound, etc. SR methods are categorized into two main types: traditional non-learning-based methods and modern learning-based approaches. In both applications, SR methodologies have been effectively utilized on biomedical images, enhancing the visualization of complex biological structures. Additionally, these methods have been employed on biomedical data, leading to improvements in computational precision and efficiency for biomedical simulations. The use of SR techniques has resulted in more detailed and accurate analyses in diagnostics and research, essential for early disease detection and treatment planning. However, challenges such as computational demands, data interpretation complexities, and the lack of unified high-quality data persist. The article emphasizes these issues, underscoring the need for ongoing development in SR technologies to further improve biomedical research and patient care outcomes.
Collapse
Affiliation(s)
- Minwoo Shin
- School of Mathematics and Computing (Computational Science and Engineering), Yonsei University, 50 Yonsei-Ro, Seodaemun-Gu, Seoul, 03722 Republic of Korea
| | - Minjee Seo
- School of Mathematics and Computing (Computational Science and Engineering), Yonsei University, 50 Yonsei-Ro, Seodaemun-Gu, Seoul, 03722 Republic of Korea
| | - Kyunghyun Lee
- School of Mathematics and Computing (Computational Science and Engineering), Yonsei University, 50 Yonsei-Ro, Seodaemun-Gu, Seoul, 03722 Republic of Korea
| | - Kyungho Yoon
- School of Mathematics and Computing (Computational Science and Engineering), Yonsei University, 50 Yonsei-Ro, Seodaemun-Gu, Seoul, 03722 Republic of Korea
| |
Collapse
|
21
|
Migliorelli G, Fiorentino MC, Di Cosmo M, Villani FP, Mancini A, Moccia S. On the use of contrastive learning for standard-plane classification in fetal ultrasound imaging. Comput Biol Med 2024; 174:108430. [PMID: 38613892 DOI: 10.1016/j.compbiomed.2024.108430] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2023] [Revised: 03/06/2024] [Accepted: 04/07/2024] [Indexed: 04/15/2024]
Abstract
BACKGROUND To investigate the effectiveness of contrastive learning, in particular SimClr, in reducing the need for large annotated ultrasound (US) image datasets for fetal standard plane identification. METHODS We explore SimClr advantage in the cases of both low and high inter-class variability, considering at the same time how classification performance varies according to different amounts of labels used. This evaluation is performed by exploiting contrastive learning through different training strategies. We apply both quantitative and qualitative analyses, using standard metrics (F1-score, sensitivity, and precision), Class Activation Mapping (CAM), and t-Distributed Stochastic Neighbor Embedding (t-SNE). RESULTS When dealing with high inter-class variability classification tasks, contrastive learning does not bring a significant advantage; whereas it results to be relevant for low inter-class variability classification, specifically when initialized with ImageNet weights. CONCLUSIONS Contrastive learning approaches are typically used when a large number of unlabeled data is available, which is not representative of US datasets. We proved that SimClr either as pre-training with backbone initialized via ImageNet weights or used in an end-to-end dual-task may impact positively the performance over standard transfer learning approaches, under a scenario in which the dataset is small and characterized by low inter-class variability.
Collapse
Affiliation(s)
| | | | - Mariachiara Di Cosmo
- Department of Information Engineering, Università Politecnica delle Marche, Ancona, Italy
| | | | - Adriano Mancini
- Department of Information Engineering, Università Politecnica delle Marche, Ancona, Italy
| | - Sara Moccia
- The BioRobotics Institute and Department of Excellence in Robotics and AI, Scuola Superiore Sant'Anna, Pisa, Italy
| |
Collapse
|
22
|
Wang Y, Ni H, Zhou J, Liu L, Lin J, Yin M, Gao J, Zhu S, Yin Q, Zhu J, Li R. A Semi-Supervised Learning Framework for Classifying Colorectal Neoplasia Based on the NICE Classification. JOURNAL OF IMAGING INFORMATICS IN MEDICINE 2024:10.1007/s10278-024-01123-9. [PMID: 38653910 DOI: 10.1007/s10278-024-01123-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/21/2024] [Revised: 04/02/2024] [Accepted: 04/12/2024] [Indexed: 04/25/2024]
Abstract
Labelling medical images is an arduous and costly task that necessitates clinical expertise and large numbers of qualified images. Insufficient samples can lead to underfitting during training and poor performance of supervised learning models. In this study, we aim to develop a SimCLR-based semi-supervised learning framework to classify colorectal neoplasia based on the NICE classification. First, the proposed framework was trained under self-supervised learning using a large unlabelled dataset; subsequently, it was fine-tuned on a limited labelled dataset based on the NICE classification. The model was evaluated on an independent dataset and compared with models based on supervised transfer learning and endoscopists using accuracy, Matthew's correlation coefficient (MCC), and Cohen's kappa. Finally, Grad-CAM and t-SNE were applied to visualize the models' interpretations. A ResNet-backboned SimCLR model (accuracy of 0.908, MCC of 0.862, and Cohen's kappa of 0.896) outperformed supervised transfer learning-based models (means: 0.803, 0.698, and 0.742) and junior endoscopists (0.816, 0.724, and 0.863), while performing only slightly worse than senior endoscopists (0.916, 0.875, and 0.944). Moreover, t-SNE showed a better clustering of ternary samples through self-supervised learning in SimCLR than through supervised transfer learning. Compared with traditional supervised learning, semi-supervised learning enables deep learning models to achieve improved performance with limited labelled endoscopic images.
Collapse
Affiliation(s)
- Yu Wang
- Department of Hepatobiliary Surgery, Jintan Affiliated Hospital of Jiangsu University, Changzhou, Jiangsu, 213200, China
| | - Haoxiang Ni
- Department of Gastroenterology, The First Affiliated Hospital of Soochow University, # 899 Pinghai St., Suzhou, Jiangsu, 215006, China
- Suzhou Clinical Center of Digestive Disease, Suzhou, Jiangsu, 215006, China
| | - Jielu Zhou
- Department of Gastroenterology, The First Affiliated Hospital of Soochow University, # 899 Pinghai St., Suzhou, Jiangsu, 215006, China
- Department of Geriatrics, Kowloon Affiliated Hospital of Shanghai Jiao Tong University, Suzhou, Jiangsu, 215006, China
| | - Lihe Liu
- Department of Gastroenterology, The First Affiliated Hospital of Soochow University, # 899 Pinghai St., Suzhou, Jiangsu, 215006, China
- Suzhou Clinical Center of Digestive Disease, Suzhou, Jiangsu, 215006, China
| | - Jiaxi Lin
- Department of Gastroenterology, The First Affiliated Hospital of Soochow University, # 899 Pinghai St., Suzhou, Jiangsu, 215006, China
- Suzhou Clinical Center of Digestive Disease, Suzhou, Jiangsu, 215006, China
| | - Minyue Yin
- Department of Gastroenterology, Beijing Friendship Hospital, Capital Medical University, Beijing, 100050, China
- National Clinical Research Center for Digestive Disease, Beijing Digestive Disease Center, State Key Laboratory of Digestive Health, Beijing, 100050, China
| | - Jingwen Gao
- Department of Gastroenterology, The First Affiliated Hospital of Soochow University, # 899 Pinghai St., Suzhou, Jiangsu, 215006, China
- Suzhou Clinical Center of Digestive Disease, Suzhou, Jiangsu, 215006, China
| | - Shiqi Zhu
- Department of Gastroenterology, The First Affiliated Hospital of Soochow University, # 899 Pinghai St., Suzhou, Jiangsu, 215006, China
- Suzhou Clinical Center of Digestive Disease, Suzhou, Jiangsu, 215006, China
| | - Qi Yin
- Department of Anesthesiology, Jintan Affiliated Hospital of Jiangsu University, Changzhou, Jiangsu, 213200, China
| | - Jinzhou Zhu
- Department of Gastroenterology, The First Affiliated Hospital of Soochow University, # 899 Pinghai St., Suzhou, Jiangsu, 215006, China.
- Suzhou Clinical Center of Digestive Disease, Suzhou, Jiangsu, 215006, China.
- Key Laboratory of Hepatosplenic Surgery, Ministry of Education, The First Affiliated Hospital of Harbin Medical University, Harbin, 150001, China.
| | - Rui Li
- Department of Gastroenterology, The First Affiliated Hospital of Soochow University, # 899 Pinghai St., Suzhou, Jiangsu, 215006, China.
- Suzhou Clinical Center of Digestive Disease, Suzhou, Jiangsu, 215006, China.
| |
Collapse
|
23
|
VanBerlo B, Hoey J, Wong A. A survey of the impact of self-supervised pretraining for diagnostic tasks in medical X-ray, CT, MRI, and ultrasound. BMC Med Imaging 2024; 24:79. [PMID: 38580932 PMCID: PMC10998380 DOI: 10.1186/s12880-024-01253-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2023] [Accepted: 03/18/2024] [Indexed: 04/07/2024] Open
Abstract
Self-supervised pretraining has been observed to be effective at improving feature representations for transfer learning, leveraging large amounts of unlabelled data. This review summarizes recent research into its usage in X-ray, computed tomography, magnetic resonance, and ultrasound imaging, concentrating on studies that compare self-supervised pretraining to fully supervised learning for diagnostic tasks such as classification and segmentation. The most pertinent finding is that self-supervised pretraining generally improves downstream task performance compared to full supervision, most prominently when unlabelled examples greatly outnumber labelled examples. Based on the aggregate evidence, recommendations are provided for practitioners considering using self-supervised learning. Motivated by limitations identified in current research, directions and practices for future study are suggested, such as integrating clinical knowledge with theoretically justified self-supervised learning methods, evaluating on public datasets, growing the modest body of evidence for ultrasound, and characterizing the impact of self-supervised pretraining on generalization.
Collapse
Affiliation(s)
- Blake VanBerlo
- Cheriton School of Computer Science, 200 University Ave W, N2L 3G1, Waterloo, Canada.
| | - Jesse Hoey
- Cheriton School of Computer Science, 200 University Ave W, N2L 3G1, Waterloo, Canada
| | - Alexander Wong
- Department of Systems Design Engineering, 200 University Ave W, N2L 3G1, Waterloo, Canada
| |
Collapse
|
24
|
Huang J, Galal G, Mukhin V, Etemadi M, Tanna AP. Prediction and Detection of Glaucomatous Visual Field Progression Using Deep Learning on Macular Optical Coherence Tomography. J Glaucoma 2024; 33:246-253. [PMID: 38245813 DOI: 10.1097/ijg.0000000000002359] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2023] [Accepted: 11/26/2023] [Indexed: 01/22/2024]
Abstract
PRCIS A deep learning model trained on macular OCT imaging studies detected clinically significant functional glaucoma progression and was also able to predict future progression. OBJECTIVE To use macular optical coherence tomography (OCT) imaging to predict the future and detect concurrent visual field progression, respectively, using deep learning. DESIGN A retrospective cohort study. SUBJECTS A pretraining data set was comprised of 7,702,201 B-scan images from 151,389 macular OCT studies. The progression detection task included 3902 macular OCT imaging studies from 1534 eyes of 828 patients with glaucoma, and the progression prediction task included 1346 macular OCT studies from 1205 eyes of 784. METHODS A novel deep learning method was developed to detect glaucoma progression and predict future progression using macular OCT, based on self-supervised pretraining of a vision transformer (ViT) model on a large, unlabeled data set of OCT images. Glaucoma progression was defined as a mean deviation (MD) rate of change of ≤ -0.5 dB/year over 5 consecutive Humphrey visual field tests, and rapid progression was defined as MD change ≤ -1 dB/year. MAIN OUTCOME MEASURES Diagnostic performance of the ViT model for prediction of future visual field progression and detection of concurrent visual field progression using area under the receiver operating characteristic curve (AUC), sensitivity, and specificity. RESULTS The model distinguished stable eyes from progressing eyes, achieving an AUC of 0.90 (95% CI, 0.88-0.91). Rapid progression was detected with an AUC of 0.92 (95% CI, 0.91-0.93). The model also demonstrated high predictive ability for forecasting future glaucoma progression, with an AUC of 0.85 (95% CI 0.83-0.87). Rapid progression was predicted with an AUC of 0.84 (95% CI 0.81-0.86). CONCLUSIONS A deep learning model detected clinically significant functional glaucoma progression using macular OCT imaging studies and was also able to predict future progression. Early identification of patients undergoing glaucoma progression or at high risk for future progression may aid in clinical decision-making.
Collapse
Affiliation(s)
| | - Galal Galal
- Research and Development, Northwestern Medicine Information Services, Chicago
| | - Vladislav Mukhin
- Research and Development, Northwestern Medicine Information Services, Chicago
| | - Mozziyar Etemadi
- Department of Anesthesiology, Northwestern University Feinberg School of Medicine
- Research and Development, Northwestern Medicine Information Services, Chicago
- Department of Biomedical Engineering, Northwestern University, Evanston, IL
| | | |
Collapse
|
25
|
Carter D, Bykhovsky D, Hasky A, Mamistvalov I, Zimmer Y, Ram E, Hoffer O. Convolutional neural network deep learning model accurately detects rectal cancer in endoanal ultrasounds. Tech Coloproctol 2024; 28:44. [PMID: 38561492 PMCID: PMC10984882 DOI: 10.1007/s10151-024-02917-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/22/2023] [Accepted: 03/06/2024] [Indexed: 04/04/2024]
Abstract
BACKGROUND Imaging is vital for assessing rectal cancer, with endoanal ultrasound (EAUS) being highly accurate in large tertiary medical centers. However, EAUS accuracy drops outside such settings, possibly due to varied examiner experience and fewer examinations. This underscores the need for an AI-based system to enhance accuracy in non-specialized centers. This study aimed to develop and validate deep learning (DL) models to differentiate rectal cancer in standard EAUS images. METHODS A transfer learning approach with fine-tuned DL architectures was employed, utilizing a dataset of 294 images. The performance of DL models was assessed through a tenfold cross-validation. RESULTS The DL diagnostics model exhibited a sensitivity and accuracy of 0.78 each. In the identification phase, the automatic diagnostic platform achieved an area under the curve performance of 0.85 for diagnosing rectal cancer. CONCLUSIONS This research demonstrates the potential of DL models in enhancing rectal cancer detection during EAUS, especially in settings with lower examiner experience. The achieved sensitivity and accuracy suggest the viability of incorporating AI support for improved diagnostic outcomes in non-specialized medical centers.
Collapse
Affiliation(s)
- D Carter
- Department of Gastroenterology, Chaim Sheba Medical Center, Ramat Gan, Israel.
- Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel.
| | - D Bykhovsky
- Electrical and Electronics Engineering Department, Shamoon College of Engineering, Beer-Sheba, Israel
| | - A Hasky
- School of Electrical Engineering, Afeka College of Engineering, Tel Aviv, Israel
| | - I Mamistvalov
- School of Electrical Engineering, Afeka College of Engineering, Tel Aviv, Israel
| | - Y Zimmer
- School of Medical Engineering, Afeka College of Engineering, Tel Aviv, Israel
| | - E Ram
- Department of Gastroenterology, Chaim Sheba Medical Center, Ramat Gan, Israel
- Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel
| | - O Hoffer
- School of Electrical Engineering, Afeka College of Engineering, Tel Aviv, Israel
| |
Collapse
|
26
|
Pai S, Bontempi D, Hadzic I, Prudente V, Sokač M, Chaunzwa TL, Bernatz S, Hosny A, Mak RH, Birkbak NJ, Aerts HJWL. Foundation model for cancer imaging biomarkers. NAT MACH INTELL 2024; 6:354-367. [PMID: 38523679 PMCID: PMC10957482 DOI: 10.1038/s42256-024-00807-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2023] [Accepted: 02/08/2024] [Indexed: 03/26/2024]
Abstract
Foundation models in deep learning are characterized by a single large-scale model trained on vast amounts of data serving as the foundation for various downstream tasks. Foundation models are generally trained using self-supervised learning and excel in reducing the demand for training samples in downstream applications. This is especially important in medicine, where large labelled datasets are often scarce. Here, we developed a foundation model for cancer imaging biomarker discovery by training a convolutional encoder through self-supervised learning using a comprehensive dataset of 11,467 radiographic lesions. The foundation model was evaluated in distinct and clinically relevant applications of cancer imaging-based biomarkers. We found that it facilitated better and more efficient learning of imaging biomarkers and yielded task-specific models that significantly outperformed conventional supervised and other state-of-the-art pretrained implementations on downstream tasks, especially when training dataset sizes were very limited. Furthermore, the foundation model was more stable to input variations and showed strong associations with underlying biology. Our results demonstrate the tremendous potential of foundation models in discovering new imaging biomarkers that may extend to other clinical use cases and can accelerate the widespread translation of imaging biomarkers into clinical settings.
Collapse
Affiliation(s)
- Suraj Pai
- Artificial Intelligence in Medicine (AIM) Program, Mass General Brigham, Harvard Medical School, Harvard Institutes of Medicine, Boston, MA USA
- Radiology and Nuclear Medicine, CARIM and GROW, Maastricht University, Maastricht, the Netherlands
- Department of Radiation Oncology, Brigham and Women’s Hospital, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA USA
| | - Dennis Bontempi
- Artificial Intelligence in Medicine (AIM) Program, Mass General Brigham, Harvard Medical School, Harvard Institutes of Medicine, Boston, MA USA
- Radiology and Nuclear Medicine, CARIM and GROW, Maastricht University, Maastricht, the Netherlands
- Department of Radiation Oncology, Brigham and Women’s Hospital, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA USA
| | - Ibrahim Hadzic
- Artificial Intelligence in Medicine (AIM) Program, Mass General Brigham, Harvard Medical School, Harvard Institutes of Medicine, Boston, MA USA
- Radiology and Nuclear Medicine, CARIM and GROW, Maastricht University, Maastricht, the Netherlands
- Department of Radiation Oncology, Brigham and Women’s Hospital, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA USA
| | - Vasco Prudente
- Artificial Intelligence in Medicine (AIM) Program, Mass General Brigham, Harvard Medical School, Harvard Institutes of Medicine, Boston, MA USA
- Radiology and Nuclear Medicine, CARIM and GROW, Maastricht University, Maastricht, the Netherlands
- Department of Radiation Oncology, Brigham and Women’s Hospital, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA USA
| | - Mateo Sokač
- Department of Molecular Medicine, Aarhus University Hospital, Aarhus, Denmark
- Department of Clinical Medicine, Aarhus University, Aarhus, Denmark
| | - Tafadzwa L. Chaunzwa
- Artificial Intelligence in Medicine (AIM) Program, Mass General Brigham, Harvard Medical School, Harvard Institutes of Medicine, Boston, MA USA
- Department of Radiation Oncology, Brigham and Women’s Hospital, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA USA
| | - Simon Bernatz
- Artificial Intelligence in Medicine (AIM) Program, Mass General Brigham, Harvard Medical School, Harvard Institutes of Medicine, Boston, MA USA
- Department of Radiation Oncology, Brigham and Women’s Hospital, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA USA
| | - Ahmed Hosny
- Artificial Intelligence in Medicine (AIM) Program, Mass General Brigham, Harvard Medical School, Harvard Institutes of Medicine, Boston, MA USA
- Department of Radiation Oncology, Brigham and Women’s Hospital, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA USA
| | - Raymond H. Mak
- Artificial Intelligence in Medicine (AIM) Program, Mass General Brigham, Harvard Medical School, Harvard Institutes of Medicine, Boston, MA USA
- Radiology and Nuclear Medicine, CARIM and GROW, Maastricht University, Maastricht, the Netherlands
| | - Nicolai J. Birkbak
- Department of Molecular Medicine, Aarhus University Hospital, Aarhus, Denmark
- Department of Clinical Medicine, Aarhus University, Aarhus, Denmark
| | - Hugo J. W. L. Aerts
- Artificial Intelligence in Medicine (AIM) Program, Mass General Brigham, Harvard Medical School, Harvard Institutes of Medicine, Boston, MA USA
- Radiology and Nuclear Medicine, CARIM and GROW, Maastricht University, Maastricht, the Netherlands
- Department of Radiation Oncology, Brigham and Women’s Hospital, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA USA
- Department of Radiology, Brigham and Women’s Hospital, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA USA
| |
Collapse
|
27
|
Guo H, Somayajula SA, Hosseini R, Xie P. Improving image classification of gastrointestinal endoscopy using curriculum self-supervised learning. Sci Rep 2024; 14:6100. [PMID: 38480815 PMCID: PMC10937990 DOI: 10.1038/s41598-024-53955-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2023] [Accepted: 02/07/2024] [Indexed: 03/17/2024] Open
Abstract
Endoscopy, a widely used medical procedure for examining the gastrointestinal (GI) tract to detect potential disorders, poses challenges in manual diagnosis due to non-specific symptoms and difficulties in accessing affected areas. While supervised machine learning models have proven effective in assisting clinical diagnosis of GI disorders, the scarcity of image-label pairs created by medical experts limits their availability. To address these limitations, we propose a curriculum self-supervised learning framework inspired by human curriculum learning. Our approach leverages the HyperKvasir dataset, which comprises 100k unlabeled GI images for pre-training and 10k labeled GI images for fine-tuning. By adopting our proposed method, we achieved an impressive top-1 accuracy of 88.92% and an F1 score of 73.39%. This represents a 2.1% increase over vanilla SimSiam for the top-1 accuracy and a 1.9% increase for the F1 score. The combination of self-supervised learning and a curriculum-based approach demonstrates the efficacy of our framework in advancing the diagnosis of GI disorders. Our study highlights the potential of curriculum self-supervised learning in utilizing unlabeled GI tract images to improve the diagnosis of GI disorders, paving the way for more accurate and efficient diagnosis in GI endoscopy.
Collapse
Affiliation(s)
- Han Guo
- Department of Electrical and Computer Engineering, University of California, San Diego, San Diego, 92093, USA
| | - Sai Ashish Somayajula
- Department of Electrical and Computer Engineering, University of California, San Diego, San Diego, 92093, USA
| | - Ramtin Hosseini
- Department of Electrical and Computer Engineering, University of California, San Diego, San Diego, 92093, USA
| | - Pengtao Xie
- Department of Electrical and Computer Engineering, University of California, San Diego, San Diego, 92093, USA.
| |
Collapse
|
28
|
Gholami S, Scheppke L, Kshirsagar M, Wu Y, Dodhia R, Bonelli R, Leung I, Sallo FB, Muldrew A, Jamison C, Peto T, Lavista Ferres J, Weeks WB, Friedlander M, Lee AY. Self-Supervised Learning for Improved Optical Coherence Tomography Detection of Macular Telangiectasia Type 2. JAMA Ophthalmol 2024; 142:226-233. [PMID: 38329740 PMCID: PMC10853868 DOI: 10.1001/jamaophthalmol.2023.6454] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2023] [Accepted: 11/29/2023] [Indexed: 02/09/2024]
Abstract
Importance Deep learning image analysis often depends on large, labeled datasets, which are difficult to obtain for rare diseases. Objective To develop a self-supervised approach for automated classification of macular telangiectasia type 2 (MacTel) on optical coherence tomography (OCT) with limited labeled data. Design, Setting, and Participants This was a retrospective comparative study. OCT images from May 2014 to May 2019 were collected by the Lowy Medical Research Institute, La Jolla, California, and the University of Washington, Seattle, from January 2016 to October 2022. Clinical diagnoses of patients with and without MacTel were confirmed by retina specialists. Data were analyzed from January to September 2023. Exposures Two convolutional neural networks were pretrained using the Bootstrap Your Own Latent algorithm on unlabeled training data and fine-tuned with labeled training data to predict MacTel (self-supervised method). ResNet18 and ResNet50 models were also trained using all labeled data (supervised method). Main Outcomes and Measures The ground truth yes vs no MacTel diagnosis is determined by retinal specialists based on spectral-domain OCT. The models' predictions were compared against human graders using accuracy, sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), area under precision recall curve (AUPRC), and area under the receiver operating characteristic curve (AUROC). Uniform manifold approximation and projection was performed for dimension reduction and GradCAM visualizations for supervised and self-supervised methods. Results A total of 2636 OCT scans from 780 patients with MacTel and 131 patients without MacTel were included from the MacTel Project (mean [SD] age, 60.8 [11.7] years; 63.8% female), and another 2564 from 1769 patients without MacTel from the University of Washington (mean [SD] age, 61.2 [18.1] years; 53.4% female). The self-supervised approach fine-tuned on 100% of the labeled training data with ResNet50 as the feature extractor performed the best, achieving an AUPRC of 0.971 (95% CI, 0.969-0.972), an AUROC of 0.970 (95% CI, 0.970-0.973), accuracy of 0.898%, sensitivity of 0.898, specificity of 0.949, PPV of 0.935, and NPV of 0.919. With only 419 OCT volumes (185 MacTel patients in 10% of labeled training dataset), the ResNet18 self-supervised model achieved comparable performance, with an AUPRC of 0.958 (95% CI, 0.957-0.960), an AUROC of 0.966 (95% CI, 0.964-0.967), and accuracy, sensitivity, specificity, PPV, and NPV of 90.2%, 0.884, 0.916, 0.896, and 0.906, respectively. The self-supervised models showed better agreement with the more experienced human expert graders. Conclusions and Relevance The findings suggest that self-supervised learning may improve the accuracy of automated MacTel vs non-MacTel binary classification on OCT with limited labeled training data, and these approaches may be applicable to other rare diseases, although further research is warranted.
Collapse
Affiliation(s)
| | - Lea Scheppke
- The Lowy Medical Research Institute, La Jolla, California
| | | | - Yue Wu
- Department of Ophthalmology, University of Washington, Seattle
- Roger and Angie Karalis Johnson Retina Center, Seattle, Washington
| | - Rahul Dodhia
- AI for Good Lab, Microsoft Research, Redmond, Washington
| | | | - Irene Leung
- Moorfields Eye Hospital, London, United Kingdom
| | - Ferenc B. Sallo
- Hôpital Ophtalmique Jules-Gonin, Fondation Asile des Aveugles, University of Lausanne, Lausanne, Switzerland
| | | | | | - Tunde Peto
- Queen’s University Belfast, Belfast, Northern Ireland
| | | | | | - Martin Friedlander
- The Lowy Medical Research Institute, La Jolla, California
- The Scripps Research Institute, La Jolla, California
| | - Aaron Y. Lee
- Department of Ophthalmology, University of Washington, Seattle
- Roger and Angie Karalis Johnson Retina Center, Seattle, Washington
| |
Collapse
|
29
|
Alkojak Almansi A, Sugarova S, Alsanosi A, Almuhawas F, Hofmeyr L, Wagner F, Kedves E, Sriperumbudur K, Dhanasingh A, Kedves A. A novel radiological software prototype for automatically detecting the inner ear and classifying normal from malformed anatomy. Comput Biol Med 2024; 171:108168. [PMID: 38432006 DOI: 10.1016/j.compbiomed.2024.108168] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2023] [Revised: 01/29/2024] [Accepted: 02/15/2024] [Indexed: 03/05/2024]
Abstract
BACKGROUND To develop an effective radiological software prototype that could read Digital Imaging and Communications in Medicine (DICOM) files, crop the inner ear automatically based on head computed tomography (CT), and classify normal and inner ear malformation (IEM). METHODS A retrospective analysis was conducted on 2053 patients from 3 hospitals. We extracted 1200 inner ear CTs for importing, cropping, and training, testing, and validating an artificial intelligence (AI) model. Automated cropping algorithms based on CTs were developed to precisely isolate the inner ear volume. Additionally, a simple graphical user interface (GUI) was implemented for user interaction. Using cropped CTs as input, a deep learning convolutional neural network (DL CNN) with 5-fold cross-validation was used to classify inner ear anatomy as normal or abnormal. Five specific IEM types (cochlear hypoplasia, ossification, incomplete partition types I and III, and common cavity) were included, with data equally distributed between classes. Both the cropping tool and the AI model were extensively validated. RESULTS The newly developed DICOM viewer/software successfully achieved its objectives: reading CT files, automatically cropping inner ear volumes, and classifying them as normal or malformed. The cropping tool demonstrated an average accuracy of 92.25%. The DL CNN model achieved an area under the curve (AUC) of 0.86 (95% confidence interval: 0.81-0.91). Performance metrics for the AI model were: accuracy (0.812), precision (0.791), recall (0.8), and F1-score (0.766). CONCLUSION This study successfully developed and validated a fully automated workflow for classifying normal versus abnormal inner ear anatomy using a combination of advanced image processing and deep learning techniques. The tool exhibited good diagnostic accuracy, suggesting its potential application in risk stratification. However, it is crucial to emphasize the need for supervision by qualified medical professionals when utilizing this tool for clinical decision-making.
Collapse
Affiliation(s)
- Abdulrahman Alkojak Almansi
- University of Pecs, Faculty of Engineering and Information Technology, Institute of Information and Electrical Technology, Pecs, Hungary
| | - Sima Sugarova
- St. Petersburg ENT and Speech Research Institute, St. Petersburg, Russia
| | - Abdulrahman Alsanosi
- King Saud University, King Abdullah Ear Specialist Center (KAESC), Department of Otolaryngology, Riyadh, Saudi Arabia
| | - Fida Almuhawas
- King Saud University, King Abdullah Ear Specialist Center (KAESC), Department of Otolaryngology, Riyadh, Saudi Arabia
| | - Louis Hofmeyr
- Dr Loius Hofmeyr's workplace to Stellenbosch University Division of Otorhinolaryngology, Stellenbosch, South Africa
| | - Franca Wagner
- University Hospital Bern, University Institute for Diagnostic and Interventional Neuroradiology, Switzerland
| | - Emerencia Kedves
- University of Sopron, Doctoral School of Wood Sciences and Technologies, Sopron, Hungary
| | - Kiran Sriperumbudur
- MED-EL Medical Electronics GmbH., Department of Research and Development, Innsbruck, Austria
| | - Anandhan Dhanasingh
- MED-EL Medical Electronics GmbH., Department of Research and Development, Innsbruck, Austria.
| | - Andras Kedves
- MED-EL Medical Electronics GmbH., Department of Research and Development, Innsbruck, Austria; University of Pecs, Faculty of Engineering and Information Technology, Institute of Information and Electrical Technology, Pecs, Hungary.
| |
Collapse
|
30
|
Chen L, Zhang Z. The self-distillation trained multitask dense-attention network for diagnosing lung cancers based on CT scans. Med Phys 2024; 51:1738-1753. [PMID: 37715993 DOI: 10.1002/mp.16736] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2023] [Revised: 07/31/2023] [Accepted: 08/15/2023] [Indexed: 09/18/2023] Open
Abstract
BACKGROUND The latest international multidisciplinary histopathological classification of lung cancer indicates that a deeper study of the lung adenocarcinoma requires a comprehensive multidisciplinary platform. However, in the traditional pathological examination or previous computer-vision-based research, the entire lung is not considered in a comprehensive manner. PURPOSE The study aims to develop a deep learning model proposed for diagnosing the lung adenocarcinoma histopathologically based on CT scans. Instead of just classifying the lung adenocarcinoma, the pathological report should be inferred based on both the invasiveness and growth pattern of the tumors. METHODS A self-distillation trained multitask dense-attention network (SD-MdaNet) is proposed and validated based on 2412 labeled CT scans from 476 patients and 845 unlabeled scans. Inferring the pathological report is divided into two tasks, predicting the invasiveness of the lung tumor and inferring growth patterns of tumor cells in a comprehensive histopathological subtyping manner with excellent accuracy. In the proposed method, the dense-attention module is introduced to better extract features from a small dataset in the main branch of the MdaNet. Next, task-specific attention modules are utilized in different branches and finally integrated as a multitask model. The second task is a blend of classification and regression tasks. Thus, a specialized loss function is developed. In the proposed knowledge distillation (KD) process, the MdaNet as well as its main branch trained for solving two single tasks, respectively, are treated as multiple teachers to produce a student model. A novel KD loss function is developed to take the advantage of all the models as well as data with labels and without labels. RESULTS SD-MdaNet achieves an AUC of98.7 ± 0.4 % $98.7\pm 0.4\%$ on invasiveness prediction, and91.6 ± 1.0 % $91.6\pm 1.0\%$ on predominant growth pattern prediction on our dataset. Moreover, the average mean squared error in inferring growth pattern proportion reaches0.0217 ± 0.0019 $0.0217\pm 0.0019$ , and the AUC for predominant growth pattern proportion reaches91.6 ± 1.0 % $91.6\pm 1.0\%$ . The proposed SD-MdaNet is significantly better than all other benchmarking methods (F D R < 0.05 $FDR<0.05$ ). CONCLUSIONS Experimental results demonstrate that the proposed SD-MdaNet can significantly improve the performance of the lung adenocarcinoma pathological diagnosis using only CT scans. Analyses and discussions are conducted to interpret the advantages of the SD-MdaNet.
Collapse
Affiliation(s)
- Liuyin Chen
- School of Data Science, City University of Hong Kong, Hong Kong SAR, China
| | - Zijun Zhang
- School of Data Science, City University of Hong Kong, Hong Kong SAR, China
| |
Collapse
|
31
|
Fu H, Novak A, Robert D, Kumar S, Tanamala S, Oke J, Bhatia K, Shah R, Romsauerova A, Das T, Espinosa A, Grzeda MT, Narbone M, Dharmadhikari R, Harrison M, Vimalesvaran K, Gooch J, Woznitza N, Salik N, Campbell A, Khan F, Lowe DJ, Shuaib H, Ather S. AI assisted reader evaluation in acute CT head interpretation (AI-REACT): protocol for a multireader multicase study. BMJ Open 2024; 14:e079824. [PMID: 38346874 PMCID: PMC10862304 DOI: 10.1136/bmjopen-2023-079824] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/13/2023] [Accepted: 01/28/2024] [Indexed: 02/15/2024] Open
Abstract
INTRODUCTION A non-contrast CT head scan (NCCTH) is the most common cross-sectional imaging investigation requested in the emergency department. Advances in computer vision have led to development of several artificial intelligence (AI) tools to detect abnormalities on NCCTH. These tools are intended to provide clinical decision support for clinicians, rather than stand-alone diagnostic devices. However, validation studies mostly compare AI performance against radiologists, and there is relative paucity of evidence on the impact of AI assistance on other healthcare staff who review NCCTH in their daily clinical practice. METHODS AND ANALYSIS A retrospective data set of 150 NCCTH will be compiled, to include 60 control cases and 90 cases with intracranial haemorrhage, hypodensities suggestive of infarct, midline shift, mass effect or skull fracture. The intracranial haemorrhage cases will be subclassified into extradural, subdural, subarachnoid, intraparenchymal and intraventricular. 30 readers will be recruited across four National Health Service (NHS) trusts including 10 general radiologists, 15 emergency medicine clinicians and 5 CT radiographers of varying experience. Readers will interpret each scan first without, then with, the assistance of the qER EU 2.0 AI tool, with an intervening 2-week washout period. Using a panel of neuroradiologists as ground truth, the stand-alone performance of qER will be assessed, and its impact on the readers' performance will be analysed as change in accuracy (area under the curve), median review time per scan and self-reported diagnostic confidence. Subgroup analyses will be performed by reader professional group, reader seniority, pathological finding, and neuroradiologist-rated difficulty. ETHICS AND DISSEMINATION The study has been approved by the UK Healthcare Research Authority (IRAS 310995, approved 13 December 2022). The use of anonymised retrospective NCCTH has been authorised by Oxford University Hospitals. The results will be presented at relevant conferences and published in a peer-reviewed journal. TRIAL REGISTRATION NUMBER NCT06018545.
Collapse
Affiliation(s)
- Howell Fu
- Oxford University Hospitals NHS Foundation Trust, Oxford, UK
| | - Alex Novak
- Emergency Medicine Research Oxford, Oxford University Hospitals NHS Foundation Trust, Oxford, UK
| | | | | | | | - Jason Oke
- Department of Primary Care Health Sciences, University of Oxford, Oxford, UK
| | - Kanika Bhatia
- Oxford University Hospitals NHS Foundation Trust, Oxford, UK
| | - Ruchir Shah
- Oxford University Hospitals NHS Foundation Trust, Oxford, UK
| | | | - Tilak Das
- Department of Clinical Radiology, Cambridge University Hospitals NHS Foundation Trust, Cambridge, UK
| | - Abdalá Espinosa
- Emergency Medicine Research Oxford, Oxford University Hospitals NHS Foundation Trust, Oxford, UK
| | | | | | | | - Mark Harrison
- Emergency Department, Northumbria Specialist Emergency Care Hospital, Cramlington, UK
| | - Kavitha Vimalesvaran
- Clinical Scientific Computing, Guy's and St Thomas' NHS Foundation Trust, London, UK
| | - Jane Gooch
- College of Health, Psychology & Social Care, University of Derby, Derby, UK
| | - Nicholas Woznitza
- Radiology Department, University College London Hospitals NHS Foundation Trust, London, UK
- School of Allied and Public Health Professions, Canterbury Christ Church University, Canterbury, UK
| | | | - Alan Campbell
- Radiology Department, University College London Hospitals NHS Foundation Trust, London, UK
| | - Farhaan Khan
- Oxford University Hospitals NHS Foundation Trust, Oxford, UK
| | | | - Haris Shuaib
- Clinical Scientific Computing, Guy's and St Thomas' NHS Foundation Trust, London, UK
| | - Sarim Ather
- Oxford University Hospitals NHS Foundation Trust, Oxford, UK
| |
Collapse
|
32
|
Shen Y, Li H, Sun C, Ji H, Zhang D, Hu K, Tang Y, Chen Y, Wei Z, Lv J. Optimizing skin disease diagnosis: harnessing online community data with contrastive learning and clustering techniques. NPJ Digit Med 2024; 7:28. [PMID: 38332257 PMCID: PMC10853166 DOI: 10.1038/s41746-024-01014-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2023] [Accepted: 01/18/2024] [Indexed: 02/10/2024] Open
Abstract
Skin diseases pose significant challenges in China. Internet health forums offer a platform for millions of users to discuss skin diseases and share images for early intervention, leaving large amount of valuable dermatology images. However, data quality and annotation challenges limit the potential of these resources for developing diagnostic models. In this study, we proposed a deep-learning model that utilized unannotated dermatology images from diverse online sources. We adopted a contrastive learning approach to learn general representations from unlabeled images and fine-tuned the model on coarsely annotated images from Internet forums. Our model classified 22 common skin diseases. To improve annotation quality, we used a clustering method with a small set of standardized validation images. We tested the model on images collected by 33 experienced dermatologists from 15 tertiary hospitals and achieved a 45.05% top-1 accuracy, outperforming the published baseline model by 3%. Accuracy increased with additional validation images, reaching 49.64% with 50 images per category. Our model also demonstrated transferability to new tasks, such as detecting monkeypox, with a 61.76% top-1 accuracy using only 50 additional images in the training process. We also tested our model on benchmark datasets to show the generalization ability. Our findings highlight the potential of unannotated images from online forums for future dermatology applications and demonstrate the effectiveness of our model for early diagnosis and potential outbreak mitigation.
Collapse
Affiliation(s)
- Yue Shen
- Simulation of Complex Systems Lab, Department of Human and Engineered Environmental Studies, Graduate School of Frontier Sciences, The University of Tokyo, Chiba, Japan
| | - Huanyu Li
- Shanghai Beforteen AI Lab, Shanghai, China
| | - Can Sun
- Institution of Aix-marseille, Wuhan University of Technology WHUT, Wuhan City, China
| | - Hongtao Ji
- Shanghai Business School No. 6333, Oriental Meigu Avenue, Shanghai, China
| | - Daojun Zhang
- The third affiliated hospital of CQMU, Chongqing, China
| | - Kun Hu
- Shanghai Beforteen AI Lab, Shanghai, China
| | - Yiqi Tang
- Shanghai Beforteen AI Lab, Shanghai, China
| | - Yu Chen
- Simulation of Complex Systems Lab, Department of Human and Engineered Environmental Studies, Graduate School of Frontier Sciences, The University of Tokyo, Chiba, Japan
| | - Zikun Wei
- Shanghai Beforteen AI Lab, Shanghai, China.
| | - Junwei Lv
- Shanghai Beforteen AI Lab, Shanghai, China.
| |
Collapse
|
33
|
Ma D, Stocks J, Rosen H, Kantarci K, Lockhart SN, Bateman JR, Craft S, Gurcan MN, Popuri K, Beg MF, Wang L. Differential diagnosis of frontotemporal dementia subtypes with explainable deep learning on structural MRI. Front Neurosci 2024; 18:1331677. [PMID: 38384484 PMCID: PMC10879283 DOI: 10.3389/fnins.2024.1331677] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2023] [Accepted: 01/08/2024] [Indexed: 02/23/2024] Open
Abstract
Background Frontotemporal dementia (FTD) represents a collection of neurobehavioral and neurocognitive syndromes that are associated with a significant degree of clinical, pathological, and genetic heterogeneity. Such heterogeneity hinders the identification of effective biomarkers, preventing effective targeted recruitment of participants in clinical trials for developing potential interventions and treatments. In the present study, we aim to automatically differentiate patients with three clinical phenotypes of FTD, behavioral-variant FTD (bvFTD), semantic variant PPA (svPPA), and nonfluent variant PPA (nfvPPA), based on their structural MRI by training a deep neural network (DNN). Methods Data from 277 FTD patients (173 bvFTD, 63 nfvPPA, and 41 svPPA) recruited from two multi-site neuroimaging datasets: the Frontotemporal Lobar Degeneration Neuroimaging Initiative and the ARTFL-LEFFTDS Longitudinal Frontotemporal Lobar Degeneration databases. Raw T1-weighted MRI data were preprocessed and parcellated into patch-based ROIs, with cortical thickness and volume features extracted and harmonized to control the confounding effects of sex, age, total intracranial volume, cohort, and scanner difference. A multi-type parallel feature embedding framework was trained to classify three FTD subtypes with a weighted cross-entropy loss function used to account for unbalanced sample sizes. Feature visualization was achieved through post-hoc analysis using an integrated gradient approach. Results The proposed differential diagnosis framework achieved a mean balanced accuracy of 0.80 for bvFTD, 0.82 for nfvPPA, 0.89 for svPPA, and an overall balanced accuracy of 0.84. Feature importance maps showed more localized differential patterns among different FTD subtypes compared to groupwise statistical mapping. Conclusion In this study, we demonstrated the efficiency and effectiveness of using explainable deep-learning-based parallel feature embedding and visualization framework on MRI-derived multi-type structural patterns to differentiate three clinically defined subphenotypes of FTD: bvFTD, nfvPPA, and svPPA, which could help with the identification of at-risk populations for early and precise diagnosis for intervention planning.
Collapse
Affiliation(s)
- Da Ma
- Department of Internal Medicine, Wake Forest University School of Medicine, Winston-Salem, NC, United States
| | - Jane Stocks
- Department of Psychiatry and Behavioral Health, Northwestern University Feinberg School of Medicine, Chicago, IL, United States
| | - Howard Rosen
- Weill Institute for Neurosciences, University of California San Francisco, San Francisco, CA, United States
| | - Kejal Kantarci
- Department of Radiology, Mayo Clinic, Rochester, MN, United States
| | - Samuel N. Lockhart
- Department of Internal Medicine, Wake Forest University School of Medicine, Winston-Salem, NC, United States
| | - James R. Bateman
- Department of Neurology, Wake Forest University School of Medicine, Winston-Salem, NC, United States
| | - Suzanne Craft
- Department of Internal Medicine, Wake Forest University School of Medicine, Winston-Salem, NC, United States
| | - Metin N. Gurcan
- Department of Internal Medicine, Wake Forest University School of Medicine, Winston-Salem, NC, United States
| | - Karteek Popuri
- Department of Computer Science, Memorial University of Newfoundland, St. John's, NL, Canada
| | - Mirza Faisal Beg
- School of Engineering Science, Simon Fraser University, Burnaby, BC, Canada
| | - Lei Wang
- Department of Psychiatry and Behavioral Health, Northwestern University Feinberg School of Medicine, Chicago, IL, United States
- Department of Psychiatry and Behavioral Health, Ohio State University Wexner Medical Center, Columbus, OH, United States
| | | |
Collapse
|
34
|
Mohammad-Rahimi H, Dianat O, Abbasi R, Zahedrozegar S, Ashkan A, Motamedian SR, Rohban MH, Nosrat A. Artificial Intelligence for Detection of External Cervical Resorption Using Label-Efficient Self-Supervised Learning Method. J Endod 2024; 50:144-153.e2. [PMID: 37977219 DOI: 10.1016/j.joen.2023.11.004] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2023] [Revised: 10/30/2023] [Accepted: 11/08/2023] [Indexed: 11/19/2023]
Abstract
INTRODUCTION The aim of this study was to leverage label-efficient self-supervised learning (SSL) to train a model that can detect ECR and differentiate it from caries. METHODS Periapical (PA) radiographs of teeth with ECR defects were collected. Two board-certified endodontists reviewed PA radiographs and cone beam computed tomographic (CBCT) images independently to determine presence of ECR (ground truth). Radiographic data were divided into 3 regions of interest (ROIs): healthy teeth, teeth with ECR, and teeth with caries. Nine contrastive SSL models (SimCLR v2, MoCo v2, BYOL, DINO, NNCLR, SwAV, MSN, Barlow Twins, and SimSiam) were implemented in the assessment alongside 7 baseline deep learning models (ResNet-18, ResNet-50, VGG16, DenseNet, MobileNetV2, ResNeXt-50, and InceptionV3). A 10-fold cross-validation strategy and a hold-out test set were employed for model evaluation. Model performance was assessed via various metrics including classification accuracy, precision, recall, and F1-score. RESULTS Included were 190 PA radiographs, composed of 470 ROIs. Results from 10-fold cross-validation demonstrated that most SSL models outperformed the transfer learning baseline models, with DINO achieving the highest mean accuracy (85.64 ± 4.56), significantly outperforming 13 other models (P < .05). DINO reached the highest test set (ie, 3 ROIs) accuracy (84.09%) while MoCo v2 exhibited the highest recall and F1-score (77.37% and 82.93%, respectively). CONCLUSIONS This study showed that AI can assist clinicians in detecting ECR and differentiating it from caries. Additionally, it introduced the application of SSL in detecting ECR, emphasizing that SSL-based models can outperform transfer learning baselines and reduce reliance on large, labeled datasets.
Collapse
Affiliation(s)
- Hossein Mohammad-Rahimi
- Topic Group Dental Diagnostics and Digital Dentistry, ITU/WHO Focus Group AI on Health, Berlin, Germany; Department of Computer Engineering, Sharif University of Technology, Tehran, Iran
| | - Omid Dianat
- Division of Endodontics, Department of Advanced Oral Sciences and Therapeutics, University of Maryland, School of Dentistry, Baltimore, Maryland; Private Practice, Centreville Endodontics, Centreville, Virginia
| | - Reza Abbasi
- Department of Computer Engineering, Sharif University of Technology, Tehran, Iran
| | - Samira Zahedrozegar
- Dentofacial Deformities Research Center, Research Institute of Dental Sciences, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Ali Ashkan
- Department of Orthodontics, School of Dentistry, Hamadan University of Medical Sciences, Hamadan, Iran
| | - Saeed Reza Motamedian
- Topic Group Dental Diagnostics and Digital Dentistry, ITU/WHO Focus Group AI on Health, Berlin, Germany; Dentofacial Deformities Research Center, Research Institute of Dental Sciences, Shahid Beheshti University of Medical Sciences, Tehran, Iran; Department of Orthodontics, School of Dentistry, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | | | - Ali Nosrat
- Division of Endodontics, Department of Advanced Oral Sciences and Therapeutics, University of Maryland, School of Dentistry, Baltimore, Maryland; Private Practice, Centreville Endodontics, Centreville, Virginia.
| |
Collapse
|
35
|
Azad R, Kazerouni A, Heidari M, Aghdam EK, Molaei A, Jia Y, Jose A, Roy R, Merhof D. Advances in medical image analysis with vision Transformers: A comprehensive review. Med Image Anal 2024; 91:103000. [PMID: 37883822 DOI: 10.1016/j.media.2023.103000] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2023] [Revised: 09/30/2023] [Accepted: 10/11/2023] [Indexed: 10/28/2023]
Abstract
The remarkable performance of the Transformer architecture in natural language processing has recently also triggered broad interest in Computer Vision. Among other merits, Transformers are witnessed as capable of learning long-range dependencies and spatial correlations, which is a clear advantage over convolutional neural networks (CNNs), which have been the de facto standard in Computer Vision problems so far. Thus, Transformers have become an integral part of modern medical image analysis. In this review, we provide an encyclopedic review of the applications of Transformers in medical imaging. Specifically, we present a systematic and thorough review of relevant recent Transformer literature for different medical image analysis tasks, including classification, segmentation, detection, registration, synthesis, and clinical report generation. For each of these applications, we investigate the novelty, strengths and weaknesses of the different proposed strategies and develop taxonomies highlighting key properties and contributions. Further, if applicable, we outline current benchmarks on different datasets. Finally, we summarize key challenges and discuss different future research directions. In addition, we have provided cited papers with their corresponding implementations in https://github.com/mindflow-institue/Awesome-Transformer.
Collapse
Affiliation(s)
- Reza Azad
- Faculty of Electrical Engineering and Information Technology, RWTH Aachen University, Aachen, Germany
| | - Amirhossein Kazerouni
- School of Electrical Engineering, Iran University of Science and Technology, Tehran, Iran
| | - Moein Heidari
- School of Electrical Engineering, Iran University of Science and Technology, Tehran, Iran
| | | | - Amirali Molaei
- School of Computer Engineering, Iran University of Science and Technology, Tehran, Iran
| | - Yiwei Jia
- Faculty of Electrical Engineering and Information Technology, RWTH Aachen University, Aachen, Germany
| | - Abin Jose
- Faculty of Electrical Engineering and Information Technology, RWTH Aachen University, Aachen, Germany
| | - Rijo Roy
- Faculty of Electrical Engineering and Information Technology, RWTH Aachen University, Aachen, Germany
| | - Dorit Merhof
- Faculty of Informatics and Data Science, University of Regensburg, Regensburg, Germany; Fraunhofer Institute for Digital Medicine MEVIS, Bremen, Germany.
| |
Collapse
|
36
|
Wolf D, Payer T, Lisson CS, Lisson CG, Beer M, Götz M, Ropinski T. Self-supervised pre-training with contrastive and masked autoencoder methods for dealing with small datasets in deep learning for medical imaging. Sci Rep 2023; 13:20260. [PMID: 37985685 PMCID: PMC10662445 DOI: 10.1038/s41598-023-46433-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2023] [Accepted: 10/31/2023] [Indexed: 11/22/2023] Open
Abstract
Deep learning in medical imaging has the potential to minimize the risk of diagnostic errors, reduce radiologist workload, and accelerate diagnosis. Training such deep learning models requires large and accurate datasets, with annotations for all training samples. However, in the medical imaging domain, annotated datasets for specific tasks are often small due to the high complexity of annotations, limited access, or the rarity of diseases. To address this challenge, deep learning models can be pre-trained on large image datasets without annotations using methods from the field of self-supervised learning. After pre-training, small annotated datasets are sufficient to fine-tune the models for a specific task. The most popular self-supervised pre-training approaches in medical imaging are based on contrastive learning. However, recent studies in natural image processing indicate a strong potential for masked autoencoder approaches. Our work compares state-of-the-art contrastive learning methods with the recently introduced masked autoencoder approach "SparK" for convolutional neural networks (CNNs) on medical images. Therefore, we pre-train on a large unannotated CT image dataset and fine-tune on several CT classification tasks. Due to the challenge of obtaining sufficient annotated training data in medical imaging, it is of particular interest to evaluate how the self-supervised pre-training methods perform when fine-tuning on small datasets. By experimenting with gradually reducing the training dataset size for fine-tuning, we find that the reduction has different effects depending on the type of pre-training chosen. The SparK pre-training method is more robust to the training dataset size than the contrastive methods. Based on our results, we propose the SparK pre-training for medical imaging tasks with only small annotated datasets.
Collapse
Affiliation(s)
- Daniel Wolf
- Visual Computing Research Group, Institute of Media Informatics, Ulm University, Ulm, Germany.
- Experimental Radiology Research Group, Department for Diagnostic and Interventional Radiology, Ulm University Medical Center, Ulm, Germany.
| | - Tristan Payer
- Visual Computing Research Group, Institute of Media Informatics, Ulm University, Ulm, Germany
| | - Catharina Silvia Lisson
- Experimental Radiology Research Group, Department for Diagnostic and Interventional Radiology, Ulm University Medical Center, Ulm, Germany
| | - Christoph Gerhard Lisson
- Experimental Radiology Research Group, Department for Diagnostic and Interventional Radiology, Ulm University Medical Center, Ulm, Germany
| | - Meinrad Beer
- Experimental Radiology Research Group, Department for Diagnostic and Interventional Radiology, Ulm University Medical Center, Ulm, Germany
| | - Michael Götz
- Experimental Radiology Research Group, Department for Diagnostic and Interventional Radiology, Ulm University Medical Center, Ulm, Germany
| | - Timo Ropinski
- Visual Computing Research Group, Institute of Media Informatics, Ulm University, Ulm, Germany
| |
Collapse
|
37
|
Feng R, Deb B, Ganesan P, Tjong FVY, Rogers AJ, Ruipérez-Campillo S, Somani S, Clopton P, Baykaner T, Rodrigo M, Zou J, Haddad F, Zahari M, Narayan SM. Segmenting computed tomograms for cardiac ablation using machine learning leveraged by domain knowledge encoding. Front Cardiovasc Med 2023; 10:1189293. [PMID: 37849936 PMCID: PMC10577270 DOI: 10.3389/fcvm.2023.1189293] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2023] [Accepted: 09/18/2023] [Indexed: 10/19/2023] Open
Abstract
Background Segmentation of computed tomography (CT) is important for many clinical procedures including personalized cardiac ablation for the management of cardiac arrhythmias. While segmentation can be automated by machine learning (ML), it is limited by the need for large, labeled training data that may be difficult to obtain. We set out to combine ML of cardiac CT with domain knowledge, which reduces the need for large training datasets by encoding cardiac geometry, which we then tested in independent datasets and in a prospective study of atrial fibrillation (AF) ablation. Methods We mathematically represented atrial anatomy with simple geometric shapes and derived a model to parse cardiac structures in a small set of N = 6 digital hearts. The model, termed "virtual dissection," was used to train ML to segment cardiac CT in N = 20 patients, then tested in independent datasets and in a prospective study. Results In independent test cohorts (N = 160) from 2 Institutions with different CT scanners, atrial structures were accurately segmented with Dice scores of 96.7% in internal (IQR: 95.3%-97.7%) and 93.5% in external (IQR: 91.9%-94.7%) test data, with good agreement with experts (r = 0.99; p < 0.0001). In a prospective study of 42 patients at ablation, this approach reduced segmentation time by 85% (2.3 ± 0.8 vs. 15.0 ± 6.9 min, p < 0.0001), yet provided similar Dice scores to experts (93.9% (IQR: 93.0%-94.6%) vs. 94.4% (IQR: 92.8%-95.7%), p = NS). Conclusions Encoding cardiac geometry using mathematical models greatly accelerated training of ML to segment CT, reducing the need for large training sets while retaining accuracy in independent test data. Combining ML with domain knowledge may have broad applications.
Collapse
Affiliation(s)
- Ruibin Feng
- Department of Medicine and Cardiovascular Institute, Stanford University, Stanford, CA, United States
| | - Brototo Deb
- Department of Medicine and Cardiovascular Institute, Stanford University, Stanford, CA, United States
| | - Prasanth Ganesan
- Department of Medicine and Cardiovascular Institute, Stanford University, Stanford, CA, United States
| | - Fleur V. Y. Tjong
- Department of Medicine and Cardiovascular Institute, Stanford University, Stanford, CA, United States
- Heart Center, Department of Clinical and Experimental Cardiology, Amsterdam UMC, University of Amsterdam, Amsterdam, Netherlands
| | - Albert J. Rogers
- Department of Medicine and Cardiovascular Institute, Stanford University, Stanford, CA, United States
| | - Samuel Ruipérez-Campillo
- Department of Medicine and Cardiovascular Institute, Stanford University, Stanford, CA, United States
- Bioengineering Department, University of California, Berkeley, Berkeley, CA, United States
| | - Sulaiman Somani
- Department of Medicine and Cardiovascular Institute, Stanford University, Stanford, CA, United States
| | - Paul Clopton
- Department of Medicine and Cardiovascular Institute, Stanford University, Stanford, CA, United States
| | - Tina Baykaner
- Department of Medicine and Cardiovascular Institute, Stanford University, Stanford, CA, United States
| | - Miguel Rodrigo
- Department of Medicine and Cardiovascular Institute, Stanford University, Stanford, CA, United States
- CoMMLab, Universitat Politècnica de València, Valencia, Spain
| | - James Zou
- Department of Biomedical Data Science, Stanford University, Stanford, CA, United States
| | - Francois Haddad
- Department of Medicine and Cardiovascular Institute, Stanford University, Stanford, CA, United States
| | - Matei Zahari
- Department of Computer Science, Stanford University, Stanford, CA, United States
| | - Sanjiv M. Narayan
- Department of Medicine and Cardiovascular Institute, Stanford University, Stanford, CA, United States
| |
Collapse
|
38
|
Langlotz CP. The Future of AI and Informatics in Radiology: 10 Predictions. Radiology 2023; 309:e231114. [PMID: 37874234 PMCID: PMC10623186 DOI: 10.1148/radiol.231114] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2023] [Revised: 05/16/2023] [Accepted: 05/22/2023] [Indexed: 10/25/2023]
Affiliation(s)
- Curtis P. Langlotz
- From the Departments of Radiology, Medicine, and Biomedical Data
Science, Stanford University School of Medicine, 300 Pasteur Dr, Stanford, CA
94305
| |
Collapse
|
39
|
Khanal B, Bhattarai B, Khanal B, Linte CA. Improving Medical Image Classification in Noisy Labels Using only Self-supervised Pretraining. DATA ENGINEERING IN MEDICAL IMAGING : FIRST MICCAI WORKSHOP, DEMI 2023, HELD IN CONJUNCTION WITH MICCAI 2023, VANCOUVER, BC, CANADA, OCTOBER 8, 2023, PROCEEDINGS. DEMI (WORKSHOP) (1ST : 2023 : VANCOUVER, B.C.) 2023; 14314:78-90. [PMID: 39144367 PMCID: PMC11321236 DOI: 10.1007/978-3-031-44992-5_8] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/16/2024]
Abstract
Noisy labels hurt deep learning-based supervised image classification performance as the models may overfit the noise and learn corrupted feature extractors. For natural image classification training with noisy labeled data, model initialization with contrastive self-supervised pretrained weights has shown to reduce feature corruption and improve classification performance. However, no works have explored: i) how other self-supervised approaches, such as pretext task-based pretraining, impact the learning with noisy label, and ii) any self-supervised pretraining methods alone for medical images in noisy label settings. Medical images often feature smaller datasets and subtle inter-class variations, requiring human expertise to ensure correct classification. Thus, it is not clear if the methods improving learning with noisy labels in natural image datasets such as CIFAR would also help with medical images. In this work, we explore contrastive and pretext task-based selfsupervised pretraining to initialize the weights of a deep learning classification model for two medical datasets with self-induced noisy labels-NCT-CRC-HE-100K tissue histological images and COVID-QU-Ex chest X-ray images. Our results show that models initialized with pretrained weights obtained from self-supervised learning can effectively learn better features and improve robustness against noisy labels.
Collapse
Affiliation(s)
- Bidur Khanal
- Center for Imaging Science, RIT, Rochester, NY, USA
| | | | - Bishesh Khanal
- NepAl Applied Mathematics and Informatics Institute for Research (NAAMII), Patan, Nepal
| | - Cristian A Linte
- Center for Imaging Science, RIT, Rochester, NY, USA
- Biomedical Engineering, RIT, Rochester, NY, USA
| |
Collapse
|
40
|
Oikonomou EK, Khera R. Machine learning in precision diabetes care and cardiovascular risk prediction. Cardiovasc Diabetol 2023; 22:259. [PMID: 37749579 PMCID: PMC10521578 DOI: 10.1186/s12933-023-01985-3] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/25/2023] [Accepted: 09/07/2023] [Indexed: 09/27/2023] Open
Abstract
Artificial intelligence and machine learning are driving a paradigm shift in medicine, promising data-driven, personalized solutions for managing diabetes and the excess cardiovascular risk it poses. In this comprehensive review of machine learning applications in the care of patients with diabetes at increased cardiovascular risk, we offer a broad overview of various data-driven methods and how they may be leveraged in developing predictive models for personalized care. We review existing as well as expected artificial intelligence solutions in the context of diagnosis, prognostication, phenotyping, and treatment of diabetes and its cardiovascular complications. In addition to discussing the key properties of such models that enable their successful application in complex risk prediction, we define challenges that arise from their misuse and the role of methodological standards in overcoming these limitations. We also identify key issues in equity and bias mitigation in healthcare and discuss how the current regulatory framework should ensure the efficacy and safety of medical artificial intelligence products in transforming cardiovascular care and outcomes in diabetes.
Collapse
Affiliation(s)
- Evangelos K Oikonomou
- Section of Cardiovascular Medicine, Department of Internal Medicine, Yale School of Medicine, New Haven, CT, USA
| | - Rohan Khera
- Section of Cardiovascular Medicine, Department of Internal Medicine, Yale School of Medicine, New Haven, CT, USA.
- Section of Health Informatics, Department of Biostatistics, Yale School of Public Health, New Haven, CT, USA.
- Section of Biomedical Informatics and Data Science, Yale School of Medicine, New Haven, CT, USA.
- Center for Outcomes Research and Evaluation, Yale-New Haven Hospital, 195 Church St, 6th floor, New Haven, CT, 06510, USA.
| |
Collapse
|
41
|
Pai S, Bontempi D, Prudente V, Hadzic I, Sokač M, Chaunzwa TL, Bernatz S, Hosny A, Mak RH, Birkbak NJ, Aerts HJ. Foundation Models for Quantitative Biomarker Discovery in Cancer Imaging. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.09.04.23294952. [PMID: 37732237 PMCID: PMC10508804 DOI: 10.1101/2023.09.04.23294952] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/22/2023]
Abstract
Foundation models represent a recent paradigm shift in deep learning, where a single large-scale model trained on vast amounts of data can serve as the foundation for various downstream tasks. Foundation models are generally trained using self-supervised learning and excel in reducing the demand for training samples in downstream applications. This is especially important in medicine, where large labeled datasets are often scarce. Here, we developed a foundation model for imaging biomarker discovery by training a convolutional encoder through self-supervised learning using a comprehensive dataset of 11,467 radiographic lesions. The foundation model was evaluated in distinct and clinically relevant applications of imaging-based biomarkers. We found that they facilitated better and more efficient learning of imaging biomarkers and yielded task-specific models that significantly outperformed their conventional supervised counterparts on downstream tasks. The performance gain was most prominent when training dataset sizes were very limited. Furthermore, foundation models were more stable to input and inter-reader variations and showed stronger associations with underlying biology. Our results demonstrate the tremendous potential of foundation models in discovering novel imaging biomarkers that may extend to other clinical use cases and can accelerate the widespread translation of imaging biomarkers into clinical settings.
Collapse
Affiliation(s)
- Suraj Pai
- Artificial Intelligence in Medicine (AIM) Program, Mass General Brigham, Harvard Medical School, Harvard Institutes of Medicine, 77 Avenue Louis Pasteur, Boston, MA 02115, United States of America
- Radiology and Nuclear Medicine, CARIM & GROW, Maastricht University, Universiteitssingel 40, 6229 ER Maastricht, The Netherlands
- Department of Radiation Oncology, Brigham and Women's Hospital, Dana-Farber Cancer Institute, Harvard Medical School, 75 Francis Street and 450 Brookline Avenue, Boston, MA 02115, USA
| | - Dennis Bontempi
- Artificial Intelligence in Medicine (AIM) Program, Mass General Brigham, Harvard Medical School, Harvard Institutes of Medicine, 77 Avenue Louis Pasteur, Boston, MA 02115, United States of America
- Radiology and Nuclear Medicine, CARIM & GROW, Maastricht University, Universiteitssingel 40, 6229 ER Maastricht, The Netherlands
- Department of Radiation Oncology, Brigham and Women's Hospital, Dana-Farber Cancer Institute, Harvard Medical School, 75 Francis Street and 450 Brookline Avenue, Boston, MA 02115, USA
| | - Vasco Prudente
- Artificial Intelligence in Medicine (AIM) Program, Mass General Brigham, Harvard Medical School, Harvard Institutes of Medicine, 77 Avenue Louis Pasteur, Boston, MA 02115, United States of America
- Radiology and Nuclear Medicine, CARIM & GROW, Maastricht University, Universiteitssingel 40, 6229 ER Maastricht, The Netherlands
- Department of Radiation Oncology, Brigham and Women's Hospital, Dana-Farber Cancer Institute, Harvard Medical School, 75 Francis Street and 450 Brookline Avenue, Boston, MA 02115, USA
| | - Ibrahim Hadzic
- Artificial Intelligence in Medicine (AIM) Program, Mass General Brigham, Harvard Medical School, Harvard Institutes of Medicine, 77 Avenue Louis Pasteur, Boston, MA 02115, United States of America
- Radiology and Nuclear Medicine, CARIM & GROW, Maastricht University, Universiteitssingel 40, 6229 ER Maastricht, The Netherlands
- Department of Radiation Oncology, Brigham and Women's Hospital, Dana-Farber Cancer Institute, Harvard Medical School, 75 Francis Street and 450 Brookline Avenue, Boston, MA 02115, USA
| | - Mateo Sokač
- Department of Molecular Medicine, Aarhus University Hospital, 8200 Aarhus, Denmark
- Department of Clinical Medicine, Aarhus University, 8200 Aarhus, Denmark
| | - Tafadzwa L Chaunzwa
- Artificial Intelligence in Medicine (AIM) Program, Mass General Brigham, Harvard Medical School, Harvard Institutes of Medicine, 77 Avenue Louis Pasteur, Boston, MA 02115, United States of America
- Department of Radiation Oncology, Brigham and Women's Hospital, Dana-Farber Cancer Institute, Harvard Medical School, 75 Francis Street and 450 Brookline Avenue, Boston, MA 02115, USA
| | - Simon Bernatz
- Artificial Intelligence in Medicine (AIM) Program, Mass General Brigham, Harvard Medical School, Harvard Institutes of Medicine, 77 Avenue Louis Pasteur, Boston, MA 02115, United States of America
- Department of Radiation Oncology, Brigham and Women's Hospital, Dana-Farber Cancer Institute, Harvard Medical School, 75 Francis Street and 450 Brookline Avenue, Boston, MA 02115, USA
| | - Ahmed Hosny
- Artificial Intelligence in Medicine (AIM) Program, Mass General Brigham, Harvard Medical School, Harvard Institutes of Medicine, 77 Avenue Louis Pasteur, Boston, MA 02115, United States of America
- Department of Radiation Oncology, Brigham and Women's Hospital, Dana-Farber Cancer Institute, Harvard Medical School, 75 Francis Street and 450 Brookline Avenue, Boston, MA 02115, USA
| | - Raymond H Mak
- Artificial Intelligence in Medicine (AIM) Program, Mass General Brigham, Harvard Medical School, Harvard Institutes of Medicine, 77 Avenue Louis Pasteur, Boston, MA 02115, United States of America
- Radiology and Nuclear Medicine, CARIM & GROW, Maastricht University, Universiteitssingel 40, 6229 ER Maastricht, The Netherlands
| | - Nicolai J Birkbak
- Department of Molecular Medicine, Aarhus University Hospital, 8200 Aarhus, Denmark
- Department of Clinical Medicine, Aarhus University, 8200 Aarhus, Denmark
| | - Hugo Jwl Aerts
- Artificial Intelligence in Medicine (AIM) Program, Mass General Brigham, Harvard Medical School, Harvard Institutes of Medicine, 77 Avenue Louis Pasteur, Boston, MA 02115, United States of America
- Radiology and Nuclear Medicine, CARIM & GROW, Maastricht University, Universiteitssingel 40, 6229 ER Maastricht, The Netherlands
- Department of Radiation Oncology, Brigham and Women's Hospital, Dana-Farber Cancer Institute, Harvard Medical School, 75 Francis Street and 450 Brookline Avenue, Boston, MA 02115, USA
- Department of Radiology, Brigham and Women's Hospital, Dana-Farber Cancer Institute, Harvard Medical School, 75 Francis Street and 450 Brookline Avenue, Boston, MA 02115, USA
| |
Collapse
|
42
|
Nielsen M, Wenderoth L, Sentker T, Werner R. Self-Supervision for Medical Image Classification: State-of-the-Art Performance with ~100 Labeled Training Samples per Class. Bioengineering (Basel) 2023; 10:895. [PMID: 37627780 PMCID: PMC10451977 DOI: 10.3390/bioengineering10080895] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Revised: 07/21/2023] [Accepted: 07/24/2023] [Indexed: 08/27/2023] Open
Abstract
Is self-supervised deep learning (DL) for medical image analysis already a serious alternative to the de facto standard of end-to-end trained supervised DL? We tackle this question for medical image classification, with a particular focus on one of the currently most limiting factor of the field: the (non-)availability of labeled data. Based on three common medical imaging modalities (bone marrow microscopy, gastrointestinal endoscopy, dermoscopy) and publicly available data sets, we analyze the performance of self-supervised DL within the self-distillation with no labels (DINO) framework. After learning an image representation without use of image labels, conventional machine learning classifiers are applied. The classifiers are fit using a systematically varied number of labeled data (1-1000 samples per class). Exploiting the learned image representation, we achieve state-of-the-art classification performance for all three imaging modalities and data sets with only a fraction of between 1% and 10% of the available labeled data and about 100 labeled samples per class.
Collapse
Affiliation(s)
- Maximilian Nielsen
- Department of Computational Neuroscience, Institute for Applied Medical Informatics, University Medical Center Hamburg-Eppendorf, Martinistraße 52, 20251 Hamburg, Germany; (L.W.); (T.S.); (R.W.)
| | | | | | | |
Collapse
|