1
|
Guo LL, Fries J, Steinberg E, Fleming SL, Morse K, Aftandilian C, Posada J, Shah N, Sung L. A multi-center study on the adaptability of a shared foundation model for electronic health records. NPJ Digit Med 2024; 7:171. [PMID: 38937550 PMCID: PMC11211479 DOI: 10.1038/s41746-024-01166-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2023] [Accepted: 06/12/2024] [Indexed: 06/29/2024] Open
Abstract
Foundation models are transforming artificial intelligence (AI) in healthcare by providing modular components adaptable for various downstream tasks, making AI development more scalable and cost-effective. Foundation models for structured electronic health records (EHR), trained on coded medical records from millions of patients, demonstrated benefits including increased performance with fewer training labels, and improved robustness to distribution shifts. However, questions remain on the feasibility of sharing these models across hospitals and their performance in local tasks. This multi-center study examined the adaptability of a publicly accessible structured EHR foundation model (FMSM), trained on 2.57 M patient records from Stanford Medicine. Experiments used EHR data from The Hospital for Sick Children (SickKids) and Medical Information Mart for Intensive Care (MIMIC-IV). We assessed both adaptability via continued pretraining on local data, and task adaptability compared to baselines of locally training models from scratch, including a local foundation model. Evaluations on 8 clinical prediction tasks showed that adapting the off-the-shelf FMSM matched the performance of gradient boosting machines (GBM) locally trained on all data while providing a 13% improvement in settings with few task-specific training labels. Continued pretraining on local data showed FMSM required fewer than 1% of training examples to match the fully trained GBM's performance, and was 60 to 90% more sample-efficient than training local foundation models from scratch. Our findings demonstrate that adapting EHR foundation models across hospitals provides improved prediction performance at less cost, underscoring the utility of base foundation models as modular components to streamline the development of healthcare AI.
Collapse
Affiliation(s)
- Lin Lawrence Guo
- Program in Child Health Evaluative Sciences, The Hospital for Sick Children, Toronto, ON, Canada
| | - Jason Fries
- Stanford Center for Biomedical Informatics Research, Stanford University, Palo Alto, CA, USA
| | - Ethan Steinberg
- Stanford Center for Biomedical Informatics Research, Stanford University, Palo Alto, CA, USA
| | - Scott Lanyon Fleming
- Stanford Center for Biomedical Informatics Research, Stanford University, Palo Alto, CA, USA
| | - Keith Morse
- Division of Pediatric Hospital Medicine, Department of Pediatrics, Stanford University, Palo Alto, CA, USA
| | - Catherine Aftandilian
- Division of Hematology/Oncology, Department of Pediatrics, Stanford University, Palo Alto, CA, USA
| | - Jose Posada
- Universidad del Norte, Barranquilla, Colombia
| | - Nigam Shah
- Stanford Center for Biomedical Informatics Research, Stanford University, Palo Alto, CA, USA
| | - Lillian Sung
- Program in Child Health Evaluative Sciences, The Hospital for Sick Children, Toronto, ON, Canada.
- Division of Haematology/Oncology, The Hospital for Sick Children, Toronto, ON, Canada.
| |
Collapse
|
2
|
Chia MA, Antaki F, Zhou Y, Turner AW, Lee AY, Keane PA. Foundation models in ophthalmology. Br J Ophthalmol 2024:bjo-2024-325459. [PMID: 38834291 DOI: 10.1136/bjo-2024-325459] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2024] [Accepted: 04/26/2024] [Indexed: 06/06/2024]
Abstract
Foundation models represent a paradigm shift in artificial intelligence (AI), evolving from narrow models designed for specific tasks to versatile, generalisable models adaptable to a myriad of diverse applications. Ophthalmology as a specialty has the potential to act as an exemplar for other medical specialties, offering a blueprint for integrating foundation models broadly into clinical practice. This review hopes to serve as a roadmap for eyecare professionals seeking to better understand foundation models, while equipping readers with the tools to explore the use of foundation models in their own research and practice. We begin by outlining the key concepts and technological advances which have enabled the development of these models, providing an overview of novel training approaches and modern AI architectures. Next, we summarise existing literature on the topic of foundation models in ophthalmology, encompassing progress in vision foundation models, large language models and large multimodal models. Finally, we outline major challenges relating to privacy, bias and clinical validation, and propose key steps forward to maximise the benefit of this powerful technology.
Collapse
Affiliation(s)
- Mark A Chia
- Institute of Ophthalmology, University College London, London, UK
- NIHR Biomedical Research Centre, Moorfields Eye Hospital NHS Foundation Trust, London, UK
| | - Fares Antaki
- Institute of Ophthalmology, University College London, London, UK
- NIHR Biomedical Research Centre, Moorfields Eye Hospital NHS Foundation Trust, London, UK
- The CHUM School of Artificial Intelligence in Healthcare, Montreal, Quebec, Canada
| | - Yukun Zhou
- Institute of Ophthalmology, University College London, London, UK
- NIHR Biomedical Research Centre, Moorfields Eye Hospital NHS Foundation Trust, London, UK
| | - Angus W Turner
- Lions Outback Vision, Lions Eye Institute, Nedlands, Western Australia, Australia
- University of Western Australia, Perth, Western Australia, Australia
| | - Aaron Y Lee
- Department of Ophthalmology, University of Washington, Seattle, Washington, USA
- Roger and Angie Karalis Johnson Retina Center, University of Washington, Seattle, Washington, USA
| | - Pearse A Keane
- Institute of Ophthalmology, University College London, London, UK
- NIHR Biomedical Research Centre, Moorfields Eye Hospital NHS Foundation Trust, London, UK
| |
Collapse
|
3
|
Tseng SC, Lien CE, Lee CH, Tu KC, Lin CH, Hsiao AY, Teng S, Chiang HH, Ke LY, Han CL, Lee YC, Huang AC, Yang DJ, Tsai CW, Chen KH. Clinical Validation of a Deep Learning-Based Software for Lumbar Bone Mineral Density and T-Score Prediction from Chest X-ray Images. Diagnostics (Basel) 2024; 14:1208. [PMID: 38928624 PMCID: PMC11202681 DOI: 10.3390/diagnostics14121208] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2024] [Revised: 05/31/2024] [Accepted: 06/03/2024] [Indexed: 06/28/2024] Open
Abstract
Screening for osteoporosis is crucial for early detection and prevention, yet it faces challenges due to the low accuracy of calcaneal quantitative ultrasound (QUS) and limited access to dual-energy X-ray absorptiometry (DXA) scans. Recent advances in AI offer a promising solution through opportunistic screening using existing medical images. This study aims to utilize deep learning techniques to develop a model that analyzes chest X-ray (CXR) images for osteoporosis screening. This study included the AI model development stage and the clinical validation stage. In the AI model development stage, the combined dataset of 5122 paired CXR images and DXA reports from the patients aged 20 to 98 years at a medical center was collected. The images were enhanced and filtered for hardware retention such as pedicle screws, bone cement, artificial intervertebral discs or severe deformity in target level of T12 and L1. The dataset was then separated into training, validating, and testing datasets for model training and performance validation. In the clinical validation stage, we collected 440 paired CXR images and DXA reports from both the TCVGH and Joy Clinic, including 304 pared data from TCVGH and 136 paired data from Joy Clinic. The pre-clinical test yielded an area under the curve (AUC) of 0.940, while the clinical validation showed an AUC of 0.946. Pearson's correlation coefficient was 0.88. The model demonstrated an overall accuracy, sensitivity, and specificity of 89.0%, 88.7%, and 89.4%, respectively. This study proposes an AI model for opportunistic osteoporosis screening through CXR, demonstrating good performance and suggesting its potential for broad adoption in preliminary screening among high-risk populations.
Collapse
Affiliation(s)
- Sheng-Chieh Tseng
- Department of Orthopedic Surgery, Taichung Veterans General Hospital, Taichung 40705, Taiwan
- Rong Hsing Research Center for Translational Medicine, National Chung Hsing University, Taichung 402202, Taiwan
- PhD Program in Translational Medicine, National Chung Hsing University, Taichung 402202, Taiwan
| | - Chia-En Lien
- Acer Medical Inc., 7F, No. 86, Sec. 1, Xintai 5th Rd. Xizhi, New Taipei City 221421, Taiwan
| | - Cheng-Hung Lee
- Department of Orthopedic Surgery, Taichung Veterans General Hospital, Taichung 40705, Taiwan
- Department of Post-Baccalaureate Medicine, College of Medicine, National Chung Hsing University, Taichung 402202, Taiwan
| | - Kao-Chang Tu
- Department of Orthopedic Surgery, Taichung Veterans General Hospital, Taichung 40705, Taiwan
- Graduate Institute of Biomedical Engineering, National Chung Hsing University, Taichung 402202, Taiwan
| | - Chia-Hui Lin
- Department of Computer Science and Engineering, National Chung Hsing University, Taichung 402202, Taiwan
| | - Amy Y. Hsiao
- Acer Medical Inc., 7F, No. 86, Sec. 1, Xintai 5th Rd. Xizhi, New Taipei City 221421, Taiwan
| | - Shin Teng
- Acer Medical Inc., 7F, No. 86, Sec. 1, Xintai 5th Rd. Xizhi, New Taipei City 221421, Taiwan
| | - Hsiao-Hung Chiang
- Acer Medical Inc., 7F, No. 86, Sec. 1, Xintai 5th Rd. Xizhi, New Taipei City 221421, Taiwan
| | - Liang-Yu Ke
- Acer Inc., 7F-5, No. 369, Fuxing N. Rd., Songshan Dist., Taipei City 10541, Taiwan
| | - Chun-Lin Han
- Acer Inc., 7F-5, No. 369, Fuxing N. Rd., Songshan Dist., Taipei City 10541, Taiwan
| | - Yen-Cheng Lee
- Acer Inc., 7F-5, No. 369, Fuxing N. Rd., Songshan Dist., Taipei City 10541, Taiwan
| | - An-Chih Huang
- Acer Inc., 7F-5, No. 369, Fuxing N. Rd., Songshan Dist., Taipei City 10541, Taiwan
| | - Dun-Jhu Yang
- Acer Inc., 7F-5, No. 369, Fuxing N. Rd., Songshan Dist., Taipei City 10541, Taiwan
| | - Chung-Wen Tsai
- Joy Clinic, No. 37 Jilin Rd., Luzhu Dist., Taoyuan City 338120, Taiwan
| | - Kun-Hui Chen
- Department of Orthopedic Surgery, Taichung Veterans General Hospital, Taichung 40705, Taiwan
- Department of Post-Baccalaureate Medicine, College of Medicine, National Chung Hsing University, Taichung 402202, Taiwan
- Department of Computer Science and Information Engineering, Providence University, Taichung 40301, Taiwan
| |
Collapse
|
4
|
Gholami S, Scheppke L, Lee AY. Enhancing Self-Supervised Learning for Rare Diseases in OCT-Reply. JAMA Ophthalmol 2024:2819803. [PMID: 38842855 DOI: 10.1001/jamaophthalmol.2024.1873] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/07/2024]
Affiliation(s)
| | - Lea Scheppke
- Lowy Medical Research Institute, La Jolla, California
| | - Aaron Y Lee
- Department of Ophthalmology, University of Washington, Seattle
- Roger and Angie Karalis Johnson Retina Center, Seattle, Washington
| |
Collapse
|
5
|
Xu H, Usuyama N, Bagga J, Zhang S, Rao R, Naumann T, Wong C, Gero Z, González J, Gu Y, Xu Y, Wei M, Wang W, Ma S, Wei F, Yang J, Li C, Gao J, Rosemon J, Bower T, Lee S, Weerasinghe R, Wright BJ, Robicsek A, Piening B, Bifulco C, Wang S, Poon H. A whole-slide foundation model for digital pathology from real-world data. Nature 2024; 630:181-188. [PMID: 38778098 PMCID: PMC11153137 DOI: 10.1038/s41586-024-07441-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2023] [Accepted: 04/19/2024] [Indexed: 05/25/2024]
Abstract
Digital pathology poses unique computational challenges, as a standard gigapixel slide may comprise tens of thousands of image tiles1-3. Prior models have often resorted to subsampling a small portion of tiles for each slide, thus missing the important slide-level context4. Here we present Prov-GigaPath, a whole-slide pathology foundation model pretrained on 1.3 billion 256 × 256 pathology image tiles in 171,189 whole slides from Providence, a large US health network comprising 28 cancer centres. The slides originated from more than 30,000 patients covering 31 major tissue types. To pretrain Prov-GigaPath, we propose GigaPath, a novel vision transformer architecture for pretraining gigapixel pathology slides. To scale GigaPath for slide-level learning with tens of thousands of image tiles, GigaPath adapts the newly developed LongNet5 method to digital pathology. To evaluate Prov-GigaPath, we construct a digital pathology benchmark comprising 9 cancer subtyping tasks and 17 pathomics tasks, using both Providence and TCGA data6. With large-scale pretraining and ultra-large-context modelling, Prov-GigaPath attains state-of-the-art performance on 25 out of 26 tasks, with significant improvement over the second-best method on 18 tasks. We further demonstrate the potential of Prov-GigaPath on vision-language pretraining for pathology7,8 by incorporating the pathology reports. In sum, Prov-GigaPath is an open-weight foundation model that achieves state-of-the-art performance on various digital pathology tasks, demonstrating the importance of real-world data and whole-slide modelling.
Collapse
Affiliation(s)
- Hanwen Xu
- Microsoft Research, Redmond, WA, USA
- Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, WA, USA
| | | | | | | | | | | | | | | | | | - Yu Gu
- Microsoft Research, Redmond, WA, USA
| | - Yanbo Xu
- Microsoft Research, Redmond, WA, USA
| | - Mu Wei
- Microsoft Research, Redmond, WA, USA
| | | | | | - Furu Wei
- Microsoft Research, Redmond, WA, USA
| | | | | | | | | | | | - Soohee Lee
- Providence Research Network, Renton, WA, USA
| | | | | | | | - Brian Piening
- Providence Genomics, Portland, OR, USA
- Earle A. Chiles Research Institute, Providence Cancer Institute, Portland, OR, USA
| | - Carlo Bifulco
- Providence Genomics, Portland, OR, USA.
- Earle A. Chiles Research Institute, Providence Cancer Institute, Portland, OR, USA.
| | - Sheng Wang
- Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, WA, USA.
- Department of Surgery, University of Washington, Seattle, WA, USA.
| | | |
Collapse
|
6
|
Setzer FC, Li J, Khan AA. The Use of Artificial Intelligence in Endodontics. J Dent Res 2024:220345241255593. [PMID: 38822561 DOI: 10.1177/00220345241255593] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/03/2024] Open
Abstract
Endodontics is the dental specialty foremost concerned with diseases of the pulp and periradicular tissues. Clinicians often face patients with varying symptoms, must critically assess radiographic images in 2 and 3 dimensions, derive complex diagnoses and decision making, and deliver sophisticated treatment. Paired with low intra- and interobserver agreement for radiographic interpretation and variations in treatment outcome resulting from nonstandardized clinical techniques, there exists an unmet need for support in the form of artificial intelligence (AI), providing automated biomedical image analysis, decision support, and assistance during treatment. In the past decade, there has been a steady increase in AI studies in endodontics but limited clinical application. This review focuses on critically assessing the recent advancements in endodontic AI research for clinical applications, including the detection and diagnosis of endodontic pathologies such as periapical lesions, fractures and resorptions, as well as clinical treatment outcome predictions. It discusses the benefits of AI-assisted diagnosis, treatment planning and execution, and future directions including augmented reality and robotics. It critically reviews the limitations and challenges imposed by the nature of endodontic data sets, AI transparency and generalization, and potential ethical dilemmas. In the near future, AI will significantly affect the everyday endodontic workflow, education, and continuous learning.
Collapse
Affiliation(s)
- F C Setzer
- Department of Endodontics, University of Pennsylvania, Philadelphia, PA, USA
| | - J Li
- School of Industrial and Systems Engineering, Georgia Institute of Technology, Atlanta, GA, USA
| | - A A Khan
- Department of Endodontics, University of Texas Health, San Antonio, TX, USA
| |
Collapse
|
7
|
Lv G, Wang Y. Machine learning-based antibiotic resistance prediction models: An updated systematic review and meta-analysis. Technol Health Care 2024:THC240119. [PMID: 38875058 DOI: 10.3233/thc-240119] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/16/2024]
Abstract
BACKGROUND The widespread use of antibiotics has led to a gradual adaptation of bacteria to these drugs, diminishing the effectiveness of treatments. OBJECTIVE To comprehensively assess the research progress of antibiotic resistance prediction models based on machine learning (ML) algorithms, providing the latest quantitative analysis and methodological evaluation. METHODS Relevant literature was systematically retrieved from databases, including PubMed, Embase and the Cochrane Library, from inception up to December 2023. Studies meeting predefined criteria were selected for inclusion. The prediction model risk of bias assessment tool was employed for methodological quality assessment, and a random-effects model was utilised for meta-analysis. RESULTS The systematic review included a total of 22 studies with a combined sample size of 43,628; 10 studies were ultimately included in the meta-analysis. Commonly used ML algorithms included random forest, decision trees and neural networks. Frequently utilised predictive variables encompassed demographics, drug use history and underlying diseases. The overall sensitivity was 0.57 (95% CI: 0.42-0.70; p< 0.001; I2= 99.7%), the specificity was 0.95 (95% CI: 0.79-0.99; p< 0.001; I2 = 99.9%), the positive likelihood ratio was 10.7 (95% CI: 2.9-39.5), the negative likelihood ratio was 0.46 (95% CI: 0.34-0.61), the diagnostic odds ratio was 23 (95% CI: 7-81) and the area under the receiver operating characteristic curve was 0.78 (95% CI: 0.74-0.81; p< 0.001), indicating a good discriminative ability of ML models for antibiotic resistance. However, methodological assessment and funnel plots suggested a high risk of bias and publication bias in the included studies. CONCLUSION This meta-analysis provides a current and comprehensive evaluation of ML models for predicting antibiotic resistance, emphasising their potential application in clinical practice. Nevertheless, stringent research design and reporting are warranted to enhance the quality and credibility of future studies. Future research should focus on methodological innovation and incorporate more high-quality studies to further advance this field.
Collapse
Affiliation(s)
- Guodong Lv
- Department of STD and AIDS Prevention and Control, Langfang Center for Disease Prevention and Control, Langfang, Hebei, China
| | - Yuntao Wang
- Department of Pharmacy, Langfang Health Vocational College, Langfang, Hebei, China
| |
Collapse
|
8
|
Hresko DJ, Drotar P. BucketAugment: Reinforced Domain Generalisation in Abdominal CT Segmentation. IEEE OPEN JOURNAL OF ENGINEERING IN MEDICINE AND BIOLOGY 2024; 5:353-361. [PMID: 38899027 PMCID: PMC11186658 DOI: 10.1109/ojemb.2024.3397623] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2023] [Revised: 02/28/2024] [Accepted: 05/03/2024] [Indexed: 06/21/2024] Open
Abstract
Goal: In recent years, deep neural networks have consistently outperformed previously proposed methods in the domain of medical segmentation. However, due to their nature, these networks often struggle to delineate desired structures in data that fall outside their training distribution. The goal of this study is to address the challenges associated with domain generalization in CT segmentation by introducing a novel method called BucketAugment for deep neural networks. Methods: BucketAugment leverages principles from the Q-learning algorithm and employs validation loss to search for an optimal policy within a search space comprised of distributed stacks of 3D volumetric augmentations, termed 'buckets.' These buckets have tunable parameters and can be seamlessly integrated into existing neural network architectures, offering flexibility for customization. Results: In our experiments, we focus on segmenting kidney and liver structures across three distinct medical datasets, each containing CT scans of the abdominal region collected from various clinical institutions and scanner vendors. Our results indicate that BucketAugment significantly enhances domain generalization across diverse medical datasets, requiring only minimal modifications to existing network architectures. Conclusions: The introduction of BucketAugment provides a promising solution to the challenges of domain generalization in CT segmentation. By leveraging Q-learning principles and distributed stacks of 3D augmentations, this method improves the performance of deep neural networks on medical segmentation tasks, demonstrating its potential to enhance the applicability of such models across different datasets and clinical scenarios.
Collapse
Affiliation(s)
| | - Peter Drotar
- Technical University of Kosice040 01KosiceSlovakia
| |
Collapse
|
9
|
Wang Y, Ni H, Zhou J, Liu L, Lin J, Yin M, Gao J, Zhu S, Yin Q, Zhu J, Li R. A Semi-Supervised Learning Framework for Classifying Colorectal Neoplasia Based on the NICE Classification. JOURNAL OF IMAGING INFORMATICS IN MEDICINE 2024:10.1007/s10278-024-01123-9. [PMID: 38653910 DOI: 10.1007/s10278-024-01123-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/21/2024] [Revised: 04/02/2024] [Accepted: 04/12/2024] [Indexed: 04/25/2024]
Abstract
Labelling medical images is an arduous and costly task that necessitates clinical expertise and large numbers of qualified images. Insufficient samples can lead to underfitting during training and poor performance of supervised learning models. In this study, we aim to develop a SimCLR-based semi-supervised learning framework to classify colorectal neoplasia based on the NICE classification. First, the proposed framework was trained under self-supervised learning using a large unlabelled dataset; subsequently, it was fine-tuned on a limited labelled dataset based on the NICE classification. The model was evaluated on an independent dataset and compared with models based on supervised transfer learning and endoscopists using accuracy, Matthew's correlation coefficient (MCC), and Cohen's kappa. Finally, Grad-CAM and t-SNE were applied to visualize the models' interpretations. A ResNet-backboned SimCLR model (accuracy of 0.908, MCC of 0.862, and Cohen's kappa of 0.896) outperformed supervised transfer learning-based models (means: 0.803, 0.698, and 0.742) and junior endoscopists (0.816, 0.724, and 0.863), while performing only slightly worse than senior endoscopists (0.916, 0.875, and 0.944). Moreover, t-SNE showed a better clustering of ternary samples through self-supervised learning in SimCLR than through supervised transfer learning. Compared with traditional supervised learning, semi-supervised learning enables deep learning models to achieve improved performance with limited labelled endoscopic images.
Collapse
Affiliation(s)
- Yu Wang
- Department of Hepatobiliary Surgery, Jintan Affiliated Hospital of Jiangsu University, Changzhou, Jiangsu, 213200, China
| | - Haoxiang Ni
- Department of Gastroenterology, The First Affiliated Hospital of Soochow University, # 899 Pinghai St., Suzhou, Jiangsu, 215006, China
- Suzhou Clinical Center of Digestive Disease, Suzhou, Jiangsu, 215006, China
| | - Jielu Zhou
- Department of Gastroenterology, The First Affiliated Hospital of Soochow University, # 899 Pinghai St., Suzhou, Jiangsu, 215006, China
- Department of Geriatrics, Kowloon Affiliated Hospital of Shanghai Jiao Tong University, Suzhou, Jiangsu, 215006, China
| | - Lihe Liu
- Department of Gastroenterology, The First Affiliated Hospital of Soochow University, # 899 Pinghai St., Suzhou, Jiangsu, 215006, China
- Suzhou Clinical Center of Digestive Disease, Suzhou, Jiangsu, 215006, China
| | - Jiaxi Lin
- Department of Gastroenterology, The First Affiliated Hospital of Soochow University, # 899 Pinghai St., Suzhou, Jiangsu, 215006, China
- Suzhou Clinical Center of Digestive Disease, Suzhou, Jiangsu, 215006, China
| | - Minyue Yin
- Department of Gastroenterology, Beijing Friendship Hospital, Capital Medical University, Beijing, 100050, China
- National Clinical Research Center for Digestive Disease, Beijing Digestive Disease Center, State Key Laboratory of Digestive Health, Beijing, 100050, China
| | - Jingwen Gao
- Department of Gastroenterology, The First Affiliated Hospital of Soochow University, # 899 Pinghai St., Suzhou, Jiangsu, 215006, China
- Suzhou Clinical Center of Digestive Disease, Suzhou, Jiangsu, 215006, China
| | - Shiqi Zhu
- Department of Gastroenterology, The First Affiliated Hospital of Soochow University, # 899 Pinghai St., Suzhou, Jiangsu, 215006, China
- Suzhou Clinical Center of Digestive Disease, Suzhou, Jiangsu, 215006, China
| | - Qi Yin
- Department of Anesthesiology, Jintan Affiliated Hospital of Jiangsu University, Changzhou, Jiangsu, 213200, China
| | - Jinzhou Zhu
- Department of Gastroenterology, The First Affiliated Hospital of Soochow University, # 899 Pinghai St., Suzhou, Jiangsu, 215006, China.
- Suzhou Clinical Center of Digestive Disease, Suzhou, Jiangsu, 215006, China.
- Key Laboratory of Hepatosplenic Surgery, Ministry of Education, The First Affiliated Hospital of Harbin Medical University, Harbin, 150001, China.
| | - Rui Li
- Department of Gastroenterology, The First Affiliated Hospital of Soochow University, # 899 Pinghai St., Suzhou, Jiangsu, 215006, China.
- Suzhou Clinical Center of Digestive Disease, Suzhou, Jiangsu, 215006, China.
| |
Collapse
|
10
|
Sangha V, Khunte A, Holste G, Mortazavi BJ, Wang Z, Oikonomou EK, Khera R. Biometric contrastive learning for data-efficient deep learning from electrocardiographic images. J Am Med Inform Assoc 2024; 31:855-865. [PMID: 38269618 PMCID: PMC10990541 DOI: 10.1093/jamia/ocae002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2023] [Revised: 12/28/2023] [Accepted: 01/02/2024] [Indexed: 01/26/2024] Open
Abstract
OBJECTIVE Artificial intelligence (AI) detects heart disease from images of electrocardiograms (ECGs). However, traditional supervised learning is limited by the need for large amounts of labeled data. We report the development of Biometric Contrastive Learning (BCL), a self-supervised pretraining approach for label-efficient deep learning on ECG images. MATERIALS AND METHODS Using pairs of ECGs from 78 288 individuals from Yale (2000-2015), we trained a convolutional neural network to identify temporally separated ECG pairs that varied in layouts from the same patient. We fine-tuned BCL-pretrained models to detect atrial fibrillation (AF), gender, and LVEF < 40%, using ECGs from 2015 to 2021. We externally tested the models in cohorts from Germany and the United States. We compared BCL with ImageNet initialization and general-purpose self-supervised contrastive learning for images (simCLR). RESULTS While with 100% labeled training data, BCL performed similarly to other approaches for detecting AF/Gender/LVEF < 40% with an AUROC of 0.98/0.90/0.90 in the held-out test sets, it consistently outperformed other methods with smaller proportions of labeled data, reaching equivalent performance at 50% of data. With 0.1% data, BCL achieved AUROC of 0.88/0.79/0.75, compared with 0.51/0.52/0.60 (ImageNet) and 0.61/0.53/0.49 (simCLR). In external validation, BCL outperformed other methods even at 100% labeled training data, with an AUROC of 0.88/0.88 for Gender and LVEF < 40% compared with 0.83/0.83 (ImageNet) and 0.84/0.83 (simCLR). DISCUSSION AND CONCLUSION A pretraining strategy that leverages biometric signatures of different ECGs from the same patient enhances the efficiency of developing AI models for ECG images. This represents a major advance in detecting disorders from ECG images with limited labeled data.
Collapse
Affiliation(s)
- Veer Sangha
- Section of Cardiovascular Medicine, Department of Internal Medicine, Yale University, New Haven, CT, 06510, United States
- Department of Engineering Science, Oxford University, Oxford, OX1 3PJ, United Kingdom
| | - Akshay Khunte
- Department of Computer Science, Yale University, New Haven, CT, 06511, United States
| | - Gregory Holste
- Department of Electrical and Computer Engineering, The University of Texas at Austin, Austin, TX, 78712, United States
| | - Bobak J Mortazavi
- Department of Computer Science & Engineering, Texas A&M University, College Station, TX, 77843, United States
- Center for Outcomes Research and Evaluation (CORE), Yale New Haven Hospital, New Haven, CT, 06510, United States
| | - Zhangyang Wang
- Department of Electrical and Computer Engineering, The University of Texas at Austin, Austin, TX, 78712, United States
| | - Evangelos K Oikonomou
- Section of Cardiovascular Medicine, Department of Internal Medicine, Yale University, New Haven, CT, 06510, United States
| | - Rohan Khera
- Section of Cardiovascular Medicine, Department of Internal Medicine, Yale University, New Haven, CT, 06510, United States
- Center for Outcomes Research and Evaluation (CORE), Yale New Haven Hospital, New Haven, CT, 06510, United States
- Section of Health Informatics, Department of Biostatistics, Yale School of Public Health, New Haven, CT, 06510, United States
| |
Collapse
|
11
|
Ktena I, Wiles O, Albuquerque I, Rebuffi SA, Tanno R, Roy AG, Azizi S, Belgrave D, Kohli P, Cemgil T, Karthikesalingam A, Gowal S. Generative models improve fairness of medical classifiers under distribution shifts. Nat Med 2024; 30:1166-1173. [PMID: 38600282 PMCID: PMC11031395 DOI: 10.1038/s41591-024-02838-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2023] [Accepted: 01/26/2024] [Indexed: 04/12/2024]
Abstract
Domain generalization is a ubiquitous challenge for machine learning in healthcare. Model performance in real-world conditions might be lower than expected because of discrepancies between the data encountered during deployment and development. Underrepresentation of some groups or conditions during model development is a common cause of this phenomenon. This challenge is often not readily addressed by targeted data acquisition and 'labeling' by expert clinicians, which can be prohibitively expensive or practically impossible because of the rarity of conditions or the available clinical expertise. We hypothesize that advances in generative artificial intelligence can help mitigate this unmet need in a steerable fashion, enriching our training dataset with synthetic examples that address shortfalls of underrepresented conditions or subgroups. We show that diffusion models can automatically learn realistic augmentations from data in a label-efficient manner. We demonstrate that learned augmentations make models more robust and statistically fair in-distribution and out of distribution. To evaluate the generality of our approach, we studied three distinct medical imaging contexts of varying difficulty: (1) histopathology, (2) chest X-ray and (3) dermatology images. Complementing real samples with synthetic ones improved the robustness of models in all three medical tasks and increased fairness by improving the accuracy of clinical diagnosis within underrepresented groups, especially out of distribution.
Collapse
|
12
|
Wei ML, Tada M, So A, Torres R. Artificial intelligence and skin cancer. Front Med (Lausanne) 2024; 11:1331895. [PMID: 38566925 PMCID: PMC10985205 DOI: 10.3389/fmed.2024.1331895] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2023] [Accepted: 02/26/2024] [Indexed: 04/04/2024] Open
Abstract
Artificial intelligence is poised to rapidly reshape many fields, including that of skin cancer screening and diagnosis, both as a disruptive and assistive technology. Together with the collection and availability of large medical data sets, artificial intelligence will become a powerful tool that can be leveraged by physicians in their diagnoses and treatment plans for patients. This comprehensive review focuses on current progress toward AI applications for patients, primary care providers, dermatologists, and dermatopathologists, explores the diverse applications of image and molecular processing for skin cancer, and highlights AI's potential for patient self-screening and improving diagnostic accuracy for non-dermatologists. We additionally delve into the challenges and barriers to clinical implementation, paths forward for implementation and areas of active research.
Collapse
Affiliation(s)
- Maria L. Wei
- Department of Dermatology, University of California, San Francisco, San Francisco, CA, United States
- Dermatology Service, San Francisco VA Health Care System, San Francisco, CA, United States
| | - Mikio Tada
- Institute for Neurodegenerative Diseases, University of California, San Francisco, San Francisco, CA, United States
| | - Alexandra So
- School of Medicine, University of California, San Francisco, San Francisco, CA, United States
| | - Rodrigo Torres
- Dermatology Service, San Francisco VA Health Care System, San Francisco, CA, United States
| |
Collapse
|
13
|
Pai S, Bontempi D, Hadzic I, Prudente V, Sokač M, Chaunzwa TL, Bernatz S, Hosny A, Mak RH, Birkbak NJ, Aerts HJWL. Foundation model for cancer imaging biomarkers. NAT MACH INTELL 2024; 6:354-367. [PMID: 38523679 PMCID: PMC10957482 DOI: 10.1038/s42256-024-00807-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2023] [Accepted: 02/08/2024] [Indexed: 03/26/2024]
Abstract
Foundation models in deep learning are characterized by a single large-scale model trained on vast amounts of data serving as the foundation for various downstream tasks. Foundation models are generally trained using self-supervised learning and excel in reducing the demand for training samples in downstream applications. This is especially important in medicine, where large labelled datasets are often scarce. Here, we developed a foundation model for cancer imaging biomarker discovery by training a convolutional encoder through self-supervised learning using a comprehensive dataset of 11,467 radiographic lesions. The foundation model was evaluated in distinct and clinically relevant applications of cancer imaging-based biomarkers. We found that it facilitated better and more efficient learning of imaging biomarkers and yielded task-specific models that significantly outperformed conventional supervised and other state-of-the-art pretrained implementations on downstream tasks, especially when training dataset sizes were very limited. Furthermore, the foundation model was more stable to input variations and showed strong associations with underlying biology. Our results demonstrate the tremendous potential of foundation models in discovering new imaging biomarkers that may extend to other clinical use cases and can accelerate the widespread translation of imaging biomarkers into clinical settings.
Collapse
Affiliation(s)
- Suraj Pai
- Artificial Intelligence in Medicine (AIM) Program, Mass General Brigham, Harvard Medical School, Harvard Institutes of Medicine, Boston, MA USA
- Radiology and Nuclear Medicine, CARIM and GROW, Maastricht University, Maastricht, the Netherlands
- Department of Radiation Oncology, Brigham and Women’s Hospital, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA USA
| | - Dennis Bontempi
- Artificial Intelligence in Medicine (AIM) Program, Mass General Brigham, Harvard Medical School, Harvard Institutes of Medicine, Boston, MA USA
- Radiology and Nuclear Medicine, CARIM and GROW, Maastricht University, Maastricht, the Netherlands
- Department of Radiation Oncology, Brigham and Women’s Hospital, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA USA
| | - Ibrahim Hadzic
- Artificial Intelligence in Medicine (AIM) Program, Mass General Brigham, Harvard Medical School, Harvard Institutes of Medicine, Boston, MA USA
- Radiology and Nuclear Medicine, CARIM and GROW, Maastricht University, Maastricht, the Netherlands
- Department of Radiation Oncology, Brigham and Women’s Hospital, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA USA
| | - Vasco Prudente
- Artificial Intelligence in Medicine (AIM) Program, Mass General Brigham, Harvard Medical School, Harvard Institutes of Medicine, Boston, MA USA
- Radiology and Nuclear Medicine, CARIM and GROW, Maastricht University, Maastricht, the Netherlands
- Department of Radiation Oncology, Brigham and Women’s Hospital, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA USA
| | - Mateo Sokač
- Department of Molecular Medicine, Aarhus University Hospital, Aarhus, Denmark
- Department of Clinical Medicine, Aarhus University, Aarhus, Denmark
| | - Tafadzwa L. Chaunzwa
- Artificial Intelligence in Medicine (AIM) Program, Mass General Brigham, Harvard Medical School, Harvard Institutes of Medicine, Boston, MA USA
- Department of Radiation Oncology, Brigham and Women’s Hospital, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA USA
| | - Simon Bernatz
- Artificial Intelligence in Medicine (AIM) Program, Mass General Brigham, Harvard Medical School, Harvard Institutes of Medicine, Boston, MA USA
- Department of Radiation Oncology, Brigham and Women’s Hospital, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA USA
| | - Ahmed Hosny
- Artificial Intelligence in Medicine (AIM) Program, Mass General Brigham, Harvard Medical School, Harvard Institutes of Medicine, Boston, MA USA
- Department of Radiation Oncology, Brigham and Women’s Hospital, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA USA
| | - Raymond H. Mak
- Artificial Intelligence in Medicine (AIM) Program, Mass General Brigham, Harvard Medical School, Harvard Institutes of Medicine, Boston, MA USA
- Radiology and Nuclear Medicine, CARIM and GROW, Maastricht University, Maastricht, the Netherlands
| | - Nicolai J. Birkbak
- Department of Molecular Medicine, Aarhus University Hospital, Aarhus, Denmark
- Department of Clinical Medicine, Aarhus University, Aarhus, Denmark
| | - Hugo J. W. L. Aerts
- Artificial Intelligence in Medicine (AIM) Program, Mass General Brigham, Harvard Medical School, Harvard Institutes of Medicine, Boston, MA USA
- Radiology and Nuclear Medicine, CARIM and GROW, Maastricht University, Maastricht, the Netherlands
- Department of Radiation Oncology, Brigham and Women’s Hospital, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA USA
- Department of Radiology, Brigham and Women’s Hospital, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA USA
| |
Collapse
|
14
|
Chen RJ, Ding T, Lu MY, Williamson DFK, Jaume G, Song AH, Chen B, Zhang A, Shao D, Shaban M, Williams M, Oldenburg L, Weishaupt LL, Wang JJ, Vaidya A, Le LP, Gerber G, Sahai S, Williams W, Mahmood F. Towards a general-purpose foundation model for computational pathology. Nat Med 2024; 30:850-862. [PMID: 38504018 DOI: 10.1038/s41591-024-02857-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2023] [Accepted: 02/05/2024] [Indexed: 03/21/2024]
Abstract
Quantitative evaluation of tissue images is crucial for computational pathology (CPath) tasks, requiring the objective characterization of histopathological entities from whole-slide images (WSIs). The high resolution of WSIs and the variability of morphological features present significant challenges, complicating the large-scale annotation of data for high-performance applications. To address this challenge, current efforts have proposed the use of pretrained image encoders through transfer learning from natural image datasets or self-supervised learning on publicly available histopathology datasets, but have not been extensively developed and evaluated across diverse tissue types at scale. We introduce UNI, a general-purpose self-supervised model for pathology, pretrained using more than 100 million images from over 100,000 diagnostic H&E-stained WSIs (>77 TB of data) across 20 major tissue types. The model was evaluated on 34 representative CPath tasks of varying diagnostic difficulty. In addition to outperforming previous state-of-the-art models, we demonstrate new modeling capabilities in CPath such as resolution-agnostic tissue classification, slide classification using few-shot class prototypes, and disease subtyping generalization in classifying up to 108 cancer types in the OncoTree classification system. UNI advances unsupervised representation learning at scale in CPath in terms of both pretraining data and downstream evaluation, enabling data-efficient artificial intelligence models that can generalize and transfer to a wide range of diagnostically challenging tasks and clinical workflows in anatomic pathology.
Collapse
Affiliation(s)
- Richard J Chen
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
- Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Cancer Data Science Program, Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Tong Ding
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Harvard John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, MA, USA
| | - Ming Y Lu
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
- Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Cancer Data Science Program, Dana-Farber Cancer Institute, Boston, MA, USA
- Electrical Engineering and Computer Science, Massachusetts Institute of Technology (MIT), Cambridge, MA, USA
| | - Drew F K Williamson
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
- Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA
| | - Guillaume Jaume
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
- Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Cancer Data Science Program, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Andrew H Song
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
- Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Cancer Data Science Program, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Bowen Chen
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
| | - Andrew Zhang
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
- Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Cancer Data Science Program, Dana-Farber Cancer Institute, Boston, MA, USA
- Health Sciences and Technology, Harvard-MIT, Cambridge, MA, USA
| | - Daniel Shao
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
- Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Cancer Data Science Program, Dana-Farber Cancer Institute, Boston, MA, USA
- Health Sciences and Technology, Harvard-MIT, Cambridge, MA, USA
| | - Muhammad Shaban
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
- Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Cancer Data Science Program, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Mane Williams
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
- Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Cancer Data Science Program, Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Lukas Oldenburg
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Luca L Weishaupt
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
- Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Cancer Data Science Program, Dana-Farber Cancer Institute, Boston, MA, USA
- Health Sciences and Technology, Harvard-MIT, Cambridge, MA, USA
| | - Judy J Wang
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Anurag Vaidya
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
- Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Cancer Data Science Program, Dana-Farber Cancer Institute, Boston, MA, USA
- Health Sciences and Technology, Harvard-MIT, Cambridge, MA, USA
| | - Long Phi Le
- Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
- Health Sciences and Technology, Harvard-MIT, Cambridge, MA, USA
| | - Georg Gerber
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Sharifa Sahai
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
- Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Cancer Data Science Program, Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Systems Biology, Harvard University, Cambridge, MA, USA
| | - Walt Williams
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Harvard John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, MA, USA
| | - Faisal Mahmood
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA.
- Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA.
- Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA.
- Cancer Data Science Program, Dana-Farber Cancer Institute, Boston, MA, USA.
- Harvard Data Science Initiative, Harvard University, Cambridge, MA, USA.
| |
Collapse
|
15
|
Ektefaie Y, Shen A, Bykova D, Marin M, Zitnik M, Farhat M. Evaluating generalizability of artificial intelligence models for molecular datasets. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.02.25.581982. [PMID: 38464295 PMCID: PMC10925170 DOI: 10.1101/2024.02.25.581982] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/12/2024]
Abstract
Deep learning has made rapid advances in modeling molecular sequencing data. Despite achieving high performance on benchmarks, it remains unclear to what extent deep learning models learn general principles and generalize to previously unseen sequences. Benchmarks traditionally interrogate model generalizability by generating metadata based (MB) or sequence-similarity based (SB) train and test splits of input data before assessing model performance. Here, we show that this approach mischaracterizes model generalizability by failing to consider the full spectrum of cross-split overlap, i.e., similarity between train and test splits. We introduce Spectra, a spectral framework for comprehensive model evaluation. For a given model and input data, Spectra plots model performance as a function of decreasing cross-split overlap and reports the area under this curve as a measure of generalizability. We apply Spectra to 18 sequencing datasets with associated phenotypes ranging from antibiotic resistance in tuberculosis to protein-ligand binding to evaluate the generalizability of 19 state-of-the-art deep learning models, including large language models, graph neural networks, diffusion models, and convolutional neural networks. We show that SB and MB splits provide an incomplete assessment of model generalizability. With Spectra, we find as cross-split overlap decreases, deep learning models consistently exhibit a reduction in performance in a task- and model-dependent manner. Although no model consistently achieved the highest performance across all tasks, we show that deep learning models can generalize to previously unseen sequences on specific tasks. Spectra paves the way toward a better understanding of how foundation models generalize in biology.
Collapse
Affiliation(s)
- Yasha Ektefaie
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Andrew Shen
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Department of Computer Science, Northwestern University, Evanston, IL, USA
| | - Daria Bykova
- Department of Biological Sciences, Columbia University, New York, NY, USA
| | - Maximillian Marin
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Marinka Zitnik
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Kempner Institute for the Study of Natural and Artificial Intelligence, Harvard University, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Harvard Data Science Initiative, Cambridge, MA, USA
| | - Maha Farhat
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Division of Pulmonary and Critical Care, Department of Medicine, Massachusetts General Hospital, Boston, MA, USA
| |
Collapse
|
16
|
Korot E, Gonçalves MB, Huemer J, Beqiri S, Khalid H, Kelly M, Chia M, Mathijs E, Struyven R, Moussa M, Keane PA. Clinician-Driven AI: Code-Free Self-Training on Public Data for Diabetic Retinopathy Referral. JAMA Ophthalmol 2023; 141:1029-1036. [PMID: 37856110 PMCID: PMC10587830 DOI: 10.1001/jamaophthalmol.2023.4508] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2023] [Accepted: 08/23/2023] [Indexed: 10/20/2023]
Abstract
Importance Democratizing artificial intelligence (AI) enables model development by clinicians with a lack of coding expertise, powerful computing resources, and large, well-labeled data sets. Objective To determine whether resource-constrained clinicians can use self-training via automated machine learning (ML) and public data sets to design high-performing diabetic retinopathy classification models. Design, Setting, and Participants This diagnostic quality improvement study was conducted from January 1, 2021, to December 31, 2021. A self-training method without coding was used on 2 public data sets with retinal images from patients in France (Messidor-2 [n = 1748]) and the UK and US (EyePACS [n = 58 689]) and externally validated on 1 data set with retinal images from patients of a private Egyptian medical retina clinic (Egypt [n = 210]). An AI model was trained to classify referable diabetic retinopathy as an exemplar use case. Messidor-2 images were assigned adjudicated labels available on Kaggle; 4 images were deemed ungradable and excluded, leaving 1744 images. A total of 300 images randomly selected from the EyePACS data set were independently relabeled by 3 blinded retina specialists using the International Classification of Diabetic Retinopathy protocol for diabetic retinopathy grade and diabetic macular edema presence; 19 images were deemed ungradable, leaving 281 images. Data analysis was performed from February 1 to February 28, 2021. Exposures Using public data sets, a teacher model was trained with labeled images using supervised learning. Next, the resulting predictions, termed pseudolabels, were used on an unlabeled public data set. Finally, a student model was trained with the existing labeled images and the additional pseudolabeled images. Main Outcomes and Measures The analyzed metrics for the models included the area under the receiver operating characteristic curve (AUROC), accuracy, sensitivity, specificity, and F1 score. The Fisher exact test was performed, and 2-tailed P values were calculated for failure case analysis. Results For the internal validation data sets, AUROC values for performance ranged from 0.886 to 0.939 for the teacher model and from 0.916 to 0.951 for the student model. For external validation of automated ML model performance, AUROC values and accuracy were 0.964 and 93.3% for the teacher model, 0.950 and 96.7% for the student model, and 0.890 and 94.3% for the manually coded bespoke model, respectively. Conclusions and Relevance These findings suggest that self-training using automated ML is an effective method to increase both model performance and generalizability while decreasing the need for costly expert labeling. This approach advances the democratization of AI by enabling clinicians without coding expertise or access to large, well-labeled private data sets to develop their own AI models.
Collapse
Affiliation(s)
- Edward Korot
- Retina Specialists of Michigan, Grand Rapids
- Moorfields Eye Hospital, London, United Kingdom
- Stanford University Byers Eye Institute, Palo Alto, California
| | - Mariana Batista Gonçalves
- Moorfields Eye Hospital, London, United Kingdom
- Federal University of Sao Paulo, Sao Paulo, Brazil
- Instituto da Visão, Sao Paulo, Brazil
| | | | - Sara Beqiri
- Moorfields Eye Hospital, London, United Kingdom
- University College London Medical School, London, United Kingdom
| | - Hagar Khalid
- Moorfields Eye Hospital, London, United Kingdom
- Ophthalmology Department, Faculty of Medicine, Tanta University Hospital, Tanta, Gharbia, Egypt
| | - Madeline Kelly
- Moorfields Eye Hospital, London, United Kingdom
- University College London Medical School, London, United Kingdom
- UCL Centre for Medical Image Computing, London, United Kingdom
| | - Mark Chia
- Moorfields Eye Hospital, London, United Kingdom
| | - Emily Mathijs
- Michigan State University College of Osteopathic Medicine, East Lansing
| | | | - Magdy Moussa
- Ophthalmology Department, Faculty of Medicine, Tanta University Hospital, Tanta, Gharbia, Egypt
| | | |
Collapse
|
17
|
Zhou Y, Chia MA, Wagner SK, Ayhan MS, Williamson DJ, Struyven RR, Liu T, Xu M, Lozano MG, Woodward-Court P, Kihara Y, Altmann A, Lee AY, Topol EJ, Denniston AK, Alexander DC, Keane PA. A foundation model for generalizable disease detection from retinal images. Nature 2023; 622:156-163. [PMID: 37704728 PMCID: PMC10550819 DOI: 10.1038/s41586-023-06555-x] [Citation(s) in RCA: 52] [Impact Index Per Article: 52.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2022] [Accepted: 08/18/2023] [Indexed: 09/15/2023]
Abstract
Medical artificial intelligence (AI) offers great potential for recognizing signs of health conditions in retinal images and expediting the diagnosis of eye diseases and systemic disorders1. However, the development of AI models requires substantial annotation and models are usually task-specific with limited generalizability to different clinical applications2. Here, we present RETFound, a foundation model for retinal images that learns generalizable representations from unlabelled retinal images and provides a basis for label-efficient model adaptation in several applications. Specifically, RETFound is trained on 1.6 million unlabelled retinal images by means of self-supervised learning and then adapted to disease detection tasks with explicit labels. We show that adapted RETFound consistently outperforms several comparison models in the diagnosis and prognosis of sight-threatening eye diseases, as well as incident prediction of complex systemic disorders such as heart failure and myocardial infarction with fewer labelled data. RETFound provides a generalizable solution to improve model performance and alleviate the annotation workload of experts to enable broad clinical AI applications from retinal imaging.
Collapse
Affiliation(s)
- Yukun Zhou
- Centre for Medical Image Computing, University College London, London, UK.
- NIHR Biomedical Research Centre at Moorfields Eye Hospital NHS Foundation Trust, London, UK.
- Department of Medical Physics and Biomedical Engineering, University College London, London, UK.
| | - Mark A Chia
- NIHR Biomedical Research Centre at Moorfields Eye Hospital NHS Foundation Trust, London, UK
- Institute of Ophthalmology, University College London, London, UK
| | - Siegfried K Wagner
- NIHR Biomedical Research Centre at Moorfields Eye Hospital NHS Foundation Trust, London, UK
- Institute of Ophthalmology, University College London, London, UK
| | - Murat S Ayhan
- Centre for Medical Image Computing, University College London, London, UK
- NIHR Biomedical Research Centre at Moorfields Eye Hospital NHS Foundation Trust, London, UK
- Institute of Ophthalmology, University College London, London, UK
| | - Dominic J Williamson
- Centre for Medical Image Computing, University College London, London, UK
- NIHR Biomedical Research Centre at Moorfields Eye Hospital NHS Foundation Trust, London, UK
- Institute of Ophthalmology, University College London, London, UK
| | - Robbert R Struyven
- Centre for Medical Image Computing, University College London, London, UK
- NIHR Biomedical Research Centre at Moorfields Eye Hospital NHS Foundation Trust, London, UK
- Institute of Ophthalmology, University College London, London, UK
| | - Timing Liu
- NIHR Biomedical Research Centre at Moorfields Eye Hospital NHS Foundation Trust, London, UK
| | - Moucheng Xu
- Centre for Medical Image Computing, University College London, London, UK
- Department of Medical Physics and Biomedical Engineering, University College London, London, UK
| | - Mateo G Lozano
- NIHR Biomedical Research Centre at Moorfields Eye Hospital NHS Foundation Trust, London, UK
- Department of Computer Science, University of Coruña, A Coruña, Spain
| | - Peter Woodward-Court
- Centre for Medical Image Computing, University College London, London, UK
- NIHR Biomedical Research Centre at Moorfields Eye Hospital NHS Foundation Trust, London, UK
- Institute of Health Informatics, University College London, London, UK
| | - Yuka Kihara
- Department of Ophthalmology, University of Washington, Seattle, WA, USA
- Roger and Angie Karalis Johnson Retina Center, University of Washington, Seattle, WA, USA
| | - Andre Altmann
- Centre for Medical Image Computing, University College London, London, UK
- Department of Medical Physics and Biomedical Engineering, University College London, London, UK
| | - Aaron Y Lee
- Department of Ophthalmology, University of Washington, Seattle, WA, USA
- Roger and Angie Karalis Johnson Retina Center, University of Washington, Seattle, WA, USA
| | - Eric J Topol
- Department of Molecular Medicine, Scripps Research, La Jolla, CA, USA
| | - Alastair K Denniston
- Academic Unit of Ophthalmology, University of Birmingham, Birmingham, UK
- University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK
| | - Daniel C Alexander
- Centre for Medical Image Computing, University College London, London, UK
- Department of Computer Science, University College London, London, UK
| | - Pearse A Keane
- NIHR Biomedical Research Centre at Moorfields Eye Hospital NHS Foundation Trust, London, UK.
- Institute of Ophthalmology, University College London, London, UK.
| |
Collapse
|
18
|
Lenharo M. An AI revolution is brewing in medicine. What will it look like? Nature 2023; 622:686-688. [PMID: 37875622 DOI: 10.1038/d41586-023-03302-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2023]
|
19
|
Alammar Z, Alzubaidi L, Zhang J, Li Y, Lafta W, Gu Y. Deep Transfer Learning with Enhanced Feature Fusion for Detection of Abnormalities in X-ray Images. Cancers (Basel) 2023; 15:4007. [PMID: 37568821 PMCID: PMC10417687 DOI: 10.3390/cancers15154007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2023] [Revised: 07/29/2023] [Accepted: 08/05/2023] [Indexed: 08/13/2023] Open
Abstract
Medical image classification poses significant challenges in real-world scenarios. One major obstacle is the scarcity of labelled training data, which hampers the performance of image-classification algorithms and generalisation. Gathering sufficient labelled data is often difficult and time-consuming in the medical domain, but deep learning (DL) has shown remarkable performance, although it typically requires a large amount of labelled data to achieve optimal results. Transfer learning (TL) has played a pivotal role in reducing the time, cost, and need for a large number of labelled images. This paper presents a novel TL approach that aims to overcome the limitations and disadvantages of TL that are characteristic of an ImageNet dataset, which belongs to a different domain. Our proposed TL approach involves training DL models on numerous medical images that are similar to the target dataset. These models were then fine-tuned using a small set of annotated medical images to leverage the knowledge gained from the pre-training phase. We specifically focused on medical X-ray imaging scenarios that involve the humerus and wrist from the musculoskeletal radiographs (MURA) dataset. Both of these tasks face significant challenges regarding accurate classification. The models trained with the proposed TL were used to extract features and were subsequently fused to train several machine learning (ML) classifiers. We combined these diverse features to represent various relevant characteristics in a comprehensive way. Through extensive evaluation, our proposed TL and feature-fusion approach using ML classifiers achieved remarkable results. For the classification of the humerus, we achieved an accuracy of 87.85%, an F1-score of 87.63%, and a Cohen's Kappa coefficient of 75.69%. For wrist classification, our approach achieved an accuracy of 85.58%, an F1-score of 82.70%, and a Cohen's Kappa coefficient of 70.46%. The results demonstrated that the models trained using our proposed TL approach outperformed those trained with ImageNet TL. We employed visualisation techniques to further validate these findings, including a gradient-based class activation heat map (Grad-CAM) and locally interpretable model-independent explanations (LIME). These visualisation tools provided additional evidence to support the superior accuracy of models trained with our proposed TL approach compared to those trained with ImageNet TL. Furthermore, our proposed TL approach exhibited greater robustness in various experiments compared to ImageNet TL. Importantly, the proposed TL approach and the feature-fusion technique are not limited to specific tasks. They can be applied to various medical image applications, thus extending their utility and potential impact. To demonstrate the concept of reusability, a computed tomography (CT) case was adopted. The results obtained from the proposed method showed improvements.
Collapse
Affiliation(s)
- Zaenab Alammar
- School of Computer Science, Queensland University of Technology, Brisbane, QLD 4000, Australia; (J.Z.); (Y.L.)
- Centre for Data Science, Queensland University of Technology, Brisbane, QLD 4000, Australia
| | - Laith Alzubaidi
- Centre for Data Science, Queensland University of Technology, Brisbane, QLD 4000, Australia
- School of Mechanical, Medical and Process Engineering, Queensland University of Technology, Brisbane, QLD 4000, Australia;
- ARC Industrial Transformation Training Centre-Joint Biomechanics, Queensland University of Technology, Brisbane, QLD 4000, Australia
| | - Jinglan Zhang
- School of Computer Science, Queensland University of Technology, Brisbane, QLD 4000, Australia; (J.Z.); (Y.L.)
- Centre for Data Science, Queensland University of Technology, Brisbane, QLD 4000, Australia
| | - Yuefeng Li
- School of Computer Science, Queensland University of Technology, Brisbane, QLD 4000, Australia; (J.Z.); (Y.L.)
| | | | - Yuantong Gu
- School of Mechanical, Medical and Process Engineering, Queensland University of Technology, Brisbane, QLD 4000, Australia;
- ARC Industrial Transformation Training Centre-Joint Biomechanics, Queensland University of Technology, Brisbane, QLD 4000, Australia
| |
Collapse
|