1
|
Zheng S, Cui X, Sun Y, Li J, Li H, Zhang Y, Chen P, Jing X, Ye Z, Yang L. Benchmarking PathCLIP for Pathology Image Analysis. JOURNAL OF IMAGING INFORMATICS IN MEDICINE 2024:10.1007/s10278-024-01128-4. [PMID: 38980627 DOI: 10.1007/s10278-024-01128-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/07/2024] [Revised: 03/21/2024] [Accepted: 04/22/2024] [Indexed: 07/10/2024]
Abstract
Accurate image classification and retrieval are of importance for clinical diagnosis and treatment decision-making. The recent contrastive language-image pre-training (CLIP) model has shown remarkable proficiency in understanding natural images. Drawing inspiration from CLIP, pathology-dedicated CLIP (PathCLIP) has been developed, utilizing over 200,000 image and text pairs in training. While the performance the PathCLIP is impressive, its robustness under a wide range of image corruptions remains unknown. Therefore, we conduct an extensive evaluation to analyze the performance of PathCLIP on various corrupted images from the datasets of osteosarcoma and WSSS4LUAD. In our experiments, we introduce eleven corruption types including brightness, contrast, defocus, resolution, saturation, hue, markup, deformation, incompleteness, rotation, and flipping at various settings. Through experiments, we find that PathCLIP surpasses OpenAI-CLIP and the pathology language-image pre-training (PLIP) model in zero-shot classification. It is relatively robust to image corruptions including contrast, saturation, incompleteness, and orientation factors. Among the eleven corruptions, hue, markup, deformation, defocus, and resolution can cause relatively severe performance fluctuation of the PathCLIP. This indicates that ensuring the quality of images is crucial before conducting a clinical test. Additionally, we assess the robustness of PathCLIP in the task of image-to-image retrieval, revealing that PathCLIP performs less effectively than PLIP on osteosarcoma but performs better on WSSS4LUAD under diverse corruptions. Overall, PathCLIP presents impressive zero-shot classification and retrieval performance for pathology images, but appropriate care needs to be taken when using it.
Collapse
Affiliation(s)
- Sunyi Zheng
- Tianjin's Clinical Research Center for Cancer, Key Laboratory of Cancer Prevention and Therapy, Department of Radiology, Tianjin Medical University Cancer Institute and Hospital, National Clinical Research Center for Cancer, Tianjin, China
- School of Engineering, Westlake University, Hangzhou, China
| | - Xiaonan Cui
- Tianjin's Clinical Research Center for Cancer, Key Laboratory of Cancer Prevention and Therapy, Department of Radiology, Tianjin Medical University Cancer Institute and Hospital, National Clinical Research Center for Cancer, Tianjin, China
| | - Yuxuan Sun
- College of Computer Science and Technology, Zhejiang University, Hangzhou, China
| | - Jingxiong Li
- College of Computer Science and Technology, Zhejiang University, Hangzhou, China
| | - Honglin Li
- College of Computer Science and Technology, Zhejiang University, Hangzhou, China
| | - Yunlong Zhang
- College of Computer Science and Technology, Zhejiang University, Hangzhou, China
| | - Pingyi Chen
- College of Computer Science and Technology, Zhejiang University, Hangzhou, China
| | - Xueping Jing
- Department of Radiation Oncology, University Medical Center of Groningen, Groningen, The Netherlands
| | - Zhaoxiang Ye
- Tianjin's Clinical Research Center for Cancer, Key Laboratory of Cancer Prevention and Therapy, Department of Radiology, Tianjin Medical University Cancer Institute and Hospital, National Clinical Research Center for Cancer, Tianjin, China
| | - Lin Yang
- School of Engineering, Westlake University, Hangzhou, China.
| |
Collapse
|
2
|
Li Y, El Habib Daho M, Conze PH, Zeghlache R, Le Boité H, Tadayoni R, Cochener B, Lamard M, Quellec G. A review of deep learning-based information fusion techniques for multimodal medical image classification. Comput Biol Med 2024; 177:108635. [PMID: 38796881 DOI: 10.1016/j.compbiomed.2024.108635] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2023] [Revised: 03/18/2024] [Accepted: 05/18/2024] [Indexed: 05/29/2024]
Abstract
Multimodal medical imaging plays a pivotal role in clinical diagnosis and research, as it combines information from various imaging modalities to provide a more comprehensive understanding of the underlying pathology. Recently, deep learning-based multimodal fusion techniques have emerged as powerful tools for improving medical image classification. This review offers a thorough analysis of the developments in deep learning-based multimodal fusion for medical classification tasks. We explore the complementary relationships among prevalent clinical modalities and outline three main fusion schemes for multimodal classification networks: input fusion, intermediate fusion (encompassing single-level fusion, hierarchical fusion, and attention-based fusion), and output fusion. By evaluating the performance of these fusion techniques, we provide insight into the suitability of different network architectures for various multimodal fusion scenarios and application domains. Furthermore, we delve into challenges related to network architecture selection, handling incomplete multimodal data management, and the potential limitations of multimodal fusion. Finally, we spotlight the promising future of Transformer-based multimodal fusion techniques and give recommendations for future research in this rapidly evolving field.
Collapse
Affiliation(s)
- Yihao Li
- LaTIM UMR 1101, Inserm, Brest, France; University of Western Brittany, Brest, France
| | - Mostafa El Habib Daho
- LaTIM UMR 1101, Inserm, Brest, France; University of Western Brittany, Brest, France.
| | | | - Rachid Zeghlache
- LaTIM UMR 1101, Inserm, Brest, France; University of Western Brittany, Brest, France
| | - Hugo Le Boité
- Sorbonne University, Paris, France; Ophthalmology Department, Lariboisière Hospital, AP-HP, Paris, France
| | - Ramin Tadayoni
- Ophthalmology Department, Lariboisière Hospital, AP-HP, Paris, France; Paris Cité University, Paris, France
| | - Béatrice Cochener
- LaTIM UMR 1101, Inserm, Brest, France; University of Western Brittany, Brest, France; Ophthalmology Department, CHRU Brest, Brest, France
| | - Mathieu Lamard
- LaTIM UMR 1101, Inserm, Brest, France; University of Western Brittany, Brest, France
| | | |
Collapse
|
3
|
Gui H, Omiye JA, Chang CT, Daneshjou R. The Promises and Perils of Foundation Models in Dermatology. J Invest Dermatol 2024; 144:1440-1448. [PMID: 38441507 DOI: 10.1016/j.jid.2023.12.019] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2023] [Revised: 12/19/2023] [Accepted: 12/20/2023] [Indexed: 06/24/2024]
Abstract
Foundation models (FM), which are large-scale artificial intelligence (AI) models that can complete a range of tasks, represent a paradigm shift in AI. These versatile models encompass large language models, vision-language models, and multimodal models. Although these models are often trained for broad tasks, they have been applied either out of the box or after additional fine tuning to tasks in medicine, including dermatology. From addressing administrative tasks to answering dermatology questions, these models are poised to have an impact on dermatology care delivery. As FMs become more ubiquitous in health care, it is important for clinicians and dermatologists to have a basic understanding of how these models are developed, what they are capable of, and what pitfalls exist. In this paper, we present a comprehensive yet accessible overview of the current state of FMs and summarize their current applications in dermatology, highlight their limitations, and discuss future developments in the field.
Collapse
Affiliation(s)
- Haiwen Gui
- Department of Dermatology, Stanford University, Stanford, California, USA.
| | - Jesutofunmi A Omiye
- Department of Dermatology, Stanford University, Stanford, California, USA; Department of Biomedical Data Science, Stanford University, Stanford, California, USA
| | - Crystal T Chang
- Department of Dermatology, Stanford University, Stanford, California, USA; Clinical Excellence Research Center, School of Medicine, Stanford University, Palo Alto, California, USA
| | - Roxana Daneshjou
- Department of Dermatology, Stanford University, Stanford, California, USA; Department of Biomedical Data Science, Stanford University, Stanford, California, USA
| |
Collapse
|
4
|
Tak D, Garomsa BA, Zapaishchykova A, Ye Z, Vajapeyam S, Mahootiha M, Climent Pardo JC, Smith C, Familiar AM, Chaunzwa T, Liu KX, Prabhu S, Bandopadhayay P, Nabavizadeh A, Mueller S, Aerts HJ, Haas-Kogan D, Poussaint TY, Kann BH. Longitudinal risk prediction for pediatric glioma with temporal deep learning. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.06.04.24308434. [PMID: 38978642 PMCID: PMC11230342 DOI: 10.1101/2024.06.04.24308434] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/10/2024]
Abstract
Pediatric glioma recurrence can cause morbidity and mortality; however, recurrence pattern and severity are heterogeneous and challenging to predict with established clinical and genomic markers. Resultingly, almost all children undergo frequent, long-term, magnetic resonance (MR) brain surveillance regardless of individual recurrence risk. Deep learning analysis of longitudinal MR may be an effective approach for improving individualized recurrence prediction in gliomas and other cancers but has thus far been infeasible with current frameworks. Here, we propose a self-supervised, deep learning approach to longitudinal medical imaging analysis, temporal learning, that models the spatiotemporal information from a patient's current and prior brain MRs to predict future recurrence. We apply temporal learning to pediatric glioma surveillance imaging for 715 patients (3,994 scans) from four distinct clinical settings. We find that longitudinal imaging analysis with temporal learning improves recurrence prediction performance by up to 41% compared to traditional approaches, with improvements in performance in both low- and high-grade glioma. We find that recurrence prediction accuracy increases incrementally with the number of historical scans available per patient. Temporal deep learning may enable point-of-care decision-support for pediatric brain tumors and be adaptable more broadly to patients with other cancers and chronic diseases undergoing surveillance imaging.
Collapse
|
5
|
Huang Z, Yang E, Shen J, Gratzinger D, Eyerer F, Liang B, Nirschl J, Bingham D, Dussaq AM, Kunder C, Rojansky R, Gilbert A, Chang-Graham AL, Howitt BE, Liu Y, Ryan EE, Tenney TB, Zhang X, Folkins A, Fox EJ, Montine KS, Montine TJ, Zou J. A pathologist-AI collaboration framework for enhancing diagnostic accuracies and efficiencies. Nat Biomed Eng 2024:10.1038/s41551-024-01223-5. [PMID: 38898173 DOI: 10.1038/s41551-024-01223-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2023] [Accepted: 05/03/2024] [Indexed: 06/21/2024]
Abstract
In pathology, the deployment of artificial intelligence (AI) in clinical settings is constrained by limitations in data collection and in model transparency and interpretability. Here we describe a digital pathology framework, nuclei.io, that incorporates active learning and human-in-the-loop real-time feedback for the rapid creation of diverse datasets and models. We validate the effectiveness of the framework via two crossover user studies that leveraged collaboration between the AI and the pathologist, including the identification of plasma cells in endometrial biopsies and the detection of colorectal cancer metastasis in lymph nodes. In both studies, nuclei.io yielded considerable diagnostic performance improvements. Collaboration between clinicians and AI will aid digital pathology by enhancing accuracies and efficiencies.
Collapse
Affiliation(s)
- Zhi Huang
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
- Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, CA, USA
| | - Eric Yang
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
| | - Jeanne Shen
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
| | - Dita Gratzinger
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
| | - Frederick Eyerer
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
| | - Brooke Liang
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
| | - Jeffrey Nirschl
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
| | - David Bingham
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
| | - Alex M Dussaq
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
| | - Christian Kunder
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
| | - Rebecca Rojansky
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
| | - Aubre Gilbert
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
| | | | - Brooke E Howitt
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
| | - Ying Liu
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
| | - Emily E Ryan
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
| | - Troy B Tenney
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
| | - Xiaoming Zhang
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
| | - Ann Folkins
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
| | - Edward J Fox
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
| | - Kathleen S Montine
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
| | - Thomas J Montine
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA.
| | - James Zou
- Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, CA, USA.
| |
Collapse
|
6
|
Wang FA, Zhuang Z, Gao F, He R, Zhang S, Wang L, Liu J, Li Y. TMO-Net: an explainable pretrained multi-omics model for multi-task learning in oncology. Genome Biol 2024; 25:149. [PMID: 38845006 PMCID: PMC11157742 DOI: 10.1186/s13059-024-03293-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2023] [Accepted: 05/29/2024] [Indexed: 06/09/2024] Open
Abstract
Cancer is a complex disease composing systemic alterations in multiple scales. In this study, we develop the Tumor Multi-Omics pre-trained Network (TMO-Net) that integrates multi-omics pan-cancer datasets for model pre-training, facilitating cross-omics interactions and enabling joint representation learning and incomplete omics inference. This model enhances multi-omics sample representation and empowers various downstream oncology tasks with incomplete multi-omics datasets. By employing interpretable learning, we characterize the contributions of distinct omics features to clinical outcomes. The TMO-Net model serves as a versatile framework for cross-modal multi-omics learning in oncology, paving the way for tumor omics-specific foundation models.
Collapse
Affiliation(s)
- Feng-Ao Wang
- Key Laboratory of Systems Health Science of Zhejiang Province, School of Life Science, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Hangzhou, 310024, China
- Guangzhou National Laboratory, Guangzhou, 510005, China
| | - Zhenfeng Zhuang
- Department of Computer Science at the School of Informatics, Xiamen University, Xiamen, 361005, China
| | - Feng Gao
- Department of Colorectal Surgery, The Sixth Affiliated Hospital, Sun Yat-Sen University, Guangzhou, 510655, China
- Shanghai Artificial Intelligence Laboratory, Shanghai, 200433, China
- Biomedical Innovation Center, The Sixth Affiliated Hospital, Sun Yat-Sen University, Guangzhou, 510655, China
| | - Ruikun He
- BYHEALTH Institute of Nutrition & Health, Guangzhou, 510000, China
| | - Shaoting Zhang
- Shanghai Artificial Intelligence Laboratory, Shanghai, 200433, China
| | - Liansheng Wang
- Department of Computer Science at the School of Informatics, Xiamen University, Xiamen, 361005, China.
| | - Junwei Liu
- Guangzhou National Laboratory, Guangzhou, 510005, China.
| | - Yixue Li
- Key Laboratory of Systems Health Science of Zhejiang Province, School of Life Science, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Hangzhou, 310024, China.
- Guangzhou National Laboratory, Guangzhou, 510005, China.
- Shanghai Institute of Nutrition and Health, Chinese Academy of Sciences, Shanghai, 200030, China.
- GZMU-GIBH Joint School of Life Sciences, The Guangdong-Hong Kong-Macau Joint Laboratory for Cell Fate Regulation and Diseases, Guangzhou Medical University, Guangzhou, 511436, China.
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, 200240, China.
- Collaborative Innovation Center for Genetics and Development, Fudan University, Shanghai, 200433, China.
- Shanghai Institute for Biomedical and Pharmaceutical Technologies, Shanghai, 200032, China.
| |
Collapse
|
7
|
Liu J, Zhang Y, Wang K, Yavuz MC, Chen X, Yuan Y, Li H, Yang Y, Yuille A, Tang Y, Zhou Z. Universal and extensible language-vision models for organ segmentation and tumor detection from abdominal computed tomography. Med Image Anal 2024; 97:103226. [PMID: 38852215 DOI: 10.1016/j.media.2024.103226] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2023] [Revised: 03/30/2024] [Accepted: 05/27/2024] [Indexed: 06/11/2024]
Abstract
The advancement of artificial intelligence (AI) for organ segmentation and tumor detection is propelled by the growing availability of computed tomography (CT) datasets with detailed, per-voxel annotations. However, these AI models often struggle with flexibility for partially annotated datasets and extensibility for new classes due to limitations in the one-hot encoding, architectural design, and learning scheme. To overcome these limitations, we propose a universal, extensible framework enabling a single model, termed Universal Model, to deal with multiple public datasets and adapt to new classes (e.g., organs/tumors). Firstly, we introduce a novel language-driven parameter generator that leverages language embeddings from large language models, enriching semantic encoding compared with one-hot encoding. Secondly, the conventional output layers are replaced with lightweight, class-specific heads, allowing Universal Model to simultaneously segment 25 organs and six types of tumors and ease the addition of new classes. We train our Universal Model on 3410 CT volumes assembled from 14 publicly available datasets and then test it on 6173 CT volumes from four external datasets. Universal Model achieves first place on six CT tasks in the Medical Segmentation Decathlon (MSD) public leaderboard and leading performance on the Beyond The Cranial Vault (BTCV) dataset. In summary, Universal Model exhibits remarkable computational efficiency (6× faster than other dataset-specific models), demonstrates strong generalization across different hospitals, transfers well to numerous downstream tasks, and more importantly, facilitates the extensibility to new classes while alleviating the catastrophic forgetting of previously learned classes. Codes, models, and datasets are available at https://github.com/ljwztc/CLIP-Driven-Universal-Model.
Collapse
Affiliation(s)
- Jie Liu
- City University of Hong Kong, Hong Kong
| | - Yixiao Zhang
- Johns Hopkins University, United States of America
| | - Kang Wang
- University of California, San Francisco, United States of America
| | - Mehmet Can Yavuz
- University of California, San Francisco, United States of America
| | - Xiaoxi Chen
- University of Illinois Urbana-Champaign, United States of America
| | | | | | - Yang Yang
- University of California, San Francisco, United States of America
| | - Alan Yuille
- Johns Hopkins University, United States of America
| | | | - Zongwei Zhou
- Johns Hopkins University, United States of America.
| |
Collapse
|
8
|
Carrillo-Larco RM, Bravo-Rocca G, Castillo-Cara M, Xu X, Bernabe-Ortiz A. A multimodal approach using fundus images and text meta-data in a machine learning classifier with embeddings to predict years with self-reported diabetes - An exploratory analysis. Prim Care Diabetes 2024; 18:327-332. [PMID: 38616442 DOI: 10.1016/j.pcd.2024.04.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/26/2023] [Revised: 01/17/2024] [Accepted: 04/09/2024] [Indexed: 04/16/2024]
Abstract
AIMS Machine learning models can use image and text data to predict the number of years since diabetes diagnosis; such model can be applied to new patients to predict, approximately, how long the new patient may have lived with diabetes unknowingly. We aimed to develop a model to predict self-reported diabetes duration. METHODS We used the Brazilian Multilabel Ophthalmological Dataset. Unit of analysis was the fundus image and its meta-data, regardless of the patient. We included people 40 + years and fundus images without diabetic retinopathy. Fundus images and meta-data (sex, age, comorbidities and taking insulin) were passed to the MedCLIP model to extract the embedding representation. The embedding representation was passed to an Extra Tree Classifier to predict: 0-4, 5-9, 10-14 and 15 + years with self-reported diabetes. RESULTS There were 988 images from 563 people (mean age = 67 years; 64 % were women). Overall, the F1 score was 57 %. The group 15 + years of self-reported diabetes had the highest precision (64 %) and F1 score (63 %), while the highest recall (69 %) was observed in the group 0-4 years. The proportion of correctly classified observations was 55 % for the group 0-4 years, 51 % for 5-9 years, 58 % for 10-14 years, and 64 % for 15 + years with self-reported diabetes. CONCLUSIONS The machine learning model had acceptable accuracy and F1 score, and correctly classified more than half of the patients according to diabetes duration. Using large foundational models to extract image and text embeddings seems a feasible and efficient approach to predict years living with self-reported diabetes.
Collapse
Affiliation(s)
- Rodrigo M Carrillo-Larco
- Hubert Department of Global Health, Rollins School of Public Health, Emory University, Atlanta, GA, USA; Emory Global Diabetes Research Center, Emory University, Atlanta, GA, USA.
| | | | | | - Xiaolin Xu
- School of Public Health, The Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, China; The Key Laboratory of Intelligent Preventive Medicine of Zhejiang Province, Hangzhou, China; School of Public Health, Faculty of Medicine, The University of Queensland, Brisbane, Australia
| | | |
Collapse
|
9
|
Xu H, Usuyama N, Bagga J, Zhang S, Rao R, Naumann T, Wong C, Gero Z, González J, Gu Y, Xu Y, Wei M, Wang W, Ma S, Wei F, Yang J, Li C, Gao J, Rosemon J, Bower T, Lee S, Weerasinghe R, Wright BJ, Robicsek A, Piening B, Bifulco C, Wang S, Poon H. A whole-slide foundation model for digital pathology from real-world data. Nature 2024; 630:181-188. [PMID: 38778098 PMCID: PMC11153137 DOI: 10.1038/s41586-024-07441-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2023] [Accepted: 04/19/2024] [Indexed: 05/25/2024]
Abstract
Digital pathology poses unique computational challenges, as a standard gigapixel slide may comprise tens of thousands of image tiles1-3. Prior models have often resorted to subsampling a small portion of tiles for each slide, thus missing the important slide-level context4. Here we present Prov-GigaPath, a whole-slide pathology foundation model pretrained on 1.3 billion 256 × 256 pathology image tiles in 171,189 whole slides from Providence, a large US health network comprising 28 cancer centres. The slides originated from more than 30,000 patients covering 31 major tissue types. To pretrain Prov-GigaPath, we propose GigaPath, a novel vision transformer architecture for pretraining gigapixel pathology slides. To scale GigaPath for slide-level learning with tens of thousands of image tiles, GigaPath adapts the newly developed LongNet5 method to digital pathology. To evaluate Prov-GigaPath, we construct a digital pathology benchmark comprising 9 cancer subtyping tasks and 17 pathomics tasks, using both Providence and TCGA data6. With large-scale pretraining and ultra-large-context modelling, Prov-GigaPath attains state-of-the-art performance on 25 out of 26 tasks, with significant improvement over the second-best method on 18 tasks. We further demonstrate the potential of Prov-GigaPath on vision-language pretraining for pathology7,8 by incorporating the pathology reports. In sum, Prov-GigaPath is an open-weight foundation model that achieves state-of-the-art performance on various digital pathology tasks, demonstrating the importance of real-world data and whole-slide modelling.
Collapse
Affiliation(s)
- Hanwen Xu
- Microsoft Research, Redmond, WA, USA
- Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, WA, USA
| | | | | | | | | | | | | | | | | | - Yu Gu
- Microsoft Research, Redmond, WA, USA
| | - Yanbo Xu
- Microsoft Research, Redmond, WA, USA
| | - Mu Wei
- Microsoft Research, Redmond, WA, USA
| | | | | | - Furu Wei
- Microsoft Research, Redmond, WA, USA
| | | | | | | | | | | | - Soohee Lee
- Providence Research Network, Renton, WA, USA
| | | | | | | | - Brian Piening
- Providence Genomics, Portland, OR, USA
- Earle A. Chiles Research Institute, Providence Cancer Institute, Portland, OR, USA
| | - Carlo Bifulco
- Providence Genomics, Portland, OR, USA.
- Earle A. Chiles Research Institute, Providence Cancer Institute, Portland, OR, USA.
| | - Sheng Wang
- Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, WA, USA.
- Department of Surgery, University of Washington, Seattle, WA, USA.
| | | |
Collapse
|
10
|
Pais C, Liu J, Voigt R, Gupta V, Wade E, Bayati M. Large language models for preventing medication direction errors in online pharmacies. Nat Med 2024; 30:1574-1582. [PMID: 38664535 PMCID: PMC11186789 DOI: 10.1038/s41591-024-02933-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2023] [Accepted: 03/20/2024] [Indexed: 05/04/2024]
Abstract
Errors in pharmacy medication directions, such as incorrect instructions for dosage or frequency, can increase patient safety risk substantially by raising the chances of adverse drug events. This study explores how integrating domain knowledge with large language models (LLMs)-capable of sophisticated text interpretation and generation-can reduce these errors. We introduce MEDIC (medication direction copilot), a system that emulates the reasoning of pharmacists by prioritizing precise communication of core clinical components of a prescription, such as dosage and frequency. It fine-tunes a first-generation LLM using 1,000 expert-annotated and augmented directions from Amazon Pharmacy to extract the core components and assembles them into complete directions using pharmacy logic and safety guardrails. We compared MEDIC against two LLM-based benchmarks: one leveraging 1.5 million medication directions and the other using state-of-the-art LLMs. On 1,200 expert-reviewed prescriptions, the two benchmarks respectively recorded 1.51 (confidence interval (CI) 1.03, 2.31) and 4.38 (CI 3.13, 6.64) times more near-miss events-errors caught and corrected before reaching the patient-than MEDIC. Additionally, we tested MEDIC by deploying within the production system of an online pharmacy, and during this experimental period, it reduced near-miss events by 33% (CI 26%, 40%). This study shows that LLMs, with domain expertise and safeguards, improve the accuracy and efficiency of pharmacy operations.
Collapse
Affiliation(s)
| | | | | | - Vin Gupta
- Amazon, Seattle, WA, USA
- Department of Health Metrics Sciences, University of Washington, Seattle, WA, USA
| | | | - Mohsen Bayati
- Amazon, Seattle, WA, USA
- Operations, Information and Technology at Graduate School of Business, Stanford University, Stanford, CA, USA
| |
Collapse
|
11
|
Christensen M, Vukadinovic M, Yuan N, Ouyang D. Vision-language foundation model for echocardiogram interpretation. Nat Med 2024; 30:1481-1488. [PMID: 38689062 PMCID: PMC11108770 DOI: 10.1038/s41591-024-02959-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2023] [Accepted: 03/28/2024] [Indexed: 05/02/2024]
Abstract
The development of robust artificial intelligence models for echocardiography has been limited by the availability of annotated clinical data. Here, to address this challenge and improve the performance of cardiac imaging models, we developed EchoCLIP, a vision-language foundation model for echocardiography, that learns the relationship between cardiac ultrasound images and the interpretations of expert cardiologists across a wide range of patients and indications for imaging. After training on 1,032,975 cardiac ultrasound videos and corresponding expert text, EchoCLIP performs well on a diverse range of benchmarks for cardiac image interpretation, despite not having been explicitly trained for individual interpretation tasks. EchoCLIP can assess cardiac function (mean absolute error of 7.1% when predicting left ventricular ejection fraction in an external validation dataset) and identify implanted intracardiac devices (area under the curve (AUC) of 0.84, 0.92 and 0.97 for pacemakers, percutaneous mitral valve repair and artificial aortic valves, respectively). We also developed a long-context variant (EchoCLIP-R) using a custom tokenizer based on common echocardiography concepts. EchoCLIP-R accurately identified unique patients across multiple videos (AUC of 0.86), identified clinical transitions such as heart transplants (AUC of 0.79) and cardiac surgery (AUC 0.77) and enabled robust image-to-text search (mean cross-modal retrieval rank in the top 1% of candidate text reports). These capabilities represent a substantial step toward understanding and applying foundation models in cardiovascular imaging for preliminary interpretation of echocardiographic findings.
Collapse
Affiliation(s)
- Matthew Christensen
- Department of Cardiology, Smidt Heart Institute, Cedars-Sinai Medical Center, Los Angeles, CA, USA
| | - Milos Vukadinovic
- Department of Cardiology, Smidt Heart Institute, Cedars-Sinai Medical Center, Los Angeles, CA, USA
- Department of Bioengineering, University of California Los Angeles, Los Angeles, CA, USA
| | - Neal Yuan
- Department of Medicine, University of California San Francisco, San Francisco, CA, USA
- Division of Cardiology, San Francisco Veterans Affairs Medical Center, San Francisco, CA, USA
| | - David Ouyang
- Department of Cardiology, Smidt Heart Institute, Cedars-Sinai Medical Center, Los Angeles, CA, USA.
- Division of Artificial Intelligence in Medicine, Cedars-Sinai Medical Center, Los Angeles, CA, USA.
| |
Collapse
|
12
|
Lotter W, Hassett MJ, Schultz N, Kehl KL, Van Allen EM, Cerami E. Artificial Intelligence in Oncology: Current Landscape, Challenges, and Future Directions. Cancer Discov 2024; 14:711-726. [PMID: 38597966 PMCID: PMC11131133 DOI: 10.1158/2159-8290.cd-23-1199] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2023] [Revised: 01/29/2024] [Accepted: 02/28/2024] [Indexed: 04/11/2024]
Abstract
Artificial intelligence (AI) in oncology is advancing beyond algorithm development to integration into clinical practice. This review describes the current state of the field, with a specific focus on clinical integration. AI applications are structured according to cancer type and clinical domain, focusing on the four most common cancers and tasks of detection, diagnosis, and treatment. These applications encompass various data modalities, including imaging, genomics, and medical records. We conclude with a summary of existing challenges, evolving solutions, and potential future directions for the field. SIGNIFICANCE AI is increasingly being applied to all aspects of oncology, where several applications are maturing beyond research and development to direct clinical integration. This review summarizes the current state of the field through the lens of clinical translation along the clinical care continuum. Emerging areas are also highlighted, along with common challenges, evolving solutions, and potential future directions for the field.
Collapse
Affiliation(s)
- William Lotter
- Department of Data Science, Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Pathology, Brigham and Women’s Hospital, Boston, MA, USA
- Harvard Medical School, Boston, MA, USA
| | - Michael J. Hassett
- Harvard Medical School, Boston, MA, USA
- Division of Population Sciences, Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Nikolaus Schultz
- Marie-Josée and Henry R. Kravis Center for Molecular Oncology, Memorial Sloan Kettering Cancer Center; New York, NY, USA
- Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Kenneth L. Kehl
- Harvard Medical School, Boston, MA, USA
- Division of Population Sciences, Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Eliezer M. Van Allen
- Harvard Medical School, Boston, MA, USA
- Division of Population Sciences, Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA
- Cancer Program, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Ethan Cerami
- Department of Data Science, Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| |
Collapse
|
13
|
Varghese C, Harrison EM, O'Grady G, Topol EJ. Artificial intelligence in surgery. Nat Med 2024; 30:1257-1268. [PMID: 38740998 DOI: 10.1038/s41591-024-02970-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2024] [Accepted: 04/03/2024] [Indexed: 05/16/2024]
Abstract
Artificial intelligence (AI) is rapidly emerging in healthcare, yet applications in surgery remain relatively nascent. Here we review the integration of AI in the field of surgery, centering our discussion on multifaceted improvements in surgical care in the preoperative, intraoperative and postoperative space. The emergence of foundation model architectures, wearable technologies and improving surgical data infrastructures is enabling rapid advances in AI interventions and utility. We discuss how maturing AI methods hold the potential to improve patient outcomes, facilitate surgical education and optimize surgical care. We review the current applications of deep learning approaches and outline a vision for future advances through multimodal foundation models.
Collapse
Affiliation(s)
- Chris Varghese
- Department of Surgery, University of Auckland, Auckland, New Zealand
| | - Ewen M Harrison
- Centre for Medical Informatics, Usher Institute, University of Edinburgh, Edinburgh, UK
| | - Greg O'Grady
- Department of Surgery, University of Auckland, Auckland, New Zealand
- Auckland Bioengineering Institute, University of Auckland, Auckland, New Zealand
| | - Eric J Topol
- Scripps Research Translational Institute, La Jolla, CA, USA.
| |
Collapse
|
14
|
Lam K, Qiu J. Foundation models: the future of surgical artificial intelligence? Br J Surg 2024; 111:znae090. [PMID: 38650580 DOI: 10.1093/bjs/znae090] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2023] [Revised: 02/29/2024] [Accepted: 03/10/2024] [Indexed: 04/25/2024]
Affiliation(s)
- Kyle Lam
- Department of Surgery and Cancer, Imperial College London, London, UK
| | - Jianing Qiu
- Department of Biomedical Engineering, Chinese University of Hong Kong, Hong Kong SAR
| |
Collapse
|
15
|
Kim C, Gadgil SU, DeGrave AJ, Omiye JA, Cai ZR, Daneshjou R, Lee SI. Transparent medical image AI via an image-text foundation model grounded in medical literature. Nat Med 2024; 30:1154-1165. [PMID: 38627560 DOI: 10.1038/s41591-024-02887-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2023] [Accepted: 02/27/2024] [Indexed: 04/21/2024]
Abstract
Building trustworthy and transparent image-based medical artificial intelligence (AI) systems requires the ability to interrogate data and models at all stages of the development pipeline, from training models to post-deployment monitoring. Ideally, the data and associated AI systems could be described using terms already familiar to physicians, but this requires medical datasets densely annotated with semantically meaningful concepts. In the present study, we present a foundation model approach, named MONET (medical concept retriever), which learns how to connect medical images with text and densely scores images on concept presence to enable important tasks in medical AI development and deployment such as data auditing, model auditing and model interpretation. Dermatology provides a demanding use case for the versatility of MONET, due to the heterogeneity in diseases, skin tones and imaging modalities. We trained MONET based on 105,550 dermatological images paired with natural language descriptions from a large collection of medical literature. MONET can accurately annotate concepts across dermatology images as verified by board-certified dermatologists, competitively with supervised models built on previously concept-annotated dermatology datasets of clinical images. We demonstrate how MONET enables AI transparency across the entire AI system development pipeline, from building inherently interpretable models to dataset and model auditing, including a case study dissecting the results of an AI clinical trial.
Collapse
Affiliation(s)
- Chanwoo Kim
- Paul G. Allen School of Computer Science & Engineering, University of Washington, Seattle, WA, USA
| | - Soham U Gadgil
- Paul G. Allen School of Computer Science & Engineering, University of Washington, Seattle, WA, USA
| | - Alex J DeGrave
- Paul G. Allen School of Computer Science & Engineering, University of Washington, Seattle, WA, USA
- Medical Scientist Training Program, University of Washington, Seattle, WA, USA
| | - Jesutofunmi A Omiye
- Department of Dermatology, Stanford School of Medicine, Stanford, CA, USA
- Department of Biomedical Data Science, Stanford School of Medicine, Stanford, CA, USA
| | - Zhuo Ran Cai
- Program for Clinical Research and Technology, Stanford University, Stanford, CA, USA
| | - Roxana Daneshjou
- Department of Dermatology, Stanford School of Medicine, Stanford, CA, USA.
- Department of Biomedical Data Science, Stanford School of Medicine, Stanford, CA, USA.
| | - Su-In Lee
- Paul G. Allen School of Computer Science & Engineering, University of Washington, Seattle, WA, USA.
| |
Collapse
|
16
|
Chen RJ, Ding T, Lu MY, Williamson DFK, Jaume G, Song AH, Chen B, Zhang A, Shao D, Shaban M, Williams M, Oldenburg L, Weishaupt LL, Wang JJ, Vaidya A, Le LP, Gerber G, Sahai S, Williams W, Mahmood F. Towards a general-purpose foundation model for computational pathology. Nat Med 2024; 30:850-862. [PMID: 38504018 DOI: 10.1038/s41591-024-02857-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2023] [Accepted: 02/05/2024] [Indexed: 03/21/2024]
Abstract
Quantitative evaluation of tissue images is crucial for computational pathology (CPath) tasks, requiring the objective characterization of histopathological entities from whole-slide images (WSIs). The high resolution of WSIs and the variability of morphological features present significant challenges, complicating the large-scale annotation of data for high-performance applications. To address this challenge, current efforts have proposed the use of pretrained image encoders through transfer learning from natural image datasets or self-supervised learning on publicly available histopathology datasets, but have not been extensively developed and evaluated across diverse tissue types at scale. We introduce UNI, a general-purpose self-supervised model for pathology, pretrained using more than 100 million images from over 100,000 diagnostic H&E-stained WSIs (>77 TB of data) across 20 major tissue types. The model was evaluated on 34 representative CPath tasks of varying diagnostic difficulty. In addition to outperforming previous state-of-the-art models, we demonstrate new modeling capabilities in CPath such as resolution-agnostic tissue classification, slide classification using few-shot class prototypes, and disease subtyping generalization in classifying up to 108 cancer types in the OncoTree classification system. UNI advances unsupervised representation learning at scale in CPath in terms of both pretraining data and downstream evaluation, enabling data-efficient artificial intelligence models that can generalize and transfer to a wide range of diagnostically challenging tasks and clinical workflows in anatomic pathology.
Collapse
Affiliation(s)
- Richard J Chen
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
- Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Cancer Data Science Program, Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Tong Ding
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Harvard John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, MA, USA
| | - Ming Y Lu
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
- Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Cancer Data Science Program, Dana-Farber Cancer Institute, Boston, MA, USA
- Electrical Engineering and Computer Science, Massachusetts Institute of Technology (MIT), Cambridge, MA, USA
| | - Drew F K Williamson
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
- Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA
| | - Guillaume Jaume
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
- Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Cancer Data Science Program, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Andrew H Song
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
- Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Cancer Data Science Program, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Bowen Chen
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
| | - Andrew Zhang
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
- Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Cancer Data Science Program, Dana-Farber Cancer Institute, Boston, MA, USA
- Health Sciences and Technology, Harvard-MIT, Cambridge, MA, USA
| | - Daniel Shao
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
- Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Cancer Data Science Program, Dana-Farber Cancer Institute, Boston, MA, USA
- Health Sciences and Technology, Harvard-MIT, Cambridge, MA, USA
| | - Muhammad Shaban
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
- Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Cancer Data Science Program, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Mane Williams
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
- Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Cancer Data Science Program, Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Lukas Oldenburg
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Luca L Weishaupt
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
- Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Cancer Data Science Program, Dana-Farber Cancer Institute, Boston, MA, USA
- Health Sciences and Technology, Harvard-MIT, Cambridge, MA, USA
| | - Judy J Wang
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Anurag Vaidya
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
- Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Cancer Data Science Program, Dana-Farber Cancer Institute, Boston, MA, USA
- Health Sciences and Technology, Harvard-MIT, Cambridge, MA, USA
| | - Long Phi Le
- Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
- Health Sciences and Technology, Harvard-MIT, Cambridge, MA, USA
| | - Georg Gerber
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Sharifa Sahai
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
- Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Cancer Data Science Program, Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Systems Biology, Harvard University, Cambridge, MA, USA
| | - Walt Williams
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Harvard John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, MA, USA
| | - Faisal Mahmood
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA.
- Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA.
- Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA.
- Cancer Data Science Program, Dana-Farber Cancer Institute, Boston, MA, USA.
- Harvard Data Science Initiative, Harvard University, Cambridge, MA, USA.
| |
Collapse
|
17
|
Lu MY, Chen B, Williamson DFK, Chen RJ, Liang I, Ding T, Jaume G, Odintsov I, Le LP, Gerber G, Parwani AV, Zhang A, Mahmood F. A visual-language foundation model for computational pathology. Nat Med 2024; 30:863-874. [PMID: 38504017 DOI: 10.1038/s41591-024-02856-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2023] [Accepted: 02/05/2024] [Indexed: 03/21/2024]
Abstract
The accelerated adoption of digital pathology and advances in deep learning have enabled the development of robust models for various pathology tasks across a diverse array of diseases and patient cohorts. However, model training is often difficult due to label scarcity in the medical domain, and a model's usage is limited by the specific task and disease for which it is trained. Additionally, most models in histopathology leverage only image data, a stark contrast to how humans teach each other and reason about histopathologic entities. We introduce CONtrastive learning from Captions for Histopathology (CONCH), a visual-language foundation model developed using diverse sources of histopathology images, biomedical text and, notably, over 1.17 million image-caption pairs through task-agnostic pretraining. Evaluated on a suite of 14 diverse benchmarks, CONCH can be transferred to a wide range of downstream tasks involving histopathology images and/or text, achieving state-of-the-art performance on histology image classification, segmentation, captioning, and text-to-image and image-to-text retrieval. CONCH represents a substantial leap over concurrent visual-language pretrained systems for histopathology, with the potential to directly facilitate a wide array of machine learning-based workflows requiring minimal or no further supervised fine-tuning.
Collapse
Affiliation(s)
- Ming Y Lu
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
- Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Cancer Data Science Program, Dana-Farber Cancer Institute, Boston, MA, USA
- Electrical Engineering and Computer Science, Massachusetts Institute of Technology (MIT), Cambridge, MA, USA
| | - Bowen Chen
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
| | - Drew F K Williamson
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
- Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA
| | - Richard J Chen
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
- Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Cancer Data Science Program, Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Ivy Liang
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Harvard John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, MA, USA
| | - Tong Ding
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Harvard John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, MA, USA
| | - Guillaume Jaume
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
- Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Cancer Data Science Program, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Igor Odintsov
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Long Phi Le
- Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
| | - Georg Gerber
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Anil V Parwani
- Department of Pathology, Wexner Medical Center, Ohio State University, Columbus, OH, USA
| | - Andrew Zhang
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
- Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Cancer Data Science Program, Dana-Farber Cancer Institute, Boston, MA, USA
- Health Sciences and Technology, Harvard-MIT, Cambridge, MA, USA
| | - Faisal Mahmood
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA.
- Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA.
- Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA.
- Cancer Data Science Program, Dana-Farber Cancer Institute, Boston, MA, USA.
- Harvard Data Science Initiative, Harvard University, Cambridge, MA, USA.
| |
Collapse
|
18
|
Maetzler W, Mirelman A, Pilotto A, Bhidayasiri R. Identifying Subtle Motor Deficits Before Parkinson's Disease is Diagnosed: What to Look for? JOURNAL OF PARKINSON'S DISEASE 2024:JPD230350. [PMID: 38363620 DOI: 10.3233/jpd-230350] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/17/2024]
Abstract
Motor deficits typical of Parkinson's disease (PD), such as gait and balance disturbances, tremor, reduced arm swing and finger movement, and voice and breathing changes, are believed to manifest several years prior to clinical diagnosis. Here we describe the evidence for the presence and progression of motor deficits in this pre-diagnostic phase in order to provide suggestions for the design of future observational studies for an effective, quantitatively oriented investigation. On the one hand, these future studies must detect these motor deficits in as large (potentially, population-based) cohorts as possible with high sensitivity and specificity. On the other hand, they must describe the progression of these motor deficits in the pre-diagnostic phase as accurately as possible, to support the testing of the effect of pharmacological and non-pharmacological interventions. Digital technologies and artificial intelligence can substantially accelerate this process.
Collapse
Affiliation(s)
- Walter Maetzler
- Department of Neurology University Hospital Schleswig-Holstein and Kiel University, Kiel, Germany
| | - Anat Mirelman
- Laboratory for Early Markers of Neurodegeneration, Neurological Institute, Tel Aviv Sourasky Medical Center, Tel Aviv, Israel
- Sagol School of Neuroscience and Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel
| | - Andrea Pilotto
- Neurology Unit, Department of Clinical and Experimental Sciences, University of Brescia, Brescia, Italy
- Laboratory of Digital Neurology and Biosensors, University of Brescia, Brescia, Italy
- Neurology Unit, Department of Continuity of Care and Frailty, ASST Spedali Civili Brescia Hospital, Brescia, Italy
| | - Roongroj Bhidayasiri
- Chulalongkorn Centre of Excellence for Parkinson's Disease & Related Disorders, Department of Medicine, Faculty of Medicine, Chulalongkorn University and King Chulalongkorn Memorial Hospital, Thai Red Cross Society, Bangkok, Thailand
- The Academy of Science, The Royal Society of Thailand, Bangkok, Thailand
| |
Collapse
|
19
|
Jia Y, Liu J, Chen L, Zhao T, Wang Y. THItoGene: a deep learning method for predicting spatial transcriptomics from histological images. Brief Bioinform 2023; 25:bbad464. [PMID: 38145948 PMCID: PMC10749789 DOI: 10.1093/bib/bbad464] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2023] [Revised: 11/10/2023] [Accepted: 11/18/2023] [Indexed: 12/27/2023] Open
Abstract
Spatial transcriptomics unveils the complex dynamics of cell regulation and transcriptomes, but it is typically cost-prohibitive. Predicting spatial gene expression from histological images via artificial intelligence offers a more affordable option, yet existing methods fall short in extracting deep-level information from pathological images. In this paper, we present THItoGene, a hybrid neural network that utilizes dynamic convolutional and capsule networks to adaptively sense potential molecular signals in histological images for exploring the relationship between high-resolution pathology image phenotypes and regulation of gene expression. A comprehensive benchmark evaluation using datasets from human breast cancer and cutaneous squamous cell carcinoma has demonstrated the superior performance of THItoGene in spatial gene expression prediction. Moreover, THItoGene has demonstrated its capacity to decipher both the spatial context and enrichment signals within specific tissue regions. THItoGene can be freely accessed at https://github.com/yrjia1015/THItoGene.
Collapse
Affiliation(s)
- Yuran Jia
- Institute for Bioinformatics, School of Computer Science and Technology, Harbin Institute of Technology, Harbin, 150040, China
| | - Junliang Liu
- Institute for Bioinformatics, School of Computer Science and Technology, Harbin Institute of Technology, Harbin, 150040, China
| | - Li Chen
- School of Life Sciences, Westlake University, Hangzhou, Zhejiang 310024, China
| | - Tianyi Zhao
- School of Medicine and Health, Harbin Institute of Technology, Harbin, 150040, China
| | - Yadong Wang
- School of Medicine and Health, Harbin Institute of Technology, Harbin, 150040, China
| |
Collapse
|
20
|
Strang AR, Backley S, Wade K, Easter SR, Samuel A, Parchem JG. What's trending? Reach and content of the Society for Maternal-Fetal Medicine on social media. Am J Obstet Gynecol MFM 2023; 5:101159. [PMID: 37709050 DOI: 10.1016/j.ajogmf.2023.101159] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2023] [Revised: 08/25/2023] [Accepted: 09/09/2023] [Indexed: 09/16/2023]
Abstract
BACKGROUND The Society for Maternal-Fetal Medicine uses social media to increase awareness of the Society and its key programs and to foster community and discussion around perinatal health, especially on Twitter. The influence and role of the Society for Maternal-Fetal Medicine Twitter account in public discourse around issues relevant to pregnancy have not been studied. OBJECTIVE This study aimed to evaluate the trends in engagement with the Society for Maternal-Fetal Medicine on Twitter by analyzing Society for Maternal-Fetal Medicine follower growth and discussion topics on Twitter compared with Facebook and by quantifying public engagement during the Society for Maternal-Fetal Medicine Annual Pregnancy Meeting. STUDY DESIGN This retrospective study analyzed follower growth data from August 2019 to July 2022 for the Society for Maternal-Fetal Medicine Twitter (@MySMFM) and Society for Maternal-Fetal Medicine Facebook (@SocietyforMaternalFetalMedicine) accounts. We identified the top 10 tweets and Facebook posts during the study period using Twitter Analytics and Facebook data. The popularity of tweets and Facebook posts was determined by "impressions" and "reach," respectively; these metrics reflect the number of times a post was viewed. To evaluate annual trends in Society for Maternal-Fetal Medicine Twitter engagement, we analyzed data associated with the Society for Maternal-Fetal Medicine Annual Pregnancy Meeting, including the number of tweets using the hashtag (#SMFM(Year)) and overall impressions for the Society for Maternal-Fetal Medicine Twitter account for each meeting from 2016 to 2023. RESULTS The absolute number of new followers for the Society for Maternal-Fetal Medicine Twitter and Facebook accounts was similar, but the relative increase and rate of follower growth was higher for Twitter than for Facebook. The Twitter account consistently gained followers, whereas the Facebook account experienced intermittent periods of stagnancy or follower loss. More than half of the top-ranked posts on Twitter and Facebook mentioned the COVID-19 vaccine; other popular topics included COVID-19 and abortion. During the Society for Maternal-Fetal Medicine Annual Pregnancy Meeting, the number of tweets using the meeting hashtag consistently peaked on meeting day 4, coincident with the opening plenary session (mean 1270±499). An upward trend in annual pregnancy meeting tweets was observed each year until 2021-the first virtual Society for Maternal-Fetal Medicine meeting. CONCLUSION The trends in Society for Maternal-Fetal Medicine Twitter engagement suggest increasing use and popularity of the platform for timely dissemination of pregnancy-related news, guidelines, and research. The reduction in annual pregnancy meeting tweets and impressions in 2021 and 2022 suggests the potential negative effect of virtual meetings on Society for Maternal-Fetal Medicine member engagement around annual meeting content.
Collapse
Affiliation(s)
- Amanda R Strang
- McGovern Medical School, The University of Texas Health Science Center at Houston, Houston, TX (Ms Strang)
| | - Sami Backley
- Department of Obstetrics, Gynecology, and Reproductive Sciences, McGovern Medical School, The University of Texas Health Science Center at Houston, Houston, TX (Drs Backley and Parchem)
| | - Kerri Wade
- Society for Maternal-Fetal Medicine, Washington, DC (Ms Wade)
| | - Sarah Rae Easter
- Brigham and Women's Hospital, Harvard Medical School, Boston, MA (Dr Easter)
| | - Amber Samuel
- Obstetrix Maternal-Fetal Medicine Specialists, Houston, TX (Dr Samuel)
| | - Jacqueline G Parchem
- Department of Obstetrics, Gynecology, and Reproductive Sciences, McGovern Medical School, The University of Texas Health Science Center at Houston, Houston, TX (Drs Backley and Parchem).
| |
Collapse
|
21
|
Lu MY, Chen B, Mahmood F. Harnessing medical twitter data for pathology AI. Nat Med 2023; 29:2181-2182. [PMID: 37704865 DOI: 10.1038/s41591-023-02530-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/15/2023]
Affiliation(s)
- Ming Y Lu
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
- Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Electrical Engineering and Computer Science, Massachusetts Institute of Technology (MIT), Cambridge, MA, USA
- Harvard Data Science Initiative, Harvard University, Cambridge, MA, USA
- Cancer Data Science Program, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Bowen Chen
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
| | - Faisal Mahmood
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA.
- Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA.
- Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA.
- Cancer Data Science Program, Dana-Farber Cancer Institute, Boston, MA, USA.
| |
Collapse
|