1
|
Shi J, Sun D, Jiang Z, Du J, Wang W, Zheng Y, Wu H. Weakly supervised multi-modal contrastive learning framework for predicting the HER2 scores in breast cancer. Comput Med Imaging Graph 2025; 121:102502. [PMID: 39919535 DOI: 10.1016/j.compmedimag.2025.102502] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2024] [Revised: 08/22/2024] [Accepted: 01/25/2025] [Indexed: 02/09/2025]
Abstract
Human epidermal growth factor receptor 2 (HER2) is an important biomarker for prognosis and prediction of treatment response in breast cancer (BC). HER2 scoring is typically evaluated by pathologist microscopic observation on immunohistochemistry (IHC) images, which is labor-intensive and results in observational biases among different pathologists. Most existing methods generally use hand-crafted features or deep learning models in unimodal (hematoxylin and eosin (H&E) or IHC) to predict HER2 scores through supervised or weakly supervised learning. Consequently, the information from different modalities is not effectively integrated into feature learning which can help improve HER2 scoring performance. In this paper, we propose a novel weakly supervised multi-modal contrastive learning (WSMCL) framework to predict the HER2 scores in BC at the whole slide image (WSI) level. It aims to leverage multi-modal (H&E and IHC) joint learning under the weak supervision of WSI label to achieve the HER2 score prediction. Specifically, the patch features within H&E and IHC WSIs are respectively extracted and then the multi-head self-attention (MHSA) is used to explore the global dependencies of the patches within each modality. The patch features corresponding to top-k and bottom-k attention scores generated by MHSA in each modality are selected as the candidates for multi-modal joint learning. Particularly, a multi-modal attentive contrastive learning (MACL) module is designed to guarantee the semantic alignment of the candidate features from different modalities. Extensive experiments demonstrate the proposed WSMCL has the better HER2 scoring performance and outperforms the state-of-the-art methods. The code is available at https://github.com/HFUT-miaLab/WSMCL.
Collapse
Affiliation(s)
- Jun Shi
- School of Software, Hefei University of Technology, Hefei, 230601, Anhui Province, China
| | - Dongdong Sun
- School of Computer Science and Information Engineering, Hefei University of Technology, Hefei, 230601, Anhui Province, China
| | - Zhiguo Jiang
- Image Processing Center, School of Astronautics, Beihang University, Beijing, 102206, China
| | - Jun Du
- Department of Pathology, the First Affiliated Hospital of USTC, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, 230036, Anhui Province, China; Intelligent Pathology Institute, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, 230036, Anhui Province, China
| | - Wei Wang
- Department of Pathology, the First Affiliated Hospital of USTC, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, 230036, Anhui Province, China; Intelligent Pathology Institute, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, 230036, Anhui Province, China
| | - Yushan Zheng
- School of Engineering Medicine, Beijing Advanced Innovation Center for Biomedical Engineering, Beihang University, Beijing, 100191, China.
| | - Haibo Wu
- Department of Pathology, the First Affiliated Hospital of USTC, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, 230036, Anhui Province, China; Intelligent Pathology Institute, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, 230036, Anhui Province, China; Department of Pathology, Centre for Leading Medicine and Advanced Technologies of IHM, The First Affiliated Hospital of USTC, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, 230001, Anhui Province, China.
| |
Collapse
|
2
|
Xu H, Wang M, Shi D, Qin H, Zhang Y, Liu Z, Madabhushi A, Gao P, Cong F, Lu C. When multiple instance learning meets foundation models: Advancing histological whole slide image analysis. Med Image Anal 2025; 101:103456. [PMID: 39842326 DOI: 10.1016/j.media.2025.103456] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2024] [Revised: 12/31/2024] [Accepted: 01/06/2025] [Indexed: 01/24/2025]
Abstract
Deep multiple instance learning (MIL) pipelines are the mainstream weakly supervised learning methodologies for whole slide image (WSI) classification. However, it remains unclear how these widely used approaches compare to each other, given the recent proliferation of foundation models (FMs) for patch-level embedding and the diversity of slide-level aggregations. This paper implemented and systematically compared six FMs and six recent MIL methods by organizing different feature extractions and aggregations across seven clinically relevant end-to-end prediction tasks using WSIs from 4044 patients with four different cancer types. We tested state-of-the-art (SOTA) FMs in computational pathology, including CTransPath, PathoDuet, PLIP, CONCH, and UNI, as patch-level feature extractors. Feature aggregators, such as attention-based pooling, transformers, and dynamic graphs were thoroughly tested. Our experiments on cancer grading, biomarker status prediction, and microsatellite instability (MSI) prediction suggest that (1) FMs like UNI, trained with more diverse histological images, outperform generic models with smaller training datasets in patch embeddings, significantly enhancing downstream MIL classification accuracy and model training convergence speed, (2) instance feature fine-tuning, known as online feature re-embedding, to capture both fine-grained details and spatial interactions can often further improve WSI classification performance, (3) FMs advance MIL models by enabling promising grading classifications, biomarker status, and MSI predictions without requiring pixel- or patch-level annotations. These findings encourage the development of advanced, domain-specific FMs, aimed at more universally applicable diagnostic tasks, aligning with the evolving needs of clinical AI in pathology.
Collapse
Affiliation(s)
- Hongming Xu
- Cancer Hospital of Dalian University of Technology, Dalian, China; School of Biomedical Engineering, Faculty of Medicine, Dalian University of Technology, Dalian, China; Key Laboratory of Integrated Circuit and Biomedical Electronic System, Liaoning Province, Dalian University of Technology, Dalian, China; Dalian Key Laboratory of Digital Medicine for Critical Diseases, Dalian University of Technology, Dalian, China
| | - Mingkang Wang
- School of Biomedical Engineering, Faculty of Medicine, Dalian University of Technology, Dalian, China
| | - Duanbo Shi
- Department of Pathology, Qilu Hospital, Cheeloo College of Medicine, Shandong University, Jinan, China
| | - Huamin Qin
- Department of Pathology, Beijing Shijitan Hospital, Capital Medical University, Beijing, China
| | - Yunpeng Zhang
- Department of Anesthesiology, The Affiliated Hospital of Chengde Medical University, Chengde, China
| | - Zaiyi Liu
- Department of Radiology, Guangdong Provincial People's Hospital (Guangdong Academy of Medical Sciences), Southern Medical University, Guangzhou, China; Medical Research Institute, Guangdong Provincial People's Hospital (Guangdong Academy of Medical Sciences), Southern Medical University, Guangzhou, China; Guangdong Provincial Key Laboratory of Artificial Intelligence in Medical Image Analysis and Application, Southern Medical University, Guangzhou, China
| | - Anant Madabhushi
- The Wallace H. Coulter Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, Atlanta, USA
| | - Peng Gao
- Department of Pathology, Qilu Hospital, Cheeloo College of Medicine, Shandong University, Jinan, China; Key Laboratory for Experimental Teratology of Ministry of Education, Department of Pathology, School of Basic Medical Sciences, Shandong University, Jinan, China.
| | - Fengyu Cong
- Cancer Hospital of Dalian University of Technology, Dalian, China; School of Biomedical Engineering, Faculty of Medicine, Dalian University of Technology, Dalian, China; Key Laboratory of Integrated Circuit and Biomedical Electronic System, Liaoning Province, Dalian University of Technology, Dalian, China; Faculty of Information Technology, University of Jyväskylä, Jyväskylä, Finland.
| | - Cheng Lu
- Department of Radiology, Guangdong Provincial People's Hospital (Guangdong Academy of Medical Sciences), Southern Medical University, Guangzhou, China; Medical Research Institute, Guangdong Provincial People's Hospital (Guangdong Academy of Medical Sciences), Southern Medical University, Guangzhou, China; Guangdong Provincial Key Laboratory of Artificial Intelligence in Medical Image Analysis and Application, Southern Medical University, Guangzhou, China.
| |
Collapse
|
3
|
Brussee S, Buzzanca G, Schrader AMR, Kers J. Graph neural networks in histopathology: Emerging trends and future directions. Med Image Anal 2025; 101:103444. [PMID: 39793218 DOI: 10.1016/j.media.2024.103444] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2024] [Revised: 11/18/2024] [Accepted: 12/17/2024] [Indexed: 01/13/2025]
Abstract
Histopathological analysis of whole slide images (WSIs) has seen a surge in the utilization of deep learning methods, particularly Convolutional Neural Networks (CNNs). However, CNNs often fail to capture the intricate spatial dependencies inherent in WSIs. Graph Neural Networks (GNNs) present a promising alternative, adept at directly modeling pairwise interactions and effectively discerning the topological tissue and cellular structures within WSIs. Recognizing the pressing need for deep learning techniques that harness the topological structure of WSIs, the application of GNNs in histopathology has experienced rapid growth. In this comprehensive review, we survey GNNs in histopathology, discuss their applications, and explore emerging trends that pave the way for future advancements in the field. We begin by elucidating the fundamentals of GNNs and their potential applications in histopathology. Leveraging quantitative literature analysis, we explore four emerging trends: Hierarchical GNNs, Adaptive Graph Structure Learning, Multimodal GNNs, and Higher-order GNNs. Through an in-depth exploration of these trends, we offer insights into the evolving landscape of GNNs in histopathological analysis. Based on our findings, we propose future directions to propel the field forward. Our analysis serves to guide researchers and practitioners towards innovative approaches and methodologies, fostering advancements in histopathological analysis through the lens of graph neural networks.
Collapse
Affiliation(s)
- Siemen Brussee
- Leiden University Medical Center, Albinusdreef 2, 2333 ZA, Leiden, The Netherlands.
| | - Giorgio Buzzanca
- Leiden University Medical Center, Albinusdreef 2, 2333 ZA, Leiden, The Netherlands
| | - Anne M R Schrader
- Leiden University Medical Center, Albinusdreef 2, 2333 ZA, Leiden, The Netherlands
| | - Jesper Kers
- Leiden University Medical Center, Albinusdreef 2, 2333 ZA, Leiden, The Netherlands; Amsterdam University Medical Center, Meibergdreef 9, 1105 AZ, Amsterdam, The Netherlands
| |
Collapse
|
4
|
Boehm KM, El Nahhas OSM, Marra A, Waters M, Jee J, Braunstein L, Schultz N, Selenica P, Wen HY, Weigelt B, Paul ED, Cekan P, Erber R, Loeffler CML, Guerini-Rocco E, Fusco N, Frascarelli C, Mane E, Munzone E, Dellapasqua S, Zagami P, Curigliano G, Razavi P, Reis-Filho JS, Pareja F, Chandarlapaty S, Shah SP, Kather JN. Multimodal histopathologic models stratify hormone receptor-positive early breast cancer. Nat Commun 2025; 16:2106. [PMID: 40025017 PMCID: PMC11873197 DOI: 10.1038/s41467-025-57283-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2024] [Accepted: 02/13/2025] [Indexed: 03/04/2025] Open
Abstract
The Oncotype DX® Recurrence Score (RS) is an assay for hormone receptor-positive early breast cancer with extensively validated predictive and prognostic value. However, its cost and lag time have limited global adoption, and previous attempts to estimate it using clinicopathologic variables have had limited success. To address this, we assembled 6172 cases across three institutions and developed Orpheus, a multimodal deep learning tool to infer the RS from H&E whole-slide images. Our model identifies TAILORx high-risk cases (RS > 25) with an area under the curve (AUC) of 0.89, compared to a leading clinicopathologic nomogram with 0.73. Furthermore, in patients with RS ≤ 25, Orpheus ascertains risk of metastatic recurrence more accurately than the RS itself (0.75 vs 0.49 mean time-dependent AUC). These findings have the potential to guide adjuvant therapy for high-risk cases and tailor surveillance for patients at elevated metastatic recurrence risk.
Collapse
Affiliation(s)
- Kevin M Boehm
- Computational Oncology Service, Memorial Sloan Kettering Cancer Center, 323 E 61 St, New York, NY, USA
- Department of Radiation Oncology, Memorial Sloan Kettering Cancer Center, 1275 York Avenue, New York, NY, USA
| | - Omar S M El Nahhas
- Else Kroener Fresenius Center for Digital Health, Medical Faculty Carl Gustav Carus, Technical University Dresden, Fetscherstraße 74, 01307, Dresden, Germany
- StratifAI GmbH, Suite 14500 Großenhainer Str. 98, 01127, Dresden, Germany
| | - Antonio Marra
- Department of Pathology and Laboratory Medicine, Memorial Sloan Kettering Cancer Center, 1275 York Avenue, New York, NY, USA
- Early Drug Development for Innovative Therapies, European Institute of Oncology IRCCS, Via Giuseppe Ripamonti 435, 20141, Milan, Italy
| | - Michele Waters
- Computational Oncology Service, Memorial Sloan Kettering Cancer Center, 323 E 61 St, New York, NY, USA
| | - Justin Jee
- Computational Oncology Service, Memorial Sloan Kettering Cancer Center, 323 E 61 St, New York, NY, USA
- Department of Medicine, Memorial Sloan Kettering Cancer Center, 1275 York Avenue, New York, NY, USA
| | - Lior Braunstein
- Department of Radiation Oncology, Memorial Sloan Kettering Cancer Center, 1275 York Avenue, New York, NY, USA
| | - Nikolaus Schultz
- Computational Oncology Service, Memorial Sloan Kettering Cancer Center, 323 E 61 St, New York, NY, USA
- Human Oncology and Pathogenesis Program, Memorial Sloan Kettering Cancer Center, 1275 York Avenue, New York, NY, USA
- Marie-Josée and Henry R. Kravis Center for Molecular Oncology, Memorial Sloan Kettering Cancer Center, 1275 York Avenue, New York, NY, USA
| | - Pier Selenica
- Department of Pathology and Laboratory Medicine, Memorial Sloan Kettering Cancer Center, 1275 York Avenue, New York, NY, USA
| | - Hannah Y Wen
- Department of Pathology and Laboratory Medicine, Memorial Sloan Kettering Cancer Center, 1275 York Avenue, New York, NY, USA
| | - Britta Weigelt
- Department of Pathology and Laboratory Medicine, Memorial Sloan Kettering Cancer Center, 1275 York Avenue, New York, NY, USA
| | - Evan D Paul
- MultiplexDX, s.r.o., Ilkovičova 8, 841 04 Karlova Ves, Comenius University Science Park, Bratislava, Slovakia
- MultiplexDX, Inc., One Research Court Suite 450, Rockville, MD, 20850, USA
| | - Pavol Cekan
- MultiplexDX, s.r.o., Ilkovičova 8, 841 04 Karlova Ves, Comenius University Science Park, Bratislava, Slovakia
- MultiplexDX, Inc., One Research Court Suite 450, Rockville, MD, 20850, USA
| | - Ramona Erber
- Institute of Pathology, University Hospital Erlangen, Friedrich-Alexander-Universität Erlangen-Nürnberg, Comprehensive Cancer Center Erlangen-EMN (CCC ER-EMN), Krankenhausstraße 8-10, 91054, Erlangen, Germany
| | - Chiara M L Loeffler
- Else Kroener Fresenius Center for Digital Health, Medical Faculty Carl Gustav Carus, Technical University Dresden, Fetscherstraße 74, 01307, Dresden, Germany
| | - Elena Guerini-Rocco
- Department of Pathology, European Institute of Oncology IRCCS, Via Giuseppe Ripamonti 435, 20141, Milan, Italy
- Department of Oncology and Haemato-Oncology, University of Milano, Via Festa del Perdono 7, 20122, Milan, Italy
| | - Nicola Fusco
- Department of Pathology, European Institute of Oncology IRCCS, Via Giuseppe Ripamonti 435, 20141, Milan, Italy
- Department of Oncology and Haemato-Oncology, University of Milano, Via Festa del Perdono 7, 20122, Milan, Italy
| | - Chiara Frascarelli
- Department of Pathology, European Institute of Oncology IRCCS, Via Giuseppe Ripamonti 435, 20141, Milan, Italy
- Department of Oncology and Haemato-Oncology, University of Milano, Via Festa del Perdono 7, 20122, Milan, Italy
| | - Eltjona Mane
- Department of Pathology, European Institute of Oncology IRCCS, Via Giuseppe Ripamonti 435, 20141, Milan, Italy
| | - Elisabetta Munzone
- Division of Medical Senology, European Institute of Oncology IRCCS, Via Giuseppe Ripamonti 435, 20141, Milan, Italy
| | - Silvia Dellapasqua
- Division of Medical Senology, European Institute of Oncology IRCCS, Via Giuseppe Ripamonti 435, 20141, Milan, Italy
| | - Paola Zagami
- Early Drug Development for Innovative Therapies, European Institute of Oncology IRCCS, Via Giuseppe Ripamonti 435, 20141, Milan, Italy
- Department of Oncology and Haemato-Oncology, University of Milano, Via Festa del Perdono 7, 20122, Milan, Italy
| | - Giuseppe Curigliano
- Early Drug Development for Innovative Therapies, European Institute of Oncology IRCCS, Via Giuseppe Ripamonti 435, 20141, Milan, Italy
- Department of Oncology and Haemato-Oncology, University of Milano, Via Festa del Perdono 7, 20122, Milan, Italy
| | - Pedram Razavi
- Department of Medicine, Memorial Sloan Kettering Cancer Center, 1275 York Avenue, New York, NY, USA
| | - Jorge S Reis-Filho
- Department of Pathology and Laboratory Medicine, Memorial Sloan Kettering Cancer Center, 1275 York Avenue, New York, NY, USA
- AstraZeneca, 1 MedImmune Way, Gaithersburg, MD, 20878, USA
| | - Fresia Pareja
- Department of Pathology and Laboratory Medicine, Memorial Sloan Kettering Cancer Center, 1275 York Avenue, New York, NY, USA
| | - Sarat Chandarlapaty
- Department of Medicine, Memorial Sloan Kettering Cancer Center, 1275 York Avenue, New York, NY, USA.
- Human Oncology and Pathogenesis Program, Memorial Sloan Kettering Cancer Center, 1275 York Avenue, New York, NY, USA.
| | - Sohrab P Shah
- Computational Oncology Service, Memorial Sloan Kettering Cancer Center, 323 E 61 St, New York, NY, USA.
| | - Jakob Nikolas Kather
- Else Kroener Fresenius Center for Digital Health, Medical Faculty Carl Gustav Carus, Technical University Dresden, Fetscherstraße 74, 01307, Dresden, Germany.
- Medical Oncology, National Center for Tumor Diseases (NCT), University Hospital Heidelberg, Im Neuenheimer Feld 460, 69120, Heidelberg, Germany.
| |
Collapse
|
5
|
Lu Y, Wang A. Integrating language into medical visual recognition and reasoning: A survey. Med Image Anal 2025; 102:103514. [PMID: 40023891 DOI: 10.1016/j.media.2025.103514] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2024] [Revised: 01/13/2025] [Accepted: 02/16/2025] [Indexed: 03/04/2025]
Abstract
Vision-Language Models (VLMs) are regarded as efficient paradigms that build a bridge between visual perception and textual interpretation. For medical visual tasks, they can benefit from expert observation and physician knowledge extracted from textual context, thereby improving the visual understanding of models. Motivated by the fact that extensive medical reports are commonly attached to medical imaging, medical VLMs have triggered more and more interest, serving not only as self-supervised learning in the pretraining stage but also as a means to introduce auxiliary information into medical visual perception. To strengthen the understanding of such a promising direction, this survey aims to provide an in-depth exploration and review of medical VLMs for various visual recognition and reasoning tasks. Firstly, we present an introduction to medical VLMs. Then, we provide preliminaries and delve into how to exploit language in medical visual tasks from diverse perspectives. Further, we investigate publicly available VLM datasets and discuss the challenges and future perspectives. We expect that the comprehensive discussion about state-of-the-art medical VLMs will make researchers realize their significant potential.
Collapse
Affiliation(s)
- Yinbin Lu
- The School of Computer Science and Technology, East China Normal University, Shanghai 200062, China
| | - Alan Wang
- Auckland Bioengineering Institute, The University of Auckland, New Zealand; Medical Imaging Research Center, Faculty of Medical and Health Sciences, The University of Auckland, New Zealand; Centre for Co-Created Ageing Research, The University of Auckland, New Zealand; Centre for Brain Research, The University of Auckland, New Zealand.
| |
Collapse
|
6
|
Hu D, Jiang Z, Shi J, Xie F, Wu K, Tang K, Cao M, Huai J, Zheng Y. Pathology report generation from whole slide images with knowledge retrieval and multi-level regional feature selection. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2025; 263:108677. [PMID: 40023962 DOI: 10.1016/j.cmpb.2025.108677] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/18/2024] [Revised: 02/07/2025] [Accepted: 02/16/2025] [Indexed: 03/04/2025]
Abstract
BACKGROUND AND OBJECTIVES With the development of deep learning techniques, the computer-assisted pathology diagnosis plays a crucial role in clinical diagnosis. An important task within this field is report generation, which provides doctors with text descriptions of whole slide images (WSIs). Report generation from WSIs presents significant challenges due to the structural complexity and pathological diversity of tissues, as well as the large size and high information density of WSIs. The objective of this study is to design a histopathology report generation method that can efficiently generate reports from WSIs and is suitable for clinical practice. METHODS In this paper, we propose a novel approach for generating pathology reports from WSIs, leveraging knowledge retrieval and multi-level regional feature selection. To deal with the uneven distribution of pathological information in WSIs, we introduce a multi-level regional feature encoding network and a feature selection module that extracts multi-level region representations and filters out region features irrelevant to the diagnosis, enabling more efficient report generation. Moreover, we design a knowledge retrieval module to improve the report generation performance that can leverage the diagnostic information from historical cases. Additionally, we propose an out-of-domain application mode based on large language model (LLM). The use of LLM enhances the scalability of the generation model and improves its adaptability to data from different sources. RESULTS The proposed method is evaluated on a public datasets and one in-house dataset. On the public GastricADC (991 WSIs), our method outperforms state-of-the-art text generation methods and achieved 0.568 and 0.345 on metric Rouge-L and Bleu-4, respectively. On the in-house Gastric-3300 (3309 WSIs), our method achieved significantly better performance with Rouge-L of 0.690, which surpassed the second-best state-of-the-art method Wcap 6.3%. CONCLUSIONS We present an advanced method for pathology report generation from WSIs, addressing the key challenges associated with the large size and complex pathological structures of these images. In particular, the multi-level regional feature selection module effectively captures diagnostically significant regions of varying sizes. The knowledge retrieval-based decoder leverages historical diagnostic data to enhance report accuracy. Our method not only improves the informativeness and relevance of the generated pathology reports but also outperforms the state-of-the-art techniques.
Collapse
Affiliation(s)
- Dingyi Hu
- Image Processing Center, School of Astronautics, Beihang University, Beijing, 100191, China
| | - Zhiguo Jiang
- Image Processing Center, School of Astronautics, Beihang University, Beijing, 100191, China; Tianmushan Laboratory, Hangzhou, 311115, Zhejiang Province, China
| | - Jun Shi
- School of Software, Hefei University of Technology, Hefei, 230601, Anhui Province, China
| | - Fengying Xie
- Image Processing Center, School of Astronautics, Beihang University, Beijing, 100191, China; Tianmushan Laboratory, Hangzhou, 311115, Zhejiang Province, China
| | - Kun Wu
- Image Processing Center, School of Astronautics, Beihang University, Beijing, 100191, China
| | - Kunming Tang
- Image Processing Center, School of Astronautics, Beihang University, Beijing, 100191, China
| | - Ming Cao
- Department of Pathology, the First People's Hospital of Wuhu, Wuhu, 241000, Anhui Province, China
| | - Jianguo Huai
- Department of Pathology, the First People's Hospital of Wuhu, Wuhu, 241000, Anhui Province, China
| | - Yushan Zheng
- School of Engineering Medicine, Beijing Advanced Innovation Center for Biomedical Engineering, Beihang University, Beijing, 100191, China.
| |
Collapse
|
7
|
Brieghel C, Werling M, Frederiksen CM, Parviz M, Lacoppidan T, Faitova T, Teglgaard RS, Vainer N, da Cunha-Bang C, Rotbain EC, Agius R, Niemann CU. The Danish Lymphoid Cancer Research (DALY-CARE) Data Resource: The Basis for Developing Data-Driven Hematology. Clin Epidemiol 2025; 17:131-145. [PMID: 39996155 PMCID: PMC11849980 DOI: 10.2147/clep.s479672] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2024] [Accepted: 12/19/2024] [Indexed: 02/26/2025] Open
Abstract
Background Lymphoid-lineage cancers (LC; International Classification of Diseases, 10th edition [ICD10] C81.x-C90.x, C91.1-C91.9, C95.1, C95.7, C95.9, D47.2, D47.9B, and E85.8A) share many epidemiological and clinical features, which favor meta-learning when developing medical artificial intelligence (mAI). However, access to large, shared datasets is largely missing and limits mAI research. Aim Creating a large-scale data repository for patients with LC to develop data-driven hematology. Methods We gathered electronic health data and created open-source processing pipelines to create a comprehensive data resource for Danish LC Research (DALY-CARE) approved for epidemiological, molecular, and data-driven research. Results We included all Danish adults registered with LC diagnoses since 2002 (n=65,774) and combined 10 nationwide registers, electronic health records (EHR), and laboratory data on a high-powered cloud-computer to develop a secure research environment. Among other, data include treatments (ie 21,750 cytoreductive treatment plans, 21.3M outpatient prescriptions, and 12.7M in-hospital administrations), biochemical analyses (77.3M), comorbidity (14.8M ICD10 codes), pathology codes (4.5M), treatment procedures (8.3M), surgical procedures (1.0M), radiological examinations (3.3M), vital signs (18.3M values), and survival data. We herein describe the data infrastructure and exemplify how DALY-CARE has been used for molecular studies, real-world evidence to evaluate the efficacy of care, and mAI deployed directly into EHR systems. Conclusion The DALY-CARE data resource allows for the development of near real-time decision-support tools and extrapolation of clinical trial results to clinical practice, thereby improving care for patients with LC while facilitating streamlining of health data infrastructure across cohorts and medical specialties.
Collapse
Affiliation(s)
- Christian Brieghel
- Department of Hematology, Copenhagen University Hospital – Rigshospitalet, Copenhagen, Denmark
- Danish Cancer Institute, Copenhagen, Denmark
| | - Mikkel Werling
- Department of Hematology, Copenhagen University Hospital – Rigshospitalet, Copenhagen, Denmark
- Danish Cancer Institute, Copenhagen, Denmark
| | - Casper Møller Frederiksen
- Department of Hematology, Copenhagen University Hospital – Rigshospitalet, Copenhagen, Denmark
- Danish Cancer Institute, Copenhagen, Denmark
| | - Mehdi Parviz
- Department of Hematology, Copenhagen University Hospital – Rigshospitalet, Copenhagen, Denmark
- Department of Clinical Medicine, University of Copenhagen, Copenhagen, Denmark
| | - Thomas Lacoppidan
- Department of Hematology, Copenhagen University Hospital – Rigshospitalet, Copenhagen, Denmark
| | - Tereza Faitova
- Department of Hematology, Copenhagen University Hospital – Rigshospitalet, Copenhagen, Denmark
- Danish Cancer Institute, Copenhagen, Denmark
| | - Rebecca Svanberg Teglgaard
- Department of Hematology, Copenhagen University Hospital – Rigshospitalet, Copenhagen, Denmark
- Department of Clinical Immunology, Copenhagen University Hospital – Rigshospitalet, Copenhagen, Denmark
| | - Noomi Vainer
- Department of Hematology, Copenhagen University Hospital – Rigshospitalet, Copenhagen, Denmark
- Danish Cancer Institute, Copenhagen, Denmark
| | - Caspar da Cunha-Bang
- Department of Hematology, Copenhagen University Hospital – Rigshospitalet, Copenhagen, Denmark
- Danish Cancer Institute, Copenhagen, Denmark
| | - Emelie Curovic Rotbain
- Department of Hematology, Copenhagen University Hospital – Rigshospitalet, Copenhagen, Denmark
- Danish Cancer Institute, Copenhagen, Denmark
| | - Rudi Agius
- Department of Hematology, Copenhagen University Hospital – Rigshospitalet, Copenhagen, Denmark
| | - Carsten Utoft Niemann
- Department of Hematology, Copenhagen University Hospital – Rigshospitalet, Copenhagen, Denmark
- Danish Cancer Institute, Copenhagen, Denmark
- Department of Clinical Medicine, University of Copenhagen, Copenhagen, Denmark
| |
Collapse
|
8
|
Launet L, Colomer A, Mosquera-Zamudio A, Monteagudo C, Naranjo V. The puzzling Spitz tumours: is artificial intelligence the key to their understanding? Histopathology 2025. [PMID: 39976082 DOI: 10.1111/his.15428] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/21/2025]
Abstract
Since their first description in 1948, Spitz tumours remain one of the most challenging diagnostic entities in dermatopathology due to their complex histological features and ambiguous clinical behaviour. In recent years, artificial intelligence (AI) solutions have demonstrated significant potential across a wide range of medical applications, including computational pathology, for decision-making in diagnosis, along with promising advances in prognosis and tumour classification. However, the application of AI to Spitz tumours remains relatively underexplored, with few studies addressing this field. Yet in this evolving technological landscape, could AI provide the insights needed to help resolve the diagnostic uncertainties surrounding Spitz tumours? How could this technology be leveraged to bridge the gap between histopathological uncertainty and clinical accuracy? This review aims to provide an overview of the current state of AI applications in Spitz tumour analysis, identify existing research gaps, and propose future directions to optimize the use of AI in understanding and diagnosing these complex tumours.
Collapse
Affiliation(s)
- Laëtitia Launet
- Instituto Universitario de Investigación en Tecnología Centrada en el Ser Humano, HUMAN-Tech, Universitat Politècnica de València, Valencia, Spain
| | - Adrián Colomer
- Instituto Universitario de Investigación en Tecnología Centrada en el Ser Humano, HUMAN-Tech, Universitat Politècnica de València, Valencia, Spain
| | - Andrés Mosquera-Zamudio
- Universitat de València, Valencia, Spain
- INCLIVA, Instituto de Investigación Sanitaria, Valencia, Spain
| | - Carlos Monteagudo
- Universitat de València, Valencia, Spain
- INCLIVA, Instituto de Investigación Sanitaria, Valencia, Spain
| | - Valery Naranjo
- Instituto Universitario de Investigación en Tecnología Centrada en el Ser Humano, HUMAN-Tech, Universitat Politècnica de València, Valencia, Spain
| |
Collapse
|
9
|
Mallya M, Mirabadi AK, Farnell D, Farahani H, Bashashati A. Benchmarking histopathology foundation models for ovarian cancer bevacizumab treatment response prediction from whole slide images. Discov Oncol 2025; 16:196. [PMID: 39961889 PMCID: PMC11832855 DOI: 10.1007/s12672-025-01973-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/30/2024] [Accepted: 02/11/2025] [Indexed: 02/20/2025] Open
Abstract
PURPOSE Bevacizumab is a widely studied targeted therapeutic drug used in conjunction with standard chemotherapy for the treatment of recurrent ovarian cancer. While its administration has been shown to increase progression-free survival (PFS) in patients with advanced-stage ovarian cancer, the lack of identifiable biomarkers for predicting patient response has been a major roadblock in its effective adoption towards personalized medicine. METHODS In this work, we leverage the latest histopathology foundation models trained on large-scale whole slide image (WSI) datasets to extract ovarian tumor tissue features for predicting bevacizumab response from WSIs. RESULTS Our extensive experiments across a combination of different histopathology foundation models and multiple instance learning (MIL) strategies demonstrate the capability of these large models in predicting bevacizumab response in ovarian cancer patients with the best models achieving a patient-level balanced accuracy score close to 70%. Furthermore, these models can effectively stratify high- and low-risk patients (p < 0.05) during the first year of bevacizumab treatment. CONCLUSION This work highlights the utility of histopathology foundation models to predict response to bevacizumab treatment from WSIs. The high-attention regions of the WSIs highlighted by these models not only aid the model explainability but also serve as promising imaging biomarkers for treatment prognosis.
Collapse
Affiliation(s)
- Mayur Mallya
- Faculty of Science, University of British Columbia, 2207 Main Mall, Vancouver, V6T 1Z4, British Columbia, Canada
| | - Ali Khajegili Mirabadi
- Faculty of Science, University of British Columbia, 2207 Main Mall, Vancouver, V6T 1Z4, British Columbia, Canada
| | - David Farnell
- Department of Pathology and Laboratory Medicine, University of British Columbia, 2211 Wesbrook Mall, Vancouver, V6T 1Z7, British Columbia, Canada
- Vancouver General Hospital, 855 W 12Th Ave, Vancouver, V5Z 1M9, British Columbia, Canada
| | - Hossein Farahani
- School of Biomedical Engineering, University of British Columbia, 2222 Health Sciences Mall, Vancouver, V6T 2B9, British Columbia, Canada
| | - Ali Bashashati
- School of Biomedical Engineering, University of British Columbia, 2222 Health Sciences Mall, Vancouver, V6T 2B9, British Columbia, Canada.
- Department of Pathology and Laboratory Medicine, University of British Columbia, 2211 Wesbrook Mall, Vancouver, V6T 1Z7, British Columbia, Canada.
| |
Collapse
|
10
|
Wang X, Wang D, Li X, Rittscher J, Metaxas D, Zhang S. Editorial for Special Issue on Foundation Models for Medical Image Analysis. Med Image Anal 2025; 100:103389. [PMID: 39603969 DOI: 10.1016/j.media.2024.103389] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2024]
Affiliation(s)
| | - Dequan Wang
- Shanghai AI Laboratory, Shanghai 200232, China; Qing Yuan Research Institute, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Xiaoxiao Li
- School of Electrical and Computer Engineering, University of British Columbia, Vancouver BC V6T 1Z4, Canada
| | - Jens Rittscher
- Department of Engineering Science, University of Oxford, Oxford OX3 7DQ, UK
| | - Dimitris Metaxas
- Department of Computer Science, Rutgers University, Piscataway, NJ 08854, USA
| | | |
Collapse
|
11
|
Zheng S, Cui X, Sun Y, Li J, Li H, Zhang Y, Chen P, Jing X, Ye Z, Yang L. Benchmarking PathCLIP for Pathology Image Analysis. JOURNAL OF IMAGING INFORMATICS IN MEDICINE 2025; 38:422-438. [PMID: 38980627 PMCID: PMC11811322 DOI: 10.1007/s10278-024-01128-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/07/2024] [Revised: 03/21/2024] [Accepted: 04/22/2024] [Indexed: 07/10/2024]
Abstract
Accurate image classification and retrieval are of importance for clinical diagnosis and treatment decision-making. The recent contrastive language-image pre-training (CLIP) model has shown remarkable proficiency in understanding natural images. Drawing inspiration from CLIP, pathology-dedicated CLIP (PathCLIP) has been developed, utilizing over 200,000 image and text pairs in training. While the performance the PathCLIP is impressive, its robustness under a wide range of image corruptions remains unknown. Therefore, we conduct an extensive evaluation to analyze the performance of PathCLIP on various corrupted images from the datasets of osteosarcoma and WSSS4LUAD. In our experiments, we introduce eleven corruption types including brightness, contrast, defocus, resolution, saturation, hue, markup, deformation, incompleteness, rotation, and flipping at various settings. Through experiments, we find that PathCLIP surpasses OpenAI-CLIP and the pathology language-image pre-training (PLIP) model in zero-shot classification. It is relatively robust to image corruptions including contrast, saturation, incompleteness, and orientation factors. Among the eleven corruptions, hue, markup, deformation, defocus, and resolution can cause relatively severe performance fluctuation of the PathCLIP. This indicates that ensuring the quality of images is crucial before conducting a clinical test. Additionally, we assess the robustness of PathCLIP in the task of image-to-image retrieval, revealing that PathCLIP performs less effectively than PLIP on osteosarcoma but performs better on WSSS4LUAD under diverse corruptions. Overall, PathCLIP presents impressive zero-shot classification and retrieval performance for pathology images, but appropriate care needs to be taken when using it.
Collapse
Affiliation(s)
- Sunyi Zheng
- Tianjin's Clinical Research Center for Cancer, Key Laboratory of Cancer Prevention and Therapy, Department of Radiology, Tianjin Medical University Cancer Institute and Hospital, National Clinical Research Center for Cancer, Tianjin, China
- School of Engineering, Westlake University, Hangzhou, China
| | - Xiaonan Cui
- Tianjin's Clinical Research Center for Cancer, Key Laboratory of Cancer Prevention and Therapy, Department of Radiology, Tianjin Medical University Cancer Institute and Hospital, National Clinical Research Center for Cancer, Tianjin, China
| | - Yuxuan Sun
- College of Computer Science and Technology, Zhejiang University, Hangzhou, China
| | - Jingxiong Li
- College of Computer Science and Technology, Zhejiang University, Hangzhou, China
| | - Honglin Li
- College of Computer Science and Technology, Zhejiang University, Hangzhou, China
| | - Yunlong Zhang
- College of Computer Science and Technology, Zhejiang University, Hangzhou, China
| | - Pingyi Chen
- College of Computer Science and Technology, Zhejiang University, Hangzhou, China
| | - Xueping Jing
- Department of Radiation Oncology, University Medical Center of Groningen, Groningen, The Netherlands
| | - Zhaoxiang Ye
- Tianjin's Clinical Research Center for Cancer, Key Laboratory of Cancer Prevention and Therapy, Department of Radiology, Tianjin Medical University Cancer Institute and Hospital, National Clinical Research Center for Cancer, Tianjin, China
| | - Lin Yang
- School of Engineering, Westlake University, Hangzhou, China.
| |
Collapse
|
12
|
Xiang J, Wang X, Zhang X, Xi Y, Eweje F, Chen Y, Li Y, Bergstrom C, Gopaulchan M, Kim T, Yu KH, Willens S, Olguin FM, Nirschl JJ, Neal J, Diehn M, Yang S, Li R. A vision-language foundation model for precision oncology. Nature 2025; 638:769-778. [PMID: 39779851 DOI: 10.1038/s41586-024-08378-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2024] [Accepted: 11/08/2024] [Indexed: 01/11/2025]
Abstract
Clinical decision-making is driven by multimodal data, including clinical notes and pathological characteristics. Artificial intelligence approaches that can effectively integrate multimodal data hold significant promise in advancing clinical care1,2. However, the scarcity of well-annotated multimodal datasets in clinical settings has hindered the development of useful models. In this study, we developed the Multimodal transformer with Unified maSKed modeling (MUSK), a vision-language foundation model designed to leverage large-scale, unlabelled, unpaired image and text data. MUSK was pretrained on 50 million pathology images from 11,577 patients and one billion pathology-related text tokens using unified masked modelling. It was further pretrained on one million pathology image-text pairs to efficiently align the vision and language features. With minimal or no further training, MUSK was tested in a wide range of applications and demonstrated superior performance across 23 patch-level and slide-level benchmarks, including image-to-text and text-to-image retrieval, visual question answering, image classification and molecular biomarker prediction. Furthermore, MUSK showed strong performance in outcome prediction, including melanoma relapse prediction, pan-cancer prognosis prediction and immunotherapy response prediction in lung and gastro-oesophageal cancers. MUSK effectively combined complementary information from pathology images and clinical reports and could potentially improve diagnosis and precision in cancer therapy.
Collapse
Affiliation(s)
- Jinxi Xiang
- Department of Radiation Oncology, Stanford University School of Medicine, Stanford, CA, USA
| | - Xiyue Wang
- Department of Radiation Oncology, Stanford University School of Medicine, Stanford, CA, USA
| | - Xiaoming Zhang
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
| | - Yinghua Xi
- Department of Radiation Oncology, Stanford University School of Medicine, Stanford, CA, USA
| | - Feyisope Eweje
- Department of Radiation Oncology, Stanford University School of Medicine, Stanford, CA, USA
| | - Yijiang Chen
- Department of Radiation Oncology, Stanford University School of Medicine, Stanford, CA, USA
| | - Yuchen Li
- Department of Radiation Oncology, Stanford University School of Medicine, Stanford, CA, USA
| | - Colin Bergstrom
- Department of Medicine (Oncology), Stanford University School of Medicine, Stanford, CA, USA
| | - Matthew Gopaulchan
- Department of Radiation Oncology, Stanford University School of Medicine, Stanford, CA, USA
| | - Ted Kim
- Department of Radiation Oncology, Stanford University School of Medicine, Stanford, CA, USA
| | - Kun-Hsing Yu
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Sierra Willens
- Department of Medicine (Oncology), Stanford University School of Medicine, Stanford, CA, USA
| | - Francesca Maria Olguin
- Department of Medicine (Oncology), Stanford University School of Medicine, Stanford, CA, USA
| | - Jeffrey J Nirschl
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
| | - Joel Neal
- Department of Medicine (Oncology), Stanford University School of Medicine, Stanford, CA, USA
| | - Maximilian Diehn
- Department of Radiation Oncology, Stanford University School of Medicine, Stanford, CA, USA
| | - Sen Yang
- Department of Radiation Oncology, Stanford University School of Medicine, Stanford, CA, USA.
| | - Ruijiang Li
- Department of Radiation Oncology, Stanford University School of Medicine, Stanford, CA, USA.
- Stanford Institute for Human-Centered Artificial Intelligence, Stanford, CA, USA.
| |
Collapse
|
13
|
Zhou H, Zhou F, Chen H. Cohort-Individual Cooperative Learning for Multimodal Cancer Survival Analysis. IEEE TRANSACTIONS ON MEDICAL IMAGING 2025; 44:656-667. [PMID: 39240739 DOI: 10.1109/tmi.2024.3455931] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/08/2024]
Abstract
Recently, we have witnessed impressive achievements in cancer survival analysis by integrating multimodal data, e.g., pathology images and genomic profiles. However, the heterogeneity and high dimensionality of these modalities pose significant challenges in extracting discriminative representations while maintaining good generalization. In this paper, we propose a Cohort-individual Cooperative Learning (CCL) framework to advance cancer survival analysis by collaborating knowledge decomposition and cohort guidance. Specifically, first, we propose a Multimodal Knowledge Decomposition (MKD) module to explicitly decompose multimodal knowledge into four distinct components: redundancy, synergy, and uniqueness of the two modalities. Such a comprehensive decomposition can enlighten the models to perceive easily overlooked yet important information, facilitating an effective multimodal fusion. Second, we propose a Cohort Guidance Modeling (CGM) to mitigate the risk of overfitting task-irrelevant information. It can promote a more comprehensive and robust understanding of the underlying multimodal data while avoiding the pitfalls of overfitting and enhancing the generalization ability of the model. By cooperating with the knowledge decomposition and cohort guidance methods, we develop a robust multimodal survival analysis model with enhanced discrimination and generalization abilities. Extensive experimental results on five cancer datasets demonstrate the effectiveness of our model in integrating multimodal data for survival analysis. Our code is available at https://github.com/moothes/CCL-survival.
Collapse
|
14
|
He Y, Huang F, Jiang X, Nie Y, Wang M, Wang J, Chen H. Foundation Model for Advancing Healthcare: Challenges, Opportunities and Future Directions. IEEE Rev Biomed Eng 2025; 18:172-191. [PMID: 39531565 DOI: 10.1109/rbme.2024.3496744] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2024]
Abstract
Foundation model, trained on a diverse range of data and adaptable to a myriad of tasks, is advancing healthcare. It fosters the development of healthcare artificial intelligence (AI) models tailored to the intricacies of the medical field, bridging the gap between limited AI models and the varied nature of healthcare practices. The advancement of a healthcare foundation model (HFM) brings forth tremendous potential to augment intelligent healthcare services across a broad spectrum of scenarios. However, despite the imminent widespread deployment of HFMs, there is currently a lack of clear understanding regarding their operation in the healthcare field, their existing challenges, and their future trajectory. To answer these critical inquiries, we present a comprehensive and in-depth examination that delves into the landscape of HFMs. It begins with a comprehensive overview of HFMs, encompassing their methods, data, and applications, to provide a quick understanding of the current progress. Subsequently, it delves into a thorough exploration of the challenges associated with data, algorithms, and computing infrastructures in constructing and widely applying foundation models in healthcare. Furthermore, this survey identifies promising directions for future development in this field. We believe that this survey will enhance the community's understanding of the current progress of HFMs and serve as a valuable source of guidance for future advancements in this domain.
Collapse
|
15
|
Ding L, Fan L, Shen M, Wang Y, Sheng K, Zou Z, An H, Jiang Z. Evaluating ChatGPT's diagnostic potential for pathology images. Front Med (Lausanne) 2025; 11:1507203. [PMID: 39917264 PMCID: PMC11798939 DOI: 10.3389/fmed.2024.1507203] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2024] [Accepted: 12/27/2024] [Indexed: 02/09/2025] Open
Abstract
Background Chat Generative Pretrained Transformer (ChatGPT) is a type of large language model (LLM) developed by OpenAI, known for its extensive knowledge base and interactive capabilities. These attributes make it a valuable tool in the medical field, particularly for tasks such as answering medical questions, drafting clinical notes, and optimizing the generation of radiology reports. However, keeping accuracy in medical contexts is the biggest challenge to employing GPT-4 in a clinical setting. This study aims to investigate the accuracy of GPT-4, which can process both text and image inputs, in generating diagnoses from pathological images. Methods This study analyzed 44 histopathological images from 16 organs and 100 colorectal biopsy photomicrographs. The initial evaluation was conducted using the standard GPT-4 model in January 2024, with a subsequent re-evaluation performed in July 2024. The diagnostic accuracy of GPT-4 was assessed by comparing its outputs to a reference standard using statistical measures. Additionally, four pathologists independently reviewed the same images to compare their diagnoses with the model's outputs. Both scanned and photographed images were tested to evaluate GPT-4's generalization ability across different image types. Results GPT-4 achieved an overall accuracy of 0.64 in identifying tumor imaging and tissue origins. For colon polyp classification, accuracy varied from 0.57 to 0.75 in different subtypes. The model achieved 0.88 accuracy in distinguishing low-grade from high-grade dysplasia and 0.75 in distinguishing high-grade dysplasia from adenocarcinoma, with a high sensitivity in detecting adenocarcinoma. Consistency between initial and follow-up evaluations showed slight to moderate agreement, with Kappa values ranging from 0.204 to 0.375. Conclusion GPT-4 demonstrates the ability to diagnose pathological images, showing improved performance over earlier versions. Its diagnostic accuracy in cancer is comparable to that of pathology residents. These findings suggest that GPT-4 holds promise as a supportive tool in pathology diagnostics, offering the potential to assist pathologists in routine diagnostic workflows.
Collapse
Affiliation(s)
- Liya Ding
- Department of Pathology, Sir Run Run Shaw Hospital, Zhejiang University School of Medicine, Hangzhou, China
| | - Lei Fan
- Department of Pathology, Sir Run Run Shaw Hospital, Zhejiang University School of Medicine, Hangzhou, China
- Department of Pathology, Ninghai County Traditional Chinese Medicine Hospital, Ningbo, China
| | - Miao Shen
- Department of Pathology, Sir Run Run Shaw Hospital, Zhejiang University School of Medicine, Hangzhou, China
- Department of Pathology, Deqing People’s Hospital, Hangzhou, China
| | - Yawen Wang
- College of Biomedical Engineering and Instrument Science, Zhejiang University, Hangzhou, China
| | - Kaiqin Sheng
- Department of Pathology, Sir Run Run Shaw Hospital, Zhejiang University School of Medicine, Hangzhou, China
| | - Zijuan Zou
- Department of Pathology, Sir Run Run Shaw Hospital, Zhejiang University School of Medicine, Hangzhou, China
| | - Huimin An
- Department of Pathology, Sir Run Run Shaw Hospital, Zhejiang University School of Medicine, Hangzhou, China
| | - Zhinong Jiang
- Department of Pathology, Sir Run Run Shaw Hospital, Zhejiang University School of Medicine, Hangzhou, China
| |
Collapse
|
16
|
Liu X, Liu H, Yang G, Jiang Z, Cui S, Zhang Z, Wang H, Tao L, Sun Y, Song Z, Hong T, Yang J, Gao T, Zhang J, Li X, Zhang J, Sang Y, Yang Z, Xue K, Wu S, Zhang P, Yang J, Song C, Wang G. A generalist medical language model for disease diagnosis assistance. Nat Med 2025:10.1038/s41591-024-03416-6. [PMID: 39779927 DOI: 10.1038/s41591-024-03416-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2024] [Accepted: 11/12/2024] [Indexed: 01/11/2025]
Abstract
The delivery of accurate diagnoses is crucial in healthcare and represents the gateway to appropriate and timely treatment. Although recent large language models (LLMs) have demonstrated impressive capabilities in few-shot or zero-shot learning, their effectiveness in clinical diagnosis remains unproven. Here we present MedFound, a generalist medical language model with 176 billion parameters, pre-trained on a large-scale corpus derived from diverse medical text and real-world clinical records. We further fine-tuned MedFound to learn physicians' inferential diagnosis with a self-bootstrapping strategy-based chain-of-thought approach and introduced a unified preference alignment framework to align it with standard clinical practice. Extensive experiments demonstrate that our medical LLM outperforms other baseline LLMs and specialized models in in-distribution (common diseases), out-of-distribution (external validation) and long-tailed distribution (rare diseases) scenarios across eight specialties. Further ablation studies indicate the effectiveness of key components in our medical LLM training approach. We conducted a comprehensive evaluation of the clinical applicability of LLMs for diagnosis involving artificial intelligence (AI) versus physician comparison, AI-assistance study and human evaluation framework. Our proposed framework incorporates eight clinical evaluation metrics, covering capabilities such as medical record summarization, diagnostic reasoning and risk management. Our findings demonstrate the model's feasibility in assisting physicians with disease diagnosis as part of the clinical workflow.
Collapse
Affiliation(s)
- Xiaohong Liu
- State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, Beijing, China
| | - Hao Liu
- Department of Orthopedics, Peking University Third Hospital & Beijing Key Laboratory of Spinal Disease & Engineering Research Center of Bone and Joint Precision Medicine, Beijing, China
| | - Guoxing Yang
- State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, Beijing, China
| | - Zeyu Jiang
- State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, Beijing, China
| | - Shuguang Cui
- School of Science and Engineering (SSE), Future Network of Intelligence Institute (FNii) and Guangdong Provincial Key Laboratory of Future Networks of Intelligence, Chinese University of Hong Kong, Shenzhen, China
| | - Zhaoze Zhang
- Department of Orthopedics, Peking University Third Hospital & Beijing Key Laboratory of Spinal Disease & Engineering Research Center of Bone and Joint Precision Medicine, Beijing, China
| | - Huan Wang
- Department of Orthopedics, Peking University Third Hospital & Beijing Key Laboratory of Spinal Disease & Engineering Research Center of Bone and Joint Precision Medicine, Beijing, China
| | - Liyuan Tao
- Research Center of Clinical Epidemiology, Peking University Third Hospital, Beijing, China
| | - Yongchang Sun
- Department of Respiratory and Critical Care Medicine, Peking University Third Hospital and Research Center for Chronic Airway Diseases, Peking University Health Science Center, Beijing, China
| | - Zhu Song
- Department of Respiratory and Critical Care Medicine, Peking University Third Hospital and Research Center for Chronic Airway Diseases, Peking University Health Science Center, Beijing, China
| | - Tianpei Hong
- Department of Endocrinology and Metabolism, Peking University Third Hospital, Beijing, China
| | - Jin Yang
- Department of Endocrinology and Metabolism, Peking University Third Hospital, Beijing, China
| | - Tianrun Gao
- State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, Beijing, China
| | - Jiangjiang Zhang
- State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, Beijing, China
| | - Xiaohu Li
- State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, Beijing, China
| | - Jing Zhang
- Department of Cardiology, The First College of Clinical Medical Science, China Three Gorges University and Yichang Central People's Hospital, Yichang, China
| | - Ye Sang
- Department of Cardiology, The First College of Clinical Medical Science, China Three Gorges University and Yichang Central People's Hospital, Yichang, China
| | - Zhao Yang
- Peking University First Hospital and Research Center of Public Policy, Peking University, Beijing, China
| | - Kanmin Xue
- Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford, UK
| | - Song Wu
- South China Hospital, Medical School, Shenzhen University, Shenzhen, China
| | - Ping Zhang
- State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, Beijing, China
| | - Jian Yang
- Department of Cardiology, The First College of Clinical Medical Science, China Three Gorges University and Yichang Central People's Hospital, Yichang, China.
| | - Chunli Song
- Department of Orthopedics, Peking University Third Hospital & Beijing Key Laboratory of Spinal Disease & Engineering Research Center of Bone and Joint Precision Medicine, Beijing, China.
| | - Guangyu Wang
- State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, Beijing, China.
| |
Collapse
|
17
|
Huang Y, Dou H, Ni D. A foundation model unlocks unified biomedical image analysis. Nat Methods 2025; 22:18-19. [PMID: 39558097 DOI: 10.1038/s41592-024-02519-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2024]
Affiliation(s)
- Yuhao Huang
- National-Regional Key Technology Engineering Laboratory for Medical Ultrasound, School of Biomedical Engineering, Medical School, Shenzhen University, Shenzhen, China
- Medical Ultrasound Image Computing (MUSIC) Lab, Shenzhen University, Shenzhen, China
| | - Haoran Dou
- Department of Computer Science, School of Engineering, University of Manchester, Manchester, UK
| | - Dong Ni
- National-Regional Key Technology Engineering Laboratory for Medical Ultrasound, School of Biomedical Engineering, Medical School, Shenzhen University, Shenzhen, China.
- Medical Ultrasound Image Computing (MUSIC) Lab, Shenzhen University, Shenzhen, China.
| |
Collapse
|
18
|
Liu Y, Zhang L, Gu M, Xiao Y, Yu T, Tao X, Zhang Q, Wang Y, Shen D, Li Q. Inspect quantitative signals in placental histopathology: Computer-assisted multiple functional tissues identification through multi-model fusion and distillation framework. Comput Med Imaging Graph 2025; 119:102482. [PMID: 39746224 DOI: 10.1016/j.compmedimag.2024.102482] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2024] [Revised: 12/09/2024] [Accepted: 12/20/2024] [Indexed: 01/04/2025]
Abstract
Pathological analysis of placenta is currently a valuable tool for gaining insights into pregnancy outcomes. In placental histopathology, multiple functional tissues can be inspected as potential signals reflecting the transfer functionality between fetal and maternal circulations. However, the identification of multiple functional tissues is challenging due to (1) severe heterogeneity in texture, size and shape, (2) distribution across different scales and (3) the need for comprehensive assessment at the whole slide image (WSI) level. To solve aforementioned problems, we establish a brand new dataset and propose a computer-aided segmentation framework through multi-model fusion and distillation to identify multiple functional tissues in placental histopathologic images, including villi, capillaries, fibrin deposits and trophoblast aggregations. Specifically, we propose a two-stage Multi-model Fusion and Distillation (MMFD) framework. Considering the multi-scale distribution and heterogeneity of multiple functional tissues, we enhance the visual representation in the first stage by fusing feature from multiple models to boost the effectiveness of the network. However, the multi-model fusion stage contributes to extra parameters and a significant computational burden, which is impractical for recognizing gigapixels of WSIs within clinical practice. In the second stage, we propose straightforward plug-in feature distillation method that transfers knowledge from the large fused model to a compact student model. In self-collected placental dataset, our proposed MMFD framework demonstrates an improvement of 4.3% in mean Intersection over Union (mIoU) while achieving an approximate 50% increase in inference speed and utilizing only 10% of parameters and computational resources, compared to the parameter-efficient fine-tuned Segment Anything Model (SAM) baseline. Visualization of segmentation results across entire WSIs on unseen cases demonstrates the generalizability of our proposed MMFD framework. Besides, experimental results on a public dataset further prove the effectiveness of MMFD framework on other tasks. Our work can present a fundamental method to expedite quantitative analysis of placental histopathology.
Collapse
Affiliation(s)
- Yiming Liu
- Shanghai Key Laboratory of Multidimensional Information Processing, School of Communication and Electronic Engineering, East China Normal University, Shanghai 200241, China
| | - Ling Zhang
- Shanghai Key Laboratory of Multidimensional Information Processing, School of Communication and Electronic Engineering, East China Normal University, Shanghai 200241, China
| | - Mingxue Gu
- Shanghai Key Laboratory of Multidimensional Information Processing, School of Communication and Electronic Engineering, East China Normal University, Shanghai 200241, China
| | - Yaoxing Xiao
- Department of Pathology, Obstetrics and Gynecology, Hospital of Fudan University, Shanghai 200090, China
| | - Ting Yu
- Department of Pathology, Obstetrics and Gynecology, Hospital of Fudan University, Shanghai 200090, China
| | - Xiang Tao
- Department of Pathology, Obstetrics and Gynecology, Hospital of Fudan University, Shanghai 200090, China
| | - Qing Zhang
- Shanghai Key Laboratory of Multidimensional Information Processing, School of Communication and Electronic Engineering, East China Normal University, Shanghai 200241, China
| | - Yan Wang
- Shanghai Key Laboratory of Multidimensional Information Processing, School of Communication and Electronic Engineering, East China Normal University, Shanghai 200241, China
| | - Dinggang Shen
- School of Biomedical Engineering, ShanghaiTech University, Shanghai 200000, China; Shanghai United Imaging Healthcare Co., Ltd., Shanghai 200233, China
| | - Qingli Li
- Shanghai Key Laboratory of Multidimensional Information Processing, School of Communication and Electronic Engineering, East China Normal University, Shanghai 200241, China.
| |
Collapse
|
19
|
Zheng Y, Wu K, Li J, Tang K, Shi J, Wu H, Jiang Z, Wang W. Partial-Label Contrastive Representation Learning for Fine-Grained Biomarkers Prediction From Histopathology Whole Slide Images. IEEE J Biomed Health Inform 2025; 29:396-408. [PMID: 39012745 DOI: 10.1109/jbhi.2024.3429188] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/18/2024]
Abstract
In the domain of histopathology analysis, existing representation learning methods for biomarkers prediction from whole slide images (WSIs) face challenges due to the complexity of tissue subtypes and label noise problems. This paper proposed a novel partial-label contrastive representation learning approach to enhance the discrimination of histopathology image representations for fine-grained biomarkers prediction. We designed a partial-label contrastive clustering (PLCC) module for partial-label disambiguation and a dynamic clustering algorithm to sample the most representative features of each category to the clustering queue during the contrastive learning process. We conducted comprehensive experiments on three gene mutation prediction datasets, including USTC-EGFR, BRCA-HER2, and TCGA-EGFR. The results show that our method outperforms 9 existing methods in terms of Accuracy, AUC, and F1 Score. Specifically, our method achieved an AUC of 0.950 in EGFR mutation subtyping of TCGA-EGFR and an AUC of 0.853 in HER2 0/1+/2+/3+ grading of BRCA-HER2, which demonstrates its superiority in fine-grained biomarkers prediction from histopathology whole slide images.
Collapse
|
20
|
Rahman T, Baras AS, Chellappa R. Evaluation of a Task-Specific Self-Supervised Learning Framework in Digital Pathology Relative to Transfer Learning Approaches and Existing Foundation Models. Mod Pathol 2025; 38:100636. [PMID: 39455029 DOI: 10.1016/j.modpat.2024.100636] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2024] [Revised: 09/06/2024] [Accepted: 10/15/2024] [Indexed: 10/28/2024]
Abstract
An integral stage in typical digital pathology workflows involves deriving specific features from tiles extracted from a tessellated whole-slide image. Notably, various computer vision neural network architectures, particularly the ImageNet pretrained, have been extensively used in this domain. This study critically analyzes multiple strategies for encoding tiles to understand the extent of transfer learning and identify the most effective approach. The study categorizes neural network performance into 3 weight initialization methods: random, ImageNet-based, and self-supervised learning. Additionally, we propose a framework based on task-specific self-supervised learning, which introduces a shallow feature extraction method, employing a spatial-channel attention block to glean distinctive features optimized for histopathology intricacies. Across 2 different downstream classification tasks (patch classification and weakly supervised whole-slide image classification) with diverse classification data sets, including colorectal cancer histology, Patch Camelyon, prostate cancer detection, The Cancer Genome Atlas, and CIFAR-10, our task-specific self-supervised encoding approach consistently outperforms other convolutional neural network-based encoders. The better performances highlight the potential of task-specific attention-based self-supervised training in tailoring feature extraction for histopathology, indicating a shift from using pretrained models originating outside the histopathology domain. Our study supports the idea that task-specific self-supervised learning allows domain-specific feature extraction, encouraging a more focused analysis.
Collapse
Affiliation(s)
- Tawsifur Rahman
- Department of Biomedical Engineering, Johns Hopkins School of Medicine, Baltimore Maryland.
| | - Alexander S Baras
- Department of Pathology, Johns Hopkins University School of Medicine, Baltimore Maryland
| | - Rama Chellappa
- Department of Biomedical Engineering, Johns Hopkins School of Medicine, Baltimore Maryland; Department of Electrical and Computer Engineering, Johns Hopkins University, Baltimore, Maryland
| |
Collapse
|
21
|
Zhao Q, Nooner KB, Tapert SF, Adeli E, Pohl KM, Kuceyeski A, Sabuncu MR. The Transition From Homogeneous to Heterogeneous Machine Learning in Neuropsychiatric Research. BIOLOGICAL PSYCHIATRY GLOBAL OPEN SCIENCE 2025; 5:100397. [PMID: 39526023 PMCID: PMC11546160 DOI: 10.1016/j.bpsgos.2024.100397] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2024] [Revised: 09/17/2024] [Accepted: 09/18/2024] [Indexed: 11/16/2024] Open
Abstract
Despite the advantage of neuroimaging-based machine learning (ML) models as pivotal tools for investigating brain-behavior relationships in neuropsychiatric studies, these data-driven predictive approaches have yet to yield substantial, clinically actionable insights for mental health care. A notable impediment lies in the inadequate accommodation of most ML research to the natural heterogeneity within large samples. Although commonly thought of as individual-level analyses, many ML algorithms are unimodal and homogeneous and thus incapable of capturing the potentially heterogeneous relationships between biology and psychopathology. We review the current landscape of computational research targeting population heterogeneity and argue that there is a need to expand from brain subtyping and behavioral phenotyping to analyses that focus on heterogeneity at the relational level. To this end, we review and suggest several existing ML models with the capacity to discern how external environmental and sociodemographic factors moderate the brain-behavior mapping function in a data-driven fashion. These heterogeneous ML models hold promise for enhancing the discovery of individualized brain-behavior associations and advancing precision psychiatry.
Collapse
Affiliation(s)
- Qingyu Zhao
- Department of Radiology, Weill Cornell Medicine, New York, New York
| | - Kate B. Nooner
- Department of Psychology, University of North Carolina Wilmington, Wilmington, North Carolina
| | - Susan F. Tapert
- Department of Psychiatry, University of California San Diego, La Jolla, California
| | - Ehsan Adeli
- Department of Psychiatry & Behavioral Sciences, Stanford University, Stanford, California
- Department of Computer Science, Stanford University, Stanford, California
| | - Kilian M. Pohl
- Department of Psychiatry & Behavioral Sciences, Stanford University, Stanford, California
- Department of Electrical Engineering, Stanford University, Stanford, California
| | - Amy Kuceyeski
- Department of Radiology, Weill Cornell Medicine, New York, New York
| | - Mert R. Sabuncu
- Department of Radiology, Weill Cornell Medicine, New York, New York
- School of Electrical and Computer Engineering, Cornell University and Cornell Tech, New York, New York
| |
Collapse
|
22
|
Nunes JD, Montezuma D, Oliveira D, Pereira T, Cardoso JS. A survey on cell nuclei instance segmentation and classification: Leveraging context and attention. Med Image Anal 2025; 99:103360. [PMID: 39383642 DOI: 10.1016/j.media.2024.103360] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2023] [Revised: 08/26/2024] [Accepted: 09/27/2024] [Indexed: 10/11/2024]
Abstract
Nuclear-derived morphological features and biomarkers provide relevant insights regarding the tumour microenvironment, while also allowing diagnosis and prognosis in specific cancer types. However, manually annotating nuclei from the gigapixel Haematoxylin and Eosin (H&E)-stained Whole Slide Images (WSIs) is a laborious and costly task, meaning automated algorithms for cell nuclei instance segmentation and classification could alleviate the workload of pathologists and clinical researchers and at the same time facilitate the automatic extraction of clinically interpretable features for artificial intelligence (AI) tools. But due to high intra- and inter-class variability of nuclei morphological and chromatic features, as well as H&E-stains susceptibility to artefacts, state-of-the-art algorithms cannot correctly detect and classify instances with the necessary performance. In this work, we hypothesize context and attention inductive biases in artificial neural networks (ANNs) could increase the performance and generalization of algorithms for cell nuclei instance segmentation and classification. To understand the advantages, use-cases, and limitations of context and attention-based mechanisms in instance segmentation and classification, we start by reviewing works in computer vision and medical imaging. We then conduct a thorough survey on context and attention methods for cell nuclei instance segmentation and classification from H&E-stained microscopy imaging, while providing a comprehensive discussion of the challenges being tackled with context and attention. Besides, we illustrate some limitations of current approaches and present ideas for future research. As a case study, we extend both a general (Mask-RCNN) and a customized (HoVer-Net) instance segmentation and classification methods with context- and attention-based mechanisms and perform a comparative analysis on a multicentre dataset for colon nuclei identification and counting. Although pathologists rely on context at multiple levels while paying attention to specific Regions of Interest (RoIs) when analysing and annotating WSIs, our findings suggest translating that domain knowledge into algorithm design is no trivial task, but to fully exploit these mechanisms in ANNs, the scientific understanding of these methods should first be addressed.
Collapse
Affiliation(s)
- João D Nunes
- INESC TEC - Institute for Systems and Computer Engineering, Technology and Science, R. Dr. Roberto Frias, Porto, 4200-465, Portugal; University of Porto - Faculty of Engineering, R. Dr. Roberto Frias, Porto, 4200-465, Portugal.
| | - Diana Montezuma
- IMP Diagnostics, Praça do Bom Sucesso, 4150-146 Porto, Portugal; Cancer Biology and Epigenetics Group, Research Center of IPO Porto (CI-IPOP)/[RISE@CI-IPOP], Portuguese Oncology Institute of Porto (IPO Porto)/Porto Comprehensive Cancer Center (Porto.CCC), R. Dr. António Bernardino de Almeida, 4200-072, Porto, Portugal; Doctoral Programme in Medical Sciences, School of Medicine and Biomedical Sciences - University of Porto (ICBAS-UP), Porto, Portugal
| | | | - Tania Pereira
- INESC TEC - Institute for Systems and Computer Engineering, Technology and Science, R. Dr. Roberto Frias, Porto, 4200-465, Portugal; FCTUC - Faculty of Science and Technology, University of Coimbra, Coimbra, 3004-516, Portugal
| | - Jaime S Cardoso
- INESC TEC - Institute for Systems and Computer Engineering, Technology and Science, R. Dr. Roberto Frias, Porto, 4200-465, Portugal; University of Porto - Faculty of Engineering, R. Dr. Roberto Frias, Porto, 4200-465, Portugal
| |
Collapse
|
23
|
Komura D, Ochi M, Ishikawa S. Machine learning methods for histopathological image analysis: Updates in 2024. Comput Struct Biotechnol J 2024; 27:383-400. [PMID: 39897057 PMCID: PMC11786909 DOI: 10.1016/j.csbj.2024.12.033] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2024] [Revised: 12/23/2024] [Accepted: 12/26/2024] [Indexed: 02/04/2025] Open
Abstract
The combination of artificial intelligence and digital pathology has emerged as a transformative force in healthcare and biomedical research. As an update to our 2018 review, this review presents comprehensive analysis of machine learning applications in histopathological image analysis, with focus on the developments since 2018. We highlight significant advances that have expanded the technical capabilities and practical applications of computational pathology. The review examines progress in addressing key challenges in the field as follows: processing of gigapixel whole slide images, insufficient labeled data, multidimensional analysis, domain shifts across institutions, and interpretability of machine learning models. We evaluate emerging trends, such as foundation models and multimodal integration, that are reshaping the field. Overall, our review highlights the potential of machine learning in enhancing both routine pathological analysis and scientific discovery in pathology. By providing this comprehensive overview, this review aims to guide researchers and clinicians in understanding the current state of the pathology image analysis field and its future trajectory.
Collapse
Affiliation(s)
- Daisuke Komura
- Department of Preventive Medicine, Graduate School of Medicine, The University of Tokyo, Tokyo, Japan
| | - Mieko Ochi
- Department of Preventive Medicine, Graduate School of Medicine, The University of Tokyo, Tokyo, Japan
| | - Shumpei Ishikawa
- Department of Preventive Medicine, Graduate School of Medicine, The University of Tokyo, Tokyo, Japan
| |
Collapse
|
24
|
Zhao W, Guo Z, Fan Y, Jiang Y, Yeung MCF, Yu L. Aligning knowledge concepts to whole slide images for precise histopathology image analysis. NPJ Digit Med 2024; 7:383. [PMID: 39738468 DOI: 10.1038/s41746-024-01411-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2024] [Accepted: 12/22/2024] [Indexed: 01/02/2025] Open
Abstract
Due to the large size and lack of fine-grained annotation, Whole Slide Images (WSIs) analysis is commonly approached as a Multiple Instance Learning (MIL) problem. However, previous studies only learn from training data, posing a stark contrast to how human clinicians teach each other and reason about histopathologic entities and factors. Here, we present a novel knowledge concept-based MIL framework, named ConcepPath, to fill this gap. Specifically, ConcepPath utilizes GPT-4 to induce reliable disease-specific human expert concepts from medical literature and incorporate them with a group of purely learnable concepts to extract complementary knowledge from training data. In ConcepPath, WSIs are aligned to these linguistic knowledge concepts by utilizing the pathology vision-language model as the basic building component. In the application of lung cancer subtyping, breast cancer HER2 scoring, and gastric cancer immunotherapy-sensitive subtyping tasks, ConcepPath significantly outperformed previous SOTA methods, which lacked the guidance of human expert knowledge.
Collapse
Affiliation(s)
- Weiqin Zhao
- School of Computing and Data Science, The University of Hong Kong, Hong Kong SAR, China
| | - Ziyu Guo
- School of Computing and Data Science, The University of Hong Kong, Hong Kong SAR, China
| | - Yinshuang Fan
- School of Computing and Data Science, The University of Hong Kong, Hong Kong SAR, China
| | - Yuming Jiang
- School of Medicine, Wake Forest University, Winston-Salem, NC, USA
| | - Maximus C F Yeung
- Department of Pathology, The University of Hong Kong, Hong Kong SAR, China.
| | - Lequan Yu
- School of Computing and Data Science, The University of Hong Kong, Hong Kong SAR, China.
| |
Collapse
|
25
|
Xia Y, Yu Z. Thorny but rosy: prosperities and difficulties in 'AI plus medicine' concerning data collection, model construction and clinical deployment. Gen Psychiatr 2024; 37:e101436. [PMID: 39717668 PMCID: PMC11664349 DOI: 10.1136/gpsych-2023-101436] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/26/2023] [Accepted: 11/11/2024] [Indexed: 12/25/2024] Open
Affiliation(s)
- Yujia Xia
- Department of Bioinformatics and Biostatistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China
- SJTU-Yale Joint Center for Biostatistics and Data Science, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China
| | - Zhangsheng Yu
- Department of Bioinformatics and Biostatistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China
- SJTU-Yale Joint Center for Biostatistics and Data Science, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China
- Clinical Research Institute, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| |
Collapse
|
26
|
Tian Y, Li Z, Jin Y, Wang M, Wei X, Zhao L, Liu Y, Liu J, Liu C. Foundation model of ECG diagnosis: Diagnostics and explanations of any form and rhythm on ECG. Cell Rep Med 2024; 5:101875. [PMID: 39694017 PMCID: PMC11722092 DOI: 10.1016/j.xcrm.2024.101875] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2024] [Revised: 09/21/2024] [Accepted: 11/21/2024] [Indexed: 12/20/2024]
Abstract
We propose a knowledge-enhanced electrocardiogram (ECG) diagnosis foundation model (KED) that utilizes large language models to incorporate domain-specific knowledge of ECG signals. This model is trained on 800,000 ECGs from nearly 160,000 unique patients. Despite being trained on single-center data, KED demonstrates exceptional zero-shot diagnosis performance across various regions, including different locales in China, the United States, and other regions. This performance spans across all age groups for various conditions such as morphological abnormalities, rhythm abnormalities, conduction blocks, hypertrophy, myocardial ischemia, and infarction. Moreover, KED exhibits robust performance on diseases it has not encountered during its training. When compared to three experienced cardiologists on real clinical datasets, the model achieves comparable performance in zero-shot diagnosis of seven common clinical ECG types. We concentrate on the zero-shot diagnostic capability and the generalization performance of the proposed ECG foundation model, particularly in the context of external multi-center data and previously unseen disease.
Collapse
Affiliation(s)
- Yuanyuan Tian
- State Key Laboratory of Mechanical System and Vibration, School of Mechanical Engineering, Shanghai Jiao Tong University, Shanghai 200240, China; MoE Key Lab of Artificial Intelligence, AI Institute, Shanghai Jiao Tong University, Shanghai 200240, China.
| | - Zhiyuan Li
- State Key Laboratory of Mechanical System and Vibration, School of Mechanical Engineering, Shanghai Jiao Tong University, Shanghai 200240, China; MoE Key Lab of Artificial Intelligence, AI Institute, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Yanrui Jin
- State Key Laboratory of Mechanical System and Vibration, School of Mechanical Engineering, Shanghai Jiao Tong University, Shanghai 200240, China; MoE Key Lab of Artificial Intelligence, AI Institute, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Mengxiao Wang
- State Key Laboratory of Mechanical System and Vibration, School of Mechanical Engineering, Shanghai Jiao Tong University, Shanghai 200240, China; MoE Key Lab of Artificial Intelligence, AI Institute, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Xiaoyang Wei
- State Key Laboratory of Mechanical System and Vibration, School of Mechanical Engineering, Shanghai Jiao Tong University, Shanghai 200240, China; MoE Key Lab of Artificial Intelligence, AI Institute, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Liqun Zhao
- Department of Cardiology, Shanghai First People's Hospital Affiliated to Shanghai Jiao Tong University, Shanghai 200080, China
| | - Yunqing Liu
- State Key Laboratory of Mechanical System and Vibration, School of Mechanical Engineering, Shanghai Jiao Tong University, Shanghai 200240, China; MoE Key Lab of Artificial Intelligence, AI Institute, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Jinlei Liu
- State Key Laboratory of Mechanical System and Vibration, School of Mechanical Engineering, Shanghai Jiao Tong University, Shanghai 200240, China; MoE Key Lab of Artificial Intelligence, AI Institute, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Chengliang Liu
- State Key Laboratory of Mechanical System and Vibration, School of Mechanical Engineering, Shanghai Jiao Tong University, Shanghai 200240, China; MoE Key Lab of Artificial Intelligence, AI Institute, Shanghai Jiao Tong University, Shanghai 200240, China.
| |
Collapse
|
27
|
Yang Y, Shen H, Chen K, Li X. From pixels to patients: the evolution and future of deep learning in cancer diagnostics. Trends Mol Med 2024:S1471-4914(24)00310-1. [PMID: 39665958 DOI: 10.1016/j.molmed.2024.11.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2024] [Revised: 10/15/2024] [Accepted: 11/14/2024] [Indexed: 12/13/2024]
Abstract
Deep learning has revolutionized cancer diagnostics, shifting from pixel-based image analysis to more comprehensive, patient-centric care. This opinion article explores recent advancements in neural network architectures, highlighting their evolution in biomedical research and their impact on medical imaging interpretation and multimodal data integration. We emphasize the need for domain-specific artificial intelligence (AI) systems capable of handling complex clinical tasks, advocating for the development of multimodal large language models that can integrate diverse data sources. These models have the potential to significantly enhance the precision and efficiency of cancer diagnostics, transforming AI from a supplementary tool into a core component of clinical decision-making, ultimately improving patient outcomes and advancing cancer care.
Collapse
Affiliation(s)
- Yichen Yang
- Tianjin Cancer Institute, Tianjin's Clinical Research Center for Cancer, National Clinical Research Center for Cancer, Key Laboratory of Cancer Prevention and Therapy of Tianjin, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin, China
| | - Hongru Shen
- Tianjin Cancer Institute, Tianjin's Clinical Research Center for Cancer, National Clinical Research Center for Cancer, Key Laboratory of Cancer Prevention and Therapy of Tianjin, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin, China
| | - Kexin Chen
- Department of Epidemiology and Biostatistics, Key Laboratory of Molecular Cancer Epidemiology of Tianjin, Key Laboratory of Prevention and Control of Human Major Diseases in Ministry of Education, Tianjin's Clinical Research Center for Cancer, National Clinical Research Center for Cancer, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin, China.
| | - Xiangchun Li
- Tianjin Cancer Institute, Tianjin's Clinical Research Center for Cancer, National Clinical Research Center for Cancer, Key Laboratory of Cancer Prevention and Therapy of Tianjin, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin, China.
| |
Collapse
|
28
|
Yu J, Ma T, Chen F, Zhang J, Xu Y. Task-driven framework using large models for digital pathology. Commun Biol 2024; 7:1619. [PMID: 39632974 PMCID: PMC11618297 DOI: 10.1038/s42003-024-07303-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2024] [Accepted: 11/22/2024] [Indexed: 12/07/2024] Open
Abstract
Microscopy is an indispensable tool for collecting biomedical information in pathological diagnosis, but manual annotation, measurement and interpretation are labor-intensive and costly. Here, we propose a task-driven framework powered by large models that excel in visual analysis and real-time control, paving the way for the next generation of microscopes. We achieve proof-of-concept success on clinical tasks, specifically in adaptive analysis of H&E-stained liver tissue slides. This work demonstrates the advanced capabilities for future digital pathology, setting a new standard for intelligent, efficient, and real-time analysis in clinical applications.
Collapse
Affiliation(s)
- Jiahui Yu
- Department of Biomedical Engineering, MOE Key Laboratory of Biomedical Engineering, State Key Laboratory of Extreme Photonics and Instrumentation, Zhejiang Provincial Key Laboratory of Cardio-Cerebral Vascular Detection Technology and Medicinal Effectiveness Appraisal, Zhejiang University, Hangzhou, China
- Innovation Center for Smart Medical Technologies and Devices, Binjiang Institute of Zhejiang University, Hangzhou, China
| | - Tianyu Ma
- Innovation Center for Smart Medical Technologies and Devices, Binjiang Institute of Zhejiang University, Hangzhou, China
| | - Feng Chen
- Department of Radiology, The First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang, China
| | - Jing Zhang
- Department of Pathology, The First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang, China
- National Human Brain Bank for Health and Disease, Zhejiang University, Hangzhou, Zhejiang, China
| | - Yingke Xu
- Department of Biomedical Engineering, MOE Key Laboratory of Biomedical Engineering, State Key Laboratory of Extreme Photonics and Instrumentation, Zhejiang Provincial Key Laboratory of Cardio-Cerebral Vascular Detection Technology and Medicinal Effectiveness Appraisal, Zhejiang University, Hangzhou, China.
- Innovation Center for Smart Medical Technologies and Devices, Binjiang Institute of Zhejiang University, Hangzhou, China.
- Department of Endocrinology, Children's Hospital of Zhejiang University School of Medicine, National Clinical Research Center for Child Health, Hangzhou, China.
| |
Collapse
|
29
|
Tak D, Garomsa BA, Chaunzwa TL, Zapaishchykova A, Climent Pardo JC, Ye Z, Zielke J, Ravipati Y, Vajapeyam S, Mahootiha M, Smith C, Familiar AM, Liu KX, Prabhu S, Bandopadhayay P, Nabavizadeh A, Mueller S, Aerts HJWL, Huang RY, Poussaint TY, Kann BH. A foundation model for generalized brain MRI analysis. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.12.02.24317992. [PMID: 39677480 PMCID: PMC11643205 DOI: 10.1101/2024.12.02.24317992] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 12/17/2024]
Abstract
Artificial intelligence (AI) applied to brain magnetic resonance imaging (MRI) has the potential to improve disease diagnosis and management but requires algorithms with generalizable knowledge that can perform well in a variety of clinical scenarios. The field has been constrained, thus far, by limited training data and task-specific models that do not generalize well across patient populations and medical tasks. Foundation models, by leveraging self-supervised learning, pretraining, and targeted adaptation, present a promising paradigm to overcome these limitations. Here, we present Brain Imaging Adaptive Core (BrainIAC), a novel foundation model designed to learn generalized representations from unlabeled brain MRI data and serve as a core basis for diverse downstream application adaptation. Trained and validated on 48,519 brain MRIs across a broad spectrum of tasks, we demonstrate that BrainIAC outperforms localized supervised training and other pretrained models, particularly in low-data settings and high-difficulty tasks, allowing for application in scenarios otherwise infeasible. BrainIAC can be integrated into imaging pipelines and multimodal frameworks and may lead to improved biomarker discovery and AI clinical translation.
Collapse
Affiliation(s)
- Divyanshu Tak
- Artificial Intelligence in Medicine (AIM) Program, Mass General Brigham, Harvard Medical School, Boston, MA, United States
- Department of Radiation Oncology, Dana-Farber Cancer Institute and Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, United States
| | - Biniam A. Garomsa
- Artificial Intelligence in Medicine (AIM) Program, Mass General Brigham, Harvard Medical School, Boston, MA, United States
- Department of Radiation Oncology, Dana-Farber Cancer Institute and Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, United States
| | - Tafadzwa L. Chaunzwa
- Artificial Intelligence in Medicine (AIM) Program, Mass General Brigham, Harvard Medical School, Boston, MA, United States
- Department of Radiation Oncology, Dana-Farber Cancer Institute and Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, United States
- Memorial Sloan Kettering Cancer Center, New York, United States
| | - Anna Zapaishchykova
- Artificial Intelligence in Medicine (AIM) Program, Mass General Brigham, Harvard Medical School, Boston, MA, United States
- Department of Radiation Oncology, Dana-Farber Cancer Institute and Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, United States
| | - Juan Carlos Climent Pardo
- Artificial Intelligence in Medicine (AIM) Program, Mass General Brigham, Harvard Medical School, Boston, MA, United States
- Department of Radiation Oncology, Dana-Farber Cancer Institute and Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, United States
| | - Zezhong Ye
- Artificial Intelligence in Medicine (AIM) Program, Mass General Brigham, Harvard Medical School, Boston, MA, United States
- Department of Radiation Oncology, Dana-Farber Cancer Institute and Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, United States
| | - John Zielke
- Artificial Intelligence in Medicine (AIM) Program, Mass General Brigham, Harvard Medical School, Boston, MA, United States
- Department of Radiation Oncology, Dana-Farber Cancer Institute and Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, United States
| | - Yashwanth Ravipati
- Artificial Intelligence in Medicine (AIM) Program, Mass General Brigham, Harvard Medical School, Boston, MA, United States
- Department of Radiation Oncology, Dana-Farber Cancer Institute and Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, United States
| | - Sri Vajapeyam
- Boston Children’s Hospital, Boston, MA, United States
| | - Maryam Mahootiha
- Department of Radiation Oncology, Dana-Farber Cancer Institute and Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, United States
| | - Ceilidh Smith
- Department of Radiation Oncology, Dana-Farber Cancer Institute and Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, United States
| | | | - Kevin X. Liu
- Boston Children’s Hospital, Boston, MA, United States
| | - Sanjay Prabhu
- Boston Children’s Hospital, Boston, MA, United States
| | - Pratiti Bandopadhayay
- Boston Children’s Hospital, Boston, MA, United States
- Dana-Farber Cancer Institute, Boston, MA, United States
| | - Ali Nabavizadeh
- Children’s Hospital of Philadelphia, Philadelphia, United States
- University of Pennsylvania, Pennsylvania, United States
| | - Sabine Mueller
- Department of Neurology, Neurosurgery and Pediatrics, University of California, San Francisco, United States
| | - Hugo JWL Aerts
- Artificial Intelligence in Medicine (AIM) Program, Mass General Brigham, Harvard Medical School, Boston, MA, United States
- Department of Radiation Oncology, Dana-Farber Cancer Institute and Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, United States
- Radiology and Nuclear Medicine, CARIM & GROW, Maastricht University, Maastricht, the Netherlands
| | - Raymond Y. Huang
- Boston Children’s Hospital, Boston, MA, United States
- Dana-Farber Cancer Institute, Boston, MA, United States
- Department of Radiology, Brigham and Women’s Hospital, Harvard Medical School, Boston MA
| | - Tina Y. Poussaint
- Boston Children’s Hospital, Boston, MA, United States
- Dana-Farber Cancer Institute, Boston, MA, United States
| | - Benjamin H. Kann
- Artificial Intelligence in Medicine (AIM) Program, Mass General Brigham, Harvard Medical School, Boston, MA, United States
- Department of Radiation Oncology, Dana-Farber Cancer Institute and Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, United States
- Dana-Farber Cancer Institute, Boston, MA, United States
| |
Collapse
|
30
|
Chen C, Miao J, Wu D, Zhong A, Yan Z, Kim S, Hu J, Liu Z, Sun L, Li X, Liu T, Heng PA, Li Q. MA-SAM: Modality-agnostic SAM adaptation for 3D medical image segmentation. Med Image Anal 2024; 98:103310. [PMID: 39182302 PMCID: PMC11381141 DOI: 10.1016/j.media.2024.103310] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2023] [Revised: 08/13/2024] [Accepted: 08/16/2024] [Indexed: 08/27/2024]
Abstract
The Segment Anything Model (SAM), a foundation model for general image segmentation, has demonstrated impressive zero-shot performance across numerous natural image segmentation tasks. However, SAM's performance significantly declines when applied to medical images, primarily due to the substantial disparity between natural and medical image domains. To effectively adapt SAM to medical images, it is important to incorporate critical third-dimensional information, i.e., volumetric or temporal knowledge, during fine-tuning. Simultaneously, we aim to harness SAM's pre-trained weights within its original 2D backbone to the fullest extent. In this paper, we introduce a modality-agnostic SAM adaptation framework, named as MA-SAM, that is applicable to various volumetric and video medical data. Our method roots in the parameter-efficient fine-tuning strategy to update only a small portion of weight increments while preserving the majority of SAM's pre-trained weights. By injecting a series of 3D adapters into the transformer blocks of the image encoder, our method enables the pre-trained 2D backbone to extract third-dimensional information from input data. We comprehensively evaluate our method on five medical image segmentation tasks, by using 11 public datasets across CT, MRI, and surgical video data. Remarkably, without using any prompt, our method consistently outperforms various state-of-the-art 3D approaches, surpassing nnU-Net by 0.9%, 2.6%, and 9.9% in Dice for CT multi-organ segmentation, MRI prostate segmentation, and surgical scene segmentation respectively. Our model also demonstrates strong generalization, and excels in challenging tumor segmentation when prompts are used. Our code is available at: https://github.com/cchen-cc/MA-SAM.
Collapse
Affiliation(s)
- Cheng Chen
- Center of Advanced Medical Computing and Analysis, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, USA
| | - Juzheng Miao
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong, China
| | - Dufan Wu
- Center of Advanced Medical Computing and Analysis, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, USA
| | - Aoxiao Zhong
- Center of Advanced Medical Computing and Analysis, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, USA; Harvard John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, MA 02138, USA
| | - Zhiling Yan
- Department of Computer Science and Engineering, Lehigh University, Bethlehem, PA 18015, USA
| | - Sekeun Kim
- Center of Advanced Medical Computing and Analysis, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, USA
| | - Jiang Hu
- Center of Advanced Medical Computing and Analysis, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, USA
| | - Zhengliang Liu
- Center of Advanced Medical Computing and Analysis, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, USA; School of Computing, The University of Georgia, Athens, GA 30602, USA
| | - Lichao Sun
- Department of Computer Science and Engineering, Lehigh University, Bethlehem, PA 18015, USA
| | - Xiang Li
- Center of Advanced Medical Computing and Analysis, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, USA.
| | - Tianming Liu
- School of Computing, The University of Georgia, Athens, GA 30602, USA
| | - Pheng-Ann Heng
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong, China
| | - Quanzheng Li
- Center of Advanced Medical Computing and Analysis, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, USA
| |
Collapse
|
31
|
Gao J, Wang D. Quantifying the use and potential benefits of artificial intelligence in scientific research. Nat Hum Behav 2024; 8:2281-2292. [PMID: 39394445 DOI: 10.1038/s41562-024-02020-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2023] [Accepted: 09/12/2024] [Indexed: 10/13/2024]
Abstract
The rapid advancement of artificial intelligence (AI) is poised to reshape almost every line of work. Despite enormous efforts devoted to understanding AI's economic impacts, we lack a systematic understanding of the benefits to scientific research associated with the use of AI. Here we develop a measurement framework to estimate the direct use of AI and associated benefits in science. We find that the use and benefits of AI appear widespread throughout the sciences, growing especially rapidly since 2015. However, there is a substantial gap between AI education and its application in research, highlighting a misalignment between AI expertise supply and demand. Our analysis also reveals demographic disparities, with disciplines with higher proportions of women or Black scientists reaping fewer benefits from AI, potentially exacerbating existing inequalities in science. These findings have implications for the equity and sustainability of the research enterprise, especially as the integration of AI with science continues to deepen.
Collapse
Affiliation(s)
- Jian Gao
- Center for Science of Science and Innovation, Northwestern University, Evanston, IL, USA
- Kellogg School of Management, Northwestern University, Evanston, IL, USA
- Ryan Institute on Complexity, Northwestern University, Evanston, IL, USA
- Faculty of Social Sciences, The University of Hong Kong, Hong Kong SAR, China
| | - Dashun Wang
- Center for Science of Science and Innovation, Northwestern University, Evanston, IL, USA.
- Kellogg School of Management, Northwestern University, Evanston, IL, USA.
- Ryan Institute on Complexity, Northwestern University, Evanston, IL, USA.
- McCormick School of Engineering, Northwestern University, Evanston, IL, USA.
| |
Collapse
|
32
|
Jia S, Bit S, Searls E, Lauber MV, Claus LA, Fan P, Jasodanand VH, Veerapaneni D, Wang WM, Au R, Kolachalama VB. PodGPT: An audio-augmented large language model for research and education. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.07.11.24310304. [PMID: 39040167 PMCID: PMC11261953 DOI: 10.1101/2024.07.11.24310304] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/24/2024]
Abstract
The proliferation of scientific podcasts has generated an extensive repository of audio content, rich in specialized terminology, diverse topics, and expert dialogues. Here, we introduce a computational framework designed to enhance large language models (LLMs) by leveraging this informational content from publicly accessible podcast data across science, technology, engineering, mathematics and medical (STEMM) disciplines. This dataset, comprising over 3, 700 hours of audio content, was transcribed to generate over 42 million text tokens. Our model, PodGPT, integrates this wealth of complex dialogue found in audio podcasts to improve understanding of natural language nuances, cultural contexts, as well as scientific and medical knowledge. PodGPT also employs retrieval augmented generation (RAG) on a vector database built from articles in Creative Commons PubMed Central and The New England Journal of Medicine , enhancing STEMM research and education by providing real-time access to emerging scientific literature. Evaluated across multiple benchmarks, PodGPT demonstrated an average improvement of 3.51 percentage points over standard open-source benchmarks and 3.81 percentage points when augmented with evidence from the RAG pipeline. Moreover, it showcased an average improvement of 4.06 percentage points in its zero-shot multi-lingual transfer ability, effectively generalizing to different linguistic contexts. By harnessing the untapped potential of podcast content, PodGPT advances natural language processing and conversational AI, offering enhanced capabilities for STEMM research and education.
Collapse
|
33
|
Ferber D, Wölflein G, Wiest IC, Ligero M, Sainath S, Ghaffari Laleh N, El Nahhas OSM, Müller-Franzes G, Jäger D, Truhn D, Kather JN. In-context learning enables multimodal large language models to classify cancer pathology images. Nat Commun 2024; 15:10104. [PMID: 39572531 PMCID: PMC11582649 DOI: 10.1038/s41467-024-51465-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2024] [Accepted: 08/05/2024] [Indexed: 11/24/2024] Open
Abstract
Medical image classification requires labeled, task-specific datasets which are used to train deep learning networks de novo, or to fine-tune foundation models. However, this process is computationally and technically demanding. In language processing, in-context learning provides an alternative, where models learn from within prompts, bypassing the need for parameter updates. Yet, in-context learning remains underexplored in medical image analysis. Here, we systematically evaluate the model Generative Pretrained Transformer 4 with Vision capabilities (GPT-4V) on cancer image processing with in-context learning on three cancer histopathology tasks of high importance: Classification of tissue subtypes in colorectal cancer, colon polyp subtyping and breast tumor detection in lymph node sections. Our results show that in-context learning is sufficient to match or even outperform specialized neural networks trained for particular tasks, while only requiring a minimal number of samples. In summary, this study demonstrates that large vision language models trained on non-domain specific data can be applied out-of-the box to solve medical image-processing tasks in histopathology. This democratizes access of generalist AI models to medical experts without technical background especially for areas where annotated data is scarce.
Collapse
Affiliation(s)
- Dyke Ferber
- National Center for Tumor Diseases (NCT), Heidelberg University Hospital, Heidelberg, Germany
- Department of Medical Oncology, Heidelberg University Hospital, Heidelberg, Germany
- Else Kroener Fresenius Center for Digital Health, Technical University Dresden, Dresden, Germany
| | - Georg Wölflein
- School of Computer Science, University of St Andrews, St Andrews, UK
| | - Isabella C Wiest
- Else Kroener Fresenius Center for Digital Health, Technical University Dresden, Dresden, Germany
- Department of Medicine II, Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany
| | - Marta Ligero
- Else Kroener Fresenius Center for Digital Health, Technical University Dresden, Dresden, Germany
| | - Srividhya Sainath
- Else Kroener Fresenius Center for Digital Health, Technical University Dresden, Dresden, Germany
| | - Narmin Ghaffari Laleh
- Else Kroener Fresenius Center for Digital Health, Technical University Dresden, Dresden, Germany
| | - Omar S M El Nahhas
- Else Kroener Fresenius Center for Digital Health, Technical University Dresden, Dresden, Germany
| | - Gustav Müller-Franzes
- Department of Diagnostic and Interventional Radiology, University Hospital Aachen, Aachen, Germany
| | - Dirk Jäger
- National Center for Tumor Diseases (NCT), Heidelberg University Hospital, Heidelberg, Germany
- Department of Medical Oncology, Heidelberg University Hospital, Heidelberg, Germany
| | - Daniel Truhn
- Department of Diagnostic and Interventional Radiology, University Hospital Aachen, Aachen, Germany
| | - Jakob Nikolas Kather
- National Center for Tumor Diseases (NCT), Heidelberg University Hospital, Heidelberg, Germany.
- Department of Medical Oncology, Heidelberg University Hospital, Heidelberg, Germany.
- Else Kroener Fresenius Center for Digital Health, Technical University Dresden, Dresden, Germany.
- Department of Medicine I, University Hospital Dresden, Dresden, Germany.
| |
Collapse
|
34
|
Wu E, Bieniosek M, Wu Z, Thakkar N, Charville GW, Makky A, Schürch C, Huyghe JR, Peters U, Li CI, Li L, Giba H, Behera V, Raman A, Trevino AE, Mayer AT, Zou J. ROSIE: AI generation of multiplex immunofluorescence staining from histopathology images. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.11.10.622859. [PMID: 39605711 PMCID: PMC11601356 DOI: 10.1101/2024.11.10.622859] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 11/29/2024]
Abstract
Hematoxylin and eosin (H&E) is a common and inexpensive histopathology assay. Though widely used and information-rich, it cannot directly inform about specific molecular markers, which require additional experiments to assess. To address this gap, we present ROSIE, a deep-learning framework that computationally imputes the expression and localization of dozens of proteins from H&E images. Our model is trained on a dataset of over 1000 paired and aligned H&E and multiplex immunofluorescence (mIF) samples from 20 tissues and disease conditions, spanning over 16 million cells. Validation of our in silico mIF staining method on held-out H&E samples demonstrates that the predicted biomarkers are effective in identifying cell phenotypes, particularly distinguishing lymphocytes such as B cells and T cells, which are not readily discernible with H&E staining alone. Additionally, ROSIE facilitates the robust identification of stromal and epithelial microenvironments and immune cell subtypes like tumor-infiltrating lymphocytes (TILs), which are important for understanding tumor-immune interactions and can help inform treatment strategies in cancer research.
Collapse
Affiliation(s)
- Eric Wu
- Enable Medicine, Menlo Park, CA, USA
- Department of Electrical Engineering, Stanford University, Stanford, CA, USA
| | | | | | - Nitya Thakkar
- Department of Computer Science, Stanford University, Stanford, CA, USA
| | | | - Ahmad Makky
- Institute for Pathology, University of Tübingen, Tübingen, Germany
| | | | - Jeroen R Huyghe
- Public Health Sciences Division, Fred Hutchinson Cancer Center, Seattle, WA, USA
| | - Ulrike Peters
- Public Health Sciences Division, Fred Hutchinson Cancer Center, Seattle, WA, USA
- Department of Epidemiology, University of Washington, Seattle, WA, USA
| | - Christopher I Li
- Public Health Sciences Division, Fred Hutchinson Cancer Center, Seattle, WA, USA
- Department of Epidemiology, University of Washington, Seattle, WA, USA
| | - Li Li
- Ochsner Health, New Orleans, LA, USA
| | - Hannah Giba
- Duchossois Family Institute, University of Chicago, Chicago, IL, 60637
- Department of Pathology, University of Chicago, Chicago, IL, 60637
| | - Vivek Behera
- Duchossois Family Institute, University of Chicago, Chicago, IL, 60637
- Department of Medicine, Section of Hematology/Oncology, University of Chicago, Chicago, IL, 60637
| | - Arjun Raman
- Duchossois Family Institute, University of Chicago, Chicago, IL, 60637
- Department of Pathology, University of Chicago, Chicago, IL, 60637
- Center for the Physics of Evolving Systems, University of Chicago, Chicago, IL, 60637
| | | | | | - James Zou
- Enable Medicine, Menlo Park, CA, USA
- Department of Electrical Engineering, Stanford University, Stanford, CA, USA
- Department of Biomedical Data Science, Stanford University, Stanford, CA, USA
| |
Collapse
|
35
|
Shi J, Sun D, Wu K, Jiang Z, Kong X, Wang W, Wu H, Zheng Y. Positional encoding-guided transformer-based multiple instance learning for histopathology whole slide images classification. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2024; 258:108491. [PMID: 39549395 DOI: 10.1016/j.cmpb.2024.108491] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/03/2024] [Revised: 10/30/2024] [Accepted: 11/02/2024] [Indexed: 11/18/2024]
Abstract
BACKGROUND AND OBJECTIVES Whole slide image (WSI) classification is of great clinical significance in computer-aided pathological diagnosis. Due to the high cost of manual annotation, weakly supervised WSI classification methods have gained more attention. As the most representative, multiple instance learning (MIL) generally aggregates the predictions or features of the patches within a WSI to achieve the slide-level classification under the weak supervision of WSI labels. However, most existing MIL methods ignore spatial position relationships of the patches, which is likely to strengthen the discriminative ability of WSI-level features. METHODS In this paper, we propose a novel positional encoding-guided transformer-based multiple instance learning (PEGTB-MIL) method for histopathology WSI classification. It aims to encode the spatial positional property of the patch into its corresponding semantic features and explore the potential correlation among the patches for improving the WSI classification performance. Concretely, the deep features of the patches in WSI are first extracted and simultaneously a position encoder is used to encode the spatial 2D positional information of the patches into the spatial-aware features. After incorporating the semantic features and spatial embeddings, multi-head self-attention (MHSA) is applied to explore the contextual and spatial dependencies of the fused features. Particularly, we introduce an auxiliary reconstruction task to enhance the spatial-semantic consistency and generalization ability of features. RESULTS The proposed method is evaluated on two public benchmark TCGA datasets (TCGA-LUNG and TCGA-BRCA) and two in-house clinical datasets (USTC-EGFR and USTC-GIST). Experimental results validate it is effective in the tasks of cancer subtyping and gene mutation status prediction. In the test stage, the proposed PEGTB-MIL outperforms the other state-of-the-art methods and respectively achieves 97.13±0.34%, 86.74±2.64%, 83.25±1.65%, and 72.52±1.63% of the area under the receiver operating characteristic (ROC) curve (AUC). CONCLUSION PEGTB-MIL utilizes positional encoding to effectively guide and reinforce MIL, leading to enhanced performance on downstream WSI classification tasks. Specifically, the introduced auxiliary reconstruction module adeptly preserves the spatial-semantic consistency of patch features. More significantly, this study investigates the relationship between position information and disease diagnosis and presents a promising avenue for further research.
Collapse
Affiliation(s)
- Jun Shi
- School of Software, Hefei University of Technology, Hefei, 230601, Anhui Province, China
| | - Dongdong Sun
- School of Computer Science and Information Engineering, Hefei University of Technology, Hefei, 230601, Anhui Province, China
| | - Kun Wu
- Image Processing Center, School of Astronautics, Beihang University, Beijing, 102206, China
| | - Zhiguo Jiang
- Image Processing Center, School of Astronautics, Beihang University, Beijing, 102206, China
| | - Xue Kong
- Department of Pathology, the First Affiliated Hospital of USTC, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, 230036, Anhui Province, China; Intelligent Pathology Institute, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, 230036, Anhui Province, China
| | - Wei Wang
- Department of Pathology, the First Affiliated Hospital of USTC, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, 230036, Anhui Province, China; Intelligent Pathology Institute, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, 230036, Anhui Province, China
| | - Haibo Wu
- Department of Pathology, the First Affiliated Hospital of USTC, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, 230036, Anhui Province, China; Intelligent Pathology Institute, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, 230036, Anhui Province, China
| | - Yushan Zheng
- School of Engineering Medicine, Beijing Advanced Innovation Center for Biomedical Engineering, Beihang University, Beijing, 100191, China.
| |
Collapse
|
36
|
Sükei E, Rumetshofer E, Schmidinger N, Mayr A, Schmidt-Erfurth U, Klambauer G, Bogunović H. Multi-modal representation learning in retinal imaging using self-supervised learning for enhanced clinical predictions. Sci Rep 2024; 14:26802. [PMID: 39500979 PMCID: PMC11538269 DOI: 10.1038/s41598-024-78515-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2024] [Accepted: 10/31/2024] [Indexed: 11/08/2024] Open
Abstract
Self-supervised learning has become the cornerstone of building generalizable and transferable artificial intelligence systems in medical imaging. In particular, contrastive representation learning techniques trained on large multi-modal datasets have demonstrated impressive capabilities of producing highly transferable representations for different downstream tasks. In ophthalmology, large multi-modal datasets are abundantly available and conveniently accessible as modern retinal imaging scanners acquire both 2D fundus images and 3D optical coherence tomography (OCT) scans to assess the eye. In this context, we introduce a novel multi-modal contrastive learning-based pipeline to facilitate learning joint representations for the two retinal imaging modalities. After self-supervised pre-training on 153,306 scan pairs, we show that such a pre-training framework can provide both a retrieval system and encoders that produce comprehensive OCT and fundus image representations that generalize well for various downstream tasks on three independent external datasets, explicitly focusing on clinically pertinent prediction tasks. In addition, we show that interchanging OCT with lower-cost fundus imaging can preserve the predictive power of the trained models.
Collapse
Affiliation(s)
- Emese Sükei
- OPTIMA Lab, Department of of Ophthalmology and Optometry, Medical University of Vienna, Vienna, Austria.
| | - Elisabeth Rumetshofer
- LIT AI Lab, Institute for Machine Learning, Johannes Kepler University, Linz, Austria
| | - Niklas Schmidinger
- LIT AI Lab, Institute for Machine Learning, Johannes Kepler University, Linz, Austria
| | - Andreas Mayr
- LIT AI Lab, Institute for Machine Learning, Johannes Kepler University, Linz, Austria
| | - Ursula Schmidt-Erfurth
- OPTIMA Lab, Department of of Ophthalmology and Optometry, Medical University of Vienna, Vienna, Austria
| | - Günter Klambauer
- LIT AI Lab, Institute for Machine Learning, Johannes Kepler University, Linz, Austria
| | - Hrvoje Bogunović
- OPTIMA Lab, Department of of Ophthalmology and Optometry, Medical University of Vienna, Vienna, Austria.
- Institute of Artificial Intelligence, Center for Medical Data Science, Medical University of Vienna, Vienna, Austria.
| |
Collapse
|
37
|
Wong IN, Monteiro O, Baptista-Hon DT, Wang K, Lu W, Sun Z, Nie S, Yin Y. Leveraging foundation and large language models in medical artificial intelligence. Chin Med J (Engl) 2024; 137:2529-2539. [PMID: 39497256 PMCID: PMC11556979 DOI: 10.1097/cm9.0000000000003302] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2024] [Indexed: 11/14/2024] Open
Abstract
ABSTRACT Recent advancements in the field of medical artificial intelligence (AI) have led to the widespread adoption of foundational and large language models. This review paper explores their applications within medical AI, introducing a novel classification framework that categorizes them as disease-specific, general-domain, and multi-modal models. The paper also addresses key challenges such as data acquisition and augmentation, including issues related to data volume, annotation, multi-modal fusion, and privacy concerns. Additionally, it discusses the evaluation, validation, limitations, and regulation of medical AI models, emphasizing their transformative potential in healthcare. The importance of continuous improvement, data security, standardized evaluations, and collaborative approaches is highlighted to ensure the responsible and effective integration of AI into clinical applications.
Collapse
Affiliation(s)
- Io Nam Wong
- Institute for AI in Medicine, Faculty of Medicine, Macau University of Science and Technology, Macau Special Administrative Region 999078, China
| | - Olivia Monteiro
- Institute for AI in Medicine, Faculty of Medicine, Macau University of Science and Technology, Macau Special Administrative Region 999078, China
| | - Daniel T. Baptista-Hon
- Institute for AI in Medicine, Faculty of Medicine, Macau University of Science and Technology, Macau Special Administrative Region 999078, China
| | - Kai Wang
- Department of Big Data and Biomedical AI, College of Future Technology, Peking University, Beijing 100871, China
| | - Wenyang Lu
- Institute for Advanced Study on Eye Health and Diseases, Wenzhou Medical University, Wenzhou, Zhejiang 325027, China
| | - Zhuo Sun
- Department of Ophthalmology, The Third People’s Hospital of Changzhou, Changzhou, Jiangsu 203001, China
- Institute for Advanced Study on Eye Health and Diseases, Wenzhou Medical University, Wenzhou, Zhejiang 325027, China
| | - Sheng Nie
- Division of Nephrology, National Clinical Research Center for Kidney Disease, State Key Laboratory of Organ Failure Research, Nanfang Hospital, Southern Medical University, Guangzhou, Guangdong 510515, China
| | - Yun Yin
- Faculty of Health and Wellness, Faculty of Business, City University of Macau, Macau Special Administrative Region 999078, China
| |
Collapse
|
38
|
Yang Y, Ye C, Su G, Zhang Z, Chang Z, Chen H, Chan P, Yu Y, Ma T. BrainMass: Advancing Brain Network Analysis for Diagnosis With Large-Scale Self-Supervised Learning. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:4004-4016. [PMID: 38875087 DOI: 10.1109/tmi.2024.3414476] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2024]
Abstract
Foundation models pretrained on large-scale datasets via self-supervised learning demonstrate exceptional versatility across various tasks. Due to the heterogeneity and hard-to-collect medical data, this approach is especially beneficial for medical image analysis and neuroscience research, as it streamlines broad downstream tasks without the need for numerous costly annotations. However, there has been limited investigation into brain network foundation models, limiting their adaptability and generalizability for broad neuroscience studies. In this study, we aim to bridge this gap. In particular, 1) we curated a comprehensive dataset by collating images from 30 datasets, which comprises 70,781 samples of 46,686 participants. Moreover, we introduce pseudo-functional connectivity (pFC) to further generates millions of augmented brain networks by randomly dropping certain timepoints of the BOLD signal; 2) we propose the BrainMass framework for brain network self-supervised learning via mask modeling and feature alignment. BrainMass employs Mask-ROI Modeling (MRM) to bolster intra-network dependencies and regional specificity. Furthermore, Latent Representation Alignment (LRA) module is utilized to regularize augmented brain networks of the same participant with similar topological properties to yield similar latent representations by aligning their latent embeddings. Extensive experiments on eight internal tasks and seven external brain disorder diagnosis tasks show BrainMass's superior performance, highlighting its significant generalizability and adaptability. Nonetheless, BrainMass demonstrates powerful few/zero-shot learning abilities and exhibits meaningful interpretation to various diseases, showcasing its potential use for clinical applications.
Collapse
|
39
|
Yang Y, Yu J, Fu Z, Zhang K, Yu T, Wang X, Jiang H, Lv J, Huang Q, Han W. Token-Mixer: Bind Image and Text in One Embedding Space for Medical Image Reporting. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:4017-4028. [PMID: 38861436 DOI: 10.1109/tmi.2024.3412402] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2024]
Abstract
Medical image reporting focused on automatically generating the diagnostic reports from medical images has garnered growing research attention. In this task, learning cross-modal alignment between images and reports is crucial. However, the exposure bias problem in autoregressive text generation poses a notable challenge, as the model is optimized by a word-level loss function using the teacher-forcing strategy. To this end, we propose a novel Token-Mixer framework that learns to bind image and text in one embedding space for medical image reporting. Concretely, Token-Mixer enhances the cross-modal alignment by matching image-to-text generation with text-to-text generation that suffers less from exposure bias. The framework contains an image encoder, a text encoder and a text decoder. In training, images and paired reports are first encoded into image tokens and text tokens, and these tokens are randomly mixed to form the mixed tokens. Then, the text decoder accepts image tokens, text tokens or mixed tokens as prompt tokens and conducts text generation for network optimization. Furthermore, we introduce a tailored text decoder and an alternative training strategy that well integrate with our Token-Mixer framework. Extensive experiments across three publicly available datasets demonstrate Token-Mixer successfully enhances the image-text alignment and thereby attains a state-of-the-art performance. Related codes are available at https://github.com/yangyan22/Token-Mixer.
Collapse
|
40
|
Gao S, Fang A, Huang Y, Giunchiglia V, Noori A, Schwarz JR, Ektefaie Y, Kondic J, Zitnik M. Empowering biomedical discovery with AI agents. Cell 2024; 187:6125-6151. [PMID: 39486399 DOI: 10.1016/j.cell.2024.09.022] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2024] [Revised: 07/16/2024] [Accepted: 09/12/2024] [Indexed: 11/04/2024]
Abstract
We envision "AI scientists" as systems capable of skeptical learning and reasoning that empower biomedical research through collaborative agents that integrate AI models and biomedical tools with experimental platforms. Rather than taking humans out of the discovery process, biomedical AI agents combine human creativity and expertise with AI's ability to analyze large datasets, navigate hypothesis spaces, and execute repetitive tasks. AI agents are poised to be proficient in various tasks, planning discovery workflows and performing self-assessment to identify and mitigate gaps in their knowledge. These agents use large language models and generative models to feature structured memory for continual learning and use machine learning tools to incorporate scientific knowledge, biological principles, and theories. AI agents can impact areas ranging from virtual cell simulation, programmable control of phenotypes, and the design of cellular circuits to developing new therapies.
Collapse
Affiliation(s)
- Shanghua Gao
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Ada Fang
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA; Department of Chemistry and Chemical Biology, Harvard University, Cambridge, MA, USA; Kempner Institute for the Study of Natural and Artificial Intelligence, Harvard University, Allston, MA, USA
| | - Yepeng Huang
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA; Program in Biological and Biomedical Sciences, Harvard Medical School, Boston, MA, USA
| | - Valentina Giunchiglia
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA; Department of Brain Sciences, Imperial College London, London, UK
| | - Ayush Noori
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA; Harvard College, Cambridge, MA, USA
| | | | - Yasha Ektefaie
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA; Program in Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Jovana Kondic
- Department of Electrical Engineering and Computer Science, MIT, Cambridge, MA, USA
| | - Marinka Zitnik
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA; Kempner Institute for the Study of Natural and Artificial Intelligence, Harvard University, Allston, MA, USA; Broad Institute of MIT and Harvard, Cambridge, MA, USA; Harvard Data Science Initiative, Cambridge, MA, USA.
| |
Collapse
|
41
|
Zhang WY, Chang YJ, Shi RH. Artificial intelligence enhances the management of esophageal squamous cell carcinoma in the precision oncology era. World J Gastroenterol 2024; 30:4267-4280. [PMID: 39492825 PMCID: PMC11525855 DOI: 10.3748/wjg.v30.i39.4267] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/28/2024] [Revised: 08/31/2024] [Accepted: 09/19/2024] [Indexed: 10/12/2024] Open
Abstract
Esophageal squamous cell carcinoma (ESCC) is the most common histological type of esophageal cancer with a poor prognosis. Early diagnosis and prognosis assessment are crucial for improving the survival rate of ESCC patients. With the advancement of artificial intelligence (AI) technology and the proliferation of medical digital information, AI has demonstrated promising sensitivity and accuracy in assisting precise detection, treatment decision-making, and prognosis assessment of ESCC. It has become a unique opportunity to enhance comprehensive clinical management of ESCC in the era of precision oncology. This review examines how AI is applied to the diagnosis, treatment, and prognosis assessment of ESCC in the era of precision oncology, and analyzes the challenges and potential opportunities that AI faces in clinical translation. Through insights into future prospects, it is hoped that this review will contribute to the real-world application of AI in future clinical settings, ultimately alleviating the disease burden caused by ESCC.
Collapse
Affiliation(s)
- Wan-Yue Zhang
- School of Medicine, Southeast University, Nanjing 221000, Jiangsu Province, China
| | - Yong-Jian Chang
- School of Cyber Science and Engineering, Southeast University, Nanjing 210009, Jiangsu Province, China
| | - Rui-Hua Shi
- Department of Gastroenterology, Zhongda Hospital, Southeast University, Nanjing 210009, Jiangsu Province, China
| |
Collapse
|
42
|
Zheng K, Duan J, Wang R, Chen H, He H, Zheng X, Zhao Z, Jing B, Zhang Y, Liu S, Xie D, Lin Y, Sun Y, Zhang N, Cai M. Deep learning model with pathological knowledge for detection of colorectal neuroendocrine tumor. Cell Rep Med 2024; 5:101785. [PMID: 39413732 PMCID: PMC11513840 DOI: 10.1016/j.xcrm.2024.101785] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2024] [Revised: 08/19/2024] [Accepted: 09/19/2024] [Indexed: 10/18/2024]
Abstract
Colorectal neuroendocrine tumors (NETs) differ significantly from colorectal carcinoma (CRC) in terms of treatment strategy and prognosis, necessitating a cost-effective approach for accurate discrimination. Here, we propose an approach for distinguishing between colorectal NET and CRC based on pathological images by utilizing pathological prior information to facilitate the generation of robust slide-level features. By calculating the similarity between morphological descriptions and patches, our approach selects only 2% of the diagnostically relevant patches for both training and inference, achieving an area under the receiver operating characteristic curve (AUROC) of 0.9974 on the internal dataset, and AUROCs of 0.9724 and 0.9513 on two external datasets. Our model effectively identifies NETs from CRCs, reducing unnecessary immunohistochemical tests and enhancing the precise treatment for patients with colorectal tumors. Our approach also enables researchers to investigate methods with high accuracy and low computational complexity, thereby advancing the application of artificial intelligence in clinical settings.
Collapse
Affiliation(s)
- Ke Zheng
- Department of Pathology, State Key Laboratory of Oncology in South China, Guangdong Provincial Clinical Research Center for Cancer, Sun Yat-sen University Cancer Center, Guangzhou 510060, China
| | - Jinling Duan
- Department of Pathology, State Key Laboratory of Oncology in South China, Guangdong Provincial Clinical Research Center for Cancer, Sun Yat-sen University Cancer Center, Guangzhou 510060, China
| | - Ruixuan Wang
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou 510006, China
| | - Haohua Chen
- Artificial Intelligence Laboratory, State Key Laboratory of Oncology in South China, Guangdong Provincial Clinical Research Center for Cancer, Sun Yat-sen University Cancer Center, Guangzhou 510060, China
| | - Haiyang He
- Department of Pathology, The First Affiliated Hospital, Sun Yat-sen University, Guangzhou 510080, China
| | - Xueyi Zheng
- Department of Pathology, State Key Laboratory of Oncology in South China, Guangdong Provincial Clinical Research Center for Cancer, Sun Yat-sen University Cancer Center, Guangzhou 510060, China
| | - Zihan Zhao
- Department of Pathology, State Key Laboratory of Oncology in South China, Guangdong Provincial Clinical Research Center for Cancer, Sun Yat-sen University Cancer Center, Guangzhou 510060, China
| | - Bingzhong Jing
- Artificial Intelligence Laboratory, State Key Laboratory of Oncology in South China, Guangdong Provincial Clinical Research Center for Cancer, Sun Yat-sen University Cancer Center, Guangzhou 510060, China
| | - Yuqian Zhang
- Electrical Engineering & Computer Science, Johns Hopkins University, Baltimore, MD 21218, USA
| | - Shasha Liu
- Department of Pathology, Tianjin Medical University Cancer Institute and Hospital, Tianjin 300000, China
| | - Dan Xie
- Department of Pathology, State Key Laboratory of Oncology in South China, Guangdong Provincial Clinical Research Center for Cancer, Sun Yat-sen University Cancer Center, Guangzhou 510060, China
| | - Yuan Lin
- Department of Pathology, The First Affiliated Hospital, Sun Yat-sen University, Guangzhou 510080, China.
| | - Yan Sun
- Department of Pathology, Tianjin Medical University Cancer Institute and Hospital, Tianjin 300000, China.
| | - Ning Zhang
- Department of Gastroenterology and Hepatology, The First Affiliated Hospital, Sun Yat-Sen University, Guangzhou 510060, China.
| | - Muyan Cai
- Department of Pathology, State Key Laboratory of Oncology in South China, Guangdong Provincial Clinical Research Center for Cancer, Sun Yat-sen University Cancer Center, Guangzhou 510060, China.
| |
Collapse
|
43
|
Bani Baker Q, Hammad M, Al-Smadi M, Al-Jarrah H, Al-Hamouri R, Al-Zboon SA. Enhanced COVID-19 Detection from X-ray Images with Convolutional Neural Network and Transfer Learning. J Imaging 2024; 10:250. [PMID: 39452413 PMCID: PMC11508642 DOI: 10.3390/jimaging10100250] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2024] [Revised: 09/22/2024] [Accepted: 09/27/2024] [Indexed: 10/26/2024] Open
Abstract
The global spread of Coronavirus (COVID-19) has prompted imperative research into scalable and effective detection methods to curb its outbreak. The early diagnosis of COVID-19 patients has emerged as a pivotal strategy in mitigating the spread of the disease. Automated COVID-19 detection using Chest X-ray (CXR) imaging has significant potential for facilitating large-scale screening and epidemic control efforts. This paper introduces a novel approach that employs state-of-the-art Convolutional Neural Network models (CNNs) for accurate COVID-19 detection. The employed datasets each comprised 15,000 X-ray images. We addressed both binary (Normal vs. Abnormal) and multi-class (Normal, COVID-19, Pneumonia) classification tasks. Comprehensive evaluations were performed by utilizing six distinct CNN-based models (Xception, Inception-V3, ResNet50, VGG19, DenseNet201, and InceptionResNet-V2) for both tasks. As a result, the Xception model demonstrated exceptional performance, achieving 98.13% accuracy, 98.14% precision, 97.65% recall, and a 97.89% F1-score in binary classification, while in multi-classification it yielded 87.73% accuracy, 90.20% precision, 87.73% recall, and an 87.49% F1-score. Moreover, the other utilized models, such as ResNet50, demonstrated competitive performance compared with many recent works.
Collapse
Affiliation(s)
- Qanita Bani Baker
- Faculty of Computer and Information Technology, Jordan University of Science and Technology, P.O. Box 3030, Irbid 22110, Jordan; (M.H.); (H.A.-J.); (R.A.-H.); (S.A.A.-Z.)
| | - Mahmoud Hammad
- Faculty of Computer and Information Technology, Jordan University of Science and Technology, P.O. Box 3030, Irbid 22110, Jordan; (M.H.); (H.A.-J.); (R.A.-H.); (S.A.A.-Z.)
| | - Mohammed Al-Smadi
- Digital Learning and Online Education Office (DLOE), Qatar University, Doha 2713, Qatar;
| | - Heba Al-Jarrah
- Faculty of Computer and Information Technology, Jordan University of Science and Technology, P.O. Box 3030, Irbid 22110, Jordan; (M.H.); (H.A.-J.); (R.A.-H.); (S.A.A.-Z.)
| | - Rahaf Al-Hamouri
- Faculty of Computer and Information Technology, Jordan University of Science and Technology, P.O. Box 3030, Irbid 22110, Jordan; (M.H.); (H.A.-J.); (R.A.-H.); (S.A.A.-Z.)
| | - Sa’ad A. Al-Zboon
- Faculty of Computer and Information Technology, Jordan University of Science and Technology, P.O. Box 3030, Irbid 22110, Jordan; (M.H.); (H.A.-J.); (R.A.-H.); (S.A.A.-Z.)
| |
Collapse
|
44
|
Wang J. Deep Learning in Hematology: From Molecules to Patients. Clin Hematol Int 2024; 6:19-42. [PMID: 39417017 PMCID: PMC11477942 DOI: 10.46989/001c.124131] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2024] [Accepted: 06/29/2024] [Indexed: 10/19/2024] Open
Abstract
Deep learning (DL), a subfield of machine learning, has made remarkable strides across various aspects of medicine. This review examines DL's applications in hematology, spanning from molecular insights to patient care. The review begins by providing a straightforward introduction to the basics of DL tailored for those without prior knowledge, touching on essential concepts, principal architectures, and prevalent training methods. It then discusses the applications of DL in hematology, concentrating on elucidating the models' architecture, their applications, performance metrics, and inherent limitations. For example, at the molecular level, DL has improved the analysis of multi-omics data and protein structure prediction. For cells and tissues, DL enables the automation of cytomorphology analysis, interpretation of flow cytometry data, and diagnosis from whole slide images. At the patient level, DL's utility extends to analyzing curated clinical data, electronic health records, and clinical notes through large language models. While DL has shown promising results in various hematology applications, challenges remain in model generalizability and explainability. Moreover, the integration of novel DL architectures into hematology has been relatively slow in comparison to that in other medical fields.
Collapse
Affiliation(s)
- Jiasheng Wang
- Division of Hematology, Department of MedicineThe Ohio State University Comprehensive Cancer Center
| |
Collapse
|
45
|
Lu MY, Chen B, Williamson DFK, Chen RJ, Zhao M, Chow AK, Ikemura K, Kim A, Pouli D, Patel A, Soliman A, Chen C, Ding T, Wang JJ, Gerber G, Liang I, Le LP, Parwani AV, Weishaupt LL, Mahmood F. A multimodal generative AI copilot for human pathology. Nature 2024; 634:466-473. [PMID: 38866050 PMCID: PMC11464372 DOI: 10.1038/s41586-024-07618-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2023] [Accepted: 05/28/2024] [Indexed: 06/14/2024]
Abstract
Computational pathology1,2 has witnessed considerable progress in the development of both task-specific predictive models and task-agnostic self-supervised vision encoders3,4. However, despite the explosive growth of generative artificial intelligence (AI), there have been few studies on building general-purpose multimodal AI assistants and copilots5 tailored to pathology. Here we present PathChat, a vision-language generalist AI assistant for human pathology. We built PathChat by adapting a foundational vision encoder for pathology, combining it with a pretrained large language model and fine-tuning the whole system on over 456,000 diverse visual-language instructions consisting of 999,202 question and answer turns. We compare PathChat with several multimodal vision-language AI assistants and GPT-4V, which powers the commercially available multimodal general-purpose AI assistant ChatGPT-4 (ref. 6). PathChat achieved state-of-the-art performance on multiple-choice diagnostic questions from cases with diverse tissue origins and disease models. Furthermore, using open-ended questions and human expert evaluation, we found that overall PathChat produced more accurate and pathologist-preferable responses to diverse queries related to pathology. As an interactive vision-language AI copilot that can flexibly handle both visual and natural language inputs, PathChat may potentially find impactful applications in pathology education, research and human-in-the-loop clinical decision-making.
Collapse
Affiliation(s)
- Ming Y Lu
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
- Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Electrical Engineering and Computer Science, Massachusetts Institute of Technology (MIT), Cambridge, MA, USA
| | - Bowen Chen
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
| | - Drew F K Williamson
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
- Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA
| | - Richard J Chen
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
- Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA
| | - Melissa Zhao
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
| | - Aaron K Chow
- Department of Pathology, Wexner Medical Center, Ohio State University, Columbus, OH, USA
| | - Kenji Ikemura
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
| | - Ahrong Kim
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Department of Pathology, Pusan National University, Busan, South Korea
| | - Dimitra Pouli
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
| | - Ankush Patel
- Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, MN, USA
| | - Amr Soliman
- Department of Pathology, Wexner Medical Center, Ohio State University, Columbus, OH, USA
| | - Chengkuan Chen
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Tong Ding
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Harvard John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, MA, USA
| | - Judy J Wang
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Georg Gerber
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Ivy Liang
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Harvard John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, MA, USA
| | - Long Phi Le
- Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
| | - Anil V Parwani
- Department of Pathology, Wexner Medical Center, Ohio State University, Columbus, OH, USA
| | - Luca L Weishaupt
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Health Sciences and Technology, Harvard-MIT, Cambridge, MA, USA
| | - Faisal Mahmood
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA.
- Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA.
- Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA.
- Harvard Data Science Initiative, Harvard University, Cambridge, MA, USA.
| |
Collapse
|
46
|
Avram O, Durmus B, Rakocz N, Corradetti G, An U, Nittala MG, Terway P, Rudas A, Chen ZJ, Wakatsuki Y, Hirabayashi K, Velaga S, Tiosano L, Corvi F, Verma A, Karamat A, Lindenberg S, Oncel D, Almidani L, Hull V, Fasih-Ahmad S, Esmaeilkhanian H, Cannesson M, Wykoff CC, Rahmani E, Arnold CW, Zhou B, Zaitlen N, Gronau I, Sankararaman S, Chiang JN, Sadda SR, Halperin E. Accurate prediction of disease-risk factors from volumetric medical scans by a deep vision model pre-trained with 2D scans. Nat Biomed Eng 2024:10.1038/s41551-024-01257-9. [PMID: 39354052 DOI: 10.1038/s41551-024-01257-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2023] [Accepted: 08/23/2024] [Indexed: 10/03/2024]
Abstract
The application of machine learning to tasks involving volumetric biomedical imaging is constrained by the limited availability of annotated datasets of three-dimensional (3D) scans for model training. Here we report a deep-learning model pre-trained on 2D scans (for which annotated data are relatively abundant) that accurately predicts disease-risk factors from 3D medical-scan modalities. The model, which we named SLIViT (for 'slice integration by vision transformer'), preprocesses a given volumetric scan into 2D images, extracts their feature map and integrates it into a single prediction. We evaluated the model in eight different learning tasks, including classification and regression for six datasets involving four volumetric imaging modalities (computed tomography, magnetic resonance imaging, optical coherence tomography and ultrasound). SLIViT consistently outperformed domain-specific state-of-the-art models and was typically as accurate as clinical specialists who had spent considerable time manually annotating the analysed scans. Automating diagnosis tasks involving volumetric scans may save valuable clinician hours, reduce data acquisition costs and duration, and help expedite medical research and clinical applications.
Collapse
Affiliation(s)
- Oren Avram
- Department of Computational Medicine, University of California, Los Angeles, Los Angeles, CA, USA.
- Department of Computer Science, University of California, Los Angeles, Los Angeles, CA, USA.
- Department of Anesthesiology and Perioperative Medicine, University of California, Los Angeles, Los Angeles, CA, USA.
| | - Berkin Durmus
- Department of Computer Science, University of California, Los Angeles, Los Angeles, CA, USA
| | - Nadav Rakocz
- Department of Computer Science, University of California, Los Angeles, Los Angeles, CA, USA
| | - Giulia Corradetti
- Doheny Eye Institute, University of California, Los Angeles, Pasadena, CA, USA
- Department of Ophthalmology, University of California, Los Angeles, Los Angeles, CA, USA
| | - Ulzee An
- Department of Computational Medicine, University of California, Los Angeles, Los Angeles, CA, USA
- Department of Computer Science, University of California, Los Angeles, Los Angeles, CA, USA
| | - Muneeswar G Nittala
- Doheny Eye Institute, University of California, Los Angeles, Pasadena, CA, USA
- Department of Ophthalmology, University of California, Los Angeles, Los Angeles, CA, USA
| | - Prerit Terway
- Department of Computational Medicine, University of California, Los Angeles, Los Angeles, CA, USA
- Department of Computer Science, University of California, Los Angeles, Los Angeles, CA, USA
| | - Akos Rudas
- Department of Computational Medicine, University of California, Los Angeles, Los Angeles, CA, USA
| | - Zeyuan Johnson Chen
- Department of Computer Science, University of California, Los Angeles, Los Angeles, CA, USA
| | - Yu Wakatsuki
- Doheny Eye Institute, University of California, Los Angeles, Pasadena, CA, USA
| | | | - Swetha Velaga
- Doheny Eye Institute, University of California, Los Angeles, Pasadena, CA, USA
| | - Liran Tiosano
- Doheny Eye Institute, University of California, Los Angeles, Pasadena, CA, USA
- Department of Ophthalmology, Hadassah-Hebrew University Medical Center, Jerusalem, Israel
| | - Federico Corvi
- Doheny Eye Institute, University of California, Los Angeles, Pasadena, CA, USA
| | - Aditya Verma
- Doheny Eye Institute, University of California, Los Angeles, Pasadena, CA, USA
- Department of Ophthalmology and Visual Sciences, University of Louisville, Louisville, KY, USA
| | - Ayesha Karamat
- Doheny Eye Institute, University of California, Los Angeles, Pasadena, CA, USA
| | - Sophiana Lindenberg
- Doheny Eye Institute, University of California, Los Angeles, Pasadena, CA, USA
| | - Deniz Oncel
- Doheny Eye Institute, University of California, Los Angeles, Pasadena, CA, USA
| | - Louay Almidani
- Doheny Eye Institute, University of California, Los Angeles, Pasadena, CA, USA
| | - Victoria Hull
- Doheny Eye Institute, University of California, Los Angeles, Pasadena, CA, USA
| | - Sohaib Fasih-Ahmad
- Doheny Eye Institute, University of California, Los Angeles, Pasadena, CA, USA
| | | | - Maxime Cannesson
- Department of Anesthesiology and Perioperative Medicine, University of California, Los Angeles, Los Angeles, CA, USA
| | - Charles C Wykoff
- Retina Consultants of Texas, Retina Consultants of America, Houston, TX, USA
- Blanton Eye Institute, Houston Methodist Hospital, Houston, TX, USA
| | - Elior Rahmani
- Department of Computational Medicine, University of California, Los Angeles, Los Angeles, CA, USA
| | - Corey W Arnold
- Department of Radiology, University of California, Los Angeles, Los Angeles, CA, USA
- Department of Bioengineering, University of California, Los Angeles, Los Angeles, CA, USA
- Department of Pathology, University of California, Los Angeles, Los Angeles, CA, USA
| | - Bolei Zhou
- Department of Computational Medicine, University of California, Los Angeles, Los Angeles, CA, USA
- Department of Computer Science, University of California, Los Angeles, Los Angeles, CA, USA
| | - Noah Zaitlen
- Department of Neurology, University of California, Los Angeles, Los Angeles, CA, USA
- Department of Human Genetics, University of California, Los Angeles, Los Angeles, CA, USA
| | - Ilan Gronau
- School of Computer Science, Reichman University, Herzliya, Israel
| | - Sriram Sankararaman
- Department of Computational Medicine, University of California, Los Angeles, Los Angeles, CA, USA
- Department of Computer Science, University of California, Los Angeles, Los Angeles, CA, USA
- Department of Human Genetics, University of California, Los Angeles, Los Angeles, CA, USA
| | - Jeffrey N Chiang
- Department of Computational Medicine, University of California, Los Angeles, Los Angeles, CA, USA
- Department of Neurosurgery, University of California, Los Angeles, Los Angeles, CA, USA
| | - Srinivas R Sadda
- Doheny Eye Institute, University of California, Los Angeles, Pasadena, CA, USA.
- Department of Ophthalmology, University of California, Los Angeles, Los Angeles, CA, USA.
| | - Eran Halperin
- Department of Computer Science, University of California, Los Angeles, Los Angeles, CA, USA.
| |
Collapse
|
47
|
Wang X, Zhao J, Marostica E, Yuan W, Jin J, Zhang J, Li R, Tang H, Wang K, Li Y, Wang F, Peng Y, Zhu J, Zhang J, Jackson CR, Zhang J, Dillon D, Lin NU, Sholl L, Denize T, Meredith D, Ligon KL, Signoretti S, Ogino S, Golden JA, Nasrallah MP, Han X, Yang S, Yu KH. A pathology foundation model for cancer diagnosis and prognosis prediction. Nature 2024; 634:970-978. [PMID: 39232164 DOI: 10.1038/s41586-024-07894-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2023] [Accepted: 08/01/2024] [Indexed: 09/06/2024]
Abstract
Histopathology image evaluation is indispensable for cancer diagnoses and subtype classification. Standard artificial intelligence methods for histopathology image analyses have focused on optimizing specialized models for each diagnostic task1,2. Although such methods have achieved some success, they often have limited generalizability to images generated by different digitization protocols or samples collected from different populations3. Here, to address this challenge, we devised the Clinical Histopathology Imaging Evaluation Foundation (CHIEF) model, a general-purpose weakly supervised machine learning framework to extract pathology imaging features for systematic cancer evaluation. CHIEF leverages two complementary pretraining methods to extract diverse pathology representations: unsupervised pretraining for tile-level feature identification and weakly supervised pretraining for whole-slide pattern recognition. We developed CHIEF using 60,530 whole-slide images spanning 19 anatomical sites. Through pretraining on 44 terabytes of high-resolution pathology imaging datasets, CHIEF extracted microscopic representations useful for cancer cell detection, tumour origin identification, molecular profile characterization and prognostic prediction. We successfully validated CHIEF using 19,491 whole-slide images from 32 independent slide sets collected from 24 hospitals and cohorts internationally. Overall, CHIEF outperformed the state-of-the-art deep learning methods by up to 36.1%, showing its ability to address domain shifts observed in samples from diverse populations and processed by different slide preparation methods. CHIEF provides a generalizable foundation for efficient digital pathology evaluation for patients with cancer.
Collapse
Affiliation(s)
- Xiyue Wang
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Department of Radiation Oncology, Stanford University School of Medicine, Stanford, CA, USA
| | - Junhan Zhao
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Eliana Marostica
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Division of Health Sciences and Technology, Harvard-Massachusetts Institute of Technology, Boston, MA, USA
| | - Wei Yuan
- College of Biomedical Engineering, Sichuan University, Chengdu, China
| | - Jietian Jin
- Department of Pathology, Sun Yat-sen University Cancer Center, Guangzhou, China
| | - Jiayu Zhang
- College of Biomedical Engineering, Sichuan University, Chengdu, China
| | - Ruijiang Li
- Department of Radiation Oncology, Stanford University School of Medicine, Stanford, CA, USA
| | - Hongping Tang
- Department of Pathology, Shenzhen Maternity & Child Healthcare Hospital, Shenzhen, China
| | - Kanran Wang
- Department of Radiation Oncology, Chongqing University Cancer Hospital, Chongqing, China
| | - Yu Li
- Department of Pathology, Chongqing University Cancer Hospital, Chongqing, China
| | - Fang Wang
- Department of Pathology, The Affiliated Yantai Yuhuangding Hospital of Qingdao University, Yantai, China
| | - Yulong Peng
- Department of Pathology, The First Affiliated Hospital of Jinan University, Guangzhou, China
| | - Junyou Zhu
- Department of Burn, The First Affiliated Hospital, Sun Yat-sen University, Guangzhou, China
| | - Jing Zhang
- College of Biomedical Engineering, Sichuan University, Chengdu, China
| | - Christopher R Jackson
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Department of Pathology and Laboratory Medicine, Pennsylvania State University, Hummelstown, PA, USA
- Department of Pathology, Massachusetts General Hospital, Boston, MA, USA
| | | | - Deborah Dillon
- Department of Pathology, Brigham and Women's Hospital, Boston, MA, USA
| | - Nancy U Lin
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Lynette Sholl
- Department of Pathology, Brigham and Women's Hospital, Boston, MA, USA
- Department of Pathology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Thomas Denize
- Department of Pathology, Brigham and Women's Hospital, Boston, MA, USA
- Department of Pathology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - David Meredith
- Department of Pathology, Brigham and Women's Hospital, Boston, MA, USA
| | - Keith L Ligon
- Department of Pathology, Brigham and Women's Hospital, Boston, MA, USA
- Department of Pathology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Sabina Signoretti
- Department of Pathology, Brigham and Women's Hospital, Boston, MA, USA
- Department of Pathology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Shuji Ogino
- Department of Pathology, Brigham and Women's Hospital, Boston, MA, USA
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Jeffrey A Golden
- Department of Pathology, Brigham and Women's Hospital, Boston, MA, USA
- Department of Pathology, Cedars-Sinai Medical Center, Los Angeles, CA, USA
| | - MacLean P Nasrallah
- Department of Pathology and Laboratory Medicine, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA, USA
| | | | - Sen Yang
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA.
- Department of Radiation Oncology, Stanford University School of Medicine, Stanford, CA, USA.
| | - Kun-Hsing Yu
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA.
- Department of Pathology, Brigham and Women's Hospital, Boston, MA, USA.
- Harvard Data Science Initiative, Harvard University, Cambridge, MA, USA.
| |
Collapse
|
48
|
Liu J, Zhang Y, Wang K, Yavuz MC, Chen X, Yuan Y, Li H, Yang Y, Yuille A, Tang Y, Zhou Z. Universal and extensible language-vision models for organ segmentation and tumor detection from abdominal computed tomography. Med Image Anal 2024; 97:103226. [PMID: 38852215 DOI: 10.1016/j.media.2024.103226] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2023] [Revised: 03/30/2024] [Accepted: 05/27/2024] [Indexed: 06/11/2024]
Abstract
The advancement of artificial intelligence (AI) for organ segmentation and tumor detection is propelled by the growing availability of computed tomography (CT) datasets with detailed, per-voxel annotations. However, these AI models often struggle with flexibility for partially annotated datasets and extensibility for new classes due to limitations in the one-hot encoding, architectural design, and learning scheme. To overcome these limitations, we propose a universal, extensible framework enabling a single model, termed Universal Model, to deal with multiple public datasets and adapt to new classes (e.g., organs/tumors). Firstly, we introduce a novel language-driven parameter generator that leverages language embeddings from large language models, enriching semantic encoding compared with one-hot encoding. Secondly, the conventional output layers are replaced with lightweight, class-specific heads, allowing Universal Model to simultaneously segment 25 organs and six types of tumors and ease the addition of new classes. We train our Universal Model on 3410 CT volumes assembled from 14 publicly available datasets and then test it on 6173 CT volumes from four external datasets. Universal Model achieves first place on six CT tasks in the Medical Segmentation Decathlon (MSD) public leaderboard and leading performance on the Beyond The Cranial Vault (BTCV) dataset. In summary, Universal Model exhibits remarkable computational efficiency (6× faster than other dataset-specific models), demonstrates strong generalization across different hospitals, transfers well to numerous downstream tasks, and more importantly, facilitates the extensibility to new classes while alleviating the catastrophic forgetting of previously learned classes. Codes, models, and datasets are available at https://github.com/ljwztc/CLIP-Driven-Universal-Model.
Collapse
Affiliation(s)
- Jie Liu
- City University of Hong Kong, Hong Kong
| | - Yixiao Zhang
- Johns Hopkins University, United States of America
| | - Kang Wang
- University of California, San Francisco, United States of America
| | - Mehmet Can Yavuz
- University of California, San Francisco, United States of America
| | - Xiaoxi Chen
- University of Illinois Urbana-Champaign, United States of America
| | | | | | - Yang Yang
- University of California, San Francisco, United States of America
| | - Alan Yuille
- Johns Hopkins University, United States of America
| | | | - Zongwei Zhou
- Johns Hopkins University, United States of America.
| |
Collapse
|
49
|
Marini N, Marchesin S, Wodzinski M, Caputo A, Podareanu D, Guevara BC, Boytcheva S, Vatrano S, Fraggetta F, Ciompi F, Silvello G, Müller H, Atzori M. Multimodal representations of biomedical knowledge from limited training whole slide images and reports using deep learning. Med Image Anal 2024; 97:103303. [PMID: 39154617 DOI: 10.1016/j.media.2024.103303] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2023] [Revised: 08/08/2024] [Accepted: 08/09/2024] [Indexed: 08/20/2024]
Abstract
The increasing availability of biomedical data creates valuable resources for developing new deep learning algorithms to support experts, especially in domains where collecting large volumes of annotated data is not trivial. Biomedical data include several modalities containing complementary information, such as medical images and reports: images are often large and encode low-level information, while reports include a summarized high-level description of the findings identified within data and often only concerning a small part of the image. However, only a few methods allow to effectively link the visual content of images with the textual content of reports, preventing medical specialists from properly benefitting from the recent opportunities offered by deep learning models. This paper introduces a multimodal architecture creating a robust biomedical data representation encoding fine-grained text representations within image embeddings. The architecture aims to tackle data scarcity (combining supervised and self-supervised learning) and to create multimodal biomedical ontologies. The architecture is trained on over 6,000 colon whole slide Images (WSI), paired with the corresponding report, collected from two digital pathology workflows. The evaluation of the multimodal architecture involves three tasks: WSI classification (on data from pathology workflow and from public repositories), multimodal data retrieval, and linking between textual and visual concepts. Noticeably, the latter two tasks are available by architectural design without further training, showing that the multimodal architecture that can be adopted as a backbone to solve peculiar tasks. The multimodal data representation outperforms the unimodal one on the classification of colon WSIs and allows to halve the data needed to reach accurate performance, reducing the computational power required and thus the carbon footprint. The combination of images and reports exploiting self-supervised algorithms allows to mine databases without needing new annotations provided by experts, extracting new information. In particular, the multimodal visual ontology, linking semantic concepts to images, may pave the way to advancements in medicine and biomedical analysis domains, not limited to histopathology.
Collapse
Affiliation(s)
- Niccolò Marini
- Information Systems Institute, University of Applied Sciences Western Switzerland (HES-SO Valais), Sierre, Switzerland.
| | - Stefano Marchesin
- Department of Information Engineering, University of Padua, Padua, Italy.
| | - Marek Wodzinski
- Information Systems Institute, University of Applied Sciences Western Switzerland (HES-SO Valais), Sierre, Switzerland; Department of Measurement and Electronics, AGH University of Kraków, Krakow, Poland
| | - Alessandro Caputo
- Department of Pathology, Ruggi University Hospital, Salerno, Italy; Pathology Unit, Gravina Hospital Caltagirone ASP, Catania, Italy
| | | | | | - Svetla Boytcheva
- Ontotext, Sofia, Bulgaria; Institute of Information and Communication Technologies, Bulgarian Academy of Sciences, Sofia, Bulgaria
| | - Simona Vatrano
- Pathology Unit, Gravina Hospital Caltagirone ASP, Catania, Italy
| | - Filippo Fraggetta
- Pathology Unit, Gravina Hospital Caltagirone ASP, Catania, Italy; Department of Pathology, Radboud University Medical Center, Nijmegen, The Netherlands
| | - Francesco Ciompi
- Department of Pathology, Radboud University Medical Center, Nijmegen, The Netherlands
| | - Gianmaria Silvello
- Department of Information Engineering, University of Padua, Padua, Italy
| | - Henning Müller
- Information Systems Institute, University of Applied Sciences Western Switzerland (HES-SO Valais), Sierre, Switzerland; Medical faculty, University of Geneva, 1211 Geneva, Switzerland
| | - Manfredo Atzori
- Information Systems Institute, University of Applied Sciences Western Switzerland (HES-SO Valais), Sierre, Switzerland; Department of Neurosciences, University of Padua, Padua, Italy
| |
Collapse
|
50
|
Li X, Shen B, Feng F, Li K, Tang Z, Ma L, Li H. Dual-view jointly learning improves personalized drug synergy prediction. Bioinformatics 2024; 40:btae604. [PMID: 39423102 PMCID: PMC11524890 DOI: 10.1093/bioinformatics/btae604] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2024] [Revised: 08/23/2024] [Accepted: 10/17/2024] [Indexed: 10/21/2024] Open
Abstract
MOTIVATION Accurate and robust estimation of the synergistic drug combination is important for medicine precision. Although some computational methods have been developed, some predictions are still unreliable especially for the cross-dataset predictions, due to the complex mechanism of drug combinations and heterogeneity of cancer samples. RESULTS We have proposed JointSyn that utilizes dual-view jointly learning to predict sample-specific effects of drug combination from drug and cell features. JointSyn outperforms existing state-of-the-art methods in predictive accuracy and robustness across various benchmarks. Each view of JointSyn captures drug synergy-related characteristics and makes complementary contributes to the final prediction of the drug combination. Moreover, JointSyn with fine-tuning improves its generalization ability to predict a novel drug combination or cancer sample using a small number of experimental measurements. We also used JointSyn to generate an estimated atlas of drug synergy for pan-cancer and explored the differential pattern among cancers. These results demonstrate the potential of JointSyn to predict drug synergy, supporting the development of personalized combinatorial therapies. AVAILABILITY AND IMPLEMENTATION Source code and data are available at https://github.com/LiHongCSBLab/JointSyn.
Collapse
Affiliation(s)
- Xueliang Li
- CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China
| | - Bihan Shen
- CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China
| | - Fangyoumin Feng
- CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China
| | - Kunshi Li
- CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China
| | - Zhixuan Tang
- CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China
| | - Liangxiao Ma
- Bio-Med Big Data Center, CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, Chinese Academy of Science, Shanghai 200031, China
| | - Hong Li
- CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China
| |
Collapse
|