1
|
Jindal JA, Lungren MP, Shah NH. Ensuring useful adoption of generative artificial intelligence in healthcare. J Am Med Inform Assoc 2024; 31:1441-1444. [PMID: 38452298 PMCID: PMC11105148 DOI: 10.1093/jamia/ocae043] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2023] [Revised: 02/01/2024] [Accepted: 02/22/2024] [Indexed: 03/09/2024] Open
Abstract
OBJECTIVES This article aims to examine how generative artificial intelligence (AI) can be adopted with the most value in health systems, in response to the Executive Order on AI. MATERIALS AND METHODS We reviewed how technology has historically been deployed in healthcare, and evaluated recent examples of deployments of both traditional AI and generative AI (GenAI) with a lens on value. RESULTS Traditional AI and GenAI are different technologies in terms of their capability and modes of current deployment, which have implications on value in health systems. DISCUSSION Traditional AI when applied with a framework top-down can realize value in healthcare. GenAI in the short term when applied top-down has unclear value, but encouraging more bottom-up adoption has the potential to provide more benefit to health systems and patients. CONCLUSION GenAI in healthcare can provide the most value for patients when health systems adapt culturally to grow with this new technology and its adoption patterns.
Collapse
Affiliation(s)
- Jenelle A Jindal
- Center for Biomedical Informatics Research, Stanford University, Stanford, CA 94305, United States
| | - Matthew P Lungren
- Health and Life Sciences, Microsoft Corporation, Redmond, WA 98052, United States
- Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, CA 94305, United States
- Department of Biomedical Imaging, University of California San Francisco, San Francisco, CA 94143, United States
| | - Nigam H Shah
- Department of Medicine, Stanford School of Medicine, Stanford, CA 94304, United States
- Clinical Excellence Research Center, Stanford School of Medicine, Stanford, CA 94304, United States
- Technology and Digital Solutions, Stanford Health Care, Palo Alto, CA 94304, United States
| |
Collapse
|
2
|
Affiliation(s)
- Nigam H Shah
- Technology and Digital Solutions, Stanford Medicine, Palo Alto, California
| | | | | |
Collapse
|
3
|
Shah NH, Halamka JD, Saria S, Pencina M, Tazbaz T, Tripathi M, Callahan A, Hildahl H, Anderson B. A Nationwide Network of Health AI Assurance Laboratories. JAMA 2024; 331:245-249. [PMID: 38117493 DOI: 10.1001/jama.2023.26930] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/21/2023]
Abstract
Importance Given the importance of rigorous development and evaluation standards needed of artificial intelligence (AI) models used in health care, nationwide accepted procedures to provide assurance that the use of AI is fair, appropriate, valid, effective, and safe are urgently needed. Observations While there are several efforts to develop standards and best practices to evaluate AI, there is a gap between having such guidance and the application of such guidance to both existing and new AI models being developed. As of now, there is no publicly available, nationwide mechanism that enables objective evaluation and ongoing assessment of the consequences of using health AI models in clinical care settings. Conclusion and Relevance The need to create a public-private partnership to support a nationwide health AI assurance labs network is outlined here. In this network, community best practices could be applied for testing health AI models to produce reports on their performance that can be widely shared for managing the lifecycle of AI models over time and across populations and sites where these models are deployed.
Collapse
Affiliation(s)
- Nigam H Shah
- Stanford Medicine, Palo Alto, California
- Coalition for Health AI, Dover, Delaware
| | - John D Halamka
- Coalition for Health AI, Dover, Delaware
- Mayo Clinic Platform, Mayo Clinic, Rochester, Minnesota
| | - Suchi Saria
- Coalition for Health AI, Dover, Delaware
- Bayesian Health, New York, New York
- Johns Hopkins University, Baltimore, Maryland
- Johns Hopkins Medicine, Baltimore, Maryland
| | - Michael Pencina
- Coalition for Health AI, Dover, Delaware
- Duke AI Health, Duke University School of Medicine, Durham, North Carolina
| | - Troy Tazbaz
- US Food and Drug Administration, Silver Spring, Maryland
| | - Micky Tripathi
- US Office of the National Coordinator for Health IT, Washington, DC
| | | | | | - Brian Anderson
- Coalition for Health AI, Dover, Delaware
- MITRE Corporation, Bedford, Massachusetts
| |
Collapse
|
4
|
Liu S, Wei S, Lehmann HP. Applicability Area: A novel utility-based approach for evaluating predictive models, beyond discrimination. AMIA ... ANNUAL SYMPOSIUM PROCEEDINGS. AMIA SYMPOSIUM 2024; 2023:494-503. [PMID: 38222359 PMCID: PMC10785877] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 01/16/2024]
Abstract
Translating prediction models into practice and supporting clinicians' decision-making demand demonstration of clinical value. Existing approaches to evaluating machine learning models emphasize discriminatory power, which is only a part of the medical decision problem. We propose the Applicability Area (ApAr), a decision-analytic utility-based approach to evaluating predictive models that communicate the range of prior probability and test cutoffs for which the model has positive utility; larger ApArs suggest a broader potential use of the model. We assess ApAr with simulated datasets and with three published medical datasets. ApAr adds value beyond the typical area under the receiver operating characteristic curve (AUROC) metric analysis. As an example, in the diabetes dataset, the top model by ApAr was ranked as the 23rd best model by AUROC. Decision makers looking to adopt and implement models can leverage ApArs to assess if the local range of priors and utilities is within the respective ApArs.
Collapse
Affiliation(s)
- Star Liu
- Johns Hopkins University School of Medicine, Baltimore, MD, United States
| | - Shixiong Wei
- Johns Hopkins University School of Medicine, Baltimore, MD, United States
| | - Harold P Lehmann
- Johns Hopkins University School of Medicine, Baltimore, MD, United States
| |
Collapse
|
5
|
Maynard S, Farrington J, Alimam S, Evans H, Li K, Wong WK, Stanworth SJ. Machine learning in transfusion medicine: A scoping review. Transfusion 2024; 64:162-184. [PMID: 37950535 DOI: 10.1111/trf.17582] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2023] [Revised: 09/25/2023] [Accepted: 09/27/2023] [Indexed: 11/12/2023]
Affiliation(s)
- Suzanne Maynard
- Medical Sciences Division, Radcliffe Department of Medicine, University of Oxford, Oxford, UK
- NIHR Blood and Transplant Research Unit in Data Driven Transfusion Practice, Nuffield Division of Clinical Laboratory Sciences, Radcliffe Department of Medicine, University of Oxford, Oxford, UK
- NHSBT and Oxford University Hospitals NHS Foundation Trust, Oxford, UK
| | - Joseph Farrington
- Institute of Health Informatics, University College London, London, UK
| | - Samah Alimam
- Haematology Department, University College London Hospitals NHS Foundation Trust, London, UK
| | - Hayley Evans
- NIHR Blood and Transplant Research Unit in Data Driven Transfusion Practice, Nuffield Division of Clinical Laboratory Sciences, Radcliffe Department of Medicine, University of Oxford, Oxford, UK
| | - Kezhi Li
- Institute of Health Informatics, University College London, London, UK
| | - Wai Keong Wong
- Director of Digital, Cambridge University Hospitals NHS Foundation Trust, Cambridge, UK
| | - Simon J Stanworth
- Medical Sciences Division, Radcliffe Department of Medicine, University of Oxford, Oxford, UK
- NIHR Blood and Transplant Research Unit in Data Driven Transfusion Practice, Nuffield Division of Clinical Laboratory Sciences, Radcliffe Department of Medicine, University of Oxford, Oxford, UK
- NHSBT and Oxford University Hospitals NHS Foundation Trust, Oxford, UK
| |
Collapse
|
6
|
Wornow M, Xu Y, Thapa R, Patel B, Steinberg E, Fleming S, Pfeffer MA, Fries J, Shah NH. The shaky foundations of large language models and foundation models for electronic health records. NPJ Digit Med 2023; 6:135. [PMID: 37516790 PMCID: PMC10387101 DOI: 10.1038/s41746-023-00879-8] [Citation(s) in RCA: 29] [Impact Index Per Article: 29.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2023] [Accepted: 07/13/2023] [Indexed: 07/31/2023] Open
Abstract
The success of foundation models such as ChatGPT and AlphaFold has spurred significant interest in building similar models for electronic medical records (EMRs) to improve patient care and hospital operations. However, recent hype has obscured critical gaps in our understanding of these models' capabilities. In this narrative review, we examine 84 foundation models trained on non-imaging EMR data (i.e., clinical text and/or structured data) and create a taxonomy delineating their architectures, training data, and potential use cases. We find that most models are trained on small, narrowly-scoped clinical datasets (e.g., MIMIC-III) or broad, public biomedical corpora (e.g., PubMed) and are evaluated on tasks that do not provide meaningful insights on their usefulness to health systems. Considering these findings, we propose an improved evaluation framework for measuring the benefits of clinical foundation models that is more closely grounded to metrics that matter in healthcare.
Collapse
Affiliation(s)
- Michael Wornow
- Department of Computer Science, Stanford University, Stanford, CA, USA.
| | - Yizhe Xu
- Center for Biomedical Informatics Research, Stanford University School of Medicine, Stanford, CA, USA
| | - Rahul Thapa
- Center for Biomedical Informatics Research, Stanford University School of Medicine, Stanford, CA, USA
| | - Birju Patel
- Center for Biomedical Informatics Research, Stanford University School of Medicine, Stanford, CA, USA
| | - Ethan Steinberg
- Department of Computer Science, Stanford University, Stanford, CA, USA
| | - Scott Fleming
- Center for Biomedical Informatics Research, Stanford University School of Medicine, Stanford, CA, USA
| | - Michael A Pfeffer
- Center for Biomedical Informatics Research, Stanford University School of Medicine, Stanford, CA, USA
- Technology and Digital Services, Stanford Health Care, Palo Alto, CA, USA
| | - Jason Fries
- Center for Biomedical Informatics Research, Stanford University School of Medicine, Stanford, CA, USA
| | - Nigam H Shah
- Center for Biomedical Informatics Research, Stanford University School of Medicine, Stanford, CA, USA
- Technology and Digital Services, Stanford Health Care, Palo Alto, CA, USA
- Department of Medicine, Stanford University School of Medicine, Stanford, CA, USA
- Clinical Excellence Research Center, Stanford University School of Medicine, Stanford, CA, USA
| |
Collapse
|