1
|
Smith JC, Williamson BD, Cronkite DJ, Park D, Whitaker JM, McLemore MF, Osmanski JT, Winter R, Ramaprasan A, Kelley A, Shea M, Wittayanukorn S, Stojanovic D, Zhao Y, Toh S, Johnson KB, Aronoff DM, Carrell DS. Data-driven automated classification algorithms for acute health conditions: applying PheNorm to COVID-19 disease. J Am Med Inform Assoc 2024; 31:574-582. [PMID: 38109888 PMCID: PMC10873852 DOI: 10.1093/jamia/ocad241] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2023] [Revised: 10/19/2023] [Accepted: 11/27/2023] [Indexed: 12/20/2023] Open
Abstract
OBJECTIVES Automated phenotyping algorithms can reduce development time and operator dependence compared to manually developed algorithms. One such approach, PheNorm, has performed well for identifying chronic health conditions, but its performance for acute conditions is largely unknown. Herein, we implement and evaluate PheNorm applied to symptomatic COVID-19 disease to investigate its potential feasibility for rapid phenotyping of acute health conditions. MATERIALS AND METHODS PheNorm is a general-purpose automated approach to creating computable phenotype algorithms based on natural language processing, machine learning, and (low cost) silver-standard training labels. We applied PheNorm to cohorts of potential COVID-19 patients from 2 institutions and used gold-standard manual chart review data to investigate the impact on performance of alternative feature engineering options and implementing externally trained models without local retraining. RESULTS Models at each institution achieved AUC, sensitivity, and positive predictive value of 0.853, 0.879, 0.851 and 0.804, 0.976, and 0.885, respectively, at quantiles of model-predicted risk that maximize F1. We report performance metrics for all combinations of silver labels, feature engineering options, and models trained internally versus externally. DISCUSSION Phenotyping algorithms developed using PheNorm performed well at both institutions. Performance varied with different silver-standard labels and feature engineering options. Models developed locally at one site also worked well when implemented externally at the other site. CONCLUSION PheNorm models successfully identified an acute health condition, symptomatic COVID-19. The simplicity of the PheNorm approach allows it to be applied at multiple study sites with substantially reduced overhead compared to traditional approaches.
Collapse
Affiliation(s)
- Joshua C Smith
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37203, United States
| | - Brian D Williamson
- Kaiser Permanente Washington Health Research Institute, Seattle, WA 98101, United States
| | - David J Cronkite
- Kaiser Permanente Washington Health Research Institute, Seattle, WA 98101, United States
| | - Daniel Park
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37203, United States
| | - Jill M Whitaker
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37203, United States
| | - Michael F McLemore
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37203, United States
| | - Joshua T Osmanski
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37203, United States
| | - Robert Winter
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37203, United States
| | - Arvind Ramaprasan
- Kaiser Permanente Washington Health Research Institute, Seattle, WA 98101, United States
| | - Ann Kelley
- Kaiser Permanente Washington Health Research Institute, Seattle, WA 98101, United States
| | - Mary Shea
- Kaiser Permanente Washington Health Research Institute, Seattle, WA 98101, United States
| | - Saranrat Wittayanukorn
- Center for Drug Evaluation and Research, US Food and Drug Administration, Silver Spring, MD 20903, United States
| | - Danijela Stojanovic
- Center for Drug Evaluation and Research, US Food and Drug Administration, Silver Spring, MD 20903, United States
| | - Yueqin Zhao
- Center for Drug Evaluation and Research, US Food and Drug Administration, Silver Spring, MD 20903, United States
| | - Sengwee Toh
- Harvard Pilgrim Health Care Institute, Boston, MA 02215, United States
| | - Kevin B Johnson
- Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania, Philadelphia, PA 19104, United States
| | - David M Aronoff
- Department of Medicine, Indiana University School of Medicine, Indianapolis, IN 46202, United States
| | - David S Carrell
- Kaiser Permanente Washington Health Research Institute, Seattle, WA 98101, United States
| |
Collapse
|