1
|
Ladin K, Cuddeback J, Duru OK, Goel S, Harvey W, Park JG, Paulus JK, Sackey J, Sharp R, Steyerberg E, Ustun B, van Klaveren D, Weingart SN, Kent DM. Guidance for unbiased predictive information for healthcare decision-making and equity (GUIDE): considerations when race may be a prognostic factor. NPJ Digit Med 2024; 7:290. [PMID: 39427028 PMCID: PMC11490638 DOI: 10.1038/s41746-024-01245-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2023] [Accepted: 08/31/2024] [Indexed: 10/21/2024] Open
Abstract
Clinical prediction models (CPMs) are tools that compute the risk of an outcome given a set of patient characteristics and are routinely used to inform patients, guide treatment decision-making, and resource allocation. Although much hope has been placed on CPMs to mitigate human biases, CPMs may potentially contribute to racial disparities in decision-making and resource allocation. While some policymakers, professional organizations, and scholars have called for eliminating race as a variable from CPMs, others raise concerns that excluding race may exacerbate healthcare disparities and this controversy remains unresolved. The Guidance for Unbiased predictive Information for healthcare Decision-making and Equity (GUIDE) provides expert guidelines for model developers and health system administrators on the transparent use of race in CPMs and mitigation of algorithmic bias across contexts developed through a 5-round, modified Delphi process from a diverse 14-person technical expert panel (TEP). Deliberations affirmed that race is a social construct and that the goals of prediction are distinct from those of causal inference, and emphasized: the importance of decisional context (e.g., shared decision-making versus healthcare rationing); the conflicting nature of different anti-discrimination principles (e.g., anticlassification versus antisubordination principles); and the importance of identifying and balancing trade-offs in achieving equity-related goals with race-aware versus race-unaware CPMs for conditions where racial identity is prognostically informative. The GUIDE, comprising 31 key items in the development and use of CPMs in healthcare, outlines foundational principles, distinguishes between bias and fairness, and offers guidance for examining subgroup invalidity and using race as a variable in CPMs. This GUIDE presents a living document that supports appraisal and reporting of bias in CPMs to support best practice in CPM development and use.
Collapse
Affiliation(s)
- Keren Ladin
- Research on Ethics, Aging and Community Health (REACH Lab), Medford, MA, USA
- Departments of Occupational Therapy and Community Health, Tufts University, Medford, MA, USA
| | | | - O Kenrik Duru
- Department of Medicine, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA, USA
| | - Sharad Goel
- Harvard Kennedy School, Harvard University, Cambridge, MA, USA
| | - William Harvey
- Department of Medicine, Tufts Medical Center, Boston, MA, USA
| | - Jinny G Park
- Predictive Analytics and Comparative Effectiveness Center, Tufts Medical Center, Boston, MA, USA
| | | | - Joyce Sackey
- Department of Medicine, Stanford Medicine, Stanford, CA, USA
| | - Richard Sharp
- Center for Individualized Medicine Bioethics, Mayo Clinic, Rochester, MN, USA
| | - Ewout Steyerberg
- Department of Biomedical Data Sciences, Leiden University Medical Centre, Leiden, Netherlands
| | - Berk Ustun
- Halıcıoğlu Data Science Institute, University of California San Diego, San Diego, CA, USA
| | - David van Klaveren
- Predictive Analytics and Comparative Effectiveness Center, Tufts Medical Center, Boston, MA, USA
- Erasmus University Medical Centre, Rotterdam, Netherlands
| | - Saul N Weingart
- Department of Medicine, Tufts Medical Center, Boston, MA, USA
| | - David M Kent
- Predictive Analytics and Comparative Effectiveness Center, Tufts Medical Center, Boston, MA, USA.
- Tufts Clinical and Translational Science Institute, Tufts University, Boston, MA, USA.
| |
Collapse
|
2
|
Lin S, Hsu YJ, Kim JS, Jackson JW, Segal JB. Predictive Factors of Apparent Treatment Resistant Hypertension Among Patients With Hypertension Identified Using Electronic Health Records. J Gen Intern Med 2024:10.1007/s11606-024-09068-z. [PMID: 39358502 DOI: 10.1007/s11606-024-09068-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/18/2024] [Accepted: 09/20/2024] [Indexed: 10/04/2024]
Abstract
BACKGROUND Early identification of a patient with resistant hypertension (RH) enables quickly intensified treatment, short-interval follow-up, or perhaps case management to bring his or her blood pressure under control and reduce the risk of complications. OBJECTIVE To identify predictors of RH among individuals with newly diagnosed hypertension (HTN), while comparing different prediction models and techniques for managing missing covariates using electronic health records data. DESIGN Risk prediction study in a retrospective cohort. PARTICIPANTS Adult patients with incident HTN treated in any of the primary care clinics of one health system between April 2013 and December 2016. MAIN MEASURES Predicted risk of RH at the time of HTN identification and candidate predictors for variable selection in future model development. KEY RESULTS Among 26,953 individuals with incident HTN, 613 (2.3%) met criteria for RH after 4.7 months (interquartile range, 1.2-11.3). Variables selected by the least absolute shrinkage and selection operator (LASSO), included baseline systolic blood pressure (SBP) and its missing indicator (a dummy variable created if baseline SBP is absent), use of antihypertensive medication at the time of cohort entry, body mass index, and atherosclerosis risk. The random forest technique achieved the highest area under the curve (AUC) of 0.893 (95% CI, 0.881-0.904) and the best calibration with a calibration slope of 1.01. Complete case analysis is not a valuable option (AUC = 0.625). CONCLUSIONS Machine learning techniques and traditional logistic regression exhibited comparable levels of predictive performance after handling the missingness. We suggest that the variables identified by this study may be good candidates for clinical prediction models to alert clinicians to the need for short-interval follow up and more intensive early therapy for HTN.
Collapse
Affiliation(s)
- Shanshan Lin
- Division of General Internal Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Yea-Jen Hsu
- Department of Health Policy and Management, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA
| | - Ji Soo Kim
- Division of Rheumatology, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - John W Jackson
- Center for Drug Safety and Effectiveness, Johns Hopkins University Bloomberg School of Public Health, Baltimore, MD, USA
- Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA
- Department of Mental Health, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA
| | - Jodi B Segal
- Division of General Internal Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA.
- Center for Drug Safety and Effectiveness, Johns Hopkins University Bloomberg School of Public Health, Baltimore, MD, USA.
- Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA.
- Division of General Internal Medicine, Johns Hopkins School of Medicine, Baltimore, MD, USA.
| |
Collapse
|
3
|
Cerdeña JP, Plaisime MV, Borrell LN. Race as a Risk Marker, Not a Risk Factor: Revising Race-Based Algorithms to Protect Racially Oppressed Patients. J Gen Intern Med 2024; 39:2565-2570. [PMID: 38980468 PMCID: PMC11436499 DOI: 10.1007/s11606-024-08919-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/19/2024] [Accepted: 06/25/2024] [Indexed: 07/10/2024]
Abstract
Emerging consensus in the medical and public health spheres encourages removing race and ethnicity from algorithms used in clinical decision-making. Although clinical algorithms remain appealing given their promise to lighten the cognitive load of medical practice and save time for providers, they risk exacerbating existing health disparities. Race is a strong risk marker of health outcomes, yet it is not a risk factor. The use of race as a factor in medical algorithms suggests that the effect of race is intrinsic to the patient or that its effects can be distinct or separated from other social and environmental variables. By contrast, incisive public health analysis coupled with a race-conscious perspective recognizes that race serves as a marker of countless other dynamic variables and that structural racism, rather than race, compromises the health of racially oppressed individuals. This perspective offers a historical and theoretical context for the current debates regarding the use of race in clinical algorithms, clinical and epidemiologic perspectives on "risk," and future directions for research and policy interventions that combat color-evasive racism and follow the principles of race-conscious medicine.
Collapse
Affiliation(s)
- Jessica P Cerdeña
- Department of Family Medicine, Middlesex Health, Middletown, CT, USA.
- Institute for Collaboration On Health, Intervention, and Policy (InCHIP), University of Connecticut, Storrs, CT, USA.
- Department of Anthropology, University of Connecticut, Storrs, CT, USA.
| | - Marie V Plaisime
- FXB Center for Health and Human Rights, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Penn Program On Race, Science & Society Center for Africana Studies (PRSS), University of Pennsylvania, Philadelphia, PA, USA
| | - Luisa N Borrell
- Department of Epidemiology & Biostatistics, Graduate School of Public Health & Health Policy, The City University of New York, New York, NY, USA
- Department of Surgery, Medical and Social Sciences, Universidad de Alcala, Henares Madrid, Spain
| |
Collapse
|
4
|
Yang Y, Zhang H, Gichoya JW, Katabi D, Ghassemi M. The limits of fair medical imaging AI in real-world generalization. Nat Med 2024; 30:2838-2848. [PMID: 38942996 PMCID: PMC11485237 DOI: 10.1038/s41591-024-03113-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2023] [Accepted: 06/05/2024] [Indexed: 06/30/2024]
Abstract
As artificial intelligence (AI) rapidly approaches human-level performance in medical imaging, it is crucial that it does not exacerbate or propagate healthcare disparities. Previous research established AI's capacity to infer demographic data from chest X-rays, leading to a key concern: do models using demographic shortcuts have unfair predictions across subpopulations? In this study, we conducted a thorough investigation into the extent to which medical AI uses demographic encodings, focusing on potential fairness discrepancies within both in-distribution training sets and external test sets. Our analysis covers three key medical imaging disciplines-radiology, dermatology and ophthalmology-and incorporates data from six global chest X-ray datasets. We confirm that medical imaging AI leverages demographic shortcuts in disease classification. Although correcting shortcuts algorithmically effectively addresses fairness gaps to create 'locally optimal' models within the original data distribution, this optimality is not true in new test settings. Surprisingly, we found that models with less encoding of demographic attributes are often most 'globally optimal', exhibiting better fairness during model evaluation in new test environments. Our work establishes best practices for medical imaging models that maintain their performance and fairness in deployments beyond their initial training contexts, underscoring critical considerations for AI clinical deployments across populations and sites.
Collapse
Affiliation(s)
- Yuzhe Yang
- Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, MA, USA.
| | - Haoran Zhang
- Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Judy W Gichoya
- Department of Radiology, Emory University School of Medicine, Atlanta, GA, USA
| | - Dina Katabi
- Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Marzyeh Ghassemi
- Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, MA, USA
- Institute for Medical Engineering & Science, Massachusetts Institute of Technology, Cambridge, MA, USA
| |
Collapse
|
5
|
Matos J, Gallifant J, Chowdhury A, Economou-Zavlanos N, Charpignon ML, Gichoya J, Celi LA, Nazer L, King H, Wong AKI. A Clinician's Guide to Understanding Bias in Critical Clinical Prediction Models. Crit Care Clin 2024; 40:827-857. [PMID: 39218488 DOI: 10.1016/j.ccc.2024.05.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/04/2024]
Abstract
This narrative review focuses on the role of clinical prediction models in supporting informed decision-making in critical care, emphasizing their 2 forms: traditional scores and artificial intelligence (AI)-based models. Acknowledging the potential for both types to embed biases, the authors underscore the importance of critical appraisal to increase our trust in models. The authors outline recommendations and critical care examples to manage risk of bias in AI models. The authors advocate for enhanced interdisciplinary training for clinicians, who are encouraged to explore various resources (books, journals, news Web sites, and social media) and events (Datathons) to deepen their understanding of risk of bias.
Collapse
Affiliation(s)
- João Matos
- University of Porto (FEUP), Porto, Portugal; Institute for Systems and Computer Engineering, Technology and Science (INESC TEC), Porto, Portugal; Laboratory for Computational Physiology, Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Jack Gallifant
- Laboratory for Computational Physiology, Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Cambridge, MA, USA; Department of Critical Care, Guy's and St Thomas' NHS Trust, London, UK
| | - Anand Chowdhury
- Division of Pulmonary, Allergy, and Critical Care Medicine, Department of Medicine, Duke University, Durham, NC, USA
| | | | - Marie-Laure Charpignon
- Institute for Data Systems and Society, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Judy Gichoya
- Department of Radiology, Emory University, Atlanta, GA, USA
| | - Leo Anthony Celi
- Laboratory for Computational Physiology, Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Cambridge, MA, USA; Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA; Department of Medicine, Beth Israel Deaconess Medical Center, Boston, MA, USA
| | - Lama Nazer
- Department of Pharmacy, King Hussein Cancer Center, Amman, Jordan
| | - Heather King
- Durham VA Health Care System, Health Services Research and Development, Center of Innovation to Accelerate Discovery and Practice Transformation (ADAPT), Durham, NC, USA; Department of Population Health Sciences, Duke University, Durham, NC, USA; Division of General Internal Medicine, Duke University, Duke University School of Medicine, Durham, NC, USA
| | - An-Kwok Ian Wong
- Division of Pulmonary, Allergy, and Critical Care Medicine, Department of Medicine, Duke University, Durham, NC, USA; Department of Biostatistics and Bioinformatics, Duke University, Division of Translational Biomedical Informatics, Durham, NC, USA.
| |
Collapse
|
6
|
Mullahy J. Clinical decisions, patient race, and flawed data. Proc Natl Acad Sci U S A 2024; 121:e2415152121. [PMID: 39159382 PMCID: PMC11363271 DOI: 10.1073/pnas.2415152121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/21/2024] Open
Affiliation(s)
- John Mullahy
- Department of Population Health Sciences, University of Wisconsin-Madison School of Medicine and Public Health, Madison, WI53726
| |
Collapse
|
7
|
Teshale AB, Htun HL, Vered M, Owen AJ, Freak-Poli R. A Systematic Review of Artificial Intelligence Models for Time-to-Event Outcome Applied in Cardiovascular Disease Risk Prediction. J Med Syst 2024; 48:68. [PMID: 39028429 PMCID: PMC11271333 DOI: 10.1007/s10916-024-02087-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2024] [Accepted: 07/09/2024] [Indexed: 07/20/2024]
Abstract
Artificial intelligence (AI) based predictive models for early detection of cardiovascular disease (CVD) risk are increasingly being utilised. However, AI based risk prediction models that account for right-censored data have been overlooked. This systematic review (PROSPERO protocol CRD42023492655) includes 33 studies that utilised machine learning (ML) and deep learning (DL) models for survival outcome in CVD prediction. We provided details on the employed ML and DL models, eXplainable AI (XAI) techniques, and type of included variables, with a focus on social determinants of health (SDoH) and gender-stratification. Approximately half of the studies were published in 2023 with the majority from the United States. Random Survival Forest (RSF), Survival Gradient Boosting models, and Penalised Cox models were the most frequently employed ML models. DeepSurv was the most frequently employed DL model. DL models were better at predicting CVD outcomes than ML models. Permutation-based feature importance and Shapley values were the most utilised XAI methods for explaining AI models. Moreover, only one in five studies performed gender-stratification analysis and very few incorporate the wide range of SDoH factors in their prediction model. In conclusion, the evidence indicates that RSF and DeepSurv models are currently the optimal models for predicting CVD outcomes. This study also highlights the better predictive ability of DL survival models, compared to ML models. Future research should ensure the appropriate interpretation of AI models, accounting for SDoH, and gender stratification, as gender plays a significant role in CVD occurrence.
Collapse
Affiliation(s)
- Achamyeleh Birhanu Teshale
- School of Public Health and Preventive Medicine, Monash University, Melbourne, VIC, Australia
- Department of Epidemiology and Biostatistics, Institute of Public Health, College of Medicine and Health Sciences, University of Gondar, Gondar, Ethiopia
| | - Htet Lin Htun
- School of Public Health and Preventive Medicine, Monash University, Melbourne, VIC, Australia
| | - Mor Vered
- Department of Data Science and AI, Faculty of Information Technology, Monash University, Clayton, VIC, Australia
| | - Alice J Owen
- School of Public Health and Preventive Medicine, Monash University, Melbourne, VIC, Australia
| | - Rosanne Freak-Poli
- School of Public Health and Preventive Medicine, Monash University, Melbourne, VIC, Australia.
- Stroke and Ageing Research, Department of Medicine, School of Clinical Sciences at Monash Health, Monash University, Melbourne, VIC, Australia.
| |
Collapse
|
8
|
Kolla L, Parikh RB. Uses and limitations of artificial intelligence for oncology. Cancer 2024; 130:2101-2107. [PMID: 38554271 PMCID: PMC11170282 DOI: 10.1002/cncr.35307] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2023] [Revised: 02/19/2024] [Accepted: 03/15/2024] [Indexed: 04/01/2024]
Abstract
Modern artificial intelligence (AI) tools built on high-dimensional patient data are reshaping oncology care, helping to improve goal-concordant care, decrease cancer mortality rates, and increase workflow efficiency and scope of care. However, data-related concerns and human biases that seep into algorithms during development and post-deployment phases affect performance in real-world settings, limiting the utility and safety of AI technology in oncology clinics. To this end, the authors review the current potential and limitations of predictive AI for cancer diagnosis and prognostication as well as of generative AI, specifically modern chatbots, which interfaces with patients and clinicians. They conclude the review with a discussion on ongoing challenges and regulatory opportunities in the field.
Collapse
Affiliation(s)
- Likhitha Kolla
- Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Ravi B. Parikh
- Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
- Corporal Michael J. Crescenz VA Medical Center, Philadelphia, PA, USA
| |
Collapse
|
9
|
Yu KH, Healey E, Leong TY, Kohane IS, Manrai AK. Medical Artificial Intelligence and Human Values. N Engl J Med 2024; 390:1895-1904. [PMID: 38810186 DOI: 10.1056/nejmra2214183] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 05/31/2024]
Affiliation(s)
- Kun-Hsing Yu
- From the Department of Biomedical Informatics, Harvard Medical School (K.-H.Y., E.H., I.S.K., A.K.M.), the Department of Pathology, Brigham and Women's Hospital (K.-H.Y.), and the Harvard-MIT Division of Health Sciences and Technology (E.H.) - all in Boston; and the School of Computing, National University of Singapore, Singapore (T.-Y.L.)
| | - Elizabeth Healey
- From the Department of Biomedical Informatics, Harvard Medical School (K.-H.Y., E.H., I.S.K., A.K.M.), the Department of Pathology, Brigham and Women's Hospital (K.-H.Y.), and the Harvard-MIT Division of Health Sciences and Technology (E.H.) - all in Boston; and the School of Computing, National University of Singapore, Singapore (T.-Y.L.)
| | - Tze-Yun Leong
- From the Department of Biomedical Informatics, Harvard Medical School (K.-H.Y., E.H., I.S.K., A.K.M.), the Department of Pathology, Brigham and Women's Hospital (K.-H.Y.), and the Harvard-MIT Division of Health Sciences and Technology (E.H.) - all in Boston; and the School of Computing, National University of Singapore, Singapore (T.-Y.L.)
| | - Isaac S Kohane
- From the Department of Biomedical Informatics, Harvard Medical School (K.-H.Y., E.H., I.S.K., A.K.M.), the Department of Pathology, Brigham and Women's Hospital (K.-H.Y.), and the Harvard-MIT Division of Health Sciences and Technology (E.H.) - all in Boston; and the School of Computing, National University of Singapore, Singapore (T.-Y.L.)
| | - Arjun K Manrai
- From the Department of Biomedical Informatics, Harvard Medical School (K.-H.Y., E.H., I.S.K., A.K.M.), the Department of Pathology, Brigham and Women's Hospital (K.-H.Y.), and the Harvard-MIT Division of Health Sciences and Technology (E.H.) - all in Boston; and the School of Computing, National University of Singapore, Singapore (T.-Y.L.)
| |
Collapse
|
10
|
Harmon I, Brailsford J, Sanchez-Cano I, Fishe J. Development of a Computable Phenotype for Prehospital Pediatric Asthma Encounters. PREHOSP EMERG CARE 2024:1-12. [PMID: 38713633 DOI: 10.1080/10903127.2024.2352583] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2024] [Accepted: 04/29/2024] [Indexed: 05/09/2024]
Abstract
INTRODUCTION Asthma exacerbations are a common cause of pediatric Emergency Medical Services (EMS) encounters. Accordingly, prehospital management of pediatric asthma exacerbations has been designated an EMS research priority. However, accurate identification of pediatric asthma exacerbations from the prehospital record is nuanced and difficult due to the heterogeneity of asthma symptoms, especially in children. Therefore, this study's objective was to develop a prehospital-specific pediatric asthma computable phenotype (CP) that could accurately identify prehospital encounters for pediatric asthma exacerbations. METHODS This is a retrospective observational study of patient encounters for ages 2-18 years from the ESO Data Collaborative between 2018 and 2021. We modified two existing rule-based pediatric asthma CPs and created three new CPs (one rule-based and two machine learning-based). Two pediatric emergency medicine physicians independently reviewed encounters to assign labels of asthma exacerbation or not. Taking that labeled encounter data, a 50/50 train/test split was used to create training and test sets from the labeled data. A 90/10 split was used to create a small validation set from the training set. We used specificity, sensitivity, positive predictive value (PPV), negative predictive value (NPV) and macro F1 to compare performance across all CP models. RESULTS After applying the inclusion and exclusion criteria, 24,283 patient encounters remained. The machine-learning models exhibited the best performance for the identification of pediatric asthma exacerbations. A multi-layer perceptron-based model had the best performance in all metrics, with an F1 score of 0.95, specificity of 1.00, sensitivity of 0.91, negative predictive value of 0.98, and positive predictive value of 1.00. CONCLUSION We modified existing and developed new pediatric asthma CPs to retrospectively identify prehospital pediatric asthma exacerbation encounters. We found that machine learning-based models greatly outperformed rule-based models. Given the high performance of the machine-learning models, the development and application of machine learning-based CPs for other conditions and diseases could help accelerate EMS research and ultimately enhance clinical care by accurately identifying patients with conditions of interest.
Collapse
Affiliation(s)
- Ira Harmon
- Center for Data Solutions, University of Florida College of Medicine - Jacksonville, Jacksonville, Florida
| | - Jennifer Brailsford
- Center for Data Solutions, University of Florida College of Medicine - Jacksonville, Jacksonville, Florida
| | - Isabel Sanchez-Cano
- Department of Emergency Medicine, University of Florida College of Medicine - Jacksonville, Jacksonville, Florida
| | - Jennifer Fishe
- Center for Data Solutions, University of Florida College of Medicine - Jacksonville, Jacksonville, Florida
- Department of Emergency Medicine, University of Florida College of Medicine - Jacksonville, Jacksonville, Florida
| |
Collapse
|
11
|
Zanger-Tishler M, Nyarko J, Goel S. Risk scores, label bias, and everything but the kitchen sink. SCIENCE ADVANCES 2024; 10:eadi8411. [PMID: 38552013 PMCID: PMC10980258 DOI: 10.1126/sciadv.adi8411] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/21/2023] [Accepted: 02/21/2024] [Indexed: 04/01/2024]
Abstract
In designing risk assessment algorithms, many scholars promote a "kitchen sink" approach, reasoning that more information yields more accurate predictions. We show, however, that this rationale often fails when algorithms are trained to predict a proxy of the true outcome, for instance, predicting arrest as a proxy for criminal behavior. With this "label bias," one should exclude a feature if its correlation with the proxy and its correlation with the true outcome have opposite signs, conditional on the other model features. This criterion is often satisfied when a feature is weakly correlated with the true outcome, and, additionally, that feature and the true outcome are both direct causes of the proxy outcome. For example, criminal behavior and geography may be weakly correlated and, due to patterns of police deployment, direct causes of one's arrest record-suggesting that excluding geography in criminal risk assessment will weaken an algorithm's performance in predicting arrest but will improve its capacity to predict actual crime.
Collapse
Affiliation(s)
| | | | - Sharad Goel
- Harvard Kennedy School, Cambridge, MA 02138, USA
| |
Collapse
|
12
|
Patel SY, Baum A, Basu S. Prediction of non emergent acute care utilization and cost among patients receiving Medicaid. Sci Rep 2024; 14:824. [PMID: 38263373 PMCID: PMC10805799 DOI: 10.1038/s41598-023-51114-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2023] [Accepted: 12/30/2023] [Indexed: 01/25/2024] Open
Abstract
Patients receiving Medicaid often experience social risk factors for poor health and limited access to primary care, leading to high utilization of emergency departments and hospitals (acute care) for non-emergent conditions. As programs proactively outreach Medicaid patients to offer primary care, they rely on risk models historically limited by poor-quality data. Following initiatives to improve data quality and collect data on social risk, we tested alternative widely-debated strategies to improve Medicaid risk models. Among a sample of 10 million patients receiving Medicaid from 26 states and Washington DC, the best-performing model tripled the probability of prospectively identifying at-risk patients versus a standard model (sensitivity 11.3% [95% CI 10.5, 12.1%] vs 3.4% [95% CI 3.0, 4.0%]), without increasing "false positives" that reduce efficiency of outreach (specificity 99.8% [95% CI 99.6, 99.9%] vs 99.5% [95% CI 99.4, 99.7%]), and with a ~ tenfold improved coefficient of determination when predicting costs (R2: 0.195-0.412 among population subgroups vs 0.022-0.050). Our best-performing model also reversed the lower sensitivity of risk prediction for Black versus White patients, a bias present in the standard cost-based model. Our results demonstrate a modeling approach to substantially improve risk prediction performance and equity for patients receiving Medicaid.
Collapse
Affiliation(s)
- Sadiq Y Patel
- Clinical Product Development, Waymark, San Francisco, CA, USA.
- School of Social Policy and Practice, University of Pennsylvania, 3701 Locust Walk, Philadelphia, PA, 19104, USA.
| | - Aaron Baum
- Clinical Product Development, Waymark, San Francisco, CA, USA
- Icahn School of Medicine at Mt Sinai, New York, NY, USA
| | - Sanjay Basu
- Clinical Product Development, Waymark, San Francisco, CA, USA
- Institute of Health Policy, Management and Evaluation, University of Toronto, Toronto, ON, Canada
- Center for Vulnerable Populations, San Francisco General Hospital/University of California San Francisco, San Francisco, CA, USA
| |
Collapse
|
13
|
Hammonds EM. A discussion on the use of race in biomedical fields. Proc Natl Acad Sci U S A 2024; 121:e2322147121. [PMID: 38198523 PMCID: PMC10801833 DOI: 10.1073/pnas.2322147121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2024] Open
Affiliation(s)
- Evelynn M. Hammonds
- Department of the History of Science and the Department of African and African American Studies, Harvard University, Cambridge, MA02138
| |
Collapse
|