1
|
Huang Y, Guo J, Donahoo WT, Lee YA, Fan Z, Lu Y, Chen WH, Tang H, Bilello L, Saguil AA, Rosenberg E, Shenkman EA, Bian J. A fair individualized polysocial risk score for identifying increased social risk in type 2 diabetes. Nat Commun 2024; 15:8653. [PMID: 39369018 DOI: 10.1038/s41467-024-52960-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2023] [Accepted: 09/27/2024] [Indexed: 10/07/2024] Open
Abstract
Racial and ethnic minorities bear a disproportionate burden of type 2 diabetes (T2D) and its complications, with social determinants of health (SDoH) recognized as key drivers of these disparities. Implementing efficient and effective social needs management strategies is crucial. We propose a machine learning analytic pipeline to calculate the individualized polysocial risk score (iPsRS), which can identify T2D patients at high social risk for hospitalization, incorporating explainable AI techniques and algorithmic fairness optimization. We use electronic health records (EHR) data from T2D patients in the University of Florida Health Integrated Data Repository, incorporating both contextual SDoH (e.g., neighborhood deprivation) and person-level SDoH (e.g., housing instability). After fairness optimization across racial and ethnic groups, the iPsRS achieved a C statistic of 0.71 in predicting 1-year hospitalization. Our iPsRS can fairly and accurately screen patients with T2D who are at increased social risk for hospitalization.
Collapse
Affiliation(s)
- Yu Huang
- Department of Health Outcomes and Biomedical Informatics, University of Florida, Gainesville, FL, USA
| | - Jingchuan Guo
- Department of Pharmaceutical Outcomes and Policy, University of Florida, Gainesville, FL, USA
| | - William T Donahoo
- Division of Endocrinology, Diabetes and Metabolism, College of Medicine, University of Florida, Gainesville, FL, USA
| | - Yao An Lee
- Department of Pharmaceutical Outcomes and Policy, University of Florida, Gainesville, FL, USA
| | - Zhengkang Fan
- Department of Health Outcomes and Biomedical Informatics, University of Florida, Gainesville, FL, USA
| | - Ying Lu
- Department of Pharmaceutical Outcomes and Policy, University of Florida, Gainesville, FL, USA
| | - Wei-Han Chen
- Department of Pharmaceutical Outcomes and Policy, University of Florida, Gainesville, FL, USA
| | - Huilin Tang
- Department of Pharmaceutical Outcomes and Policy, University of Florida, Gainesville, FL, USA
| | - Lori Bilello
- Department of Surgery, College of Medicine- Jacksonville, University of Florida, Jacksonville, FL, USA
| | - Aaron A Saguil
- Department of Community Health and Family Medicine, College of Medicine, University of Florida, Jacksonville, FL, USA
| | - Eric Rosenberg
- Division of General Internal Medicine, Department of Medicine, College of Medicine, University of Florida, Gainesville, FL, USA
| | - Elizabeth A Shenkman
- Department of Health Outcomes and Biomedical Informatics, University of Florida, Gainesville, FL, USA
| | - Jiang Bian
- Department of Health Outcomes and Biomedical Informatics, University of Florida, Gainesville, FL, USA.
| |
Collapse
|
2
|
Agraz M, Deng Y, Karniadakis GE, Mantzoros CS. Enhancing severe hypoglycemia prediction in type 2 diabetes mellitus through multi-view co-training machine learning model for imbalanced dataset. Sci Rep 2024; 14:22741. [PMID: 39349500 PMCID: PMC11444036 DOI: 10.1038/s41598-024-69844-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2024] [Accepted: 08/09/2024] [Indexed: 10/02/2024] Open
Abstract
Patients with type 2 diabetes mellitus (T2DM) who have severe hypoglycemia (SH) poses a considerable risk of long-term death, especially among the elderly, demanding urgent medical attention. Accurate prediction of SH remains challenging due to its multifaced nature, contributed from factors such as medications, lifestyle choices, and metabolic measurements. In this study, we propose a systematic approach to improve the robustness and accuracy of SH predictions using machine learning models, guided by clinical feature selection. Our focus is on developing long-term SH prediction models using both semi-supervised learning and supervised learning algorithms. Using the action to control cardiovascular risk in diabetes trial, which includes electronic health records for over 10,000 individuals, we focus on studying adults with T2DM. Our results indicate that the application of a multi-view co-training method, incorporating the random forest algorithm, improves the specificity of SH prediction, while the same setup with Naive Bayes replacing random forest demonstrates better sensitivity. Our framework also provides interpretability of machine learning models by identifying key predictors for hypoglycemia, including fasting plasma glucose, hemoglobin A1c, general diabetes education, and NPH or L insulins. The integration of data routinely available in electronic health records significantly enhances our model's capability to predict SH events, showcasing its potential to transform clinical practice by facilitating early interventions and optimizing patient management. By enhancing prediction accuracy and identifying crucial predictive features, our study contributes to advancing the understanding and management of hypoglycemia in this population.
Collapse
Affiliation(s)
- Melih Agraz
- Division of Applied Mathematics, Brown University, Providence, RI, 02912, USA
- Department of Statistics, Giresun University, Giresun, 28200, Turkey
- Department of Endocrinology, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA, 02215, USA
| | - Yixiang Deng
- Department of Computer and Information Science, College of Engineering, University of Delaware, Newark, DE, 19716, USA
- Ragon Institute of Mass General, MIT and Harvard, Cambridge, MA, 02142, USA
| | - George Em Karniadakis
- Division of Applied Mathematics, Brown University, Providence, RI, 02912, USA
- School of Engineering, Brown University, Providence, RI, 02912, USA
| | - Christos Socrates Mantzoros
- Department of Endocrinology, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA, 02215, USA.
| |
Collapse
|
3
|
Parikh V, Tariq A, Patel B, Banerjee I. Comparative Analysis of Fusion Strategies for Imaging and Non-imaging Data - Use-case of Hospital Discharge Prediction. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE PROCEEDINGS. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE 2024; 2024:652-661. [PMID: 38827051 PMCID: PMC11141810] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Subscribe] [Scholar Register] [Indexed: 06/04/2024]
Abstract
Accurate prediction of future clinical events such as discharge from hospital can not only improve hospital resource management but also provide an indicator of a patient's clinical condition. Within the scope of this work, we perform a comparative analysis of deep learning based fusion strategies against traditional single source models for prediction of discharge from hospital by fusing information encoded in two diverse but relevant data modalities, i.e., chest X-ray images and tabular electronic health records (EHR). We evaluate multiple fusion strategies including late, early and joint fusion in terms of their efficacy for target prediction compared to EHR-only and Image-only predictive models. Results indicated the importance of merging information from two modalities for prediction as fusion models tended to outperform single modality models and indicate that the joint fusion scheme was the most effective for target prediction. Joint fusion model merges the two modalities through a branched neural network that is jointly trained in an end-to-end fashion to extract target-relevant information from both modalities.
Collapse
|
4
|
Li Y, Huang Y, Yang S, Shychuk EM, Shenkman EA, Bian J, Angell AM, Guo Y. Machine Learning Prediction of Autism Spectrum Disorder Through Linking Mothers' and Children's Electronic Health Record Data. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.03.24.24304813. [PMID: 38585795 PMCID: PMC10996718 DOI: 10.1101/2024.03.24.24304813] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/09/2024]
Abstract
Autism spectrum disorder (ASD) is a neurodevelopmental disorder typically diagnosed in children. Early detection of ASD, particularly in girls who are often diagnosed late, can aid long-term development for children. We aimed to develop machine learning models for predicting ASD diagnosis in children, both boys and girls, using child-mother linked electronic health records (EHRs) data from a large clinical research network. Model features were children and mothers' risk factors in EHRs, including maternal health factors. We tested XGBoost and logistic regression with Random Oversampling (ROS) and Random Undersampling (RUS) to address imbalanced data. Logistic regression with RUS considering a three-year observation window for children's risk factors achieved the best performance for predicting ASD among the overall study population (AUROC = 0.798), boys (AUROC = 0.786), and girls (AUROC = 0.791). We calculated SHAP values to quantify the impacts of important clinical and sociodemographic risk factors.
Collapse
|
5
|
Huang Y, Guo J, Donahoo WT, Fan Z, Lu Y, Chen WH, Tang H, Bilello L, Saguil AA, Rosenberg E, Shenkman EA, Bian J. A Fair Individualized Polysocial Risk Score for Identifying Increased Social Risk in Type 2 Diabetes. RESEARCH SQUARE 2023:rs.3.rs-3684698. [PMID: 38106012 PMCID: PMC10723535 DOI: 10.21203/rs.3.rs-3684698/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/19/2023]
Abstract
Background Racial and ethnic minority groups and individuals facing social disadvantages, which often stem from their social determinants of health (SDoH), bear a disproportionate burden of type 2 diabetes (T2D) and its complications. It is crucial to implement effective social risk management strategies at the point of care. Objective To develop an electronic health records (EHR)-based machine learning (ML) analytical pipeline to address unmet social needs associated with hospitalization risk in patients with T2D. Methods We identified real-world patients with T2D from the EHR data from University of Florida (UF) Health Integrated Data Repository (IDR), incorporating both contextual SDoH (e.g., neighborhood deprivation) and individual-level SDoH (e.g., housing instability). The 2015-2020 data were used for training and validation and 2021-2022 data for independent testing. We developed a machine learning analytic pipeline, namely individualized polysocial risk score (iPsRS), to identify high social risk associated with hospitalizations in T2D patients, along with explainable AI (XAI) and fairness optimization. Results The study cohort included 10,192 real-world patients with T2D, with a mean age of 59 years and 58% female. Of the cohort, 50% were non-Hispanic White, 39% were non-Hispanic Black, 6% were Hispanic, and 5% were other races/ethnicities. Our iPsRS, including both contextual and individual-level SDoH as input factors, achieved a C statistic of 0.72 in predicting 1-year hospitalization after fairness optimization across racial and ethnic groups. The iPsRS showed excellent utility for capturing individuals at high hospitalization risk because of SDoH, that is, the actual 1-year hospitalization rate in the top 5% of iPsRS was 28.1%, ~13 times as high as the bottom decile (2.2% for 1-year hospitalization rate). Conclusion Our ML pipeline iPsRS can fairly and accurately screen for patients who have increased social risk leading to hospitalization in real word patients with T2D.
Collapse
Affiliation(s)
- Yu Huang
- Department of Health Outcomes and Biomedical Informatics, University of Florida, Gainesville, FL, USA
| | - Jingchuan Guo
- Pharmaceutical Outcomes & Policy, University of Florida, Gainesville, FL, USA
| | - William T Donahoo
- Division of Endocrinology, Diabetes and Metabolism, University of Florida College of Medicine
| | - Zhengkang Fan
- Department of Health Outcomes and Biomedical Informatics, University of Florida, Gainesville, FL, USA
| | - Ying Lu
- Pharmaceutical Outcomes & Policy, University of Florida, Gainesville, FL, USA
| | - Wei-Han Chen
- Pharmaceutical Outcomes & Policy, University of Florida, Gainesville, FL, USA
| | - Huilin Tang
- Pharmaceutical Outcomes & Policy, University of Florida, Gainesville, FL, USA
| | - Lori Bilello
- Department of Medicine, University of Florida College of Medicine
| | - Aaron A Saguil
- Department of Community Health and Family Medicine, University of Florida College of Medicine
| | - Eric Rosenberg
- Division of General Internal Medicine, Department of Medicine, University of Florida College of Medicine
| | - Elizabeth A Shenkman
- Department of Health Outcomes and Biomedical Informatics, University of Florida, Gainesville, FL, USA
| | - Jiang Bian
- Department of Health Outcomes and Biomedical Informatics, University of Florida, Gainesville, FL, USA
| |
Collapse
|
6
|
Zhang L, Yang L, Zhou Z. Data-based modeling for hypoglycemia prediction: Importance, trends, and implications for clinical practice. Front Public Health 2023; 11:1044059. [PMID: 36778566 PMCID: PMC9910805 DOI: 10.3389/fpubh.2023.1044059] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2022] [Accepted: 01/10/2023] [Indexed: 01/27/2023] Open
Abstract
Background and objective Hypoglycemia is a key barrier to achieving optimal glycemic control in people with diabetes, which has been proven to cause a set of deleterious outcomes, such as impaired cognition, increased cardiovascular disease, and mortality. Hypoglycemia prediction has come to play a role in diabetes management as big data analysis and machine learning (ML) approaches have become increasingly prevalent in recent years. As a result, a review is needed to summarize the existing prediction algorithms and models to guide better clinical practice in hypoglycemia prevention. Materials and methods PubMed, EMBASE, and the Cochrane Library were searched for relevant studies published between 1 January 2015 and 8 December 2022. Five hypoglycemia prediction aspects were covered: real-time hypoglycemia, mild and severe hypoglycemia, nocturnal hypoglycemia, inpatient hypoglycemia, and other hypoglycemia (postprandial, exercise-related). Results From the 5,042 records retrieved, we included 79 studies in our analysis. Two major categories of prediction models are identified by an overview of the chosen studies: simple or logistic regression models based on clinical data and data-based ML models (continuous glucose monitoring data is most commonly used). Models utilizing clinical data have identified a variety of risk factors that can lead to hypoglycemic events. Data-driven models based on various techniques such as neural networks, autoregressive, ensemble learning, supervised learning, and mathematical formulas have also revealed suggestive features in cases of hypoglycemia prediction. Conclusion In this study, we looked deep into the currently established hypoglycemia prediction models and identified hypoglycemia risk factors from various perspectives, which may provide readers with a better understanding of future trends in this topic.
Collapse
|
7
|
Zou X, Liu Y, Ji L. Review: Machine learning in precision pharmacotherapy of type 2 diabetes-A promising future or a glimpse of hope? Digit Health 2023; 9:20552076231203879. [PMID: 37786401 PMCID: PMC10541760 DOI: 10.1177/20552076231203879] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2023] [Accepted: 09/08/2023] [Indexed: 10/04/2023] Open
Abstract
Precision pharmacotherapy of diabetes requires judicious selection of the optimal therapeutic agent for individual patients. Artificial intelligence (AI), a swiftly expanding discipline, holds substantial potential to transform current practices in diabetes diagnosis and management. This manuscript provides a comprehensive review of contemporary research investigating drug responses in patient subgroups, stratified via either supervised or unsupervised machine learning approaches. The prevalent algorithmic workflow for investigating drug responses using machine learning involves cohort selection, data processing, predictor selection, development and validation of machine learning methods, subgroup allocation, and subsequent analysis of drug response. Despite the promising feature, current research does not yet provide sufficient evidence to implement machine learning algorithms into routine clinical practice, due to a lack of simplicity, validation, or demonstrated efficacy. Nevertheless, we anticipate that the evolving evidence base will increasingly substantiate the role of machine learning in molding precision pharmacotherapy for diabetes.
Collapse
Affiliation(s)
- Xiantong Zou
- Xiantong Zou, Department of Endocrinology and Metabolism, Peking University People's Hospital, Beijing, 100044, China.
| | | | - Linong Ji
- Linong Ji, Department of Endocrinology and Metabolism, Peking University People's Hospital, Beijing, 100044, China.
| |
Collapse
|
8
|
Liu S, Schlesinger JJ, McCoy AB, Reese TJ, Steitz B, Russo E, Koh B, Wright A. New onset delirium prediction using machine learning and long short-term memory (LSTM) in electronic health record. J Am Med Inform Assoc 2022; 30:120-131. [PMID: 36303456 PMCID: PMC9748586 DOI: 10.1093/jamia/ocac210] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2022] [Revised: 10/09/2022] [Accepted: 10/17/2022] [Indexed: 12/15/2022] Open
Abstract
OBJECTIVE To develop and test an accurate deep learning model for predicting new onset delirium in hospitalized adult patients. METHODS Using electronic health record (EHR) data extracted from a large academic medical center, we developed a model combining long short-term memory (LSTM) and machine learning to predict new onset delirium and compared its performance with machine-learning-only models (logistic regression, random forest, support vector machine, neural network, and LightGBM). The labels of models were confusion assessment method (CAM) assessments. We evaluated models on a hold-out dataset. We calculated Shapley additive explanations (SHAP) measures to gauge the feature impact on the model. RESULTS A total of 331 489 CAM assessments with 896 features from 34 035 patients were included. The LightGBM model achieved the best performance (AUC 0.927 [0.924, 0.929] and F1 0.626 [0.618, 0.634]) among the machine learning models. When combined with the LSTM model, the final model's performance improved significantly (P = .001) with AUC 0.952 [0.950, 0.955] and F1 0.759 [0.755, 0.765]. The precision value of the combined model improved from 0.497 to 0.751 with a fixed recall of 0.8. Using the mean absolute SHAP values, we identified the top 20 features, including age, heart rate, Richmond Agitation-Sedation Scale score, Morse fall risk score, pulse, respiratory rate, and level of care. CONCLUSION Leveraging LSTM to capture temporal trends and combining it with the LightGBM model can significantly improve the prediction of new onset delirium, providing an algorithmic basis for the subsequent development of clinical decision support tools for proactive delirium interventions.
Collapse
Affiliation(s)
- Siru Liu
- Corresponding Author: Siru Liu, PhD, Department of Biomedical Informatics, Vanderbilt University Medical Center, 2525 West End Ave #1475, Nashville, TN 37212, USA;
| | - Joseph J Schlesinger
- Division of Critical Care Medicine, Department of Anesthesiology, Vanderbilt University Medical Center, Nashville, Tennessee, USA
| | - Allison B McCoy
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee, USA
| | - Thomas J Reese
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee, USA
| | - Bryan Steitz
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee, USA
| | - Elise Russo
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee, USA
| | - Brian Koh
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee, USA
| | - Adam Wright
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee, USA
| |
Collapse
|