1
|
Bernstein S, Gilson S, Zhu M, Nathan AG, Cui M, Press VG, Shah S, Zarei P, Laiteerapong N, Huang ES. Diabetes Life Expectancy Prediction Model Inputs and Results From Patient Surveys Compared With Electronic Health Record Abstraction: Survey Study. JMIR Aging 2023; 6:e44037. [PMID: 37962566 PMCID: PMC10662674 DOI: 10.2196/44037] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2022] [Revised: 06/13/2023] [Accepted: 09/19/2023] [Indexed: 11/15/2023] Open
Abstract
Background Prediction models are being increasingly used in clinical practice, with some requiring patient-reported outcomes (PROs). The optimal approach to collecting the needed inputs is unknown. Objective Our objective was to compare mortality prediction model inputs and scores based on electronic health record (EHR) abstraction versus patient survey. Methods Older patients aged ≥65 years with type 2 diabetes at an urban primary care practice in Chicago were recruited to participate in a care management trial. All participants completed a survey via an electronic portal that included items on the presence of comorbid conditions and functional status, which are needed to complete a mortality prediction model. We compared the individual data inputs and the overall model performance based on the data gathered from the survey compared to the chart review. Results For individual data inputs, we found the largest differences in questions regarding functional status such as pushing/pulling, where 41.4% (31/75) of participants reported difficulties that were not captured in the chart with smaller differences for comorbid conditions. For the overall mortality score, we saw nonsignificant differences (P=.82) when comparing survey and chart-abstracted data. When allocating participants to life expectancy subgroups (<5 years, 5-10 years, >10 years), differences in survey and chart review data resulted in 20% having different subgroup assignments and, therefore, discordant glucose control recommendations. Conclusions In this small exploratory study, we found that, despite differences in data inputs regarding functional status, the overall performance of a mortality prediction model was similar when using survey and chart-abstracted data. Larger studies comparing patient survey and chart data are needed to assess whether these findings are reproduceable and clinically important.
Collapse
Affiliation(s)
- Sean Bernstein
- Rush University Medical Center, ChicagoIL, United States
| | - Sarah Gilson
- Section of General Internal Medicine, Department of Medicine, University of Chicago, ChicagoIL, United States
| | - Mengqi Zhu
- Section of General Internal Medicine, Department of Medicine, University of Chicago, ChicagoIL, United States
| | - Aviva G Nathan
- Section of General Internal Medicine, Department of Medicine, University of Chicago, ChicagoIL, United States
| | - Michael Cui
- Rush University Medical Center, ChicagoIL, United States
| | - Valerie G Press
- Section of General Internal Medicine, Department of Medicine, University of Chicago, ChicagoIL, United States
| | - Sachin Shah
- Section of General Internal Medicine, Department of Medicine, University of Chicago, ChicagoIL, United States
| | - Parmida Zarei
- College of Medicine, University of Illinois Chicago, ChicagoIL, United States
| | - Neda Laiteerapong
- Section of General Internal Medicine, Department of Medicine, University of Chicago, ChicagoIL, United States
| | - Elbert S Huang
- Section of General Internal Medicine, Department of Medicine, University of Chicago, ChicagoIL, United States
| |
Collapse
|
2
|
Kent DM, Nelson J, Pittas A, Colangelo F, Koenig C, van Klaveren D, Ciemins E, Cuddeback J. An Electronic Health Record-Compatible Model to Predict Personalized Treatment Effects From the Diabetes Prevention Program: A Cross-Evidence Synthesis Approach Using Clinical Trial and Real-World Data. Mayo Clin Proc 2022; 97:703-715. [PMID: 34782125 DOI: 10.1016/j.mayocp.2021.09.012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/20/2021] [Revised: 07/30/2021] [Accepted: 09/09/2021] [Indexed: 11/15/2022]
Abstract
OBJECTIVE To develop an electronic health record (EHR)-based risk tool that provides point-of-care estimates of diabetes risk to support targeting interventions to patients most likely to benefit. PATIENTS AND METHODS A risk prediction model was developed and validated in a large observational database of patients with an index visit date between January 1, 2012, and December 31, 2016, with treatment effect estimates from risk-based reanalysis of clinical trial data. The risk model development cohort included 1.1 million patients with prediabetes from the OptumLabs Data Warehouse (OLDW); the validation cohort included a distinct sample of 1.1 million patients in OLDW. The randomly assigned clinical trial cohort included 3081 people from the Diabetes Prevention Program (DPP) study. RESULTS Eleven variables reliably obtainable from the EHR were used to predict diabetes risk. This model validated well in the OLDW (C statistic = 0.76; observed 3-year diabetes rate was 1.8% (95% confidence interval [CI], 1.7 to 1.9) in the lowest-risk quarter and 19.6% (19.4 to 19.8) in the highest-risk quarter). In the DPP, the hazard ratio (HR) for lifestyle modification was constant across all levels of risk (HR, 0.43; 95% CI, 0.35 to 0.53), whereas the HR for metformin was highly risk dependent (HR, 1.1; 95% CI, 0.61 to 2.0 in the lowest-risk quarter vs HR, 0.45; 95% CI, 0.35 to 0.59 in the highest-risk quarter). Fifty-three percent of the benefits of population-wide dissemination of the DPP lifestyle modification and 73% of the benefits of population-wide metformin therapy can be obtained by targeting the highest-risk quarter of patients. CONCLUSION The Tufts-Predictive Analytics and Comparative Effectiveness DPP Risk model is an EHR-compatible tool that might support targeted diabetes prevention to more efficiently realize the benefits of the DPP interventions.
Collapse
Affiliation(s)
- David M Kent
- Predictive Analytics and Comparative Effectiveness Center, Tufts Medical Center, Boston, MA.
| | - Jason Nelson
- Predictive Analytics and Comparative Effectiveness Center, Tufts Medical Center, Boston, MA
| | | | | | | | - David van Klaveren
- Predictive Analytics and Comparative Effectiveness Center, Tufts Medical Center, Boston, MA; Department of Public Health, Erasmus MC University Medical Center, Rotterdam, the Netherlands
| | | | | |
Collapse
|
3
|
Asgari S, Khalili D, Hosseinpanah F, Hadaegh F. Prediction Models for Type 2 Diabetes Risk in the General Population: A Systematic Review of Observational Studies. Int J Endocrinol Metab 2021; 19:e109206. [PMID: 34567135 PMCID: PMC8453657 DOI: 10.5812/ijem.109206] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/20/2020] [Revised: 02/07/2021] [Accepted: 02/13/2021] [Indexed: 12/14/2022] Open
Abstract
OBJECTIVES This study aimed to provide an overview of prediction models of undiagnosed type 2 diabetes mellitus (U-T2DM) or the incident T2DM (I-T2DM) using the transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD) checklist and the prediction model risk of the bias assessment tool (PROBAST). DATA SOURCES Both PUBMED and EMBASE databases were searched to guarantee adequate and efficient coverage. STUDY SELECTION Articles published between December 2011 and October 2019 were considered. DATA EXTRACTION For each article, information on model development requirements, discrimination measures, calibration, overall performance, clinical usefulness, overfitting, and risk of bias (ROB) was reported. RESULTS The median (interquartile range; IQR) number of the 46 study populations for model development was 5711 (1971 - 27426) and 2457 (2060 - 6995) individuals for I-T2DM and U-T2DM, respectively. The most common reported predictors were age and body mass index, and only the Qrisk-2017 study included social factors (e.g., Townsend score). Univariable analysis was reported in 46% of the studies, and the variable selection procedure was not clear in 17.4% of them. Moreover, internal and external validation was reported in 43% the studies, while over 63% of them reported calibration. The median (IQR) of AUC for I-T2DM models was 0.78 (0.74 - 0.82); the corresponding value for studies derived before October 2011 was 0.80 (0.77 - 0.83). The highest discrimination index was reported for Qrisk-2017 with C-statistics of 0.89 for women and 0.87 for men. Low ROB for I-T2DM and U-T2DM was assessed at 18% and 41%, respectively. CONCLUSIONS Among prediction models, an intermediate to poor quality was reassessed in several aspects of model development and validation. Generally, despite its new risk factors or new methodological aspects, the newly developed model did not increase our capability in screening/predicting T2DM, mainly in the analysis part. It was due to the lack of external validation of the prediction models.
Collapse
Affiliation(s)
- Samaneh Asgari
- Prevention of Metabolic Disorders Research Center, Research Institute for Endocrine Sciences, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Davood Khalili
- Prevention of Metabolic Disorders Research Center, Research Institute for Endocrine Sciences, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Farhad Hosseinpanah
- Obesity Research Center, Research Institute for Endocrine Sciences, Shaheed Beheshti University of Medical Sciences, Tehran, Iran
| | - Farzad Hadaegh
- Prevention of Metabolic Disorders Research Center, Research Institute for Endocrine Sciences, Shahid Beheshti University of Medical Sciences, Tehran, Iran
- Corresponding Author: Prevention of Metabolic Disorders Research Center, Research Institute for Endocrine Sciences, Shahid Beheshti University of Medical Sciences, Tehran, Iran.
| |
Collapse
|
4
|
Naranjo FS, Sang Y, Ballew SH, Stempniewicz N, Dunning SC, Levey AS, Coresh J, Grams ME. Estimating Kidney Failure Risk Using Electronic Medical Records. KIDNEY360 2021; 2:415-424. [PMID: 35369014 PMCID: PMC8786004 DOI: 10.34067/kid.0005592020] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/16/2020] [Accepted: 12/22/2020] [Indexed: 02/04/2023]
Abstract
Background The four-variable kidney failure risk equation (KFRE) is a well-validated tool for patients with GFR <60 ml/min per 1.73 m2 and incorporates age, sex, GFR, and urine albumin-creatinine ratio (ACR) to forecast individual risk of kidney failure. Implementing the KFRE in electronic medical records is challenging, however, due to low ACR testing in clinical practice. The aim of this study was to determine, when ACR is missing, whether to impute ACR from protein-to-creatinine ratio (PCR) or dipstick protein for use in the four-variable KFRE, or to use the three-variable KFRE, which does not require ACR. Methods Using electronic health records from OptumLabs Data Warehouse, patients with eGFR <60 ml/min per 1.73 m2 were categorized on the basis of the availability of ACR testing within the previous 3 years. For patients missing ACR, we extracted urine PCR and dipstick protein results, comparing the discrimination of the three-variable KFRE (age, sex, GFR) with the four-variable KFRE estimated using imputed ACR from PCR and dipstick protein levels. Results There were 976,299 patients in 39 health care organizations; 59% were women, the mean age was 72 years, and mean eGFR was 47 ml/min per 1.73 m2. The proportion with ACR testing was 19% within the previous 3 years. An additional 2% had an available PCR and 36% had a dipstick protein; the remaining 43% had no form of albuminuria testing. The four-variable KFRE had significantly better discrimination than the three-variable KFRE among patients with ACR testing, PCR testing, and urine dipstick protein levels, even with imputed ACR for the latter two groups. Calibration of the four-variable KFRE was acceptable in each group, but the three-variable equation showed systematic bias in the groups that lacked ACR or PCR testing. Conclusions Implementation of the KFRE in electronic medical records should incorporate ACR, even if only imputed from PCR or urine dipstick protein levels.
Collapse
Affiliation(s)
- Felipe S. Naranjo
- Division of Nephrology, Department of Medicine, University of Nebraska Medical Center, Omaha, Nebraska,Division of Nephrology, Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, Maryland
| | - Yingying Sang
- Welch Center for Prevention, Epidemiology, and Clinical Research, Johns Hopkins Medical Institutions, Baltimore, Maryland,Department of Epidemiology, Johns Hopkins University Bloomberg School of Public Health, Baltimore, Maryland,OptumLabs Visiting Fellow, Cambridge, Massachusetts
| | - Shoshana H. Ballew
- Welch Center for Prevention, Epidemiology, and Clinical Research, Johns Hopkins Medical Institutions, Baltimore, Maryland,Department of Epidemiology, Johns Hopkins University Bloomberg School of Public Health, Baltimore, Maryland
| | | | | | - Andrew S. Levey
- Division of Nephrology, Tufts Medical Center, Boston, Massachusetts
| | - Josef Coresh
- Welch Center for Prevention, Epidemiology, and Clinical Research, Johns Hopkins Medical Institutions, Baltimore, Maryland,Department of Epidemiology, Johns Hopkins University Bloomberg School of Public Health, Baltimore, Maryland
| | - Morgan E. Grams
- Division of Nephrology, Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, Maryland,Welch Center for Prevention, Epidemiology, and Clinical Research, Johns Hopkins Medical Institutions, Baltimore, Maryland
| |
Collapse
|
5
|
Nori VS, Hane CA, Sun Y, Crown WH, Bleicher PA. Deep neural network models for identifying incident dementia using claims and EHR datasets. PLoS One 2020; 15:e0236400. [PMID: 32970677 PMCID: PMC7514098 DOI: 10.1371/journal.pone.0236400] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2020] [Accepted: 07/06/2020] [Indexed: 01/28/2023] Open
Abstract
This study investigates the use of deep learning methods to improve the accuracy of a predictive model for dementia, and compares the performance to a traditional machine learning model. With sufficient accuracy the model can be deployed as a first round screening tool for clinical follow-up including neurological examination, neuropsychological testing, imaging and recruitment to clinical trials. Seven cohorts with two years of data, three to eight years prior to index date, and an incident cohort were created. Four trained models for each cohort, boosted trees, feed forward network, recurrent neural network and recurrent neural network with pre-trained weights, were constructed and their performance compared using validation and test data. The incident model had an AUC of 94.4% and F1 score of 54.1%. Eight years removed from index date the AUC and F1 scores were 80.7% and 25.6%, respectively. The results for the remaining cohorts were between these ranges. Deep learning models can result in significant improvement in performance but come at a cost in terms of run times and hardware requirements. The results of the model at index date indicate that this modeling can be effective at stratifying patients at risk of dementia. At this time, the inability to sustain this quality at longer lead times is more an issue of data availability and quality rather than one of algorithm choices.
Collapse
Affiliation(s)
- Vijay S. Nori
- OptumLabs, Boston, Massachusetts, United States of America
- * E-mail:
| | | | - Yezhou Sun
- OptumLabs, Boston, Massachusetts, United States of America
| | | | | |
Collapse
|
6
|
Nori VS, Hane CA, Martin DC, Kravetz AD, Sanghavi DM. Identifying incident dementia by applying machine learning to a very large administrative claims dataset. PLoS One 2019; 14:e0203246. [PMID: 31276468 PMCID: PMC6611655 DOI: 10.1371/journal.pone.0203246] [Citation(s) in RCA: 35] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2018] [Accepted: 06/20/2019] [Indexed: 01/31/2023] Open
Abstract
Alzheimer's disease and related dementias (ADRD) are highly prevalent conditions, and prior efforts to develop predictive models have relied on demographic and clinical risk factors using traditional logistical regression methods. We hypothesized that machine-learning algorithms using administrative claims data may represent a novel approach to predicting ADRD. Using a national de-identified dataset of more than 125 million patients including over 10,000 clinical, pharmaceutical, and demographic variables, we developed a cohort to train a machine learning model to predict ADRD 4-5 years in advance. The Lasso algorithm selected a 50-variable model with an area under the curve (AUC) of 0.693. Top diagnosis codes in the model were memory loss (780.93), Parkinson's disease (332.0), mild cognitive impairment (331.83) and bipolar disorder (296.80), and top pharmacy codes were psychoactive drugs. Machine learning algorithms can rapidly develop predictive models for ADRD with massive datasets, without requiring hypothesis-driven feature engineering.
Collapse
|
7
|
Linder JR, Waterbury NV, Alexander B, Lund BC. Evaluation of the HealthImpact Diabetes Risk Model in the Veterans Health Administration. J Manag Care Spec Pharm 2018; 24:862-867. [PMID: 30156452 PMCID: PMC10398202 DOI: 10.18553/jmcp.2018.24.9.862] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
BACKGROUND HealthImpact is a novel algorithm using administrative health care data to stratify patients according to risk for incident diabetes. OBJECTIVES To (a) independently assess the predictive validity of HealthImpact and (b) explore its utility in diabetes screening within a nationally integrated health care system. METHODS National Veterans Health Administration data were used to create 2 cohorts. The replication cohort included patients without diagnosed diabetes as of October 1, 2012, to determine if HealthImpact scores were significantly associated with diabetes (type 1 or 2) incidence within the subsequent 3 years. The utility cohort included patients without diagnosed diabetes as of August 1, 2015, and assessed diabetes screening rates in the 2 years surrounding this index date, stratified by HealthImpact scores. RESULTS The 3-year incidence of diabetes in the replication cohort (n = 3,287,240) was 9.1%. Of 100,617 (3.1%) patients with HealthImpact scores > 90, 30,028 developed diabetes, yielding a positive predictive value of 29.8%. These patients accounted for 9.9% of all incident diabetes cases (sensitivity). Sensitivity and negative predictive value improved with descending HealthImpact threshold scores (e.g., > 75, > 50), whereas specificity and positive predictive value declined. Of 3,499,406 patients in the utility cohort, 85.3% received either a blood glucose or hemoglobin A1c test during the 2-year observation period. Among 101,355 patients with a HealthImpact score > 90, nearly all (98.3%) were screened, and 86.3% had an A1c test. CONCLUSIONS Our independent analysis corroborates the validity of HealthImpact in stratifying patients according to diabetes risk. However, its practical utility to enhance diabetes screening in a real-world clinical environment will be strongly dependent on the pattern and frequency of existing screening practices. DISCLOSURES This work was supported by the Iowa City VA Health Care System and by the Department of Veterans Affairs, Office of Research and Development, Health Services Research and Development Service (Lund, CIN 13-412). The authors have no conflicts of interest. The views expressed in this article are those of the authors and do not necessarily reflect the position or policy of the Department of Veterans Affairs or the U.S. government.
Collapse
Affiliation(s)
- Jonathan R Linder
- 1 Department of Pharmacy Services, VA Health Care System, Iowa City, Iowa
| | - Nancee V Waterbury
- 1 Department of Pharmacy Services, VA Health Care System, Iowa City, Iowa
| | - Bruce Alexander
- 2 Center for Comprehensive Access & Delivery Research & Evaluation, Iowa City VA Health Care System, Iowa City, Iowa
| | - Brian C Lund
- 2 Center for Comprehensive Access & Delivery Research & Evaluation, Iowa City VA Health Care System, Iowa City, Iowa
| |
Collapse
|