1
|
Reeves D, Morgan C, Stamate D, Ford E, Ashcroft DM, Kontopantelis E, Van Marwijk H, McMillan B. Identifying individuals at high risk for dementia in primary care: Development and validation of the DemRisk risk prediction model using routinely collected patient data. PLoS One 2024; 19:e0310712. [PMID: 39365767 PMCID: PMC11452046 DOI: 10.1371/journal.pone.0310712] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2024] [Accepted: 09/05/2024] [Indexed: 10/06/2024] Open
Abstract
INTRODUCTION Health policy in the UK and globally regarding dementia, emphasises prevention and risk reduction. These goals could be facilitated by automated assessment of dementia risk in primary care using routinely collected patient data. However, existing applicable tools are weak at identifying patients at high risk for dementia. We set out to develop improved risk prediction models deployable in primary care. METHODS Electronic health records (EHRs) for patients aged 60-89 from 393 English general practices were extracted from the Clinical Practice Research Datalink (CPRD) GOLD database. 235 and 158 practices respectively were randomly assigned to development and validation cohorts. Separate dementia risk models were developed for patients aged 60-79 (development cohort n = 616,366; validation cohort n = 419,126) and 80-89 (n = 175,131 and n = 118,717). The outcome was incident dementia within 5 years and more than 60 evidence-based risk factors were evaluated. Risk models were developed and validated using multivariable Cox regression. RESULTS The age 60-79 development cohort included 10,841 incident cases of dementia (6.3 per 1,000 person-years) and the age 80-89 development cohort included 15,994 (40.2 per 1,000 person-years). Discrimination and calibration for the resulting age 60-79 model were good (Harrell's C 0.78 (95% CI: 0.78 to 0.79); Royston's D 1.74 (1.70 to 1.78); calibration slope 0.98 (0.96 to 1.01)), with 37% of patients in the top 1% of risk scores receiving a dementia diagnosis within 5 years. Fit statistics were lower for the age 80-89 model but dementia incidence was higher and 79% of those in the top 1% of risk scores subsequently developed dementia. CONCLUSION Our models can identify individuals at higher risk of dementia using routinely collected information from their primary care record, and outperform an existing EHR-based tool. Discriminative ability was greatest for those aged 60-79, but the model for those aged 80-89 may also be clinical useful.
Collapse
Affiliation(s)
- David Reeves
- Division of Population Health, NIHR School for Primary Care Research, Centre for Primary Care, Health Services Research and Primary Care, University of Manchester, Manchester, United Kingdom
- Division of Population Health, Centre for Biostatistics, Health Services Research and Primary Care, University of Manchester, Manchester, United Kingdom
| | - Catharine Morgan
- Division of Population Health, NIHR School for Primary Care Research, Centre for Primary Care, Health Services Research and Primary Care, University of Manchester, Manchester, United Kingdom
| | - Daniel Stamate
- Division of Population Health, NIHR School for Primary Care Research, Centre for Primary Care, Health Services Research and Primary Care, University of Manchester, Manchester, United Kingdom
- Computing Department, Goldsmiths, University of London, London, United Kingdom
| | - Elizabeth Ford
- Department of Primary Care and Public Health, Brighton and Sussex Medical School, Brighton, United Kingdom
| | - Darren M. Ashcroft
- Division of Population Health, NIHR School for Primary Care Research, Centre for Primary Care, Health Services Research and Primary Care, University of Manchester, Manchester, United Kingdom
- Division of Pharmacy and Optometry, NIHR Greater Manchester Patient Safety Research Collaboration, University of Manchester, Manchester, United Kingdom
- Centre for Pharmacoepidemiology and Drug Safety, School of Health Sciences, Faculty of Biology, Medicine and Health, University of Manchester, Manchester, United Kingdom
| | - Evangelos Kontopantelis
- Division of Population Health, NIHR School for Primary Care Research, Centre for Primary Care, Health Services Research and Primary Care, University of Manchester, Manchester, United Kingdom
- Division of Informatics, Imaging and Data Sciences, University of Manchester, Manchester, United Kingdom
| | - Harm Van Marwijk
- Department of Primary Care and Public Health, Brighton and Sussex Medical School, Brighton, United Kingdom
| | - Brian McMillan
- Division of Population Health, NIHR School for Primary Care Research, Centre for Primary Care, Health Services Research and Primary Care, University of Manchester, Manchester, United Kingdom
| |
Collapse
|
2
|
Trares K, Stocker H, Stevenson-Hoare J, Perna L, Holleczek B, Beyreuther K, Schöttker B, Brenner H. Comparison of subjective cognitive decline and polygenic risk score in the prediction of all-cause dementia, Alzheimer's disease and vascular dementia. Alzheimers Res Ther 2024; 16:188. [PMID: 39160600 PMCID: PMC11331600 DOI: 10.1186/s13195-024-01559-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2024] [Accepted: 08/12/2024] [Indexed: 08/21/2024]
Abstract
BACKGROUND Polygenic risk scores (PRS) and subjective cognitive decline (SCD) are associated with the risk of developing dementia. It remains to examine whether they can improve the established cardiovascular risk factors aging and dementia (CAIDE) model and how their predictive abilities compare. METHODS The CAIDE model was applied to a sub-sample of a large, population-based cohort study (n = 5,360; aged 50-75) and evaluated for the outcomes of all-cause dementia, Alzheimer's disease (AD) and vascular dementia (VD) by calculating Akaike's information criterion (AIC) and the area under the curve (AUC). The improvement of the CAIDE model by PRS and SCD was further examined using the net reclassification improvement (NRI) method and integrated discrimination improvement (IDI). RESULTS During 17 years of follow-up, 410 participants were diagnosed with dementia, including 139 AD and 152 VD diagnoses. Overall, the CAIDE model showed high discriminative ability for all outcomes, reaching AUCs of 0.785, 0.793, and 0.789 for all-cause dementia, AD, and VD, respectively. Adding information on SCD significantly increased NRI for all-cause dementia (4.4%, p = 0.04) and VD (7.7%, p = 0.01). In contrast, prediction models for AD further improved when PRS was added to the model (NRI, 8.4%, p = 0.03). When APOE ε4 carrier status was included (CAIDE Model 2), AUCs increased, but PRS and SCD did not further improve the prediction. CONCLUSIONS Unlike PRS, information on SCD can be assessed more efficiently, and thus, the model including SCD can be more easily transferred to the clinical setting. Nevertheless, the two variables seem negligible if APOE ε4 carrier status is available.
Collapse
Affiliation(s)
- Kira Trares
- Division of Clinical Epidemiology and Aging Research, German Cancer Research Center, Im Neuenheimer Feld 581, 69120, Heidelberg, Germany.
| | - Hannah Stocker
- Division of Clinical Epidemiology and Aging Research, German Cancer Research Center, Im Neuenheimer Feld 581, 69120, Heidelberg, Germany
| | - Joshua Stevenson-Hoare
- Division of Clinical Epidemiology and Aging Research, German Cancer Research Center, Im Neuenheimer Feld 581, 69120, Heidelberg, Germany
| | - Laura Perna
- Department Genes and Environment, Max Planck Institute of Psychiatry, Kraepelinstraße 2-10, 80804, Munich, Germany
| | - Bernd Holleczek
- Saarland Cancer Registry, Neugeländstraße 9, 66117, Saarbrücken, Germany
| | - Konrad Beyreuther
- Network Aging Research, Heidelberg University, Bergheimer Straße 20, 69115, Heidelberg, Germany
| | - Ben Schöttker
- Division of Clinical Epidemiology and Aging Research, German Cancer Research Center, Im Neuenheimer Feld 581, 69120, Heidelberg, Germany
- Network Aging Research, Heidelberg University, Bergheimer Straße 20, 69115, Heidelberg, Germany
| | - Hermann Brenner
- Division of Clinical Epidemiology and Aging Research, German Cancer Research Center, Im Neuenheimer Feld 581, 69120, Heidelberg, Germany
- Network Aging Research, Heidelberg University, Bergheimer Straße 20, 69115, Heidelberg, Germany
| |
Collapse
|
3
|
Zhang C, Ren W, Lu X, Feng L, Li J, Zhu B. Empagliflozin's role in early tubular protection for type 2 diabetes patients. Mol Med 2024; 30:112. [PMID: 39085830 PMCID: PMC11293177 DOI: 10.1186/s10020-024-00881-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2024] [Accepted: 07/17/2024] [Indexed: 08/02/2024] Open
Abstract
BACKGROUND Patients with type 2 diabetes often face early tubular injury, necessitating effective treatment strategies. This study aimed to evaluate the impact of the SGLT2 inhibitor empagliflozin on early tubular injury biomarkers in type 2 diabetes patients with normoalbuminuria. METHODS A randomized controlled clinical study comprising 54 patients selected based on specific criteria was conducted. Patients were divided into an intervention group (empagliflozin, n = 27) and a control group (n = 27) and treated for 6 weeks. Tubular injury biomarkers KIM-1 and NGAL were assessed pre- and post-treatment. RESULTS Both groups demonstrated comparable baseline characteristics. Post-treatment, fasting and postprandial blood glucose levels decreased similarly in both groups. The intervention group exhibited better improvements in total cholesterol, low-density lipoprotein, and blood uric acid levels. Renal function indicators, including UACR and eGFR, showed greater enhancements in the intervention group. Significant reductions in KIM-1 and NGAL were observed in the intervention group. CONCLUSION Treatment with empagliflozin in type 2 diabetes patients with normoalbuminuria led to a notable decrease in tubular injury biomarkers KIM-1 and NGAL. These findings highlight the potential of SGLT2 inhibitors in early tubular protection, offering a new therapeutic approach.
Collapse
Affiliation(s)
- Chuangbiao Zhang
- Department of Endocrinology, First Affiliated Hospital of Jinan University, No. 613 West Huangpu Avenue, Tianhe District, Guangzhou, 510630, Guangdong Province, China
| | - Weiwei Ren
- Department of Endocrinology, Guangzhou Baiyun District Maternal And Child Health Hospital, Guangzhou, 51000, Guangdong Province, China
| | - Xiaohua Lu
- Department of Endocrinology, First Affiliated Hospital of Jinan University, No. 613 West Huangpu Avenue, Tianhe District, Guangzhou, 510630, Guangdong Province, China
| | - Lie Feng
- Department of Endocrinology, First Affiliated Hospital of Jinan University, No. 613 West Huangpu Avenue, Tianhe District, Guangzhou, 510630, Guangdong Province, China
| | - Jiaying Li
- Department of Endocrinology, First Affiliated Hospital of Jinan University, No. 613 West Huangpu Avenue, Tianhe District, Guangzhou, 510630, Guangdong Province, China.
| | - Beibei Zhu
- Endoscopy Center, First Affiliated Hospital of Jinan University, No. 613 West Huangpu Avenue, Tianhe District, Guangzhou, 510630, Guangdong Province, China.
| |
Collapse
|
4
|
John LH, Fridgeirsson EA, Kors JA, Reps JM, Williams RD, Ryan PB, Rijnbeek PR. Development and validation of a patient-level model to predict dementia across a network of observational databases. BMC Med 2024; 22:308. [PMID: 39075527 PMCID: PMC11288076 DOI: 10.1186/s12916-024-03530-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/27/2023] [Accepted: 07/15/2024] [Indexed: 07/31/2024] Open
Abstract
BACKGROUND A prediction model can be a useful tool to quantify the risk of a patient developing dementia in the next years and take risk-factor-targeted intervention. Numerous dementia prediction models have been developed, but few have been externally validated, likely limiting their clinical uptake. In our previous work, we had limited success in externally validating some of these existing models due to inadequate reporting. As a result, we are compelled to develop and externally validate novel models to predict dementia in the general population across a network of observational databases. We assess regularization methods to obtain parsimonious models that are of lower complexity and easier to implement. METHODS Logistic regression models were developed across a network of five observational databases with electronic health records (EHRs) and claims data to predict 5-year dementia risk in persons aged 55-84. The regularization methods L1 and Broken Adaptive Ridge (BAR) as well as three candidate predictor sets to optimize prediction performance were assessed. The predictor sets include a baseline set using only age and sex, a full set including all available candidate predictors, and a phenotype set which includes a limited number of clinically relevant predictors. RESULTS BAR can be used for variable selection, outperforming L1 when a parsimonious model is desired. Adding candidate predictors for disease diagnosis and drug exposure generally improves the performance of baseline models using only age and sex. While a model trained on German EHR data saw an increase in AUROC from 0.74 to 0.83 with additional predictors, a model trained on US EHR data showed only minimal improvement from 0.79 to 0.81 AUROC. Nevertheless, the latter model developed using BAR regularization on the clinically relevant predictor set was ultimately chosen as best performing model as it demonstrated more consistent external validation performance and improved calibration. CONCLUSIONS We developed and externally validated patient-level models to predict dementia. Our results show that although dementia prediction is highly driven by demographic age, adding predictors based on condition diagnoses and drug exposures further improves prediction performance. BAR regularization outperforms L1 regularization to yield the most parsimonious yet still well-performing prediction model for dementia.
Collapse
Affiliation(s)
- Luis H John
- Department of Medical Informatics, Erasmus University Medical Center, Rotterdam, The Netherlands.
| | - Egill A Fridgeirsson
- Department of Medical Informatics, Erasmus University Medical Center, Rotterdam, The Netherlands
| | - Jan A Kors
- Department of Medical Informatics, Erasmus University Medical Center, Rotterdam, The Netherlands
| | - Jenna M Reps
- Janssen Research and Development, Raritan, NJ, USA
| | - Ross D Williams
- Department of Medical Informatics, Erasmus University Medical Center, Rotterdam, The Netherlands
| | | | - Peter R Rijnbeek
- Department of Medical Informatics, Erasmus University Medical Center, Rotterdam, The Netherlands
| |
Collapse
|
5
|
Fridgeirsson EA, Williams R, Rijnbeek P, Suchard MA, Reps JM. Comparing penalization methods for linear models on large observational health data. J Am Med Inform Assoc 2024; 31:1514-1521. [PMID: 38767857 PMCID: PMC11187433 DOI: 10.1093/jamia/ocae109] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2024] [Revised: 04/19/2024] [Accepted: 05/06/2024] [Indexed: 05/22/2024] Open
Abstract
OBJECTIVE This study evaluates regularization variants in logistic regression (L1, L2, ElasticNet, Adaptive L1, Adaptive ElasticNet, Broken adaptive ridge [BAR], and Iterative hard thresholding [IHT]) for discrimination and calibration performance, focusing on both internal and external validation. MATERIALS AND METHODS We use data from 5 US claims and electronic health record databases and develop models for various outcomes in a major depressive disorder patient population. We externally validate all models in the other databases. We use a train-test split of 75%/25% and evaluate performance with discrimination and calibration. Statistical analysis for difference in performance uses Friedman's test and critical difference diagrams. RESULTS Of the 840 models we develop, L1 and ElasticNet emerge as superior in both internal and external discrimination, with a notable AUC difference. BAR and IHT show the best internal calibration, without a clear external calibration leader. ElasticNet typically has larger model sizes than L1. Methods like IHT and BAR, while slightly less discriminative, significantly reduce model complexity. CONCLUSION L1 and ElasticNet offer the best discriminative performance in logistic regression for healthcare predictions, maintaining robustness across validations. For simpler, more interpretable models, L0-based methods (IHT and BAR) are advantageous, providing greater parsimony and calibration with fewer features. This study aids in selecting suitable regularization techniques for healthcare prediction models, balancing performance, complexity, and interpretability.
Collapse
Affiliation(s)
- Egill A Fridgeirsson
- Department of Medical Informatics, Erasmus University Medical Center, 3015 GD Rotterdam, The Netherlands
| | - Ross Williams
- Department of Medical Informatics, Erasmus University Medical Center, 3015 GD Rotterdam, The Netherlands
| | - Peter Rijnbeek
- Department of Medical Informatics, Erasmus University Medical Center, 3015 GD Rotterdam, The Netherlands
| | - Marc A Suchard
- Department of Biostatistics, University of California, Los Angeles, Los Angeles, CA 90095-1772, United States
- VA Informatics and Computing Infrastructure, United States Department of Veterans Affairs, Salt Lake City, UT 84148, United States
| | - Jenna M Reps
- Department of Medical Informatics, Erasmus University Medical Center, 3015 GD Rotterdam, The Netherlands
- Observational Health Data Analytics, Janssen Research and Development, Titusville, NJ 08560, United States
| |
Collapse
|
6
|
Han L, Chen X, Wang Y, Zhang R, Zhao T, Pu L, Huang Y, Sun H. A machine learning algorithm based on circulating metabolic biomarkers offers improved predictions of neurological diseases. Clin Chim Acta 2024; 558:119671. [PMID: 38621587 DOI: 10.1016/j.cca.2024.119671] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2024] [Revised: 04/01/2024] [Accepted: 04/10/2024] [Indexed: 04/17/2024]
Abstract
BACKGROUND AND AIMS A machine learning algorithm based on circulating metabolic biomarkers for the predictions of neurological diseases (NLDs) is lacking. To develop a machine learning algorithm to compare the performance of a metabolic biomarker-based model with that of a clinical model based on conventional risk factors for predicting three NLDs: dementia, Parkinson's disease (PD), and Alzheimer's disease (AD). MATERIALS AND METHODS The eXtreme Gradient Boosting (XGBoost) algorithm was used to construct a metabolic biomarker-based model (metabolic model), a clinical risk factor-based model (clinical model), and a combined model for the prediction of the three NLDs. Risk discrimination (c-statistic), net reclassification improvement (NRI) index, and integrated discrimination improvement (IDI) index values were determined for each model. RESULTS The results indicate that incorporation of metabolic biomarkers into the clinical model afforded a model with improved performance in the prediction of dementia, AD, and PD, as demonstrated by NRI values of 0.159 (0.039-0.279), 0.113 (0.005-0.176), and 0.201 (-0.021-0.423), respectively; and IDI values of 0.098 (0.073-0.122), 0.070 (0.049-0.090), and 0.085 (0.068-0.101), respectively. CONCLUSION The performance of the model based on circulating NMR spectroscopy-detected metabolic biomarkers was better than that of the clinical model in the prediction of dementia, AD, and PD.
Collapse
Affiliation(s)
- Liyuan Han
- Key Laboratory of Diagnosis and Treatment of Digestive System Tumors of Zhejiang Province, Ningbo No 2 Hospital, Ningbo 315000, China; Center for Cardiovascular and Cerebrovascular Epidemiology and Translational Medicine, Ningbo Institute of Life and Health Industry, University of Chinese Academy of Sciences, Ningbo 315000, China
| | - Xi Chen
- Department of Economics, Yale University, USA; Yale Alzheimer's Disease Research Center, Yale University, USA
| | - Yue Wang
- School of Public Health, Medical College of Soochow University, Suzhou, Jiangsu province, China
| | - Ruijie Zhang
- Key Laboratory of Diagnosis and Treatment of Digestive System Tumors of Zhejiang Province, Ningbo No 2 Hospital, Ningbo 315000, China; Center for Cardiovascular and Cerebrovascular Epidemiology and Translational Medicine, Ningbo Institute of Life and Health Industry, University of Chinese Academy of Sciences, Ningbo 315000, China
| | - Tian Zhao
- Key Laboratory of Diagnosis and Treatment of Digestive System Tumors of Zhejiang Province, Ningbo No 2 Hospital, Ningbo 315000, China; Center for Cardiovascular and Cerebrovascular Epidemiology and Translational Medicine, Ningbo Institute of Life and Health Industry, University of Chinese Academy of Sciences, Ningbo 315000, China
| | - Liyuan Pu
- Key Laboratory of Diagnosis and Treatment of Digestive System Tumors of Zhejiang Province, Ningbo No 2 Hospital, Ningbo 315000, China; Center for Cardiovascular and Cerebrovascular Epidemiology and Translational Medicine, Ningbo Institute of Life and Health Industry, University of Chinese Academy of Sciences, Ningbo 315000, China
| | - Yi Huang
- Laboratory of Neurological Diseases and Brain Function, Department of Neurosur-gery, The First Affiliated Hospital of Ningbo University, Ningbo, Zhejiang 315010, China; Key Laboratory of Precision Medicine for Atherosclerotic Diseases of Zhejiang Province, Ningbo, Zhejiang 315010, China.
| | - Hongpeng Sun
- School of Public Health, Medical College of Soochow University, Suzhou, Jiangsu province, China.
| |
Collapse
|
7
|
Naderalvojoud B, Curtin CM, Yanover C, El-Hay T, Choi B, Park RW, Tabuenca JG, Reeve MP, Falconer T, Humphreys K, Asch SM, Hernandez-Boussard T. Towards global model generalizability: independent cross-site feature evaluation for patient-level risk prediction models using the OHDSI network. J Am Med Inform Assoc 2024; 31:1051-1061. [PMID: 38412331 PMCID: PMC11031239 DOI: 10.1093/jamia/ocae028] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2023] [Revised: 01/26/2024] [Accepted: 02/01/2024] [Indexed: 02/29/2024] Open
Abstract
BACKGROUND Predictive models show promise in healthcare, but their successful deployment is challenging due to limited generalizability. Current external validation often focuses on model performance with restricted feature use from the original training data, lacking insights into their suitability at external sites. Our study introduces an innovative methodology for evaluating features during both the development phase and the validation, focusing on creating and validating predictive models for post-surgery patient outcomes with improved generalizability. METHODS Electronic health records (EHRs) from 4 countries (United States, United Kingdom, Finland, and Korea) were mapped to the OMOP Common Data Model (CDM), 2008-2019. Machine learning (ML) models were developed to predict post-surgery prolonged opioid use (POU) risks using data collected 6 months before surgery. Both local and cross-site feature selection methods were applied in the development and external validation datasets. Models were developed using Observational Health Data Sciences and Informatics (OHDSI) tools and validated on separate patient cohorts. RESULTS Model development included 41 929 patients, 14.6% with POU. The external validation included 31 932 (UK), 23 100 (US), 7295 (Korea), and 3934 (Finland) patients with POU of 44.2%, 22.0%, 15.8%, and 21.8%, respectively. The top-performing model, Lasso logistic regression, achieved an area under the receiver operating characteristic curve (AUROC) of 0.75 during local validation and 0.69 (SD = 0.02) (averaged) in external validation. Models trained with cross-site feature selection significantly outperformed those using only features from the development site through external validation (P < .05). CONCLUSIONS Using EHRs across four countries mapped to the OMOP CDM, we developed generalizable predictive models for POU. Our approach demonstrates the significant impact of cross-site feature selection in improving model performance, underscoring the importance of incorporating diverse feature sets from various clinical settings to enhance the generalizability and utility of predictive healthcare models.
Collapse
Affiliation(s)
| | - Catherine M Curtin
- Department of Surgery, Veterans Affairs Palo Alto Health Care System, Palo Alto, CA 94304, United States
| | - Chen Yanover
- KI Research Institute, Kfar Malal, 4592000, Israel
| | - Tal El-Hay
- KI Research Institute, Kfar Malal, 4592000, Israel
| | - Byungjin Choi
- Department of Biomedical Informatics, Ajou University Graduate School of Medicine, Suwon, 16499, Korea
| | - Rae Woong Park
- Department of Biomedical Informatics, Ajou University Graduate School of Medicine, Suwon, 16499, Korea
| | - Javier Gracia Tabuenca
- Institute for Molecular Medicine Finland (FIMM), HiLIFE, University of Helsinki, Helsinki, 00014, Finland
| | - Mary Pat Reeve
- Institute for Molecular Medicine Finland (FIMM), HiLIFE, University of Helsinki, Helsinki, 00014, Finland
| | - Thomas Falconer
- Department of Biomedical Informatics, Columbia University, New York, NY 10032, United States
| | - Keith Humphreys
- Department of Psychiatry and the Behavioral Sciences, Stanford University, Stanford, CA 94305, United States
- Center for Innovation to Implementation, Veterans Affairs Palo Alto Health Care System, Palo Alto, CA 94304, United States
| | - Steven M Asch
- Department of Medicine, Stanford University, Stanford, CA 94305, United States
- Center for Innovation to Implementation, Veterans Affairs Palo Alto Health Care System, Palo Alto, CA 94304, United States
| | | |
Collapse
|
8
|
Brain J, Kafadar AH, Errington L, Kirkley R, Tang EY, Akyea RK, Bains M, Brayne C, Figueredo G, Greene L, Louise J, Morgan C, Pakpahan E, Reeves D, Robinson L, Salter A, Siervo M, Tully PJ, Turnbull D, Qureshi N, Stephan BC. What's New in Dementia Risk Prediction Modelling? An Updated Systematic Review. Dement Geriatr Cogn Dis Extra 2024; 14:49-74. [PMID: 39015518 PMCID: PMC11250535 DOI: 10.1159/000539744] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2024] [Accepted: 06/07/2024] [Indexed: 07/18/2024] Open
Abstract
Introduction Identifying individuals at high risk of dementia is critical to optimized clinical care, formulating effective preventative strategies, and determining eligibility for clinical trials. Since our previous systematic reviews in 2010 and 2015, there has been a surge in dementia risk prediction modelling. The aim of this study was to update our previous reviews to explore, and critically review, new developments in dementia risk modelling. Methods MEDLINE, Embase, Scopus, and Web of Science were searched from March 2014 to June 2022. Studies were included if they were population- or community-based cohorts (including electronic health record data), had developed a model for predicting late-life incident dementia, and included model performance indices such as discrimination, calibration, or external validation. Results In total, 9,209 articles were identified from the electronic search, of which 74 met the inclusion criteria. We found a substantial increase in the number of new models published from 2014 (>50 new models), including an increase in the number of models developed using machine learning. Over 450 unique predictor (component) variables have been tested. Nineteen studies (26%) undertook external validation of newly developed or existing models, with mixed results. For the first time, models have also been developed in low- and middle-income countries (LMICs) and others validated in racial and ethnic minority groups. Conclusion The literature on dementia risk prediction modelling is rapidly evolving with new analytical developments and testing in LMICs. However, it is still challenging to make recommendations about which one model is the most suitable for routine use in a clinical setting. There is an urgent need to develop a suitable, robust, validated risk prediction model in the general population that can be widely implemented in clinical practice to improve dementia prevention.
Collapse
Affiliation(s)
- Jacob Brain
- Institute of Mental Health, School of Medicine, University of Nottingham, Innovation Park, Jubilee Campus, Nottingham, UK
- Freemasons Foundation Centre for Men’s Health, Discipline of Medicine, School of Psychology, The University of Adelaide, Adelaide, SA, Australia
| | - Aysegul Humeyra Kafadar
- Institute of Mental Health, School of Medicine, University of Nottingham, Innovation Park, Jubilee Campus, Nottingham, UK
| | - Linda Errington
- Walton Library, Faculty of Medical Sciences, Newcastle University, Newcastle upon Tyne, UK
| | - Rachael Kirkley
- Population Health Sciences Institute, Newcastle University, Newcastle upon Tyne, UK
| | - Eugene Y.H. Tang
- Population Health Sciences Institute, Newcastle University, Newcastle upon Tyne, UK
| | - Ralph K. Akyea
- PRISM Group, Centre for Academic Primary Care, Lifespan and Population Health, School of Medicine, University of Nottingham, Nottingham, UK
| | - Manpreet Bains
- Nottingham Centre for Public Health and Epidemiology, Lifespan and Population Health, School of Medicine, University of Nottingham, Nottingham, UK
| | - Carol Brayne
- Cambridge Public Health, University of Cambridge, Cambridge, UK
| | | | - Leanne Greene
- Exeter Clinical Trials Unit, Department of Health and Community Sciences, University of Exeter Medical School, Exeter, UK
| | - Jennie Louise
- Women’s and Children’s Hospital Research Centre and South Australian Health and Medical Research Institute, Adelaide, SA, Australia
| | - Catharine Morgan
- Division of Population Health, Health Services Research and Primary Care, University of Manchester, Manchester, UK
| | - Eduwin Pakpahan
- Department of Mathematics, Physics and Electrical Engineering, Northumbria University, Newcastle upon Tyne, UK
| | - David Reeves
- School for Health Sciences, University of Manchester, Manchester, UK
| | - Louise Robinson
- Population Health Sciences Institute, Newcastle University, Newcastle upon Tyne, UK
| | - Amy Salter
- School of Public Health, Faculty of Health and Medical Sciences, University of Adelaide, Adelaide, SA, Australia
| | - Mario Siervo
- School of Population Health, Curtin University, Perth, WA, Australia
- Dementia Centre of Excellence, Curtin enAble Institute, Faculty of Health Sciences, Curtin University, Perth, WA, Australia
| | - Phillip J. Tully
- Freemasons Foundation Centre for Men’s Health, Discipline of Medicine, School of Psychology, The University of Adelaide, Adelaide, SA, Australia
- Faculty of Medicine and Health, School of Psychology, University of New England, Armidale, NSW, Australia
| | - Deborah Turnbull
- Freemasons Foundation Centre for Men’s Health, Discipline of Medicine, School of Psychology, The University of Adelaide, Adelaide, SA, Australia
| | - Nadeem Qureshi
- PRISM Group, Centre for Academic Primary Care, Lifespan and Population Health, School of Medicine, University of Nottingham, Nottingham, UK
| | - Blossom C.M. Stephan
- Institute of Mental Health, School of Medicine, University of Nottingham, Innovation Park, Jubilee Campus, Nottingham, UK
- Dementia Centre of Excellence, Curtin enAble Institute, Faculty of Health Sciences, Curtin University, Perth, WA, Australia
| |
Collapse
|
9
|
Fridgeirsson EA, Sontag D, Rijnbeek P. Attention-based neural networks for clinical prediction modelling on electronic health records. BMC Med Res Methodol 2023; 23:285. [PMID: 38062352 PMCID: PMC10701944 DOI: 10.1186/s12874-023-02112-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2023] [Accepted: 11/27/2023] [Indexed: 12/18/2023] Open
Abstract
BACKGROUND Deep learning models have had a lot of success in various fields. However, on structured data they have struggled. Here we apply four state-of-the-art supervised deep learning models using the attention mechanism and compare against logistic regression and XGBoost using discrimination, calibration and clinical utility. METHODS We develop the models using a general practitioners database. We implement a recurrent neural network, a transformer with and without reverse distillation and a graph neural network. We measure discrimination using the area under the receiver operating characteristic curve (AUC) and the area under the precision recall curve (AUPRC). We assess smooth calibration using restricted cubic splines and clinical utility with decision curve analysis. RESULTS Our results show that deep learning approaches can improve discrimination up to 2.5% points AUC and 7.4% points AUPRC. However, on average the baselines are competitive. Most models are similarly calibrated as the baselines except for the graph neural network. The transformer using reverse distillation shows the best performance in clinical utility on two out of three prediction problems over most of the prediction thresholds. CONCLUSION In this study, we evaluated various approaches in supervised learning using neural networks and attention. Here we do a rigorous comparison, not only looking at discrimination but also calibration and clinical utility. There is value in using deep learning models on electronic health record data since it can improve discrimination and clinical utility while providing good calibration. However, good baseline methods are still competitive.
Collapse
Affiliation(s)
- Egill A Fridgeirsson
- Department of Medical Informatics, Erasmus University Medical Center, Doctor Molewaterplein 40, 3015 GD, Rotterdam, the Netherlands.
| | - David Sontag
- Institute for Medical Engineering & Science, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Peter Rijnbeek
- Department of Medical Informatics, Erasmus University Medical Center, Doctor Molewaterplein 40, 3015 GD, Rotterdam, the Netherlands
| |
Collapse
|
10
|
Tian M, Ma X, Liang M, Zang H. Application of Rapid Identification and Determination of Moisture Content of Coptidis Rhizoma From Different Species Based on Data Fusion. J AOAC Int 2023; 106:1389-1401. [PMID: 37171863 DOI: 10.1093/jaoacint/qsad058] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2023] [Revised: 04/25/2023] [Accepted: 05/08/2023] [Indexed: 05/13/2023]
Abstract
BACKGROUND For thousands of years, traditional Chinese medicine (TCM) has been clinically proven, and doctors have highly valued the differences in utility between different species. OBJECTIVE This study aims to replace the complex methods traditionally used for empirical identification by compensating for the information loss of a single sensor through data fusion. The research object of the study is Coptidis rhizoma (CR). METHOD Using spectral optimization and data fusion technology, near infrared (NIR) and mid-infrared (MIR) spectra were collected for CR. PLS-DA (n = 134) and PLSR (n = 63) models were established to identify the medicinal materials and to determine the moisture content in the medicinal materials. RESULTS For the identification of the three species of CR, the mid-level fusion model performed better than the single-spectrum model. The sensitivity and specificity of the prediction set coefficients for NIR, MIR, and data fusion qualitative models were all higher than 0.95, with an AUC value of 1. The NIR data model was superior to the MIR data model. The results of low-level fusion were similar to those of the NIR optimization model. The RPD of the test set of NIR and low-level fusion model was 3.6420 and 3.4216, respectively, indicating good prediction ability of the model. CONCLUSIONS Data fusion technology using NIR and MIR can be applied to identify CR species and to determine the moisture content of CR. It provides technical support for the rapid determination of moisture content, with a fast analysis speed and without the need for complex pretreatment methods. HIGHLIGHTS This study is the first to introduce spectral data fusion technology to identify CR species. Data fusion technology is feasible for multivariable calibration model performance and reduces the cost of manual identification. The moisture content of CR can be quickly evaluated, reducing the difficulty of traditional methods.
Collapse
Affiliation(s)
- Mengyin Tian
- Shandong University, NMPA Key Laboratory for Technology Research and Evaluation of Drug Products, Cheeloo College of Medicine, Jinan, Shandong 250012, China
- Shandong University, Key Laboratory of Chemical Biology (Ministry of Education), Jinan, Shandong 250012, China
| | - Xiaobo Ma
- Shandong University, NMPA Key Laboratory for Technology Research and Evaluation of Drug Products, Cheeloo College of Medicine, Jinan, Shandong 250012, China
- Shandong University, Key Laboratory of Chemical Biology (Ministry of Education), Jinan, Shandong 250012, China
| | - Mengying Liang
- Shandong University, NMPA Key Laboratory for Technology Research and Evaluation of Drug Products, Cheeloo College of Medicine, Jinan, Shandong 250012, China
- Shandong University, Key Laboratory of Chemical Biology (Ministry of Education), Jinan, Shandong 250012, China
| | - Hengchang Zang
- Shandong University, NMPA Key Laboratory for Technology Research and Evaluation of Drug Products, Cheeloo College of Medicine, Jinan, Shandong 250012, China
- Shandong University, Key Laboratory of Chemical Biology (Ministry of Education), Jinan, Shandong 250012, China
- Shandong University, National Glycoengineering Research Center, Jinan, Shandong 250012, China
| |
Collapse
|
11
|
Fehr J, Piccininni M, Kurth T, Konigorski S. Assessing the transportability of clinical prediction models for cognitive impairment using causal models. BMC Med Res Methodol 2023; 23:187. [PMID: 37598141 PMCID: PMC10439645 DOI: 10.1186/s12874-023-02003-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2022] [Accepted: 07/27/2023] [Indexed: 08/21/2023] Open
Abstract
BACKGROUND Machine learning models promise to support diagnostic predictions, but may not perform well in new settings. Selecting the best model for a new setting without available data is challenging. We aimed to investigate the transportability by calibration and discrimination of prediction models for cognitive impairment in simulated external settings with different distributions of demographic and clinical characteristics. METHODS We mapped and quantified relationships between variables associated with cognitive impairment using causal graphs, structural equation models, and data from the ADNI study. These estimates were then used to generate datasets and evaluate prediction models with different sets of predictors. We measured transportability to external settings under guided interventions on age, APOE ε4, and tau-protein, using performance differences between internal and external settings measured by calibration metrics and area under the receiver operating curve (AUC). RESULTS Calibration differences indicated that models predicting with causes of the outcome were more transportable than those predicting with consequences. AUC differences indicated inconsistent trends of transportability between the different external settings. Models predicting with consequences tended to show higher AUC in the external settings compared to internal settings, while models predicting with parents or all variables showed similar AUC. CONCLUSIONS We demonstrated with a practical prediction task example that predicting with causes of the outcome results in better transportability compared to anti-causal predictions when considering calibration differences. We conclude that calibration performance is crucial when assessing model transportability to external settings.
Collapse
Affiliation(s)
- Jana Fehr
- Digital Engineering Faculty, University of Potsdam, Potsdam, Germany.
- Digital Health and Machine Learning, Hasso-Plattner-Institute, Potsdam, Germany.
| | - Marco Piccininni
- Institute of Public Health, Charité - Universitätsmedizin Berlin, Berlin, Germany
- Center for Stroke Research Berlin, Charité - Universitätsmedizin Berlin, Berlin, Germany
| | - Tobias Kurth
- Institute of Public Health, Charité - Universitätsmedizin Berlin, Berlin, Germany
| | - Stefan Konigorski
- Digital Engineering Faculty, University of Potsdam, Potsdam, Germany.
- Digital Health and Machine Learning, Hasso-Plattner-Institute, Potsdam, Germany.
- Icahn School of Medicine at Mount Sinai, Hasso Plattner Institute for Digital Health at Mount Sinai, New York, NY, USA.
| |
Collapse
|