1
|
Mostafaei S, Hoang MT, Jurado PG, Xu H, Zacarias-Pons L, Eriksdotter M, Chatterjee S, Garcia-Ptacek S. Machine learning algorithms for identifying predictive variables of mortality risk following dementia diagnosis: a longitudinal cohort study. Sci Rep 2023; 13:9480. [PMID: 37301891 PMCID: PMC10257644 DOI: 10.1038/s41598-023-36362-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2023] [Accepted: 06/02/2023] [Indexed: 06/12/2023] Open
Abstract
Machine learning (ML) could have advantages over traditional statistical models in identifying risk factors. Using ML algorithms, our objective was to identify the most important variables associated with mortality after dementia diagnosis in the Swedish Registry for Cognitive/Dementia Disorders (SveDem). From SveDem, a longitudinal cohort of 28,023 dementia-diagnosed patients was selected for this study. Sixty variables were considered as potential predictors of mortality risk, such as age at dementia diagnosis, dementia type, sex, body mass index (BMI), mini-mental state examination (MMSE) score, time from referral to initiation of work-up, time from initiation of work-up to diagnosis, dementia medications, comorbidities, and some specific medications for chronic comorbidities (e.g., cardiovascular disease). We applied sparsity-inducing penalties for three ML algorithms and identified twenty important variables for the binary classification task in mortality risk prediction and fifteen variables to predict time to death. Area-under-ROC curve (AUC) measure was used to evaluate the classification algorithms. Then, an unsupervised clustering algorithm was applied on the set of twenty-selected variables to find two main clusters which accurately matched surviving and dead patient clusters. A support-vector-machines with an appropriate sparsity penalty provided the classification of mortality risk with accuracy = 0.7077, AUROC = 0.7375, sensitivity = 0.6436, and specificity = 0.740. Across three ML algorithms, the majority of the identified twenty variables were compatible with literature and with our previous studies on SveDem. We also found new variables which were not previously reported in literature as associated with mortality in dementia. Performance of basic dementia diagnostic work-up, time from referral to initiation of work-up, and time from initiation of work-up to diagnosis were found to be elements of the diagnostic process identified by the ML algorithms. The median follow-up time was 1053 (IQR = 516-1771) days in surviving and 1125 (IQR = 605-1770) days in dead patients. For prediction of time to death, the CoxBoost model identified 15 variables and classified them in order of importance. These highly important variables were age at diagnosis, MMSE score, sex, BMI, and Charlson Comorbidity Index with selection scores of 23%, 15%, 14%, 12% and 10%, respectively. This study demonstrates the potential of sparsity-inducing ML algorithms in improving our understanding of mortality risk factors in dementia patients and their application in clinical settings. Moreover, ML methods can be used as a complement to traditional statistical methods.
Collapse
Affiliation(s)
- Shayan Mostafaei
- Division of Clinical Geriatrics, Department of Neurobiology, Care Sciences and Society, Karolinska Institute, Stockholm, Sweden.
- Department of Medical Epidemiology and Biostatistics, Karolinska Institute, Stockholm, Sweden.
| | - Minh Tuan Hoang
- Division of Clinical Geriatrics, Department of Neurobiology, Care Sciences and Society, Karolinska Institute, Stockholm, Sweden
- Department of Medical Epidemiology and Biostatistics, Karolinska Institute, Stockholm, Sweden
| | - Pol Grau Jurado
- Division of Clinical Geriatrics, Department of Neurobiology, Care Sciences and Society, Karolinska Institute, Stockholm, Sweden
| | - Hong Xu
- Division of Clinical Geriatrics, Department of Neurobiology, Care Sciences and Society, Karolinska Institute, Stockholm, Sweden
| | - Lluis Zacarias-Pons
- Division of Clinical Geriatrics, Department of Neurobiology, Care Sciences and Society, Karolinska Institute, Stockholm, Sweden
- Vascular Health Research Group of Girona (ISV-Girona), Institut Universitari d'Investigació en Atenció Primària Jordi Gol i Gurina (IDIAP Jordi Gol), Girona, Spain
- Network for Research on Chronicity, Primary Care, and Health Promotion (RICAPPS), Tenerife, Spain
| | - Maria Eriksdotter
- Division of Clinical Geriatrics, Department of Neurobiology, Care Sciences and Society, Karolinska Institute, Stockholm, Sweden
- Aging and Inflammation Theme, Karolinska University Hospital, Stockholm, Sweden
| | - Saikat Chatterjee
- Division of Information Science and Engineering, School of Electrical Engineering and Computer Science, KTH Royal Institute of Technology, Stockholm, Sweden
| | - Sara Garcia-Ptacek
- Division of Clinical Geriatrics, Department of Neurobiology, Care Sciences and Society, Karolinska Institute, Stockholm, Sweden.
- Aging and Inflammation Theme, Karolinska University Hospital, Stockholm, Sweden.
| |
Collapse
|
3
|
Beane S, Callahan CM, Stone RI, Zimmerman S. Research to Improve Care and Outcomes for Persons With Dementia and Their Caregivers: Immediate Needs, Equitable Care, and Funding Streams. J Am Med Dir Assoc 2021; 22:1363-1365. [PMID: 34274067 DOI: 10.1016/j.jamda.2021.05.014] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2021] [Accepted: 05/17/2021] [Indexed: 10/20/2022]
Affiliation(s)
| | - Christopher M Callahan
- Eskenazi Health, Indiana University Center for Aging Research, Indiana University School of Medicine, Regenstrief Institute, Indianapolis, IN, USA
| | | | - Sheryl Zimmerman
- Cecil G. Sheps Center for Health Services Research and Schools of Social Work and Public Health, University of North Carolina at Chapel Hill, NC, USA.
| |
Collapse
|