1
|
Tefera E, de Souza HBD, Blewitt C, Mansoor A, Peters H, Teerawanichpol P, Henin S, Barr WB, Johnson SB, Liu A. Natural Language Processing Applied to Spontaneous Recall of Famous Faces Reveals Memory Dysfunction in Temporal Lobe Epilepsy Patients. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.08.23.609193. [PMID: 39253429 PMCID: PMC11382998 DOI: 10.1101/2024.08.23.609193] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 09/11/2024]
Abstract
Objective and Background Epilepsy patients rank memory problems as their most significant cognitive comorbidity. Current clinical assessments are laborious to administer and score and may not always detect subtle memory decline. The Famous Faces Task (FF) has robustly demonstrated that left temporal lobe epilepsy (LTLE) patients remember fewer names and biographical details compared to right TLE (RTLE) patients and healthy controls (HCs). We adapted the FF task to capture subjects' entire spontaneous spoken recall, then scored responses using manual and natural language processing (NLP) methods. We expected to replicate previous group level differences using spontaneous speech and semi-automated analysis. Methods Seventy-three (N=73) adults (28 LTLE, 18 RTLE, and 27 HCs) were included in a case-control prospective study design. Twenty FF in politics, sports, and entertainment (active 2008-2017) were shown to subjects, who were asked if they could recognize and spontaneously recall as much biographical detail as possible. We created human-generated and automatically-generated keyword dictionaries for each celebrity, based on a randomly selected training set of half of the HC transcripts. To control for speech output, we measured the speech duration, total word count and content word count for the FF task and a Cookie Theft Control Task (CTT), in which subjects were merely asked to describe a visual scene. Subjects' responses to FF and CTT tasks were recorded, transcribed, and analyzed in a blinded manner with a combination of manual and automated NLP approaches. Results Famous face recognition accuracy was similar between groups. LTLE patients recalled fewer biographical details compared to HCs and RTLEs using both the gold-standard human-generated dictionary (24%±12% vs. 31%±12% and 30%±12%, p=0.007) and the automated dictionary (24%±12% vs. 31%±12% and 32%±13%, p=0.007). There were no group level differences in speech duration, total word count, or content word count for either the FF and CTT to explain difference in recall performance. There was a positive, statistically significant relationship between MOCA score and FF recall performance as scored by the human-generated (ρ= .327, p= .029) and automatically-generated dictionaries (ρ= .422, p= .004) for TLE subjects, but not HCs, an effect that was driven by LTLE subjects. Discussion LTLE patients remember fewer details of famous people than HCs or RTLE patients, as discovered by NLP analysis of spontaneous recall. Decreased biographical memory was not due to decreased speech output and correlated with lower MOCA scores. NLP analysis of spontaneous recall can detect memory dysfunction in clinical populations in a semi-automated, objective, and sensitive manner.
Collapse
|
2
|
Karako K, Hata T, Inoue A, Oyama K, Ueda E, Sakatani K. Importance of serum albumin in machine learning-based prediction of cognitive function in the elderly using a basic blood test. Front Neurol 2024; 15:1362560. [PMID: 39114530 PMCID: PMC11303288 DOI: 10.3389/fneur.2024.1362560] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2023] [Accepted: 07/08/2024] [Indexed: 08/10/2024] Open
Abstract
Introduction In this study, we investigated the correlation between serum albumin levels and cognitive function, and examined the impact of including serum albumin values in the input layer on the prediction accuracy when forecasting cognitive function using deep learning and other machine learning models. Methods We analyzed the electronic health record data from Osaka Medical and Pharmaceutical University Hospital between 2014 and 2021. The study included patients who underwent cognitive function tests during this period; however, patients from whom blood test data was not obtained up to 30 days before the cognitive function tests and those with values due to measurement error in blood test results were excluded. The Mini-Mental State Examination (MMSE) was used as the cognitive function test, and albumin levels were examined as the explanatory variable. Furthermore, we estimated MMSE scores from blood test data using deep learning models (DLM), linear regression models, support vector machines (SVM), decision trees, random forests, extreme gradient boosting (XGBoost), and light gradient boosting machines (LightGBM). Results Out of 5,017 patients who underwent cognitive function tests, 3,663 patients from whom blood test data had not been obtained recently and two patients with values due to measurement error were excluded. The final study population included 1,352 patients, with 114 patients (8.4%) aged below 65 and 1,238 patients (91.6%) aged 65 and above. In patients aged 65 and above, the age and male sex showed significant associations with MMSE scores of less than 24, while albumin and potassium levels showed negative associations with MMSE scores of less than 24. Comparing MMSE estimation performance, in those aged below 65, the mean squared error (MSE) of DLM was improved with the inclusion of albumin. Similarly, the MSE improved when using SVM, random forest and XGBoost. In those aged 65 and above, the MSE improved in all models. Discussion Our study results indicated a positive correlation between serum albumin levels and cognitive function, suggesting a positive correlation between nutritional status and cognitive function in the elderly. Serum albumin levels were shown to be an important explanatory variable in the estimation of cognitive function for individuals aged 65 and above.
Collapse
Affiliation(s)
- Kenji Karako
- Department of Human and Engineered Environmental Studies, Graduate School of Frontier Sciences, The University of Tokyo, Chiba, Japan
| | - Takeo Hata
- Department of Hospital Quality and Safety Management, Osaka Medical and Pharmaceutical University Hospital, Osaka, Japan
- Department of Pharmacy, Osaka Medical and Pharmaceutical University Hospital, Osaka, Japan
| | - Atsushi Inoue
- Graduate School of Life Science and Systems Engineering, Kyushu Institute of Technology, Fukuoka, Japan
| | - Katsunori Oyama
- Department of Computer Science, College of Engineering, Nihon University, Tokyo, Japan
| | - Eiichiro Ueda
- Department of Hospital Quality and Safety Management, Osaka Medical and Pharmaceutical University Hospital, Osaka, Japan
| | - Kaoru Sakatani
- Department of Human and Engineered Environmental Studies, Graduate School of Frontier Sciences, The University of Tokyo, Chiba, Japan
- Institute of Gerontology, The University of Tokyo, Tokyo, Japan
| |
Collapse
|
3
|
B.T B, Chen JM. Performance Assessment of ChatGPT versus Bard in Detecting Alzheimer's Dementia. Diagnostics (Basel) 2024; 14:817. [PMID: 38667463 PMCID: PMC11048951 DOI: 10.3390/diagnostics14080817] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2024] [Revised: 04/08/2024] [Accepted: 04/10/2024] [Indexed: 04/28/2024] Open
Abstract
Large language models (LLMs) find increasing applications in many fields. Here, three LLM chatbots (ChatGPT-3.5, ChatGPT-4, and Bard) are assessed in their current form, as publicly available, for their ability to recognize Alzheimer's dementia (AD) and Cognitively Normal (CN) individuals using textual input derived from spontaneous speech recordings. A zero-shot learning approach is used at two levels of independent queries, with the second query (chain-of-thought prompting) eliciting more detailed information than the first. Each LLM chatbot's performance is evaluated on the prediction generated in terms of accuracy, sensitivity, specificity, precision, and F1 score. LLM chatbots generated a three-class outcome ("AD", "CN", or "Unsure"). When positively identifying AD, Bard produced the highest true-positives (89% recall) and highest F1 score (71%), but tended to misidentify CN as AD, with high confidence (low "Unsure" rates); for positively identifying CN, GPT-4 resulted in the highest true-negatives at 56% and highest F1 score (62%), adopting a diplomatic stance (moderate "Unsure" rates). Overall, the three LLM chatbots can identify AD vs. CN, surpassing chance-levels, but do not currently satisfy the requirements for clinical application.
Collapse
Affiliation(s)
- Balamurali B.T
- Science, Mathematics & Technology (SMT), Singapore University of Technology & Design, 8 Somapah Rd, Singapore 487372, Singapore
| | - Jer-Ming Chen
- Science, Mathematics & Technology (SMT), Singapore University of Technology & Design, 8 Somapah Rd, Singapore 487372, Singapore
| |
Collapse
|
4
|
Karako K. Predictive deep learning models for cognitive risk using accessible data. Biosci Trends 2024; 18:66-72. [PMID: 38382929 DOI: 10.5582/bst.2024.01026] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/23/2024]
Abstract
The early detection of mild cognitive impairment (MCI) is crucial to preventing the progression of dementia. However, it necessitates that patients voluntarily undergo cognitive function tests, which may be too late if symptoms are only recognized once they become apparent. Recent advances in deep learning have improved model performance, leading to applied research in various predictive problems. Studies attempting to estimate dementia and the risk of MCI based on readily available data are being conducted, with the hope of facilitating the early detection of MCI. The data used for these predictions vary widely, including facial imagery, voice recordings, blood tests, and inertial information during walking. Deep learning models that make predictions based on these data sources have been proposed. This article summarizes recent research efforts to predict the risk of dementia using easily accessible data. As research progresses and more accurate predictions become feasible, simple tests could be incorporated into daily life to monitor one's personal health status and to facilitate an early intervention.
Collapse
Affiliation(s)
- Kenji Karako
- Department of Human and Engineered Environmental Studies, Graduate School of Frontier Sciences, The University of Tokyo, Chiba, Japan
| |
Collapse
|
5
|
Liu AA, Barr WB. Overlapping and distinct phenotypic profiles in Alzheimer's disease and late onset epilepsy: a biologically-based approach. Front Neurol 2024; 14:1260523. [PMID: 38545454 PMCID: PMC10965692 DOI: 10.3389/fneur.2023.1260523] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2023] [Accepted: 12/18/2023] [Indexed: 04/05/2024] Open
Abstract
Due to shared hippocampal dysfunction, patients with Alzheimer's dementia and late-onset epilepsy (LOE) report memory decline. Multiple studies have described the epidemiological, pathological, neurophysiological, and behavioral overlap between Alzheimer's Disease and LOE, implying a bi-directional relationship. We describe the neurobiological decline occurring at different spatial in AD and LOE patients, which may explain why their phenotypes overlap and differ. We provide suggestions for clinical recognition of dual presentation and novel approaches for behavioral testing that reflect an "inside-out," or biologically-based approach to testing memory. New memory and language assessments could detect-and treat-memory impairment in AD and LOE at an earlier, actionable stage.
Collapse
Affiliation(s)
- Anli A. Liu
- Langone Medical Center, New York University, New York, NY, United States
- Department of Neurology, School of Medicine, New York University, New York, NY, United States
- Neuroscience Institute, Langone Medical Center, New York University, New York, NY, United States
| | - William B. Barr
- Department of Neurology, School of Medicine, New York University, New York, NY, United States
| |
Collapse
|
6
|
García-Gutiérrez F, Alegret M, Marquié M, Muñoz N, Ortega G, Cano A, De Rojas I, García-González P, Olivé C, Puerta R, García-Sanchez A, Capdevila-Bayo M, Montrreal L, Pytel V, Rosende-Roca M, Zaldua C, Gabirondo P, Tárraga L, Ruiz A, Boada M, Valero S. Unveiling the sound of the cognitive status: Machine Learning-based speech analysis in the Alzheimer's disease spectrum. Alzheimers Res Ther 2024; 16:26. [PMID: 38308366 PMCID: PMC10835990 DOI: 10.1186/s13195-024-01394-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2023] [Accepted: 01/18/2024] [Indexed: 02/04/2024]
Abstract
BACKGROUND Advancement in screening tools accessible to the general population for the early detection of Alzheimer's disease (AD) and prediction of its progression is essential for achieving timely therapeutic interventions and conducting decentralized clinical trials. This study delves into the application of Machine Learning (ML) techniques by leveraging paralinguistic features extracted directly from a brief spontaneous speech (SS) protocol. We aimed to explore the capability of ML techniques to discriminate between different degrees of cognitive impairment based on SS. Furthermore, for the first time, this study investigates the relationship between paralinguistic features from SS and cognitive function within the AD spectrum. METHODS Physical-acoustic features were extracted from voice recordings of patients evaluated in a memory unit who underwent a SS protocol. We implemented several ML models evaluated via cross-validation to identify individuals without cognitive impairment (subjective cognitive decline, SCD), with mild cognitive impairment (MCI), and with dementia due to AD (ADD). In addition, we established models capable of predicting cognitive domain performance based on a comprehensive neuropsychological battery from Fundació Ace (NBACE) using SS-derived information. RESULTS The results of this study showed that, based on a paralinguistic analysis of sound, it is possible to identify individuals with ADD (F1 = 0.92) and MCI (F1 = 0.84). Furthermore, our models, based on physical acoustic information, exhibited correlations greater than 0.5 for predicting the cognitive domains of attention, memory, executive functions, language, and visuospatial ability. CONCLUSIONS In this study, we show the potential of a brief and cost-effective SS protocol in distinguishing between different degrees of cognitive impairment and forecasting performance in cognitive domains commonly affected within the AD spectrum. Our results demonstrate a high correspondence with protocols traditionally used to assess cognitive function. Overall, it opens up novel prospects for developing screening tools and remote disease monitoring.
Collapse
Affiliation(s)
| | - Montserrat Alegret
- Ace Alzheimer Center Barcelona, Universitat Internacional de Catalunya, Barcelona, Spain
- Networking Research Center on Neurodegenerative Diseases (CIBERNED), Instituto de Salud Carlos III, Madrid, Spain
| | - Marta Marquié
- Ace Alzheimer Center Barcelona, Universitat Internacional de Catalunya, Barcelona, Spain
- Networking Research Center on Neurodegenerative Diseases (CIBERNED), Instituto de Salud Carlos III, Madrid, Spain
| | - Nathalia Muñoz
- Ace Alzheimer Center Barcelona, Universitat Internacional de Catalunya, Barcelona, Spain
| | - Gemma Ortega
- Ace Alzheimer Center Barcelona, Universitat Internacional de Catalunya, Barcelona, Spain
- Networking Research Center on Neurodegenerative Diseases (CIBERNED), Instituto de Salud Carlos III, Madrid, Spain
| | - Amanda Cano
- Ace Alzheimer Center Barcelona, Universitat Internacional de Catalunya, Barcelona, Spain
- Networking Research Center on Neurodegenerative Diseases (CIBERNED), Instituto de Salud Carlos III, Madrid, Spain
| | - Itziar De Rojas
- Ace Alzheimer Center Barcelona, Universitat Internacional de Catalunya, Barcelona, Spain
- Networking Research Center on Neurodegenerative Diseases (CIBERNED), Instituto de Salud Carlos III, Madrid, Spain
| | - Pablo García-González
- Ace Alzheimer Center Barcelona, Universitat Internacional de Catalunya, Barcelona, Spain
| | - Clàudia Olivé
- Ace Alzheimer Center Barcelona, Universitat Internacional de Catalunya, Barcelona, Spain
| | - Raquel Puerta
- Ace Alzheimer Center Barcelona, Universitat Internacional de Catalunya, Barcelona, Spain
| | - Ainhoa García-Sanchez
- Ace Alzheimer Center Barcelona, Universitat Internacional de Catalunya, Barcelona, Spain
| | - María Capdevila-Bayo
- Ace Alzheimer Center Barcelona, Universitat Internacional de Catalunya, Barcelona, Spain
| | - Laura Montrreal
- Ace Alzheimer Center Barcelona, Universitat Internacional de Catalunya, Barcelona, Spain
| | - Vanesa Pytel
- Ace Alzheimer Center Barcelona, Universitat Internacional de Catalunya, Barcelona, Spain
| | - Maitee Rosende-Roca
- Ace Alzheimer Center Barcelona, Universitat Internacional de Catalunya, Barcelona, Spain
| | | | | | - Lluís Tárraga
- Ace Alzheimer Center Barcelona, Universitat Internacional de Catalunya, Barcelona, Spain
- Networking Research Center on Neurodegenerative Diseases (CIBERNED), Instituto de Salud Carlos III, Madrid, Spain
| | - Agustín Ruiz
- Ace Alzheimer Center Barcelona, Universitat Internacional de Catalunya, Barcelona, Spain
- Networking Research Center on Neurodegenerative Diseases (CIBERNED), Instituto de Salud Carlos III, Madrid, Spain
| | - Mercè Boada
- Ace Alzheimer Center Barcelona, Universitat Internacional de Catalunya, Barcelona, Spain
- Networking Research Center on Neurodegenerative Diseases (CIBERNED), Instituto de Salud Carlos III, Madrid, Spain
| | - Sergi Valero
- Ace Alzheimer Center Barcelona, Universitat Internacional de Catalunya, Barcelona, Spain.
- Networking Research Center on Neurodegenerative Diseases (CIBERNED), Instituto de Salud Carlos III, Madrid, Spain.
| |
Collapse
|
7
|
Al-Hammadi M, Fleyeh H, Åberg AC, Halvorsen K, Thomas I. Machine Learning Approaches for Dementia Detection Through Speech and Gait Analysis: A Systematic Literature Review. J Alzheimers Dis 2024; 100:1-27. [PMID: 38848181 PMCID: PMC11307068 DOI: 10.3233/jad-231459] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/19/2024] [Indexed: 06/09/2024]
Abstract
Background Dementia is a general term for several progressive neurodegenerative disorders including Alzheimer's disease. Timely and accurate detection is crucial for early intervention. Advancements in artificial intelligence present significant potential for using machine learning to aid in early detection. Objective Summarize the state-of-the-art machine learning-based approaches for dementia prediction, focusing on non-invasive methods, as the burden on the patients is lower. Specifically, the analysis of gait and speech performance can offer insights into cognitive health through clinically cost-effective screening methods. Methods A systematic literature review was conducted following the PRISMA protocol (Preferred Reporting Items for Systematic Reviews and Meta-Analyses). The search was performed on three electronic databases (Scopus, Web of Science, and PubMed) to identify the relevant studies published between 2017 to 2022. A total of 40 papers were selected for review. Results The most common machine learning methods employed were support vector machine followed by deep learning. Studies suggested the use of multimodal approaches as they can provide comprehensive and better prediction performance. Deep learning application in gait studies is still in the early stages as few studies have applied it. Moreover, including features of whole body movement contribute to better classification accuracy. Regarding speech studies, the combination of different parameters (acoustic, linguistic, cognitive testing) produced better results. Conclusions The review highlights the potential of machine learning, particularly non-invasive approaches, in the early prediction of dementia. The comparable prediction accuracies of manual and automatic speech analysis indicate an imminent fully automated approach for dementia detection.
Collapse
Affiliation(s)
- Mustafa Al-Hammadi
- School of Information and Engineering, Dalarna University, Falun, Sweden
| | - Hasan Fleyeh
- School of Information and Engineering, Dalarna University, Falun, Sweden
| | - Anna Cristina Åberg
- School of Health and Welfare, Dalarna University, Falun, Sweden
- Department of Public Health and Caring Sciences, Geriatrics, Uppsala University, Uppsala, Sweden
| | | | - Ilias Thomas
- School of Information and Engineering, Dalarna University, Falun, Sweden
| |
Collapse
|
8
|
Wang C, Liu S, Li A, Liu J. Text Dialogue Analysis for Primary Screening of Mild Cognitive Impairment: Development and Validation Study. J Med Internet Res 2023; 25:e51501. [PMID: 38157230 PMCID: PMC10787336 DOI: 10.2196/51501] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2023] [Revised: 09/28/2023] [Accepted: 11/27/2023] [Indexed: 01/03/2024] Open
Abstract
BACKGROUND Artificial intelligence models tailored to diagnose cognitive impairment have shown excellent results. However, it is unclear whether large linguistic models can rival specialized models by text alone. OBJECTIVE In this study, we explored the performance of ChatGPT for primary screening of mild cognitive impairment (MCI) and standardized the design steps and components of the prompts. METHODS We gathered a total of 174 participants from the DementiaBank screening and classified 70% of them into the training set and 30% of them into the test set. Only text dialogues were kept. Sentences were cleaned using a macro code, followed by a manual check. The prompt consisted of 5 main parts, including character setting, scoring system setting, indicator setting, output setting, and explanatory information setting. Three dimensions of variables from published studies were included: vocabulary (ie, word frequency and word ratio, phrase frequency and phrase ratio, and lexical complexity), syntax and grammar (ie, syntactic complexity and grammatical components), and semantics (ie, semantic density and semantic coherence). We used R 4.3.0. for the analysis of variables and diagnostic indicators. RESULTS Three additional indicators related to the severity of MCI were incorporated into the final prompt for the model. These indicators were effective in discriminating between MCI and cognitively normal participants: tip-of-the-tongue phenomenon (P<.001), difficulty with complex ideas (P<.001), and memory issues (P<.001). The final GPT-4 model achieved a sensitivity of 0.8636, a specificity of 0.9487, and an area under the curve of 0.9062 on the training set; on the test set, the sensitivity, specificity, and area under the curve reached 0.7727, 0.8333, and 0.8030, respectively. CONCLUSIONS ChatGPT was effective in the primary screening of participants with possible MCI. Improved standardization of prompts by clinicians would also improve the performance of the model. It is important to note that ChatGPT is not a substitute for a clinician making a diagnosis.
Collapse
Affiliation(s)
- Changyu Wang
- Department of Medical Informatics, West China Medical School, Sichuan University, Chengdu, China
- West China College of Stomatology, Sichuan University, Chengdu, China
| | - Siru Liu
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, United States
| | - Aiqing Li
- Department of Neurology, West China Hospital, Sichuan University, Chengdu, China
| | - Jialin Liu
- Department of Medical Informatics, West China Medical School, Sichuan University, Chengdu, China
- Information Center, West China Hospital, Sichuan University, Chengdu, China
- Department of Otolaryngology-Head and Neck Surgery, West China Hospital, Sichuan University, Chengdu, China
| |
Collapse
|
9
|
Bucholc M, James C, Khleifat AA, Badhwar A, Clarke N, Dehsarvi A, Madan CR, Marzi SJ, Shand C, Schilder BM, Tamburin S, Tantiangco HM, Lourida I, Llewellyn DJ, Ranson JM. Artificial intelligence for dementia research methods optimization. Alzheimers Dement 2023; 19:5934-5951. [PMID: 37639369 DOI: 10.1002/alz.13441] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2023] [Revised: 07/19/2023] [Accepted: 07/23/2023] [Indexed: 08/31/2023]
Abstract
Artificial intelligence (AI) and machine learning (ML) approaches are increasingly being used in dementia research. However, several methodological challenges exist that may limit the insights we can obtain from high-dimensional data and our ability to translate these findings into improved patient outcomes. To improve reproducibility and replicability, researchers should make their well-documented code and modeling pipelines openly available. Data should also be shared where appropriate. To enhance the acceptability of models and AI-enabled systems to users, researchers should prioritize interpretable methods that provide insights into how decisions are generated. Models should be developed using multiple, diverse datasets to improve robustness, generalizability, and reduce potentially harmful bias. To improve clarity and reproducibility, researchers should adhere to reporting guidelines that are co-produced with multiple stakeholders. If these methodological challenges are overcome, AI and ML hold enormous promise for changing the landscape of dementia research and care. HIGHLIGHTS: Machine learning (ML) can improve diagnosis, prevention, and management of dementia. Inadequate reporting of ML procedures affects reproduction/replication of results. ML models built on unrepresentative datasets do not generalize to new datasets. Obligatory metrics for certain model structures and use cases have not been defined. Interpretability and trust in ML predictions are barriers to clinical translation.
Collapse
Affiliation(s)
- Magda Bucholc
- Cognitive Analytics Research Lab, School of Computing, Engineering & Intelligent Systems, Ulster University, Derry, UK
| | - Charlotte James
- NIHR Bristol Biomedical Research Centre, University Hospitals Bristol and Weston NHS Foundation Trust and University of Bristol, Bristol, UK
| | - Ahmad Al Khleifat
- Department of Basic and Clinical Neuroscience, Institute of Psychiatry, Psychology & Neuroscience, King's College London, London, UK
| | - AmanPreet Badhwar
- Multiomics Investigation of Neurodegenerative Diseases (MIND) Lab, Centre de Recherche de l'Institut Universitaire de Gériatrie de Montréal, Montréal, Quebec, Canada
- Institut de génie biomédical, Université de Montréal, Montréal, Quebec, Canada
- Département de Pharmacologie et Physiologie, Université de Montréal, Montréal, Quebec, Canada
| | - Natasha Clarke
- Multiomics Investigation of Neurodegenerative Diseases (MIND) Lab, Centre de Recherche de l'Institut Universitaire de Gériatrie de Montréal, Montréal, Quebec, Canada
| | - Amir Dehsarvi
- Aberdeen Biomedical Imaging Centre, School of Medicine, Medical Sciences, and Nutrition, University of Aberdeen, Aberdeen, UK
| | | | - Sarah J Marzi
- UK Dementia Research Institute, Imperial College London, London, UK
- Department of Brain Sciences, Imperial College London, London, UK
| | - Cameron Shand
- Centre for Medical Image Computing, Department of Computer Science, University College London, London, UK
| | - Brian M Schilder
- UK Dementia Research Institute, Imperial College London, London, UK
- Department of Brain Sciences, Imperial College London, London, UK
| | - Stefano Tamburin
- Department of Neurosciences, Biomedicine and Movement Sciences, University of Verona, Verona, Italy
| | | | | | - David J Llewellyn
- University of Exeter Medical School, Exeter, UK
- The Alan Turing Institute, London, UK
| | | |
Collapse
|