1
|
Chou CJ, Chang CT, Chang YN, Lee CY, Chuang YF, Chiu YL, Liang WL, Fan YM, Liu YC. Screening for early Alzheimer's disease: enhancing diagnosis with linguistic features and biomarkers. Front Aging Neurosci 2024; 16:1451326. [PMID: 39376506 PMCID: PMC11456453 DOI: 10.3389/fnagi.2024.1451326] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2024] [Accepted: 09/11/2024] [Indexed: 10/09/2024] Open
Abstract
Introduction Research has shown that speech analysis demonstrates sensitivity in detecting early Alzheimer's disease (AD), but the relation between linguistic features and cognitive tests or biomarkers remains unclear. This study aimed to investigate how linguistic features help identify cognitive impairments in patients in the early stages of AD. Method This study analyzed connected speech from 80 participants and categorized the participants into early-AD and normal control (NC) groups. The participants underwent amyloid-β positron emission tomography scans, brain magnetic resonance imaging, and comprehensive neuropsychological testing. Participants' speech data from a picture description task were examined. A total of 15 linguistic features were analyzed to classify groups and predict cognitive performance. Results We found notable linguistic differences between the early-AD and NC groups in lexical diversity, syntactic complexity, and language disfluency. Using machine learning classifiers (SVM, KNN, and RF), we achieved up to 88% accuracy in distinguishing early-AD patients from normal controls, with mean length of utterance (MLU) and long pauses ratio (LPR) serving as core linguistic indicators. Moreover, the integration of linguistic indicators with biomarkers significantly improved predictive accuracy for AD. Regression analysis also highlighted crucial linguistic features, such as MLU, LPR, Type-to-Token ratio (TTR), and passive construction ratio (PCR), which were sensitive to changes in cognitive function. Conclusion Findings support the efficacy of linguistic analysis as a screening tool for the early detection of AD and the assessment of subtle cognitive decline. Integrating linguistic features with biomarkers significantly improved diagnostic accuracy.
Collapse
Affiliation(s)
- Chia-Ju Chou
- Department of Neurology, Cardinal Tien Hospital, Taipei, Taiwan
| | - Chih-Ting Chang
- Department of Speech-Language Pathology and Audiology, National Taipei University of Nursing and Health Sciences, Taipei, Taiwan
| | - Ya-Ning Chang
- Miin Wu School of Computing, National Cheng Kung University, Tainan, Taiwan
| | | | - Yi-Fang Chuang
- Institute of Public Health, College of Medicine, National Yang Ming Chiao Tung University, Taipei, Taiwan
- International Health Program, College of Medicine, National Yang Ming Chiao Tung University, Taipei, Taiwan
- Health Innovation Center, National Yang Ming Chiao Tung University, Taipei, Taiwan
| | - Yen-Ling Chiu
- Department of Medical Research, Far Eastern Memorial Hospital, Taipei, Taiwan
- Graduate Program in Biomedical Informatics and Graduate Institute of Medicine, Yuan Ze University, Taoyuan, Taiwan
- Graduate Institute of Clinical Medicine, National Taiwan University, Taipei, Taiwan
| | - Wan-Lin Liang
- Department of Neurology, Cardinal Tien Hospital, Taipei, Taiwan
| | - Yu-Ming Fan
- School of Medicine, Fu Jen Catholic University, Taipei, Taiwan
- Department of Nuclear Medicine, Cardinal Tien Hospital, Taipei, Taiwan
| | - Yi-Chien Liu
- Department of Neurology, Cardinal Tien Hospital, Taipei, Taiwan
- School of Medicine, Fu Jen Catholic University, Taipei, Taiwan
| |
Collapse
|
2
|
Skirrow C, Meepegama U, Weston J, Miller MJ, Nosheny RL, Albala B, Weiner MW, Fristed E. Storyteller in ADNI4: Application of an early Alzheimer's disease screening tool using brief, remote, and speech-based testing. Alzheimers Dement 2024. [PMID: 39234647 DOI: 10.1002/alz.14206] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2024] [Revised: 07/22/2024] [Accepted: 07/27/2024] [Indexed: 09/06/2024]
Abstract
INTRODUCTION Speech-based testing shows promise for sensitive and scalable objective screening for Alzheimer's disease (AD), but research to date offers limited evidence of generalizability. METHODS Data were taken from the AMYPRED (Amyloid Prediction in Early Stage Alzheimer's Disease from Acoustic and Linguistic Patterns of Speech) studies (N = 101, N = 46 mild cognitive impairment [MCI]) and Alzheimer's Disease Neuroimaging Initiative 4 (ADNI4) remote digital (N = 426, N = 58 self-reported MCI, mild AD or dementia) and in-clinic (N = 57, N = 13 MCI) cohorts, in which participants provided audio-recorded responses to automated remote story recall tasks in the Storyteller test battery. Text similarity, lexical, temporal, and acoustic speech feature sets were extracted. Models predicting early AD were developed in AMYPRED and tested out of sample in the demographically more diverse cohorts in ADNI4 (> 33% from historically underrepresented populations). RESULTS Speech models generalized well to unseen data in ADNI4 remote and in-clinic cohorts. The best-performing models evaluated text-based metrics (text similarity, lexical features: area under the curve 0.71-0.84 across cohorts). DISCUSSION Speech-based predictions of early AD from Storyteller generalize across diverse samples. HIGHLIGHTS The Storyteller speech-based test is an objective digital prescreener for Alzheimer's Disease Neuroimaging Initiative 4 (ADNI4). Speech-based models predictive of Alzheimer's disease (AD) were developed in the AMYPRED (Amyloid Prediction in Early Stage Alzheimer's Disease from Acoustic and Linguistic Patterns of Speech) sample (N = 101). Models were tested out of sample in ADNI4 in-clinic (N = 57) and remote (N = 426) cohorts. Models showed good generalization out of sample. Models evaluating text matching and lexical features were most predictive of early AD.
Collapse
Affiliation(s)
| | | | | | - Melanie J Miller
- Northern California Institute for Research and Education (NCIRE), San Francisco, California, USA
- VA Advanced Imaging Research Center, Department of Veterans Affairs Medical Center, San Francisco, California, USA
| | - Rachel L Nosheny
- Northern California Institute for Research and Education (NCIRE), San Francisco, California, USA
- Department of Psychiatry and Behavioral Sciences, University of California San Francisco, San Francisco, California, USA
- Department of Radiology and Biomedical Imaging, University of California San Francisco, San Francisco, California, USA
| | - Bruce Albala
- Department of Environmental & Occupational Health, Public Health, University of California Irvine, Irvine, California, USA
- Department of Neurology, University of California Irvine School of Medicine, Irvine, California, USA
- Department of Pharmaceutical Sciences, University of California Irvine School of Pharmacy & Pharmaceutical Sciences, Irvine, California, USA
- Research Service, Veterans Administration Long Beach Healthcare System, Long Beach, California, USA
| | - Michael W Weiner
- Northern California Institute for Research and Education (NCIRE), San Francisco, California, USA
- VA Advanced Imaging Research Center, Department of Veterans Affairs Medical Center, San Francisco, California, USA
- Department of Radiology and Biomedical Imaging, University of California San Francisco, San Francisco, California, USA
| | | |
Collapse
|
3
|
Kleiman MJ, Galvin JE. High frequency post-pause word choices and task-dependent speech behavior characterize connected speech in individuals with mild cognitive impairment. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.02.25.24303329. [PMID: 38464237 PMCID: PMC10925339 DOI: 10.1101/2024.02.25.24303329] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/12/2024]
Abstract
Background Alzheimer's disease (AD) is characterized by progressive cognitive decline, including impairments in speech production and fluency. Mild cognitive impairment (MCI), a prodrome of AD, has also been linked with changes in speech behavior but to a more subtle degree. Objective This study aimed to investigate whether speech behavior immediately following both filled and unfilled pauses (post-pause speech behavior) differs between individuals with MCI and healthy controls (HCs), and how these differences are influenced by the cognitive demands of various speech tasks. Methods Transcribed speech samples were analyzed from both groups across different tasks, including immediate and delayed narrative recall, picture descriptions, and free responses. Key metrics including lexical and syntactic complexity, lexical frequency and diversity, and part of speech usage, both overall and post-pause, were examined. Results Significant differences in pause usage were observed between groups, with a higher incidence and longer latencies following these pauses in the MCI group. Lexical frequency following filled pauses was higher among MCI participants in the free response task but not in other tasks, potentially due to the relative cognitive load of the tasks. The immediate recall task was most useful at differentiating between groups. Predictive analyses utilizing random forest classifiers demonstrated high specificity in using speech behavior metrics to differentiate between MCI and HCs. Conclusions Speech behavior following pauses differs between MCI participants and healthy controls, with these differences being influenced by the cognitive demands of the speech tasks. These post-pause speech metrics can be easily integrated into existing speech analysis paradigms.
Collapse
Affiliation(s)
- Michael J. Kleiman
- Comprehensive Center for Brain Health, Department of Neurology, University of Miami Miller School of Medicine, Boca Raton, FL 33433
| | - James E. Galvin
- Comprehensive Center for Brain Health, Department of Neurology, University of Miami Miller School of Medicine, Boca Raton, FL 33433
| |
Collapse
|
4
|
Abid M, Asif M, Khemane Z, Jawaid A, Waqar Khan A, Naveed H, Naveed T, Farah AA, Siddiq MA. Advances in artificial intelligence for diagnosing Alzheimer's disease through speech. Ann Med Surg (Lond) 2024; 86:3822-3823. [PMID: 38989201 PMCID: PMC11230774 DOI: 10.1097/ms9.0000000000002200] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2024] [Accepted: 05/08/2024] [Indexed: 07/12/2024] Open
Affiliation(s)
- Mishal Abid
- Department of Medicine, Dow University of Health Sciences
| | | | - Zoya Khemane
- Department of Medicine, Dow University of Health Sciences
| | - Afia Jawaid
- Department of Medicine, Dow University of Health Sciences
| | | | - Hufsa Naveed
- Department of Medicine, Ziauddin Medical College, Karachi, Pakistan
| | - Tooba Naveed
- Department of Medicine, Ziauddin Medical College, Karachi, Pakistan
| | - Asma Ahmed Farah
- Department of Medicine, East Africa University, Boosaaso, Somalia
| | | |
Collapse
|
5
|
Pourramezan Fard A, Mahoor MH, Alsuhaibani M, Dodge HH. Linguistic-based Mild Cognitive Impairment detection using Informative Loss. Comput Biol Med 2024; 176:108606. [PMID: 38763068 DOI: 10.1016/j.compbiomed.2024.108606] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2024] [Revised: 04/17/2024] [Accepted: 05/11/2024] [Indexed: 05/21/2024]
Abstract
This paper presents a deep learning method using Natural Language Processing (NLP) techniques, to distinguish between Mild Cognitive Impairment (MCI) and Normal Cognitive (NC) conditions in older adults. We propose a framework that analyzes transcripts generated from video interviews collected within the I-CONECT study project, a randomized controlled trial aimed at improving cognitive functions through video chats. Our proposed NLP framework consists of two Transformer-based modules, namely Sentence Embedding (SE) and Sentence Cross Attention (SCA). First, the SE module captures contextual relationships between words within each sentence. Subsequently, the SCA module extracts temporal features from a sequence of sentences. This feature is then used by a Multi-Layer Perceptron (MLP) for the classification of subjects into MCI or NC. To build a robust model, we propose a novel loss function, called InfoLoss, that considers the reduction in entropy by observing each sequence of sentences to ultimately enhance the classification accuracy. The results of our comprehensive model evaluation using the I-CONECT dataset show that our framework can distinguish between MCI and NC with an average area under the curve of 84.75%.
Collapse
Affiliation(s)
- Ali Pourramezan Fard
- Ritchie School of Engineering and Computer Science, University of Denver, Denver, CO 80208, USA.
| | - Mohammad H Mahoor
- Ritchie School of Engineering and Computer Science, University of Denver, Denver, CO 80208, USA; DreamFace Technologies LLC, Centennial, CO 8011, USA.
| | - Muath Alsuhaibani
- Ritchie School of Engineering and Computer Science, University of Denver, Denver, CO 80208, USA; Department of Electrical Engineering, Prince Sattam Bin Abdulaziz University, Al-Kharj, 11942, Saudi Arabia.
| | - Hiroko H Dodge
- Department of Neurology, Massachusetts General Hospital, Harvard Medical School, Boston, MA 02114, USA.
| |
Collapse
|
6
|
Bóna J. Pausing and fluency in speech of patients with relapsing-remitting multiple sclerosis. CLINICAL LINGUISTICS & PHONETICS 2024; 38:332-344. [PMID: 37339478 DOI: 10.1080/02699206.2023.2223347] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/19/2022] [Revised: 05/31/2023] [Accepted: 06/05/2023] [Indexed: 06/22/2023]
Abstract
Multiple Sclerosis (MS) causes a variety of symptoms in speech production, such as more frequent pauses and an increase in the duration of pauses in the speech. However, there is almost no data on whether the disease affects speech fluency in other ways, such as changes in the frequency of disfluencies in speech. The main question of this study is the following: if we examine speech fluency in speech tasks requiring different cognitive load, will there be a difference between patients and controls? Twenty people with relapsing-remitting MS (3 men and 17 women) and 20 age- and education-matched control speakers (4 men and 16 women) participated in the study. Speech samples were recorded with each participant in three speech tasks: 1) spontaneous narratives about their own lives, 2) narratives about their previous day, and 3) narrative recalls based on a heard text. In the speech samples, pauses and disfluencies were annotated and the duration of pauses was measured. Then, the frequency of pauses and disfluencies were calculated and the types of disfluencies were examined. The results show that there are differences in the frequency and duration of pauses between people with MS and controls. However, there were no significant differences in the frequency of disfluencies between the groups. The same types of disfluencies occurred in the same frequency in both groups. The results help to better understand the speech production processes in MS.
Collapse
Affiliation(s)
- Judit Bóna
- Department of Applied Linguistics and Phonetics, ELTE Eötvös Loránd University, Budapest, Hungary
| |
Collapse
|
7
|
LUZ SATURNINO, HAIDER FASIH, FROMM DAVIDA, LAZAROU IOULIETTA, KOMPATSIARIS IOANNIS, MACWHINNEY BRIAN. An Overview of the ADReSS-M Signal Processing Grand Challenge on Multilingual Alzheimer's Dementia Recognition Through Spontaneous Speech. IEEE OPEN JOURNAL OF SIGNAL PROCESSING 2024; 5:738-749. [PMID: 38957540 PMCID: PMC11218814 DOI: 10.1109/ojsp.2024.3378595] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 07/04/2024]
Abstract
The ADReSS-M Signal Processing Grand Challenge was held at the 2023 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2023. The challenge targeted difficult automatic prediction problems of great societal and medical relevance, namely, the detection of Alzheimer's Dementia (AD) and the estimation of cognitive test scoress. Participants were invited to create models for the assessment of cognitive function based on spontaneous speech data. Most of these models employed signal processing and machine learning methods. The ADReSS-M challenge was designed to assess the extent to which predictive models built based on speech in one language generalise to another language. The language data compiled and made available for ADReSS-M comprised English, for model training, and Greek, for model testing and validation. To the best of our knowledge no previous shared research task investigated acoustic features of the speech signal or linguistic characteristics in the context of multilingual AD detection. This paper describes the context of the ADReSS-M challenge, its data sets, its predictive tasks, the evaluation methodology we employed, our baseline models and results, and the top five submissions. The paper concludes with a summary discussion of the ADReSS-M results, and our critical assessment of the future outlook in this field.
Collapse
Affiliation(s)
- SATURNINO LUZ
- Usher Institute, Edinburgh Medical School, The University of Edinburgh, EH16 4UX Edinburgh, U.K
| | - FASIH HAIDER
- School of Engineering, The University of Edinburgh, EH9 3JW Edinburgh, U.K
| | - DAVIDA FROMM
- Department of Psychology, Carnegie Mellon University, Pittsburgh 15213, PA USA
| | - IOULIETTA LAZAROU
- Information Technologies Institute, CERTH, Thessaloniki, Thermi-Thessaloniki 57001, Greece
| | - IOANNIS KOMPATSIARIS
- Information Technologies Institute, CERTH, Thessaloniki, Thermi-Thessaloniki 57001, Greece
| | - BRIAN MACWHINNEY
- Department of Psychology, Carnegie Mellon University, Pittsburgh 15213, PA USA
| |
Collapse
|
8
|
Fromm D, Dalton SG, Brick A, Olaiya G, Hill S, Greenhouse J, MacWhinney B. The Case of the Cookie Jar: Differences in Typical Language Use in Dementia. J Alzheimers Dis 2024; 100:1417-1434. [PMID: 38995772 PMCID: PMC11380261 DOI: 10.3233/jad-230844] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/14/2024]
Abstract
Background Findings from language sample analyses can provide efficient and effective indicators of cognitive impairment in older adults. Objective This study used newly automated core lexicon analyses of Cookie Theft picture descriptions to assess differences in typical use across three groups. Methods Participants included adults without diagnosed cognitive impairments (Control), adults diagnosed with Alzheimer's disease (ProbableAD), and adults diagnosed with mild cognitive impairment (MCI). Cookie Theft picture descriptions were transcribed and analyzed using CLAN. Results Results showed that the ProbableAD group used significantly fewer core lexicon words overall than the MCI and Control groups. For core lexicon content words (nouns, verbs), however, both the MCI and ProbableAD groups produced significantly fewer words than the Control group. The groups did not differ in their use of core lexicon function words. The ProbableAD group was also slower to produce most of the core lexicon words than the MCI and Control groups. The MCI group was slower than the Control group for only two of the core lexicon content words. All groups mentioned a core lexicon word in the top left quadrant of the picture early in the description. The ProbableAD group was then significantly slower than the other groups to mention a core lexicon word in the other quadrants. Conclusions This standard and simple-to-administer task reveals group differences in overall core lexicon scores and the amount of time until the speaker produces the key items. Clinicians and researchers can use these tools for both early assessment and measurement of change over time.
Collapse
Affiliation(s)
- Davida Fromm
- Department of Psychology, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Sarah Grace Dalton
- Department of Speech Pathology and Audiology, Marquette University, Milwaukee, WI, USA
| | - Alexander Brick
- Department of Statistics and Data Science, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Gbenuola Olaiya
- Department of Statistics and Data Science, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Sophia Hill
- Department of Statistics and Data Science, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Joel Greenhouse
- Department of Statistics and Data Science, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Brian MacWhinney
- Department of Psychology, Carnegie Mellon University, Pittsburgh, PA, USA
| |
Collapse
|
9
|
Liu J, Fu F, Li L, Yu J, Zhong D, Zhu S, Zhou Y, Liu B, Li J. Efficient Pause Extraction and Encode Strategy for Alzheimer's Disease Detection Using Only Acoustic Features from Spontaneous Speech. Brain Sci 2023; 13:477. [PMID: 36979287 PMCID: PMC10046767 DOI: 10.3390/brainsci13030477] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2023] [Revised: 03/06/2023] [Accepted: 03/10/2023] [Indexed: 03/14/2023] Open
Abstract
Clinical studies have shown that speech pauses can reflect the cognitive function differences between Alzheimer's Disease (AD) and non-AD patients, while the value of pause information in AD detection has not been fully explored. Herein, we propose a speech pause feature extraction and encoding strategy for only acoustic-signal-based AD detection. First, a voice activity detection (VAD) method was constructed to detect pause/non-pause feature and encode it to binary pause sequences that are easier to calculate. Then, an ensemble machine-learning-based approach was proposed for the classification of AD from the participants' spontaneous speech, based on the VAD Pause feature sequence and common acoustic feature sets (ComParE and eGeMAPS). The proposed pause feature sequence was verified in five machine-learning models. The validation data included two public challenge datasets (ADReSS and ADReSSo, English voice) and a local dataset (10 audio recordings containing five patients and five controls, Chinese voice). Results showed that the VAD Pause feature was more effective than common feature sets (ComParE: 6373 features and eGeMAPS: 88 features) for AD classification, and that the ensemble method improved the accuracy by more than 5% compared to several baseline methods (8% on the ADReSS dataset; 5.9% on the ADReSSo dataset). Moreover, the pause-sequence-based AD detection method could achieve 80% accuracy on the local dataset. Our study further demonstrated the potential of pause information in speech-based AD detection, and also contributed to a more accessible and general pause feature extraction and encoding method for AD detection.
Collapse
Affiliation(s)
- Jiamin Liu
- Jiangsu Province Engineering Research Center of Smart Wearable and Rehabilitation Devices, School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing 211166, China
| | - Fan Fu
- Jiangsu Province Engineering Research Center of Smart Wearable and Rehabilitation Devices, School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing 211166, China
| | - Liang Li
- Jiangsu Province Engineering Research Center of Smart Wearable and Rehabilitation Devices, School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing 211166, China
| | - Junxiao Yu
- Jiangsu Province Engineering Research Center of Smart Wearable and Rehabilitation Devices, School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing 211166, China
| | - Dacheng Zhong
- Jiangsu Province Engineering Research Center of Smart Wearable and Rehabilitation Devices, School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing 211166, China
| | - Songsheng Zhu
- Jiangsu Province Engineering Research Center of Smart Wearable and Rehabilitation Devices, School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing 211166, China
| | - Yuxuan Zhou
- Jiangsu Province Engineering Research Center of Smart Wearable and Rehabilitation Devices, School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing 211166, China
| | - Bin Liu
- Jiangsu Province Engineering Research Center of Smart Wearable and Rehabilitation Devices, School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing 211166, China
| | - Jianqing Li
- Jiangsu Province Engineering Research Center of Smart Wearable and Rehabilitation Devices, School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing 211166, China
- The State Key Laboratory of Bioelectronics, School of Instrument Science and Engineering, Southeast University, Nanjing 211166, China
| |
Collapse
|
10
|
Agbavor F, Liang H. Predicting dementia from spontaneous speech using large language models. PLOS DIGITAL HEALTH 2022; 1:e0000168. [PMID: 36812634 PMCID: PMC9931366 DOI: 10.1371/journal.pdig.0000168] [Citation(s) in RCA: 30] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/02/2022] [Accepted: 11/21/2022] [Indexed: 12/24/2022]
Abstract
Language impairment is an important biomarker of neurodegenerative disorders such as Alzheimer's disease (AD). Artificial intelligence (AI), particularly natural language processing (NLP), has recently been increasingly used for early prediction of AD through speech. Yet, relatively few studies exist on using large language models, especially GPT-3, to aid in the early diagnosis of dementia. In this work, we show for the first time that GPT-3 can be utilized to predict dementia from spontaneous speech. Specifically, we leverage the vast semantic knowledge encoded in the GPT-3 model to generate text embedding, a vector representation of the transcribed text from speech, that captures the semantic meaning of the input. We demonstrate that the text embedding can be reliably used to (1) distinguish individuals with AD from healthy controls, and (2) infer the subject's cognitive testing score, both solely based on speech data. We further show that text embedding considerably outperforms the conventional acoustic feature-based approach and even performs competitively with prevailing fine-tuned models. Together, our results suggest that GPT-3 based text embedding is a viable approach for AD assessment directly from speech and has the potential to improve early diagnosis of dementia.
Collapse
Affiliation(s)
- Felix Agbavor
- School of Biomedical Engineering, Science and Health Systems, Drexel University, Philadelphia, United States of America
| | - Hualou Liang
- School of Biomedical Engineering, Science and Health Systems, Drexel University, Philadelphia, United States of America
- * E-mail:
| |
Collapse
|
11
|
Yang Q, Li X, Ding X, Xu F, Ling Z. Deep learning-based speech analysis for Alzheimer's disease detection: a literature review. Alzheimers Res Ther 2022; 14:186. [PMID: 36517837 PMCID: PMC9749308 DOI: 10.1186/s13195-022-01131-3] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2022] [Accepted: 11/23/2022] [Indexed: 12/23/2022]
Abstract
BACKGROUND Alzheimer's disease has become one of the most common neurodegenerative diseases worldwide, which seriously affects the health of the elderly. Early detection and intervention are the most effective prevention methods currently. Compared with traditional detection methods such as traditional scale tests, electroencephalograms, and magnetic resonance imaging, speech analysis is more convenient for automatic large-scale Alzheimer's disease detection and has attracted extensive attention from researchers. In particular, deep learning-based speech analysis and language processing techniques for Alzheimer's disease detection have been studied and achieved impressive results. METHODS To integrate the latest research progresses, hundreds of relevant papers from ACM, DBLP, IEEE, PubMed, Scopus, Web of Science electronic databases, and other sources were retrieved. We used these keywords for paper search: (Alzheimer OR dementia OR cognitive impairment) AND (speech OR voice OR audio) AND (deep learning OR neural network). CONCLUSIONS Fifty-two papers were finally retained after screening. We reviewed and presented the speech databases, deep learning methods, and model performances of these studies. In the end, we pointed out the mainstreams and limitations in the current studies and provided a direction for future research.
Collapse
Affiliation(s)
- Qin Yang
- iFlytek Research, iFlytek Co.Ltd, Hefei, China
| | - Xin Li
- NELSLIP, University of Science and Technology of China, Hefei, China.
- iFlytek Research, iFlytek Co.Ltd, Hefei, China.
| | - Xinyun Ding
- iFlytek Research, iFlytek Co.Ltd, Hefei, China
| | - Feiyang Xu
- iFlytek Research, iFlytek Co.Ltd, Hefei, China
| | - Zhenhua Ling
- NELSLIP, University of Science and Technology of China, Hefei, China
| |
Collapse
|
12
|
Sun X, Sun X, Wang Q, Wang X, Feng L, Yang Y, Jing Y, Yang C, Zhang S. Biosensors toward behavior detection in diagnosis of alzheimer’s disease. Front Bioeng Biotechnol 2022; 10:1031833. [PMID: 36338126 PMCID: PMC9626796 DOI: 10.3389/fbioe.2022.1031833] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2022] [Accepted: 10/03/2022] [Indexed: 11/30/2022] Open
Abstract
In recent years, a huge number of individuals all over the world, elderly people, in particular, have been suffering from Alzheimer’s disease (AD), which has had a significant negative impact on their quality of life. To intervene early in the progression of the disease, accurate, convenient, and low-cost detection technologies are gaining increased attention. As a result of their multiple merits in the detection and assessment of AD, biosensors are being frequently utilized in this field. Behavioral detection is a prospective way to diagnose AD at an early stage, which is a more objective and quantitative approach than conventional neuropsychological scales. Furthermore, it provides a safer and more comfortable environment than those invasive methods (such as blood and cerebrospinal fluid tests) and is more economical than neuroimaging tests. Behavior detection is gaining increasing attention in AD diagnosis. In this review, cutting-edge biosensor-based devices for AD diagnosis together with their measurement parameters and diagnostic effectiveness have been discussed in four application subtopics: body movement behavior detection, eye movement behavior detection, speech behavior detection, and multi-behavior detection. Finally, the characteristics of behavior detection sensors in various application scenarios are summarized and the prospects of their application in AD diagnostics are presented as well.
Collapse
Affiliation(s)
- Xiaotong Sun
- Ningbo Innovation Center, School of Mechanical Engineering, Zhejiang University, Ningbo, China
- Faculty of Science and Engineering, University of Nottingham Ningbo, Ningbo, China
| | - Xu Sun
- Faculty of Science and Engineering, University of Nottingham Ningbo, Ningbo, China
- Nottingham Ningbo China Beacons of Excellence Research and Innovation Institute, University of Nottingham Ningbo, Ningbo, China
- *Correspondence: Sheng Zhang, ; Xu Sun,
| | - Qingfeng Wang
- Nottingham University Business School China, University of Nottingham Ningbo China, Ningbo, Zhejiang, China
| | - Xiang Wang
- Ningbo Innovation Center, School of Mechanical Engineering, Zhejiang University, Ningbo, China
- Faculty of Science and Engineering, University of Nottingham Ningbo, Ningbo, China
| | - Luying Feng
- Ningbo Innovation Center, School of Mechanical Engineering, Zhejiang University, Ningbo, China
| | - Yifan Yang
- Ningbo Innovation Center, School of Mechanical Engineering, Zhejiang University, Ningbo, China
- Faculty of Science and Engineering, University of Nottingham Ningbo, Ningbo, China
| | - Ying Jing
- Business School, NingboTech University, Ningbo, China
| | - Canjun Yang
- Ningbo Innovation Center, School of Mechanical Engineering, Zhejiang University, Ningbo, China
| | - Sheng Zhang
- Ningbo Innovation Center, School of Mechanical Engineering, Zhejiang University, Ningbo, China
- Faculty of Science and Engineering, University of Nottingham Ningbo, Ningbo, China
- *Correspondence: Sheng Zhang, ; Xu Sun,
| |
Collapse
|
13
|
Fristed E, Skirrow C, Meszaros M, Lenain R, Meepegama U, Papp KV, Ropacki M, Weston J. Leveraging speech and artificial intelligence to screen for early Alzheimer's disease and amyloid beta positivity. Brain Commun 2022; 4:fcac231. [PMID: 36381988 PMCID: PMC9639797 DOI: 10.1093/braincomms/fcac231] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2022] [Revised: 06/30/2022] [Accepted: 09/13/2022] [Indexed: 08/27/2023] Open
Abstract
Early detection of Alzheimer's disease is required to identify patients suitable for disease-modifying medications and to improve access to non-pharmacological preventative interventions. Prior research shows detectable changes in speech in Alzheimer's dementia and its clinical precursors. The current study assesses whether a fully automated speech-based artificial intelligence system can detect cognitive impairment and amyloid beta positivity, which characterize early stages of Alzheimer's disease. Two hundred participants (age 54-85, mean 70.6; 114 female, 86 male) from sister studies in the UK (NCT04828122) and the USA (NCT04928976), completed the same assessments and were combined in the current analyses. Participants were recruited from prior clinical trials where amyloid beta status (97 amyloid positive, 103 amyloid negative, as established via PET or CSF test) and clinical diagnostic status was known (94 cognitively unimpaired, 106 with mild cognitive impairment or mild Alzheimer's disease). The automatic story recall task was administered during supervised in-person or telemedicine assessments, where participants were asked to recall stories immediately and after a brief delay. An artificial intelligence text-pair evaluation model produced vector-based outputs from the original story text and recorded and transcribed participant recalls, quantifying differences between them. Vector-based representations were fed into logistic regression models, trained with tournament leave-pair-out cross-validation analysis to predict amyloid beta status (primary endpoint), mild cognitive impairment and amyloid beta status in diagnostic subgroups (secondary endpoints). Predictions were assessed by the area under the receiver operating characteristic curve for the test result in comparison with reference standards (diagnostic and amyloid status). Simulation analysis evaluated two potential benefits of speech-based screening: (i) mild cognitive impairment screening in primary care compared with the Mini-Mental State Exam, and (ii) pre-screening prior to PET scanning when identifying an amyloid positive sample. Speech-based screening predicted amyloid beta positivity (area under the curve = 0.77) and mild cognitive impairment or mild Alzheimer's disease (area under the curve = 0.83) in the full sample, and predicted amyloid beta in subsamples (mild cognitive impairment or mild Alzheimer's disease: area under the curve = 0.82; cognitively unimpaired: area under the curve = 0.71). Simulation analyses indicated that in primary care, speech-based screening could modestly improve detection of mild cognitive impairment (+8.5%), while reducing false positives (-59.1%). Furthermore, speech-based amyloid pre-screening was estimated to reduce the number of PET scans required by 35.3% and 35.5% in individuals with mild cognitive impairment and cognitively unimpaired individuals, respectively. Speech-based assessment offers accessible and scalable screening for mild cognitive impairment and amyloid beta positivity.
Collapse
Affiliation(s)
| | | | | | | | | | - Kathryn V Papp
- Center for Alzheimer Research and Treatment, Department of Neurology, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, 02115, USA
- Department of Neurology, Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts, 02114, USA
| | - Michael Ropacki
- Strategic Global Research & Development, Temecula, California, 94019, USA
| | | |
Collapse
|
14
|
Liu N, Luo K, Yuan Z, Chen Y. A Transfer Learning Method for Detecting Alzheimer's Disease Based on Speech and Natural Language Processing. Front Public Health 2022; 10:772592. [PMID: 35493375 PMCID: PMC9043451 DOI: 10.3389/fpubh.2022.772592] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2021] [Accepted: 02/24/2022] [Indexed: 11/13/2022] Open
Abstract
Alzheimer's disease (AD) is a neurodegenerative disease that is difficult to be detected using convenient and reliable methods. The language change in patients with AD is an important signal of their cognitive status, which potentially helps in early diagnosis. In this study, we developed a transfer learning model based on speech and natural language processing (NLP) technology for the early diagnosis of AD. The lack of large datasets limits the use of complex neural network models without feature engineering, while transfer learning can effectively solve this problem. The transfer learning model is firstly pre-trained on large text datasets to get the pre-trained language model, and then, based on such a model, an AD classification model is performed on small training sets. Concretely, a distilled bidirectional encoder representation (distilBert) embedding, combined with a logistic regression classifier, is used to distinguish AD from normal controls. The model experiment was evaluated on Alzheimer's dementia recognition through spontaneous speech datasets in 2020, including the balanced 78 healthy controls (HC) and 78 patients with AD. The accuracy of the proposed model is 0.88, which is almost equivalent to the champion score in the challenge and a considerable improvement over the baseline of 75% established by organizers of the challenge. As a result, the transfer learning method in this study improves AD prediction, which does not only reduces the need for feature engineering but also addresses the lack of sufficiently large datasets.
Collapse
Affiliation(s)
- Ning Liu
- School of Public Health, Hangzhou Normal University, Hangzhou, China
- Department of Mathematics and Computer Science, Fujian Provincial Key Laboratory of Data-Intensive Computing, Quanzhou Normal University, Quanzhou, China
| | - Kexue Luo
- Tongde Hospital of Zhejiang Province Geriatrics, Hangzhou, China
| | - Zhenming Yuan
- School of Information Science and Technology, Hangzhou Normal University, Hangzhou, China
| | - Yan Chen
- International Unresponsive Wakefulness Syndrome and Consciousness Science Institute, Hangzhou Normal University, Hangzhou, China
- *Correspondence: Yan Chen
| |
Collapse
|